Speaker
Description
Surgical data technologies have not only been successfully integrated inputs from various data sources (e.g., medical devices, trackers, robots and cameras) but have also applied a range of machine learning and deep learning methods (e.g., classification, segmentation or synthesis) to data-driven interventional healthcare. However, the diversity of data, acquisitions and pre-processing methods, data types, as well as training and inference methods has presented a challenging scenario for implementing low-latency applications in surgery. Recently, transformers-based models have emerged as dominant neural networks, owing to their attention mechanisms and parallel capabilities when using multimodal medical data. Despite this progress, state-of-the-art transformers-based models remain heavyweight and challenging to optimise (with 100MB of parameters) for real-time applications. Hence, in this work, we concentrate on a lightweight transformer-based model and employ pruning techniques to achieve a balance in data size for both training and testing workflows, aiming at enhancing real-time performance. We present preliminary results from a machine learning workflow designed for real-time classification of surgical skills assessment. We similarly present a reproducible workflow for data collection using multimodal sensors, including USB video image and Bluetooth-based inertial sensors. This highlights the potential of applying models with small memory and parameter size, enhancing inference speed for surgical applications. Code, data and other resources to reproduce this work are available at https://github.com/mxochicale/rtt4ssa