- Admissions
- Academics
- Research Office
- Student Life
- News & Events
- Outreach
- About
Self-Supervised Learning (SSL) is revolutionizing deep learning technologies by enabling the training of models without human-labeled data. In signal and image processing, cutting-edge systems rely on large Transformer neural networks pretrained on extensive signal/image datasets, using methods like contrastive or multitask learning. SSL is further explored in biosignals and medical image processing, dealing with diagnosis and disease prediction, leveraging unlabeled data to train models, reducing the need for manual annotation. This approach is transforming these fields by improving model performance and enabling the use of large, diverse datasets, leading to more accurate and efficient medical diagnostic and analysis tools.
Despite the surge in accepted articles mentioning SSL techniques in recent top-tier conferences, challenges hinder broader adoption in real-world applications. SSL models face complexity issues, lack a standardized evaluation protocol, exhibit biases and robustness concerns, and lack integration with related modalities like text or video.
The LbM workshop addresses these challenges by fostering interactions among experts. Through two keynotes, a panel, and paper presentations, LbM aims to unite the SSL community, including experts from various modalities, to frame SSL as a groundbreaking solution for biosignals and biomedical image processing and beyond. This workshop provides a dedicated platform for discussing and advancing SSL technologies, paving the way for their broader application in diverse fields.
The effectiveness of deep learning relies heavily on the quality of the representations it uncovers from data [1]. These representations should capture essential structures within the raw input, such as intermediate concepts, features, or latent variables, which are beneficial for subsequent tasks. While supervised learning with large annotated datasets can yield valuable representations, acquiring such datasets is often expensive, time-consuming, and impractical. Additionally, annotations are often insufficiently detailed for many potential applications, and the supervised representations they produce may be biased toward the specific task they were trained on, limiting their utility in other contexts [2]. This challenge is particularly pronounced in various applications. For instance, annotation in medical data is challenging due to the need for specialized expertise, time-consuming manual processes, and potential inconsistencies in annotations. This leads to limited availability of annotated datasets, hindering the development and evaluation of machine learning models in medical applications, especially for rare conditions or specialized domains [3].
To address these challenges, researchers are exploring unsupervised [1] and self-supervised learning (SSL) [4-7]. In particular, SSL, i.e., the process of training models to produce meaningful representations using unlabeled data, is a promising solution to challenges caused by difficulties in curating large-scale annotations. Unlike supervised learning, SSL can create generalist models that can be fine-tuned for many downstream tasks without large-scale labeled datasets. While there is already a growing trend to leverage SSL in recent medical imaging AI literature, as well as a few narrative reviews [8, 9], the most suitable strategies and best practices for medical images have not been sufficiently investigated.
Moreover, biomedical signals (biosignals) represent a fundamental resource in the biomedical domain, including many modalities, such as electroencephalography (EEG), electromyography (EMG), and electrocardiography (ECG), magnetoencephalography (MEG), bioacoustics (BAC), electrooculography (EOG), electrodermal activity (EDA), speech. Moreover, with the progress of IoT (Internet of Things) and the spread of wearable devices, their role is increasingly becoming more relevant, especially in telemedicine and precision medicine [10]. Contrastive learning (e.g. SimCLR, BYOL, SwAV, CPC) is the primary choice in most of the works dealing with single- or multi-class pathology classification for SSL-based biosignals analysis [11, 12], combined with different data augmentation techniques [13]. Regardless of the choice of the upstream and downstream tasks within the SSL, it is important to introduce modules or blocks which are able to exploit important prior domain knowledge. In this regard, biologically meaningful and compatible to the medical task data augmentation techniques could be more beneficial than adding some complexity to a standard SSL pre-training strategy designed to work with poor-data augmentation techniques.
The current trend is to evaluate SSL models solely based on their downstream performance, hence necessitating potentially dozens or hundreds of costly fine-tunings. Another direction, currently represented by scarce literature [14], proposes to measure the quality and robustness of a given representation without downstream fine-tuning, speeding-up drastically the development process. The LbM workshop will offer a forum for the SSL community to actively discuss the best practices for exploitation of important domain knowledge, biologicaly meaningful augmentation and how SSL models should be evaluated.
Multimodality is certainly the use case where SSL could unleash its true potential. In the medical domain, multimodal data are often complementary with each other, meaning that each type of data (e.g., images, sensor data, text report) can be used to extract unique latent representations and allow a better understanding of pathology at the subject level. Combining embeddings from different sources may improve the generalization capability of models and open new possibilities for deep phenotyping [15] and precision medicine. Nevertheless, achieving multimodal SSL is a long-term goal that we should tackle step by step as a community. With LbM, we will hope to encourage original research in the direction of SSL for biosignals combined with other modalities (biomedical images, text descriptions).
original research in the direction of SSL for biosignals combined with other modalities (biomedical images, text descriptions).
Another SSL problem is the data aggregation procedure. Although there is evidence that the aggregation of multiple datasets can improve the accuracy of downstream models [16], [17], domain adaptation [18], i.e. the problem of avoiding significant performance degradation due to changes in the marginal distribution of the feature space space (domain shift), remains a major issue which needs to be further investigated. Since SSL for medical images is a promising yet nascent research area and the optimal strategies for training these models are still to be explored, we hope that LbM workshop will stimulate researchers to discuss and explore different categories of self-supervised learning for their biosignals/medical image datasets, in addition to fine-tuning strategy and pre-trained weights.
Although there exist several tools which facilitate the processing of biosignals/biomedical images, the lack of data standardization is still a relevant issue, which could negatively affect the investigation of novel SSL approaches. In addition, social and technical biases for SSL models are an active field of research [18-20]. Interestingly however, and despite the clear growing adoption of SSL, the inclusiveness and robustness of SotA models remains a completely open question. More precisely, SSL architectures currently struggle at encompassing the information from diverse populations and different pathologies, making them potentially unfair or unreliable with realistic conditions [19-22]. Furthemore, the complexity of biosignals/biomedical images-videos not only reflects in the nature of the neural architecture that must be tailored for the considered domain, but also in the need for extreme compute resources for the training of such models, which often requires weeks or months and hundreds of high-end GPUs, each worth many thousands of euros [23]. Hence, and despite a short-term astonishing jump in performance , large-scale SSL models may quickly become a major barrier for academic research as it is already impossible for the vast majority of the institutions to train them, hence relying on two or three companies. Very few attemps have been made to solve this issue [24], and we hope that the LbM workshop will foster interest around the efficiency of SSL models that appears as a critical topic in a world facing climate change.
The subject(s) proposed to be covered by the LbM workshop showcase specificity within the aim of the ICIP 2024 that deals with an emerging approach in signal processing and computer vision in the area of biomedical engineering and would compliment and add value to the conference, attracting resarchers both from the field of computer science and biomedical engineering, generating new momentum in the field, by proposing SSL-based solutions in the areas of medical diagnostics, precision medicine and disease prediction. The characteristics of the LbM focus on the emerging field of SSL within the scope of biomedical engineering, fostering interdisciplinarity and synergies with other communities, such as medical doctors and researchers, following the example of flagship conferences, such as IEEE ISBI, EMBC, ICASSP.
References:
Planned composition:The LbM workshop will be structured as follows:
Measure to ensure diversity:


Sheikh Zayed Institute for Pediatric Surgical Innovation at Children's National Hospital, Washington, D.C.; Dept. of Biomedical Engineering, George Washington University, School of Medicine and Health Sciences, Washington, D.C (MLingura@childrensnational.org)
