We present CrissCross, a self-supervised framework for learning audio-visual representations. A novel notion is introduced in our framework whereby in addition to learning the intra-modal and standard synchronous cross-modal relations, CrissCross also learns asynchronous cross-modal relationships. We show that by relaxing the temporal synchronicity between the audio and visual modalities, the network learns strong time-invariant representations. Our experiments show that strong augmentations for both audio and visual modalities with the relaxation of cross-modal temporal synchronicity optimize performance. To pretrain our proposed framework, we use 3 different datasets with varying sizes, Kinetics-Sound, Kinetics-400, and AudioSet. The learned representations are evaluated on a number of downstream tasks namely action recognition, sound classification, and retrieval. CrissCross shows state-of-the-art performances on action recognition (UCF101 and HMDB51) and sound classification (ESC50). The codes and pretrained models are publicly available.

The first row from the top is showing generated ECG, corresponding to input PPG (second row from the top). The last row is the original ECG.


We propose a novel framework called CardioGAN for generating ECG signals from PPG inputs. We utilize attention-based generators and dual time and frequency domain discriminators along with a CycleGAN backbone to obtain realistic ECG signals. To the best of our knowledge, no other studies have attempted to generate ECG from PPG (or in fact any cross-modality signal-to-signal translation in the biosignal domain) using GANs or other deep learning techniques. Moreover, CardioGAN makes it possible monitoring daily life cardiac activity in a continuous manner.

Self-supervised ECG Representation Learning

We exploit a self-supervised deep multi-task learning framework for electrocardiogram (ECG) -based emotion recognition. To the best of our knowledge, this is the first time self-supervised learning is utilized to perform emotion recognition using ECG.

Classification of Cognitive Load and Expertise for Adaptive Simulation

We propose an end-to-end framework for a trauma simulation that actively classifies a participant’s level of cognitive load and expertise for the development of a dynamically adaptive simulation.

Computer-Aided Diagnosis

This paper presents a deep learning method for computer-aided differential diagnosis of benign and malignant breast cancer tumors by avoiding potential errors caused by poor feature selection as well as class imbalances in the dataset.

