Umberto's profile photo
Howdy! I'm Umberto Cappellazzo, and I work as a Research Associate in the Department of Computing at Imperial College London, UK. I'm a member of the iBUG group led by Maja Pantic and am fortunate to be advised by Stavros Petridis. My research focuses on the parameter-efficient massive scaling of audio-visual models using Mixture of Experts and Large Language Models. Previously, I obtained my PhD in Information Engineering and Computer Science from the University of Trento, Italy. During my PhD, I explored diverse topics, including continual learning for speech processing, parameter-efficient fine-tuning techniques (e.g, adapters, LoRA) for audio/speech tasks, and Multimodal LLMs for audio-visual speech recognition, leading to nine publications in top-notch conferences. During the final year of my PhD, I spent nine months as a visiting researcher with the iBUG team at Imperial.

News

Publications


Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs
U. Cappellazzo, M. Kim, S. Petridis
arxiv pre-print


Large Language Models Are Strong Audio-Visual Speech Recognition Learners
U. Cappellazzo, M. Kim, H. Chen, P. Ma, S. Petridis, D. Falavigna, A. Brutti, M. Pantic
ICASSP 2025



Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers
U. Cappellazzo, D. Falavigna, A. Brutti, M. Ravanelli
IEEE MLSP Workshop, 2024




Continual Contrastive Spoken Language Understanding
U. Cappellazzo, E. Fini, M. Yang, D. Falavigna, A. Brutti, B. Raj
ACL Findings 2024

Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters
U. Cappellazzo, D. Falavigna, A. Brutti
Interspeech, 2024

Evaluating and Improving Continual Learning in Spoken Language Understanding
M. Yang, X. Li, U. Cappellazzo, S. Watanabe, B. Raj
Interspeech, 2024




Improving continual learning of acoustic scene classification via mutual information optimization
M. Yang, U. Cappellazzo, X. Li, S. Watanabe, B. Raj
ICASSP 2024



Training Dynamic Models using Early Exits for Automatic Speech Recognition on Resource-constrained Devices
G. A. Wright, U. Cappellazzo, S. Zaiem, D. Raj, L. Ondel Yang, D. Falavigna, M. Ali, A. Brutti
Self-supervision in Audio, Speech and Beyond Workshop, ICASSP 2024

Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding
Umberto Cappellazzo, Muqiao Yang, Daniele Falavigna, Alessio Brutti
INTERSPEECH 2023 (Oral)

An Investigation of the Combination of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding
Umberto Cappellazzo, Daniele Falavigna, Alessio Brutti
INTERSPEECH 2023 (Poster)

Work Experience

Education

Contact

Feel free to reach out to me if you have any questions upon my research activity. Plus, I'm always apt to new collaborations! Please contact me at umbertocappellazzo [at] gmail [dot] com.