I’m a PhD student at Johns Hopkins University, working at the Center for Language and Speech Processing, advised by Prof. Sanjeev Khudanpur. I was also a JHU–Amazon AI2AI Fellow during 2024–2025.
My research focuses on building real-time speech translation models that can handle conversational and long-form speech, as well as multimodal representation learning. I have also worked on problems involving multilingual, code-switched, and low-resource ASR, along with neural audio codecs.
I completed my master’s degree in Electrical and Computer Engineering at the American University of Beirut, where I was advised by Prof. Hazem Hajj and was a member of the AUB MIND Lab. My research there focused on time series and sensing analytics, domain adaptation, and multitask learning. After that, I worked as a Machine Learning Consultant at Kanari.ai under the supervision of Dr. Ahmed Ali.
Updates
- May 2025: Interning at NVIDIA RIVA team
- May 2024: Interning at MERL with Speech & Audio team
- June 2023: Joining the SCALE 2023 Workshop at John’s Hopkins University
- June 2022: Joining the JSALT 2022 Workshop at John’s Hopkins University
Selected Publications
Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
Brian Yan, Vineel Pratap, Shinji Watanabe, Michael Auli
Pre-print, 2024
paper
Improving Massively Multilingual ASR With Auxiliary CTC Objectives
William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe
2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
Best student paper award at IEEE ICASSP 2023
paper
Exploration of Efficient End-to-End ASR Using Discretized Input from Self-Supervised Learning
Xuankai Chang, Brian Yan, Yuya Fujita, Takashi Maekaku, Shinji Watanabe
24th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2023
paper
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Brian Yan*, Jiatong Shi*, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023
paper / poster
CMU’s IWSLT 2023 Simultaneous Speech Translation System
Brian Yan*, Jiatong Shi*, Soumi Maiti, William Chen, Xinjian Li, Yifan Peng, Siddhant Arora, Shinji Watanabe
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT), 2023
Winning submission to the IWSLT 2023 Simultaneous Speech-to-Speech Translation Track (English-to-German)
paper
Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
24th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2023
paper
CTC Alignments Improve Autoregressive Translation
Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W Black, Shinji Watanabe
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
paper / talk / poster / TLDR
Towards Zero-Shot Code-Switched Speech Recognition
Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe
2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
paper / poster / TLDR
CMU’s IWSLT 2022 Dialect Speech Translation System
Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT), 2022
Winning submission to the IWSLT 2022 Dialectal Track
paper / talk
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu
2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022
paper / talk / poster / TLDR
My Google Scholar is more comprehensive.
Activities
Talks
Controllable and Explainable End-to-End Speech Translation
SIG SLT Seminar, 2022
Code-Switched Modeling
JSALT Workshop, John’s Hopkins University, 2022
Building End-to-End Speech Translation Systems
JSALT Workshop, John’s Hopkins University, 2022
Teaching
EN 520.666: Information Extraction
Teaching Assistant
Johns Hopkins University, Spring 2025
Academic Service
Reviewer
JCMDS, Speech Commun, ACL, EMNLP
Contact
Email: ahussei6[at]jh.edu
