Amir Hussein

I’m a PhD student at Johns Hopkins University, working at the Center for Language and Speech Processing, advised by Prof. Sanjeev Khudanpur. I was also a JHU–Amazon AI2AI Fellow during 2024–2025.

My research focuses on building real-time speech translation models that can handle conversational and long-form speech, as well as multimodal representation learning. I have also worked on problems involving multilingual, code-switched, and low-resource ASR, along with neural audio codecs.

I completed my master’s degree in Electrical and Computer Engineering at the American University of Beirut, where I was advised by Prof. Hazem Hajj and was a member of the AUB MIND Lab. My research there focused on time series and sensing analytics, domain adaptation, and multitask learning. After that, I worked as a Machine Learning Consultant at Kanari.ai under the supervision of Dr. Ahmed Ali.

Updates

May 2025: Interning at NVIDIA RIVA team
May 2024: Interning at MERL with Speech & Audio team
June 2023: Joining the SCALE 2023 Workshop at John’s Hopkins University
June 2022: Joining the JSALT 2022 Workshop at John’s Hopkins University

Selected Publications

Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
_{Brian Yan, Vineel Pratap, Shinji Watanabe, Michael Auli}
_{Pre-print, 2024}
_paper

Improving Massively Multilingual ASR With Auxiliary CTC Objectives
_{William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe}
_{2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023}
_{Best student paper award at IEEE ICASSP 2023}
_paper

Exploration of Efficient End-to-End ASR Using Discretized Input from Self-Supervised Learning
_{Xuankai Chang, Brian Yan, Yuya Fujita, Takashi Maekaku, Shinji Watanabe}
_{24th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2023}
_paper

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
_{Brian Yan*, Jiatong Shi*, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe}
_{Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023}
_{paper / poster}

CMU’s IWSLT 2023 Simultaneous Speech Translation System
_{Brian Yan*, Jiatong Shi*, Soumi Maiti, William Chen, Xinjian Li, Yifan Peng, Siddhant Arora, Shinji Watanabe}
_{Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT), 2023}
_{Winning submission to the IWSLT 2023 Simultaneous Speech-to-Speech Translation Track (English-to-German)}
_paper

Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
_{Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath}
_{24th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2023}
_paper

CTC Alignments Improve Autoregressive Translation
_{Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W Black, Shinji Watanabe}
_{Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023}
_{paper / talk / poster / TLDR}

Towards Zero-Shot Code-Switched Speech Recognition
_{Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe}
_{2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023}
_{paper / poster / TLDR}

CMU’s IWSLT 2022 Dialect Speech Translation System
_{Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe}
_{Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT), 2022}
_{Winning submission to the IWSLT 2022 Dialectal Track}
_{paper / talk}

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
_{Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu}
_{2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022}
_{paper / talk / poster / TLDR}

My Google Scholar is more comprehensive.

Activities

Talks

Controllable and Explainable End-to-End Speech Translation
_{SIG SLT Seminar, 2022}

Code-Switched Modeling
_{JSALT Workshop, John’s Hopkins University, 2022}

Building End-to-End Speech Translation Systems
_{JSALT Workshop, John’s Hopkins University, 2022}

Teaching

EN 520.666: Information Extraction
_{Teaching Assistant}
_{Johns Hopkins University, Spring 2025}

Academic Service

Reviewer
_{JCMDS, Speech Commun, ACL, EMNLP}

Contact

Email: ahussei6[at]jh.edu