Hi, I am a PhD student at the School of Data Science, The Chinese University of Hong Kong, Shenzhen, supervised by Prof. Haizhou Li. Prior to that, I received my bachelor’s degree from Southern University of Science and Technology, supervised by Prof. Tom Ko. My research interests include automatic speech recognition, speech pre-training and spoken language models. I have published several papers at top international AI conferences and journals such as TASLP, NeurIPS, ICLR, ACL, EMNLP, and ICASSP.

📖 Education

2022.09 - now, Ph.D., the Chinese University of Hong Kong, Shenzhen.
2024.01 - 2024.12, Visiting Student, National University of Singapore.
2016.09 - 2020.06, B.Eng, Southern University of Science and Technology.
2018.09 - 2019.05, Visiting Student, the University of Edinburgh.

💻 Internships

2025.05 - 2025.11, Research Scientist Intern, Meta Superintelligence Labs.
2024.03 - 2025.05, Research Intern, ByteDance, Mentored by Prof. Zhizheng Wu and Dr. Xiaohai Tian.
2022.06 - 2022.12, Research Intern, ByteDance, Mentored by Prof. Tom Ko.
2021.06 - 2022.04, Research Intern, MSRA NLC group, Beijing, Mentored by Dr. Long Zhou and Dr. Shujie Liu.
2019.06 - 2019.08, Machine Learning Intern, Tencent, Shenzhen.

📝 Publications

A Two-Stage Self-Supervised Speech Representation Learning for Acoustic, Phonetic and Semantic Modeling, Jingru Lin, Junyi Ao, Meng Ge, Mengling Feng, Haizhou Li, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2026
Scaling Speech Tokenizers with Diffusion Autoencoders, Yuancheng Wang, Zhenyu Tang, Yun Wang, Arthur Hinsvark, Yingru Liu, Yinghao Li, Kainan Peng, Junyi Ao, Mingbo Ma, Mike Seltzer, Qing He, Xubo Liu, ICLR 2026
EchoMind: An Interrelated Multi-Level Benchmark for Evaluating Empathetic Speech Language Models, Li Zhou, Lutong Yu, You Lyu, Yihang Lin, Zefeng Zhao, Junyi Ao, Yuhao Zhang, Benyou Wang, Haizhou Li, ICLR 2026
USED: Universal Speaker Extraction and Diarization, Junyi Ao, Mehmet Sinan Yıldırım, Ruijie Tao, Meng Ge, Shuai Wang, Yanmin Qian, Haizhou Li, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2025
Leveraging Language Information for Target Language Extraction, Mehmet Sinan Yıldırım, Ruijie Tao, Wupeng Wang, Junyi Ao, Haizhou Li, APSIPA ASC 2025
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words, Junyi Ao, Yuancheng Wang, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu, NeurIPS Datasets and Benchmarks Track 2024 |
SA-WavLM: Speaker-Aware Self-Supervised Pre-Training for Mixture Speech, Jingru Lin, Meng Ge, Junyi Ao, Liqun Deng, Haizhou Li, INTERSPEECH 2024
Text-Guided HuBERT: Self-Supervised Speech Pre-Training via Generative Adversarial Networks, Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li, IEEE Signal Processing Letters, 2024
CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning, Chutong Meng, Junyi Ao, Tom Ko, Mingxuan Wang, Haizhou Li, INTERSPEECH 2023 |
Token2vec: A Joint Self-Supervised Pre-Training Framework Using Unpaired Speech and Text, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li, ICASSP 2023
Self-Supervised Acoustic Word Embedding Learning via Correspondence Transformer Encoder, Jingru Lin, Xianghu Yue, Junyi Ao, Haizhou Li, INTERSPEECH 2023
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data, Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei, INTERSPEECH 2022 |
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing, Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei, ACL 2022 |
The YiTrans Speech Translation System for the IWSLT 2022 Offline Shared Task, Ziqiang Zhang, Junyi Ao, Proceedings of the 19th International Conference on Spoken Language Translation, 2022
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-Training, Ziqiang Zhang, Long Zhou, Junyi Ao, Shujie Liu, Lirong Dai, Jinyu Li, Furu Wei, EMNLP 2022 |
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT, Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li, INTERSPEECH 2022 |
Multi-View Self-Attention Based Transformer for Speaker Recognition, Rui Wang, Junyi Ao, Long Zhou, Shujie Liu, Zhihua Wei, Tom Ko, Qing Li, Yu Zhang, ICASSP 2022
Improving Attention-based End-to-end ASR by Incorporating an N-gram Neural Network, Junyi Ao, Tom Ko, ISCSLP 2021

📜 Preprints

Audio Deepfake Verification, Li Wang, Junyi Ao, Linyong Gan, Yuancheng Wang, Xueyao Zhang, Zhizheng Wu, arXiv preprint arXiv:2509.08476, 2025
Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context, Junyi Ao, Dekun Chen, Xiaohai Tian, Wenjie Feng, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu, arXiv preprint arXiv:2503.15338, 2025
Overview of the Amphion Toolkit (v0.2), Jiaqi Li, Xueyao Zhang, Yuancheng Wang, Haorui He, Chaoren Wang, Li Wang, Huan Liao, Junyi Ao, Zeyu Xie, Yiqiao Huang, et al., arXiv preprint arXiv:2501.15442, 2025
The NUS-HLT System for the ICASSP 2024 ICMC-ASR Grand Challenge, Meng Ge, Yizhou Peng, Yidi Jiang, Jingru Lin, Junyi Ao, Mehmet Sinan Yildirim, Shuai Wang, Haizhou Li, Mengling Feng, arXiv preprint arXiv:2312.16002, 2023

🎖 Others

Reviewer

IEEE Transactions on Multimedia (TMM)
The International Conference on Learning Representations (ICLR)
The Conference on Neural Information Processing Systems (NeurIPS)
The Annual Meeting of the Association for Computational Linguistics (ACL)
IEEE Signal Processing Letters (SPL)
Computer Speech and Language
The International Conference on Acoustics, Speech and Signal Processing (ICASSP)
INTERSPEECH
International Joint Conference on Neural Networks (IJCNN)
National Conference on Man-Machine Speech Communication (NCMMSC)

Teaching

Leading TA, DDA3020 Machine Learning, Spring 2023
TA, CSC3100 Data Structures, Fall 2022

Junyi Ao (敖君逸)

📖 Education

💻 Internships

📝 Publications

📜 Preprints

🎖 Others