Hi, I am a PhD student at the School of Data Science, the Chinese University of Hong Kong, Shenzhen, supervised by Prof. Haizhou Li. Prior to that, I received my bachelor’s degree from the Southern University of Science and Technology, supervised by Prof. Tom Ko. My research interests include automatic speech recognition, speech pre-training and large language models. I have published several papers at the top international AI conferences and journals such as TASLP, NeurIPS, ACL, and ICASSP.
📖 Educations
- 2022.09 - now, Ph.D., the Chinese University of Hong Kong, Shenzhen.
- 2024.01 - now, Visiting Student, National University of Singapore.
- 2016.09 - 2020.06, B.Eng, Southern University of Science and Technology.
- 2018.09 - 2019.05, Visiting Student, the University of Edinburgh.
💻 Internships
- 2024.03 - now, Research Intern, Bytedance, Mentored by Prof. Zhizheng Wu and Dr. Xiaohai Tian.
- 2022.06 - 2022.12, Research Intern, Bytedance, Mentored by Prof. Tom Ko.
- 2021.06 - 2022.04, Research Intern, MSRA NLC group, Beijing, Mentored by Dr. Long Zhou and Dr. Shujie Liu.
- 2019.06 - 2019.08, Machine Learning Intern, Tencent, Shenzhen.
📝 Publications
-
USED: Universal Speaker Extraction and Diarization, Junyi Ao, Mehmet Sinan Yıldırım, Ruijie Tao, Meng Ge, Shuai Wang, Yanmin Qian, Haizhou Li, TASLP 2024
-
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words, Junyi Ao, Yuancheng Wang, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu, NeurIPS 2024 |
-
Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks, Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li, IEEE Signal Processing Letters 2024
-
SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech, Jingru Lin, Meng Ge, Junyi Ao, Liqun Deng, Haizhou Li , INTERSPEECH 2024
-
CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning, Chutong Meng, Junyi Ao, Tom Ko, Mingxuan Wang, Haizhou Li, INTERSPEECH 2023 |
-
Self-Supervised Acoustic Word Embedding Learning via Correspondence Transformer Encoder, Jingru Lin, Xianghu Yue, Junyi Ao, Haizhou Li, INTERSPEECH 2023
-
token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li, ICASSP 2023
-
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data, Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei, INTERSPEECH 2022 |
-
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing, Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei, ACL 2022 |
-
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training, Ziqiang Zhang, Long Zhou, Junyi Ao, Shujie Liu, Lirong Dai, Jinyu Li, Furu Wei, EMNLP 2022 |
-
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT, Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li, INTERSPEECH 2022 |
-
The YiTrans Speech Translation System for IWSLT 2022 Offline Shared Task, Ziqiang Zhang, Junyi Ao, Long Zhou, Shujie Liu, Furu Wei, Jinyu Li, ACL@IWSLT 2022 |
-
Multi-View Self-Attention Based Transformer for Speaker Recognition, Rui Wang, Junyi Ao, Long Zhou, Shujie Liu, Zhihua Wei, Tom Ko, Qing Li, Yu Zhang, ICASSP 2022
-
Improving Attention-based End-to-end ASR by Incorporating an N-gram Neural Network, Junyi Ao, Tom Ko, ISCSLP 2021
🎖 Others
Reviewer
- IEEE Transactions on Multimedia (TMM)
- The International Conference on Learning Representations (ICLR)
- IEEE Signal Processing Letters (SPL)
- The International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- INTERSPEECH
- National Conference on Man-Machine Speech Communication (NCMMSC)
Teaching
- Leading TA, DDA3020 Machine Learning, Spring 2023
- TA, CSC3100 Data Structures, Fall 2022