About me

I am currently an Assistant Professor at Nanjing University. Previously, I worked as a Postdoctoral Fellow at Hong Kong University of Science and Technology, working with Prof.Yike Guo and Prof.Wei Xue. I also worked as a Postdoctoral Fellow at Chinese University of Hong Kong, Shenzhen (CUHK-SZ), working with Prof.Haizhou Li and Prof.Zhizheng Wu. I obtained Ph.D. degree from Audio, Speech and Language Processing Laboratory at Northwestern Polytechnical University (ASLP@NWPU), supervised by Prof.Lei Xie. During my Ph.D. studies, I performed research at JD AI Lab, Tencent AI Lab and Microsoft.

My research interest mainly focuses on audio, speech and language processing; speech, audio and music understanding and generation; emotional and expressive speech generation; conversational AI; AI agents.

I committed to building open-source tools and data resources for the research community, including Amphion for audio, music, and speech generation; Audio-FLAN open-source instruction-following dataset for unified understanding and generation of speech, music, and sound; WenetSpeech4TTS open-source Mandarin dataset for speech generation. I have also contributed to research on large-scale generative models for speech, music, and audio, including Llasa, Spark-TTS, YuE, and AudioX, etc.

📢 I am actively recruiting self-motivated Ph.D. students, Master’s students, and research interns. If you are interested in my research or would like to explore related topics together, please feel free to reach out.

News

Research Experience

  • 2021.11 - 2022.10, Researcher, Microsoft.
  • 2021.06 - 2021.11, Researcher, Tencent AI Lab.
  • 2019.04 - 2020.06, Researcher, Microsoft.
  • 2018.10 - 2019.04, Researcher, JD.COM AI Lab.

Academic Activities

Selected Publications

VISITOR LOCATIONS