Biography
I am now a PH.D student in MMLab of HKU, supervised by Prof. Ping Luo. During my master period, I studied in IIG group in Tsinghua University, supervised by Prof. Yujiu Yang. I received my bachelor degree in Department of Automation from Tsinghua University in 2021. My research interests lie in Multi-Modal Learning, including Vision Language Pre-training and Large Multimodal Model. Recently, I have some works on both large vision-language model and visual generation.
Recent News
Selected Publications
-
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model
-
Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learning.
-
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
-
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model
-
MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering
Awards
- 2023.7, Tencent Rhino-Bird Research Scholarship
- 2024.6, Outstanding Master’s Thesis Award of Tsinghua University
- 2024.9, Shenzhen Universiade International Scholarship
Internship
- 2022.6~2023.7, AMAI, Department of Data Platform, Tencent
- 2023.9~2024.8, AI Platform, Intelligence Creation Department, ByteDance
(Last updated on September, 2024)