avatar

Yatai Ji

PH.D. Student in MMLab, HKU

Biography

I am now a PH.D student in MMLab of HKU, supervised by Prof. Ping Luo. During my master period, I studied in IIG group in Tsinghua University, supervised by Prof. Yujiu Yang. I received my bachelor degree in Department of Automation from Tsinghua University in 2021. My research interests lie in Multi-Modal Learning, including Vision Language Pre-training and Large Multimodal Model. Recently, I have some works on both large vision-language model and visual generation.


Recent News

Selected Publications

  • IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

    • Yatai Ji, Shilong Zhang, Jie Wu, Peize Sun, Weifeng Chen, Xuefeng Xiao, Sidi Yang, Yujiu Yang, Ping Luo.
    • Arxiv [pdf] [code]
  • Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learning.

    • Weifeng Chen*, Yatai Ji*, Jie Wu, Hefeng Wu, Pan Xie, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin.
    • Arxiv [pdf] [code]
  • Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning

    • Yatai Ji*, Rongcheng Tu*, Jie Jiang, Weijie Kong, Chengfei Cai, Wenzhe Zhao, Hongfa Wang, Yujiu Yang, Wei Liu.
    • CVPR2023 (CCF A, research paper) [pdf] [code]
  • MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model

    • Yatai Ji*, Junjie Wang*, Yuan Gong, Lin Zhang, Yanru Zhu, Hongfa Wang, Jiaxing Zhang, Tetsuya Sakai, Yujiu Yang.
    • CVPR2023 (CCF A, research paper) [pdf] [code]
  • MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering

    • Junjie Wang *, Yatai Ji*, Jiaqi Sun, Yujiu Yang, Tetsuya Sakai.
    • EMNLP2021 findings (CCF B, research paper) [pdf] [code]

Awards

  • 2023.7, Tencent Rhino-Bird Research Scholarship
  • 2024.6, Outstanding Master’s Thesis Award of Tsinghua University
  • 2024.9, Shenzhen Universiade International Scholarship

Internship

  • 2022.6~2023.7, AMAI, Department of Data Platform, Tencent
  • 2023.9~2024.8, AI Platform, Intelligence Creation Department, ByteDance

(Last updated on September, 2024)