Yuheng Ji (冀昱衡)

My name is Yuheng Ji, a lyric poet, a passionate lover of life and a master student at degree Chinese Academy of Science, Institute of Automation (CASIA). I'm supervised by Prof. Xiaolong Zheng. My research interests include vision-language models and embodied AI.

Email  /  Google Scholar  /  Poetry Anthology

profile photo
Research

My research interests primarily lie in embodied AI and computer vision.
* denotes equal contributions.

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning
Huajie Tan*, Yuheng Ji*, Xiaoshuai Hao*, Minglan Lin, Pengwei Wang, Shanghang Zhang
arXiv, 2025
Project / Paper

We developed Reason-RFT, a novel reinforcement fine-tuning framework that enhances visual reasoning capabilities in Vision-Language Models (VLMs). Reason-RFT employs a two-phase training strategy: (1) Supervised Fine-Tuning (SFT) with curated Chain-of-Thought (CoT) data to activate reasoning potential, followed by (2) Group Relative Policy Optimization (GRPO)-based reinforcement learning to generate diverse reasoning-response pairs.

RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete
Yuheng Ji*, Huajie Tan*, Jiayu Shi*, Xiaoshuai Hao*, Yuan Zhang, Hengyuan Zhang, Pengwei Wang, Mengdi Zhao, Yao Mu, Pengju An, Xinda Xue, Qinghang Su, Huaihai Lyu, Xiaolong Zheng, Jiaming Liu, Zhongyuan Wang, Shanghang Zhang
CVPR, 2025
Project / Paper

We developed RoboBrain, an MLLM-based model that combines robotic and general multi-modal data, utilizes a multi-stage training strategy, and incorporates long videos and high-resolution images to improve its robotic manipulation capabilities. Extensive experiments demonstrate that RoboBrain achieves state-of-the-art performance across various robotic tasks, highlighting its potential to advance robotic brain capabilities.

MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception
Xiaoshuai Hao, Guanqun Liu, Yuting Zhao, Yuheng Ji, Mengchuan Wei, Haimei Zhao, Lingdong Kong, Rong Yin, Yu Liu
ICME, 2025
Project / Paper

This work introduces Multi-Sensor Corruption Benchmark (MSC-Bench), the first comprehensive benchmark aimed at evaluating the robustness of multi-sensor autonomous driving perception models against various sensor corruptions.

Alleviating Performance Disparity in Adversarial Spatiotemporal Graph Learning under Zero-inflated Distribution
Songran Bai, Yuheng Ji, Yue Liu, Xingwei Zhang, Xiaolong Zheng, Daniel Dajun Zeng
AAAI (Oral), 2025
Paper

Spatiotemporal Graph Learning (SGL) under Zero-Inflated Distribution (ZID) is vital for urban risk management but is susceptible to adversarial attacks. Traditional adversarial training (AT) increases performance disparities between classes. We propose the MinGRE framework to reduce these disparities and enhance robustness, promoting more equitable and robust models.

Enhancing Adversarial Robustness of Vision-Language Models through Low-Rank Adaptation
Yuheng Ji*, Yue Liu*, Zhicheng Zhang, Zhao Zhang, Yuting Zhao, Xiaoshuai Hao, Gang Zhou, Xingwei Zhang, Xiaolong Zheng
arXiv, 2024
Paper

We propose a parameter-efficient adversarial adaptation method named AdvLoRA by low-rank adaptation to improve the robustness of vision-language models.

Learning Hash Subspace from Large-Scale Multi-modal Pre-Training: A CLIP-Based Cross-modal Hashing Framework
Yuheng Ji*, Xingwei Zhang*, Gang Zhou, Xiaolong Zheng, Daniel Dajun Zeng
China Conference on Command and Control (Outstanding Paper Award), 2023
Paper

We propose a cross-modal hashing framework called CCMH (CLIP-based Cross-Modal Hashing), which facilitates the transferability of a well-trained real-value semantic subspace to a hash semantic subspace.

Experience
Service
  • Reviewer for ICME'25
  • Reviewer for CVPR'25
  • Reviewer for ICLR'25
Award
  • [2024] Merit Student, UCAS, School Award.
  • [2023] Outstanding Graduates, Provincial Award.
  • [2022] Recommendation for admission to CASIA.
  • [2022] Merit Student, Provincial Award.
  • [2022] China National Scholarship for Undergraduate Student, National Award.
  • [2021] China National Scholarship for Undergraduate Student, National Award.
  • [2020] China National Scholarship for Undergraduate Student, National Award.
  • [2019-2023] Scholarships, School Award.
Others
  • [2024] 冀昱衡, 张曌, 郑晓龙, "大模型微调中的低秩性," 中国指挥与控制学会通讯 55 (1), 44-49.
  • [2023] 冀昱衡, 张兴伟, 郑晓龙, "基于多模态预训练的跨模态检索算法研究," 中国指挥与控制学会通讯 46 (4), 10-16.
  • [2023] 一种基于多模态预训练的跨模态哈希检索系统,发明专利,第一发明人
  • [2023] 一种基于图神经网络的信用卡欺诈检测系统,发明专利,第一发明人
  • [2023] 一种针对检索模型的在线隐私保护系统,发明专利,第二发明人
  • [2022] 一种基于新闻主题句的文本情感分类系统,发明专利,第二发明人
Participation in Research Projects

在攻读硕博期间参与了以下项目研究,主要负责项目中跨模态信息语义融合与理解等专题研究工作:

  • 基于多模态数据融合的智能社会风险预警研究, 国家自然科学基金重点项目.
  • 新技术驱动的复杂社会系统管理, 国家杰出青年科学基金项目.
  • 信息技术支撑国家治理现代化的战略研究, 中国科学院学部重大咨询项目.
  • 跨模态多语言大数据驱动的社会风险感知与理解, 2030—“新一代人工智能”重大项目.