📝 Publications

🧑‍🎨 Controllable World Model

CVPR 2026

LottieGPT: Tokenizing Vector Animation for Autoregressive Generation

Junhao Chen, Kejun Gao, Yuehan Cui, Mingze Sun, Mingjin Chen, Shaohui Wang, Xiaoxiao Long, Fei Ma, Qi Tian, Hao Zhao †, Ruqi Huang †

  • Tokenizes Lottie vector animations and finetunes a multimodal model to generate coherent, editable vector animations from text or visual prompts.
CVPR 2026

HVG-3D: Bridging Real and Simulation Domains for 3D-Conditional Hand-Object Interaction Video Synthesis

Mingjin Chen *, Junhao Chen *, Zhaoxin Fan †, Yujian Lee, Zichen Dang, Lili Wang, Yawen Cui, Lap-Pui Chau † , Yi Wang

  • HVG-3D: A 3D-aware HOI video diffusion framework with 3D ControlNet that turns one image plus 3D control signals into spatially precise, temporally coherent interaction videos.
CVPR 2026
sym

Animator-Centric Skeleton Generation on Objects with Fine-Grained Details

Mingze Sun, Cheng Zeng, Jiansong Pei, Junhao Chen, Chaoyue Song, Shaohui Wang, Tianyuan Chang, Bin Huang, Zijiao Zeng, Ruqi Huang †

  • Uses semantic-aware tokenization, a large rigged-mesh corpus, and a density-control module to generate high-quality, controllable skeletons for complex 3D assets.
ICLR 2026

GarmentGPT: Compositional Garment Pattern Generation via Discrete Latent Tokenization

Fangsheng Weng *, Junhao Chen *, Xiang Li, Jie Qin, Hanzhong Guo, Shaochun Hao, Xiaoguang Han †

[📜Paper]

  • Uses RVQ-VAE tokenization and a VLM generator to produce garment sewing patterns from discrete latent tokens, achieving strong accuracy on large curated datasets.
arxiv 2026
sym

From Frames to Sequences: Temporally Consistent Human-Centric Dense Prediction

Xingyu Miao, Junting Dong †, Qin Zhao, Yuhang Yang, Junhao Chen, Yang Long †

arXiv
  • Learns temporally consistent human-centric segmentation, depth, and normals via synthetic video supervision and a two-stage static→dynamic training pipeline.
ICLR 2026
Machine Vision and Applications 2026

🎙 Multi-modal

COLING 2024
sym

MMAD: Multi-modal Movie Audio Description

Xiaojun Ye, Junhao Chen, Xiang Li, Haidong Xin, Chao Li, Sheng Zhou †, Jiajun Bu

GitHub Repo Stars [📜Paper]

  • This work has unlocked a whole new experience of watching movies for the visually impaired.
arxiv 2023
sym

FineStyler: Text-guided Instance-level Fine-grained Image Style Transfer

Junhao Chen, Rong Peng, Xiang Li, Jingbo Sun, Hao Zhao, Ruqi Huang

GitHub Repo Stars Open In Colab arXiv

  • This work enables fine-grained stylization of a single image through text-guidance!

đź‘€ Large Language Model

EMNLP 2025
sym

LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts

Junhao Chen, Jingbo Sun, Xiang Li, Haidong Xin, Yuhao Xue, Yibin Xu, Hao Zhao †

  • This work evaluates LLMs through a game-theoretic framework.
ACL 2025
sym
EMNLP 2023
sym

ZhuJiu: A Multi-dimensional, Multi-faceted Chinese Benchmark for Large Language Models

Baoli Zhang, Haining Xie, Pengfan Du, Junhao Chen, Pengfei Cao, Yubo Chen †, Shengping Liu, Kang Liu, Jun Zhao

[🏆Leaderboard ] arXiv [📜Paper] [🎥Video]

  • This work serves as a benchmark for evaluating the Chinese language capabilities of large language models.
ICANN 2023
sym

Towards Energy-Efficient Sentiment Classification with Spiking Neural Networks

Junhao Chen, Xiaojun Ye, Jingbo Sun, Chao Li †

[📜Paper]

  • This work applies a pulsed neural network to a natural language sentiment categorization task, reaching the leading edge in terms of energy consumption.