Xiangyue Zhang (章湘粤) | Embodied AI, Motion Generation

News

Mar 31, 2026

🎉 MACE-Dance was accepted by SIGGRAPH 2026.

Nov 8, 2025

🎉 GlobalDiff was accepted by AAAI 2026.

Jul 5, 2025

🎉 EchoMask was accepted by ACM MM 2025.

Jun 26, 2025

🎉 SemTalk was accepted by ICCV 2025.

Apr 16, 2025

🎉 One paper has been accepted by IEEE T-CSVT.

Dec 22, 2024

🥂 Our band performed successfully at the Hua Young Music Festival! Cheers! Check More to see pictures.

Featured Open-Source System

Technical Report

Deep Researcher Agent: An Autonomous Framework for 24/7 Deep Learning Experimentation with Zero-Cost Monitoring

Xiangyue Zhang

arXiv preprint arXiv:2604.05854, 2026

arXiv Website GitHub Architecture AI Guide

We present Deep Researcher Agent, an open-source framework for autonomous deep learning experimentation that covers the full loop from hypothesis formation and code implementation to training execution, result analysis, and iterative refinement.

The paper highlights three core ideas: zero-cost monitoring during training, a two-tier constant-size memory capped at roughly 5K characters, and a minimal-toolset leader-worker architecture for lower token overhead in long-running research workflows.

Selected Publications

arXiv 2026

PersonaGesture: Single-Reference Co-Speech Gesture Personalization for Unseen Speakers

Xiangyue Zhang, Yiyi Cai, Kunhang Li, Kaixing Yang, You Zhou, Zhengqing Li, Xuangeng Chu, Jiaxu Zhang, and Haiyang Liu

arXiv, 2026

arXiv Bib Code Website Video Demo


@misc{zhang2026personagesture,
  title={PersonaGesture: Single-Reference Co-Speech Gesture Personalization for Unseen Speakers},
  author={Zhang, Xiangyue and Cai, Yiyi and Li, Kunhang and Yang, Kaixing and Zhou, You and Li, Zhengqing and Chu, Xuangeng and Zhang, Jiaxu and Liu, Haiyang},
  year={2026},
  eprint={2605.06064},
  archivePrefix={arXiv},
  url={http://arxiv.org/abs/2605.06064}
}

AAAI 2026

Mitigating Error Accumulation in Co-Speech Motion Generation via Global Rotation Diffusion and Multi-Level Constraints

Xiangyue Zhang*, Jianfang Li*†, Jianqiang Ren, and Jiaxu Zhang

Annual AAAI Conference on Artificial Intelligence (AAAI), 2026

arXiv Bib Code Website Video Results


@misc{zhang2025mitigatingerroraccumulationcospeech,
      title={Mitigating Error Accumulation in Co-Speech Motion Generation via Global Rotation Diffusion and Multi-Level Constraints}, 
      author={Xiangyue Zhang and Jianfang Li and Jianqiang Ren and Jiaxu Zhang},
      year={2025},
      eprint={2511.10076},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.10076}, 
}

ICCV 2025

SemTalk: Holistic Co-speech Motion Generation with Frame-level Semantic Emphasis

Xiangyue Zhang*, Jianfang Li*, Jiaxu Zhang, Ziqiang Dang, Jianqiang Ren, Liefeng Bo, and Zhigang Tu†

International Conference on Computer Vision (ICCV), 2025

arXiv Bib Code Website Video Results


@inproceedings{zhang2025semtalk,
  title={SemTalk: Holistic Co-speech Motion Generation with Frame-level Semantic Emphasis},
  author={Zhang, Xiangyue and Li, Jianfang and Zhang, Jiaxu and Dang, Ziqiang and Ren, Jianqiang and Bo, Liefeng and Tu, Zhigang},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={13761--13771},
  year={2025}
}

ACM MM 2025

EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation

Xiangyue Zhang*, Jianfang Li*, Jiaxu Zhang, Jianqiang Ren, Liefeng Bo, and Zhigang Tu†

ACM International Conference on Multimedia (ACM MM), 2025

arXiv Bib Code Website Video Results


@inproceedings{zhang2025echomask,
  title={EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation},
  author={Zhang, Xiangyue and Li, Jianfang and Zhang, Jiaxu and Ren, Jianqiang and Bo, Liefeng and Tu, Zhigang},
  booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
  pages={10827--10836},
  year={2025}
}

T-CSVT 2025

Robust 2D Skeleton Action Recognition via Decoupling and Distilling 3D Latent Features

Xiangyue Zhang*, Yifan Jia*, Jiaxu Zhang, Yijie Yang, and Zhigang Tu†

IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), 2025

Experience & Education

2026.10 - PresentTokyo

The University of Tokyo

Ph.D. Student, Mechano-Informatics, IST

Advisor: Prof. Tatsuya Harada.

2025.12 - 2026.03Shenzhen

ByteDance

Research Intern, Intelligent Creation Team

Advisor: Youjiang Xu. Working on large streaming motion generation models.

2024.06 - 2025.11Hangzhou

Alibaba

Research Intern, Tongyi Lab

Advisors: Dr. Liefeng Bo and Dr. Jianfang Li. Working on co-speech motion generation.

2023.09 - 2026.06Wuhan

Wuhan University

Master Student, Computer Application

Advisor: Prof. Zhigang Tu.

2019.09 - 2023.06Changsha

Central South University

B.S. Degree

Minored in Data Science and Big Data Technology.

Awards & Honors

JAN, 2026

Wang Zhizhuo Scholarship for Innovative Talents (top 0.3%)

OCT, 2025

National Scholarship (top 3%)

JUN, 2023

Outstanding Graduate Award

Service

Reviewer

Reviewer Service: NeurIPS, AAAI, ACM MM, T-CSVT