SIGGRAPH 2026 Conference Paper

MACE-Dance

Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation

Kaixing Yang1 Jiashu Zhu2,* Xulong Tang5 Ziqiao Peng1 Xiangyue Zhang4 Puwei Wang1,† Jiahong Wu2,† Xiangxiang Chu2 Hongyan Liu3,† Jun He1

1Renmin University of China 2AMap, Alibaba 3Tsinghua University 4Wuhan University 5Malou Tech Inc

*Project Leader   Corresponding Authors

Abstract

We present MACE-Dance, a cascaded expert framework for music-driven dance video generation. It decouples the problem into music-to-3D motion generation and motion-guided appearance synthesis, producing dance videos with plausible motion, expressive style, coherent identity, and stable temporal details.

A Motion Expert first generates rhythm-aligned 3D SMPL motion from music; an Appearance Expert then animates the reference subject with the generated motion. This motion-appearance cascade enables high-quality comparisons across music-driven video generation, 3D dance generation, pose-driven animation, long-sequence generation, and diverse dance genres.

Overview Video

Video

Results

Comparison

Dance Video Generation

3D Dance Generation

Image Animation

Results

Ablation Studies

Results

Long Sequence

Results

Diverse Genres

Citation

BibTeX

@article{yang2025mace,
  title={MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation},
  author={Yang, Kaixing and Zhu, Jiashu and Tang, Xulong and Peng, Ziqiao and Zhang, Xiangyue and Wang, Puwei and Wu, Jiahong and Chu, Xiangxiang and Liu, Hongyan and He, Jun},
  journal={arXiv preprint arXiv:2512.18181},
  year={2025}
}