Skip to content

andyl-flwls/Awesome-3D-Human-Motion-Generation

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Awesome-3D-Human-Motion-Generation

3D Human Motion Generation aims to generate natural and plausible motions from conditions such as text descriptions, action labels, music, etc.

This repository is built mainly to track mainstream Text-to-Motion works, and also contains papers and datasets related to it.

Last updated: 2024/07/24 (Partial ECCV'24 added)

Content Catalog

Datasets

Text-to-Motion

Metrics

Motion quality

  • Frechet Inception Distance (FID) $\downarrow$
    • FID is adopted as a principal metric to evaluate the feature distributions between the generated and real motions. The feature extractor employed is from [T2M].

Motion diversity

  • MultiModality (MModality) $\uparrow$
    • MModality measures the generation diversity conditioned on the same text. Specifically, MModality represents the average variance for a single text prompt by computing Euclidean distances of 10 generated pairs of motions.
  • Diversity $\rightarrow$ i.e., closer to real motion is better
    • Diversity measures the variability and richness of the generated action sequences, which is calculated by averaging Euclidean distances of random samples from 300 pairs of motion.

Condition matching

  • R-Precision $\uparrow$
    • R-Precision measures the similarity between the text description and the generated motion sequence and indicates the probability that the real text appears in the top-k after sorting.
  • Multi-Modal Distance (MM Dist) $\downarrow$
    • MM Dist represents the average Euclidean distance between the motion feature of each generated motion and the text feature of its corresponding text description in the test set.

Performance Tables

Notably! The symbol of 'o-' and 'u-' in Code Link indicate the official and the unofficial implementations, respectively.

Humanml3D

ID Year Venue
Model
(or Authors)
R Precision
Top-1 ↑
R Precision
Top-2 ↑
R Preciion
Top-3 ↑
FID ↓
MM Dist ↓
MultiModality ↑
Diversity →
code
-
- - - Real Motion $0.511^{\pm.003}$ $0.703^{\pm.003}$ $0.797^{\pm.002}$ $0.002^{\pm.000}$ $2.974^{\pm.008}$ - $9.503^{\pm.065}$ - -
- - - Real Motion † $0.539^{\pm.004}$ $0.721^{\pm.003}$ $0.810^{\pm.003}$ $0.001^{\pm.000}$ $1.462^{\pm.006}$ - $5.298^{\pm.047}$ - -
1 2018 NeurIPS Seq2Seq $0.180^{\pm.002}$ $0.300^{\pm.002}$ $0.396^{\pm.002}$ $11.75^{\pm.035}$ $5.529^{\pm.007}$ - $6.223^{\pm.061}$ [u-pytorch] -
2 2019 3DV Language2Pose $0.246^{\pm.002}$ $0.387^{\pm.002}$ $0.486^{\pm.004}$ $11.02^{\pm.046}$ $5.296^{\pm.008}$ - $7.676^{\pm.058}$ [o-pytorch] -
3 2021 IEEE VR Text2Gesture $0.165^{\pm.001}$ $0.267^{\pm.002}$ $0.345^{\pm.002}$ $5.012^{\pm.030}$ $6.030^{\pm.008}$ - $6.409^{\pm.071}$ [o-pytorch] -
4 2021 ICCV Hier $0.301^{\pm.002}$ $0.425^{\pm.002}$ $0.552^{\pm.004}$ $6.523^{\pm.024}$ $5.012^{\pm.018}$ - $8.332^{\pm.042}$ [o-pytorch] -
5 2022 ECCV TEMOS $0.424^{\pm.002}$ $0.612^{\pm.002}$ $0.722^{\pm.002}$ $3.734^{\pm.028}$ $3.703^{\pm.008}$ $0.368^{\pm.018}$ $8.973^{\pm.071}$ [o-pytorch] -
6 2022 ECCV TM2T $0.424^{\pm.003}$ $0.618^{\pm.003}$ $0.729^{\pm.002}$ $1.501^{\pm.017}$ $3.467^{\pm.011}$ $2.424^{\pm.093}$ $8.589^{\pm.076}$ [o-pytorch] -
7 2022 CVPR T2M $0.455^{\pm.003}$ $0.636^{\pm.003}$ $0.736^{\pm.002}$ $1.087^{\pm.021}$ $3.347^{\pm.008}$ $2.219^{\pm.074}$ $9.175^{\pm.083}$ [o-pytorch] -
8 2023 ICLR MDM $0.320^{\pm.005}$ $0.498^{\pm.004}$ $0.611^{\pm.007}$ $0.544^{\pm.044}$ $5.566^{\pm.027}$ $2.799^{\pm.072}$ $9.559^{\pm.086}$ [o-pytorch] -
9 2022 (2024) Arxiv (TPAMI) MOtionDiffuse $0.491^{\pm.001}$ $0.681^{\pm.001}$ $0.782^{\pm.001}$ $0.630^{\pm.001}$ $3.113^{\pm.001}$ $1.553^{\pm.042}$ $9.410^{\pm.049}$ [o-pytorch] -
10 2023 CVPR MLD $0.481^{\pm.003}$ $0.673^{\pm.003}$ $0.772^{\pm.002}$ $0.473^{\pm.013}$ $3.196^{\pm.010}$ $2.413^{\pm.079}$ $9.724^{\pm.082}$ [o-pytorch] -
11 2023 CVPR T2M-GPT $0.491^{\pm.003}$ $0.680^{\pm.003}$ $0.775^{\pm.002}$ $0.116^{\pm.004}$ $3.118^{\pm.011}$ $1.856^{\pm.011}$ $9.761^{\pm.081}$ [o-pytorch] -
12 2023 ICCV Fg-T2M $0.492^{\pm.002}$ $0.683^{\pm.003}$ $0.783^{\pm.002}$ $0.243^{\pm.019}$ $3.109^{\pm.007}$ $1.614^{\pm.049}$ $9.278^{\pm.072}$ - -
13 2023 ICCV M2DM $0.497^{\pm.003}$ $0.682^{\pm.002}$ $0.763^{\pm.003}$ $0.352^{\pm.005}$ $3.134^{\pm.010}$ $3.587^{\pm.072}$ $9.926^{\pm.073}$ - -
14 2023 ICCV AttT2M $0.499^{\pm.003}$ $0.690^{\pm.002}$ $0.786^{\pm.002}$ $0.112^{\pm.006}$ $3.038^{\pm.007}$ $2.452^{\pm.051}$ $9.700^{\pm.090}$ [o-pytorch] -
15 2023 NeurIPS MotionGPT $0.492^{\pm.003}$ $0.681^{\pm.003}$ $0.778^{\pm.002}$ $0.232^{\pm.008}$ $3.096^{\pm.008}$ $2.008^{\pm.084}$ $9.528^{\pm.071}$ [o-pytorch] -
16 2023 NeurIPS ReMoDiffuse † $0.510^{\pm.005}$ $0.698^{\pm.006}$ $0.795^{\pm.004}$ $0.103^{\pm.004}$ $2.974^{\pm.016}$ $1.795^{\pm.043}$ $9.018^{\pm.075}$ [o-pytorch] -
17 2024 CVPR MMM $0.504^{\pm.003}$ $0.696^{\pm.003}$ $0.794^{\pm.002}$ $0.080^{\pm.003}$ $2.998^{\pm.007}$ $1.164^{\pm.041}$ $9.411^{\pm.058}$ [o-pytorch] -
18 2024 CVPR MoMask $0.521^{\pm.002}$ $0.713^{\pm.002}$ $0.807^{\pm.002}$ $0.045^{\pm.002}$ $2.958^{\pm.008}$ $1.241^{\pm.040}$ - [o-pytorch] -
19 2024 ECCV MotionLCM $0.502^{\pm.003}$ $0.698^{\pm.002}$ $0.798^{\pm.002}$ $0.304^{\pm.012}$ $3.012^{\pm.007}$ $2.259^{\pm.092}$ $9.607^{\pm.066}$ [o-pytorch] -
20 2024 ECCV Motion Mamba $0.502^{\pm.003}$ $0.693^{\pm.002}$ $0.792^{\pm.002}$ $0.281^{\pm.009}$ $3.060^{\pm.058}$ $2.294^{\pm.058}$ $9.871^{\pm.084}$ [o-pytorch] -
21 2024 ECCV BAMM $0.525^{\pm.002}$ $0.720^{\pm.003}$ $0.814^{\pm.003}$ $0.055^{\pm.002}$ $2.919^{\pm.008}$ $1.687^{\pm.051}$ $9.717^{\pm.089}$ - -

KIT-ML

ID Year Venue
Model
(or Authors)
R Precision
Top-1 ↑
R Precision
Top-2 ↑
R Preciion
Top-3 ↑
FID ↓
MM Dist ↓
MultiModality ↑
Diversity →
code
-
- - - Real Motion (GT) $0.424^{\pm.005}$ $0.649^{\pm.006}$ $0.779^{\pm.006}$ $0.031^{\pm.004}$ $2.788^{\pm.012}$ - $11.08^{\pm.097}$ - -
- - - Real Motion † $0.475^{\pm.006}$ $0.690^{\pm.004}$ $0.791^{\pm.005}$ $0.002^{\pm.000}$ $1.337^{\pm.012}$ - $6.371^{\pm.058}$ - -
1 2018 NeurIPS Seq2Seq $0.103^{\pm.003}$ $0.178^{\pm.005}$ $0.241^{\pm.006}$ $24.86^{\pm.348}$ $7.960^{\pm.031}$ - $6.744^{\pm.106}$ [u-pytorch] -
2 2019 3DV Language2Pose $0.221^{\pm.005}$ $0.373^{\pm.004}$ $0.483^{\pm.005}$ $6.545^{\pm.072}$ $5.147^{\pm.030}$ - $9.073^{\pm.100}$ [o-pytorch] -
3 2021 IEEE VR Text2Gesture $0.156^{\pm.004}$ $0.255^{\pm.004}$ $0.338^{\pm.005}$ $12.12^{\pm.183}$ $6.964^{\pm.029}$ - $9.334^{\pm.079}$ [o-pytorch] -
4 2021 ICCV Hier $0.255^{\pm.006}$ $0.432^{\pm.007}$ $0.531^{\pm.007}$ $5.203^{\pm.107}$ $4.986^{\pm.027}$ - $9.563^{\pm.072}$ [o-pytorch] -
5 2022 ECCV TEMOS $0.353^{\pm.006}$ $0.561^{\pm.007}$ $0.687^{\pm.005}$ $3.717^{\pm.051}$ $3.417^{\pm.017}$ $0.532^{\pm.034}$ $10.84^{\pm.100}$ [o-pytorch] -
6 2022 ECCV TM2T $0.280^{\pm.005}$ $0.463^{\pm.006}$ $0.587^{\pm.005}$ $3.599^{\pm.153}$ $4.591^{\pm.026}$ $3.292^{\pm.081}$ $9.473^{\pm.117}$ [o-pytorch] -
7 2022 CVPR T2M $0.361^{\pm.006}$ $0.559^{\pm.007}$ $0.681^{\pm.007}$ $3.022^{\pm.107}$ $3.488^{\pm.028}$ $2.052^{\pm.107}$ $10.72^{\pm.145}$ [o-pytorch] -
8 2023 ICLR MDM $0.164^{\pm.004}$ $0.291^{\pm.004}$ $0.396^{\pm.004}$ $0.497^{\pm.021}$ $9.191^{\pm.022}$ $1.907^{\pm.214}$ $10.85^{\pm.109}$ [o-pytorch] -
9 2022 (2024) Arxiv (TPAMI) MOtionDiffuse $0.417^{\pm.004}$ $0.621^{\pm.004}$ $0.739^{\pm.004}$ $1.954^{\pm.064}$ $2.958^{\pm.005}$ $0.730^{\pm.013}$ $11.10^{\pm.143}$ [o-pytorch] -
10 2023 CVPR MLD $0.390^{\pm.008}$ $0.609^{\pm.008}$ $0.734^{\pm.007}$ $0.404^{\pm.027}$ $3.204^{\pm.027}$ $2.192^{\pm.071}$ $10.80^{\pm.117}$ [o-pytorch] -
11 2023 CVPR T2M-GPT $0.402^{\pm.006}$ $0.619^{\pm.005}$ $0.737^{\pm.006}$ $0.717^{\pm.041}$ $3.053^{\pm.026}$ $1.912^{\pm.036}$ $10.86^{\pm.094}$ [o-pytorch] -
12 2023 ICCV Fg-T2M $0.418^{\pm.005}$ $0.626^{\pm.004}$ $0.745^{\pm.004}$ $0.571^{\pm.047}$ $3.114^{\pm.015}$ $1.019^{\pm.029}$ $10.93^{\pm.083}$ - -
13 2023 ICCV M2DM $0.416^{\pm.004}$ $0.628^{\pm.004}$ $0.743^{\pm.004}$ $0.515^{\pm.029}$ $3.015^{\pm.017}$ $3.325^{\pm.370}$ $11.417^{\pm.970}$ - -
14 2023 ICCV AttT2M $0.413^{\pm.006}$ $0.632^{\pm.006}$ $0.751^{\pm.006}$ $0.870^{\pm.039}$ $3.039^{\pm.021}$ $2.281^{\pm.047}$ $10.96^{\pm.123}$ [o-pytorch] -
15 2023 NeurIPS MotionGPT $0.366^{\pm.005}$ $0.558^{\pm.004}$ $0.680^{\pm.005}$ $0.510^{\pm.016}$ $3.527^{\pm.021}$ $2.328^{\pm.117}$ $10.35^{\pm.084}$ [o-pytorch] -
16 2023 NeurIPS ReMoDiffuse † $0.427^{\pm.014}$ $0.641^{\pm.004}$ $0.765^{\pm.055}$ $0.155^{\pm.006}$ $2.814^{\pm.012}$ $1.239^{\pm.028}$ $10.80^{\pm.105}$ [o-pytorch] -
17 2024 CVPR MMM $0.381^{\pm.005}$ $0.590^{\pm.006}$ $0.718^{\pm.005}$ $0.429^{\pm.019}$ $3.146^{\pm.019}$ $1.105^{\pm.026}$ $10.633^{\pm.097}$ [o-pytorch] -
18 2024 CVPR MoMask $0.433^{\pm.007}$ $0.656^{\pm.005}$ $0.781^{\pm.005}$ $0.204^{\pm.011}$ $2.779^{\pm.022}$ $1.131^{\pm.043}$ - [o-pytorch] -
19 2024 ECCV Motion Mamba $0.419^{\pm.006}$ $0.645^{\pm.005}$ $0.765^{\pm.006}$ $0.307^{\pm.041}$ $3.021^{\pm.025}$ $1.678^{\pm.064}$ $11.02^{\pm.098}$ [o-pytorch] -
20 2024 ECCV BAMM $0.438^{\pm.009}$ $0.661^{\pm.009}$ $0.788^{\pm.005}$ $0.183^{\pm.013}$ $2.723^{\pm.026}$ $1.609^{\pm.065}$ $11.008^{\pm.098}$ - -

Paper List

Text-to-Motion

  1. [Seq2Seq] | NeurIPS'18 | Generating Animated Videos of Human Activities from Natural Language Descriptions | [pdf] | [u-pytorch] |
  2. [Language2Pose] | 3DV'19 | Language2Pose: Natural Language Grounded Pose Forecasting | [pdf] | [o-pytorch] |
  3. [Text2Gesture] | IEEE VR'21 | Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents | [pdf] | [o-pytorch] |
  4. [Hier] | ICCV'21 | Synthesis of Compositional Animations from Textual Descriptions | [pdf] | [o-pytorch] |
  5. [TEMOS] | ECCV'22 | TEMOS: Generating diverse human motions from textual descriptions | [pdf] | [o-pytorch] |
  6. [TM2T] | ECCV'22 | TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts | [pdf] | [o-pytorch] |
  7. [T2T] | CVPR'22 | Generating Diverse and Natural 3D Human Motions from Text | [pdf] | [o-pytorch] |
  8. [MDM] | ICLR'23 | MDM: Human Motion Diffusion Model | [pdf] | [o-pytorch] |
  9. [MotionDiffuse] | Arxiv'22 (TPAMI'24) | MDM: Human Motion Diffusion Model | [pdf] | [o-pytorch] |
  10. [MLD] | CVPR'23 | Executing your Commands via Motion Diffusion in Latent Space | [pdf] | [o-pytorch] |
  11. [T2m-GPT] | CVPR'23 | T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations | [pdf] | [o-pytorch] |
  12. [Fg-T2M] | ICCV'23 | Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model | [pdf] | - |
  13. [M2DM] | ICCV'23 | Priority-Centric Human Motion Generation in Discrete Latent Space | [pdf] | - |
  14. [AttT2M] | ICCV'23 | AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism | [pdf] | [o-pytorch] |
  15. [MotionGPT] | NeurIPS'23 | MotionGPT: Human Motion as a Foreign Language | [pdf] | [o-pytorch] |
  16. [ReMoDiffuse †] | NeurIPS'23 | ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model | [pdf] | [o-pytorch] |
  17. [MMM] | CVPR'24 | MMM: Generative Masked Motion Model | [pdf] | [o-pytorch] |
  18. [MoMask] | CVPR'24 | MoMask: Generative Masked Modeling of 3D Human Motions | [pdf] | [o-pytorch] |
  19. [MotionLCM] | ECCV'24 | MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model | [pdf] | [o-pytorch] |
  20. [Motion Mamba] | ECCV'24 | Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM | [pdf] | [o-pytorch] |
  21. [BAMM] | ECCV'24 | BAMM: Bidirectional Autoregressive Motion Model | [pdf] | - |

Motion Control (e.g., Spatial Contraints)

  1. [GMD] | ICCV'23 | Guided motion diffusion for controllable human motion synthesis | [pdf] | [o-pytorch] |
  2. [PhysDiff] | ICCV'23 | PhysDiff: Physics-Guided Human Motion Diffusion Model | [pdf] | - |
  3. [PriorMDM] | ICLR'24 | Human Motion Diffusion as a Generative Prior | [pdf] | [o-pytorch] |
  4. [OmniControl] | ICLR'24 | Omnicontrol: Control any joint at any time for human motion generation | [pdf] | [o-pytorch] |
  5. [MotionLCM] | ECCV'24 | MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model | [pdf] | [o-pytorch] |

Feedback

If you have any suggestions or find missing papers, please feel free to contact me.

Thanks

This format of this awesome follows this project, thanks for such a pretty template!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published