Research Milestone

HUGS: Human Gaussian Splats

Jointly learns static scene and animatable human Gaussians from a short monocular clip, optimizing skinning weights to reduce articulation artifacts at high FPS.

Authors / Team

Muhammed Kocabas · Researcher

Year

2024

Deep Dive

HUGS represents both the person and the scene with 3D Gaussians, initializing human Gaussians from SMPL while allowing deviations for clothing and hair. Articulation can tear point-based avatars; the method co-optimizes linear blend skinning weights alongside Gaussians to keep the ensemble coherent under novel poses. Training finishes in tens of minutes on short monocular clips and renders at interactive rates, reporting large speedups over implicit neural human models while enabling novel-pose and novel-view synthesis.

What we learn

01
Explicit Gaussian avatars still need a skeletal prior; otherwise pose and appearance remain under-determined from monocular video.
02
Co-optimizing skinning weights is a direct fix for tearing in point-based avatars near joints.

Verbatim quote

"Recent advances in neural rendering have improved both training and rendering times by orders of magnitude."— source ↗