印刻万物 TOP3DGS印刻万物TOP3DGS
Back to Cases
Research Milestone

LangSplat: 3D Language Gaussian Splatting

Stores distilled language features on Gaussians and splats them for open-vocabulary 3D grounding, avoiding costly NeRF volume rendering.

Authors / Team

Minghan Qin · Researcher

Year

2024

Deep Dive

LangSplat targets open-vocabulary querying in 3D by attaching language features to 3D Gaussians and rendering them with a tile-based splatter analogous to RGB splatting, avoiding ray marching through a NeRF. A scene-wise language autoencoder reduces memory versus storing full CLIP embeddings, and hierarchical semantics derived from segmentation priors sharpen object boundaries. Experiments report large speedups over prior NeRF-grounded language fields at high resolution with improved grounding quality.

What we learn

  1. 01

    Splatting language features on explicit primitives brings inference closer to RGB splatting cost.

  2. 02

    Scene-specific latents and segmentation priors mitigate the fuzzy boundaries of CLIP features in 3D.

Verbatim quote

"Humans live in a 3D world and commonly use natural language to interact with a 3D scene."— source ↗

Tags

PaperSemanticOptimizationReal-time

Links

Sources