Polygon Mesh: Classical Dominance
Polygon meshes are the most classic and widely used 3D representation in computer graphics. The core idea is simple: stitch countless small triangles to form object surfaces. Like mosaic tiles, more and smaller triangles yield smoother, more detailed surfaces. A standard mesh consists of vertices, edges, and faces. Triangles offer a natural advantage: three points are always coplanar, simplifying collision detection and lighting calculations.
Mesh dominance stems from precise geometric control, mature toolchains (Maya, Blender, 3ds Max), and GPU hardware optimized specifically for triangle rasterization. However, limitations emerge when capturing the real world: high modeling costs, difficulty expressing complex lighting, large storage footprints, and difficulty being generated automatically from real photos.
Point Clouds & NeRF: Strengths and Bottlenecks
Point clouds are collections of discrete 3D points, each containing position (x, y, z) and color data. LiDAR scanners capture millions of points in minutes, making them ideal for digitizing the real world. However, point clouds have fundamental flaws: they lack surface information, appear grainy, carry massive data with low information density, and are difficult to edit or animate. Point clouds are more like raw 3D material than finished products.
In 2020, NeRF (Neural Radiance Fields) proposed a revolutionary idea: train a neural network to learn a continuous function mapping position and viewing direction to color and density. NeRF captures extraordinarily subtle lighting effects and renders from any viewpoint. However, its Achilles heel is speed: training a single scene takes hours, and rendering a single frame takes tens of seconds — until 3D Gaussian Splatting found the balance.
The Compromise Art of 3DGS
3D Gaussian Splatting (3DGS), proposed in 2023, finds the best balance between NeRF's continuity and Mesh's efficiency. The core idea is to fill space with millions of 3D Gaussian distributions (ellipsoids), each carrying color and opacity information. During rendering, these 3D Gaussians are 'splatted' onto the 2D image plane, generating the final frame through fast rasterization.
3DGS cleverly combines multiple technical advantages: it starts from SfM point clouds, inherits their ease of capture; uses explicit geometric primitives (Gaussian ellipsoids) with directly optimizable parameters; employs an efficient GPU rasterization pipeline for real-time rendering; and uses adaptive density control to automatically increase Gaussian density in detail-rich regions. On high-end GPUs, 3DGS achieves 100+ FPS rendering, hundreds of times faster than NeRF, with training typically completing in 10–30 minutes.
Philosophical Reflections on Technical Evolution
The evolution from Mesh to Point Cloud, then to NeRF and 3DGS, reveals a clear trend: from manual design to data-driven approaches, from explicit to implicit geometry, and back to a fusion of both. Mesh represents the 'Craftsman Era', Point Cloud the 'Capture Era', NeRF the 'Intelligence Era', and 3DGS the 'Balance Era'. No single representation is universally superior — the choice depends on the specific application context.