Initialization: The Starting Point
3DGS training doesn't start from scratch — it relies on sparse point clouds from Structure-from-Motion (SfM). Given a set of photos, COLMAP detects and matches feature points, estimates camera intrinsics and extrinsics, and triangulates a sparse 3D point cloud — typically thousands to tens of thousands of points providing key position information for Gaussian initialization. Each SfM point corresponds to an initial Gaussian: position directly uses the SfM point's 3D coordinates, color initializes the 0th-order SH term with RGB, scale is initialized to a fraction of the nearest-neighbor distance, rotation is the identity quaternion, and opacity starts small (e.g., 0.1).
Adaptive Density Control: Split and Clone
The core idea of Adaptive Density Control (ADC) is to use gradient magnitude to determine which regions need more Gaussians. During training, each Gaussian's position parameters accumulate gradients. When accumulated gradients exceed a threshold (e.g., 0.0002), the Gaussian becomes a candidate for density control. For Gaussians with large gradients and large size, a 'split' operation is applied: the Gaussian is split in two along its longest axis, with child Gaussians having halved scale in that direction, inheriting the parent's color and opacity, then the parent is deleted.
For Gaussians with large gradients but small size, splitting is not appropriate — 3DGS uses 'cloning': copying all parameters of the Gaussian, slightly offsetting the new Gaussian's position along the gradient direction, and keeping the original. After cloning, the two Gaussians gradually differentiate during subsequent optimization, each finding their optimal position and parameters. Density control doesn't run every iteration: gradients accumulate every 100 iterations, split/clone executes every 500 iterations, and density control stops after iteration 15000 to enter pure optimization.
Pruning: Removing Redundancy and Noise
The most direct pruning criterion is opacity: Gaussians with opacity below a threshold (e.g., 0.005) contribute negligibly to rendering and can be safely deleted. Additionally, oversized Gaussians (maximum scale exceeding 10% of scene size) may signal optimization failure and should be removed. 3DGS also employs a clever strategy of periodically resetting all Gaussian opacities — at iteration 3000, all opacities are reset to small values, forcing the model to re-prove each Gaussian's value. Those that consistently fail to recover high opacity are deleted in the next pruning pass.
Loss Function and Full Training Pipeline
The 3DGS loss function combines L1 loss and D-SSIM (structural similarity): L = (1-λ)L₁ + λL_D-SSIM, where λ is typically 0.2. L1 loss encourages pixel-level accuracy while D-SSIM preserves image structure and texture. The complete training pipeline has two phases: iterations 0-15000 for rapid density growth with periodic split/clone and pruning; iterations 15000-30000 for fine optimization with fixed Gaussian count and decaying learning rates. Typical training takes 30-60 minutes on a modern GPU.