印刻万物 TOP3DGS印刻万物TOP3DGS

stage 08

训练

PublishedLast reviewed: 2026-05-08Recommended tools: postshot

Training

Inktoys Engrave Everything — 3DGS Tutorial Series · Chapter 08 · Training

Concept and Positioning

Training is the "magic happens here" step of the 3DGS pipeline. Every previous chapter—capture, frame extraction, color grading, SfM—has been preparing inputs for this moment. The essence of training is this: starting from the sparse point cloud produced by SfM, use differentiable rendering and gradient descent to optimize a few thousand initial 3D Gaussians into millions of Gaussian ellipsoids that precisely describe the scene's appearance.

Each Gaussian carries the following attributes:

• Position (x, y, z): the center point in 3D space

• Covariance matrix (parameterized by scale + rotation quaternion): controls the ellipsoid's shape and orientation

• Opacity: transparency between 0 and 1

• SH coefficients (spherical harmonics): encode view-dependent color information

The core logic of the training loop:

  1. From the current Gaussian set, render an image using a differentiable rasterizer

  2. Compute the loss between the rendered image and the corresponding ground-truth photo (L1 + SSIM)

  3. Backpropagate gradients and update every Gaussian's parameters

  4. Periodically run adaptive density control: split/clone Gaussians in high-gradient regions, prune Gaussians with low opacity

  5. Repeat for 30,000 iterations

The output after training: a PLY file containing the complete parameters of hundreds of thousands to millions of Gaussians. That file is your 3D scene—it can be rendered in real time from any viewpoint.

diagram

Decision Points

Decision 1: Cloud Training vs Desktop Training

This is the most fundamental choice in this chapter. Each path fits a different scenario:

Cloud option: upload images/video, wait for the server to process, download the result. No GPU required.

Desktop option: run training on your local machine. Requires an NVIDIA GPU (minimum 6 GB VRAM).

DimensionCloudDesktop
GPU requirementNoneNVIDIA RTX 2060+
ControlLow (black box)High (fully tunable)
PrivacyData uploaded to a third partyData never leaves your machine
CostSubscription or per-jobOne-time hardware investment
Quality ceilingGood (~27.5–27.8 dB PSNR)Highest (~28.1–28.4 dB PSNR)
Learning valueLowHigh (you understand the training process)
Best forQuick results, no-GPU usersQuality-driven users, parameter tuning, research

diagram

Decision 2: Cloud Options Compared

ToolPriceProcessing timeQualityExport formatsHighlights
Polycam$8/month Pro15–45 min★★★★PLY, OBJ, FBX, USDZ and 15+ formatsLiDAR support, floor plans, UE5 integration
Luma AIFree basic tier3–5 min★★★★☆PLYFastest processing, best visual quality, zero barrier to entry
KIRI Engine$10/month30–90 min★★★☆PLY, OBJLiDAR integration, 3 free scans
Scaniverse (Niantic)Free5–10 min (on-device)★★★☆SPZ, PLYFully on-device processing, no upload required
TeleportPer-project pricingVaries★★★★PLY, SPLATProfessional grade, supports large scenes
PointcosmFree beta15–30 min★★★☆PLYEmerging platform, continuously improving

Selection guide:

• Want the fastest and easiest → Luma AI (free, results in 3 minutes, excellent quality)

• Need professional export formats → Polycam (15+ formats, LiDAR, floor plans)

• Fully offline / privacy first → Scaniverse (on-device processing, no data upload)

• Tight budget → Luma AI free tier + Scaniverse free tier

Decision 3: Desktop Options Compared

ToolPriceMinimum GPUTraining time (83 images)QualityHighlights
PostShot ⭐ recommended€17/month IndieRTX 2060~15 min (4090)★★★★Real-time preview, RealityScan import, UE5 plugin, best GUI
Nerfstudio (splatfacto)Free, open sourceRTX 3060 (8 GB)~25 min (3060)★★★★Research framework, multiple methods, active community
LichtFeld StudioFree, open sourceRTX 2060 (NVIDIA only)~10 min (4090)★★★★☆Fastest speed, native C++/CUDA, Python plugins, MCP automation
BrushFree, open sourceAny GPU (incl. AMD/Intel)~20 min★★★★Cross-platform, hardware-agnostic, WebGPU
gsplatFree, open sourceRTX 3060~20 min (3060)★★★★☆PyTorch library, most modular, 4× memory savings
Original 3DGS (INRIA)Free, open sourceRTX 3090 (24 GB recommended)~30 min (3060)★★★★★Reference implementation, highest baseline quality, academic standard

Inktoys' recommended ranking:

  1. PostShot — if you want the "best experience". The real-time training preview lets you watch Gaussians densify and refine progressively, which is invaluable for understanding the training process. Supports importing poses from RealityScan, skipping the COLMAP wait. Excellent GUI design—pause/resume/parameter tweaking are all intuitive.

  2. LichtFeld Studio — if you chase "maximum speed + open source". Native C++/CUDA implementation, training speed crushes Python solutions. Supports MCMC strategy and ImprovedGS+ and other cutting-edge algorithms. The Python plugin system allows custom extensions.

  3. Nerfstudio — if you are a "researcher/developer". The most complete research framework, supporting splatfacto, instant-ngp, nerfacto and many other methods. End-to-end from data processing to training. The most active community and the first choice for paper reproduction.

  4. Brush — if you "do not have an NVIDIA GPU". The only desktop training tool that supports AMD, Intel, and even Apple Metal. Built on WebGPU, truly cross-platform.

Decision 4: Number of Training Iterations

IterationsUse caseQualityTime (RTX 4090)
7,000Quick preview / test60–70% of final quality2–3 min
15,000Medium quality85–90% of final quality5–6 min
30,000Standard full training100% baseline8–12 min
50,000+Extreme quality / difficult scenesMarginal gains (diminishing returns)15–20 min

Inktoys' suggestion: first run 7,000 iterations to confirm there are no obvious issues (incorrect poses, poor input quality), then run the full 30,000. Going beyond 30,000 is usually not worth the extra time.

diagram

Operating Steps

PostShot currently offers the best GUI experience among 3DGS desktop training tools. Here is the complete workflow:

Step 1: Import Data

Method A: import images/video directly

  1. Open PostShot → New Project

  2. Drag in an image folder or video file

  3. PostShot automatically invokes its built-in COLMAP for SfM

  4. Wait for pose estimation to finish (progress bar shown)

Method B: import RealityScan/COLMAP poses (recommended)

  1. In RealityScan, complete alignment → Export → COLMAP format

  2. PostShot → Import → COLMAP Sparse Model

  3. Select the sparse/0/ directory and the images/ directory

  4. Poses load directly, skipping the SfM wait

Step 2: Configure Training Parameters

PostShot's defaults are already good for most scenes, but the following parameters are worth attention:

ParameterDefaultTuning advice
Stop Step30,000Drop to 7,000 for quick previews
Max Splat CountAutoLimit to 500K for small objects, raise to 5M for large scenes
SH Degree3Keep default (higher SH order = better view-dependent effects)
Densification Interval100Keep default
Position Learning Rate0.00016Keep default
Opacity Reset Interval3000Keep default

Step 3: Start Training & Monitor in Real Time

  1. Click the Train button

  2. Watch the live preview window—you will see Gaussians evolve from a sparse point cloud into a complete scene

  3. Monitor the training curves:

◦ Loss should keep decreasing

◦ Splat Count should grow quickly at first, then stabilize

◦ PSNR should keep increasing (> 25 dB is acceptable, > 28 dB is excellent)

  1. If you spot problems, you can Pause at any time:

◦ Floaters appearing → lower the opacity threshold

◦ Insufficient density → raise the densification gradient threshold

◦ Overfitting → reduce iterations

Step 4: Export

After training:

  1. File → Export → Gaussian Splat (.ply)

  2. Choose the export path

  3. Optional: also export the camera trajectory (for later rendering)

PostShot export formats:

• PLY (standard Gaussian Splat, compatible with all viewers)

• SPLAT (compressed format, suitable for the Web)

• Direct Unreal Engine import (via the PostShot UE5 plugin)

bash
# Prerequisite: ns-process-data already complete (see Chapter 07)
# Train splatfacto (Nerfstudio's 3DGS implementation) ns-train splatfacto \
--data ./processed/ \
--output-dir ./splat_output/ \
--max-num-iterations 30000 \
--pipeline.model.cull-alpha-thresh 0.005 \
--pipeline.model.densify-grad-thresh 0.0002 \
--pipeline.model.sh-degree 3
# During training you can open the Viewer # Visit http://localhost:7007 in a browser to watch live training progress

Key parameters explained:

bash
# Core training parameters --max-num-iterations 30000
# Total iterations --pipeline.model.sh-degree 3
# SH degree (0=flat color, 1=simple lighting, 3=full view-dependent effects)
# Density control parameters --pipeline.model.densify-grad-thresh 0.0002
# Densification gradient threshold (lower = more splits) --pipeline.model.cull-alpha-thresh 0.005
# Pruning opacity threshold --pipeline.model.densify-until-iter 15000
# Densification cutoff iteration --pipeline.model.densify-from-iter 500
# Densification start iteration
# Learning rates --pipeline.model.position-lr-init 0.00016
# Initial position learning rate --pipeline.model.position-lr-final 0.0000016
# Final position learning rate --pipeline.model.feature-lr 0.0025
# Color feature learning rate --pipeline.model.opacity-lr 0.05
# Opacity learning rate --pipeline.model.scaling-lr 0.005
# Scale learning rate --pipeline.model.rotation-lr 0.001
# Rotation learning rate

Export PLY after training:

bash
# Export as standard PLY ns-export gaussian-splat \
--load-config ./splat_output/splatfacto/CONFIG_TIMESTAMP/config.yml \
--output-dir ./export/

LichtFeld Studio is a high-performance native C++/CUDA implementation with extremely fast training.

Installation

Download the Windows binary from portal.lichtfeld.io, or build from source on GitHub:

bash
git clone https://github.com/MrNeRF/LichtFeld-Studio.git cd LichtFeld-Studio # Build per the README (requires CUDA 12+, CMake, Visual Studio 2022)

Training Workflow

  1. Open LichtFeld Studio

  2. File → Open Dataset → choose the COLMAP sparse directory

  3. Pick a training strategy:

◦ ADC (Adaptive Density Control): the classic 3DGS strategy

◦ MCMC: a newer Markov Chain Monte Carlo strategy that often converges faster

◦ ImprovedGS+: the latest high-performance strategy

  1. Click Train → watch the reconstruction in real time

  2. When training is done → Export PLY

Unique Advantages of LichtFeld

• Python plugin system: write Python scripts to extend functionality

• MCP automation: external automation via the Model Context Protocol

• Multiple training strategies: switch between ADC / MCMC / ImprovedGS+ for comparison

• Checkpoint resume: resume training from any iteration

gsplat is a low-level CUDA-accelerated library developed by the Nerfstudio team, offering maximum flexibility:

python
#!/usr/bin/env python3 """ 08_train_gsplat.py Inktoys · 3DGS training with gsplat """
import torch import numpy as np from pathlib import Path from gsplat import rasterization from gsplat.strategy import DefaultStrategy import pycolmap
def load_colmap_data(colmap_path: str, image_dir: str):
"""Load training data from COLMAP output"""
reconstruction = pycolmap.Reconstruction()
reconstruction.read(colmap_path)
# Extract camera intrinsics
camera = list(reconstruction.cameras.values())[0]
K = np.array([
[camera.focal_length_x, 0, camera.principal_point_x],
[0, camera.focal_length_y, camera.principal_point_y],
[0, 0, 1]
])
# Extract image extrinsics and paths
viewmats = []
image_paths = []
for img in reconstruction.images.values():
if not img.registered:
continue
# World-to-camera transform
R = img.rotmat()
t = img.tvec
viewmat = np.eye(4)
viewmat[:3, :3] = R
viewmat[:3, 3] = t
viewmats.append(viewmat)
image_paths.append(Path(image_dir) / img.name)
# Extract initial point cloud
points = []
colors = []
for point in reconstruction.points3D.values():
points.append(point.xyz)
colors.append(point.color / 255.0)
return {
"K": torch.tensor(K, dtype=torch.float32),
"viewmats": torch.tensor(np.array(viewmats), dtype=torch.float32),
"image_paths": image_paths,
"points": torch.tensor(np.array(points), dtype=torch.float32),
"colors": torch.tensor(np.array(colors), dtype=torch.float32),
"width": camera.width,
"height": camera.height,
}
class GaussianModel:
"""3D Gaussian Splatting model"""
def __init__(self, points: torch.Tensor, colors: torch.Tensor,
sh_degree: int = 3, device: str = "cuda"):
self.device = device
self.sh_degree = sh_degree
N = points.shape[0]
# Learnable parameters
self.means = torch.nn.Parameter(points.to(device))
self.scales = torch.nn.Parameter(
torch.log(torch.ones(N, 3, device=device) * 0.01)
)
self.quats = torch.nn.Parameter(
torch.zeros(N, 4, device=device)
)
self.quats.data[:, 0] = 1.0
# Unit quaternion
self.opacities = torch.nn.Parameter(
torch.logit(torch.ones(N, 1, device=device) * 0.5)
)
# Spherical harmonics (SH) coefficients
num_sh = (sh_degree + 1) ** 2
self.sh_coeffs = torch.nn.Parameter(
torch.zeros(N, num_sh, 3, device=device)
)
# Initialize the DC component to the input colors
self.sh_coeffs.data[:, 0, :] = colors.to(device)
def parameters(self):
return [self.means, self.scales, self.quats,
self.opacities, self.sh_coeffs]
def train(colmap_path: str, image_dir: str, output_path: str,
num_iterations: int = 30000, device: str = "cuda"):
"""Full training pipeline"""
print("Loading data...")
data = load_colmap_data(colmap_path, image_dir)
print(f"Initializing model: {data['points'].shape[0]} Gaussians")
model = GaussianModel(data["points"], data["colors"], device=device)
# Optimizer (different parameter groups, different learning rates)
optimizer = torch.optim.Adam([
{"params": [model.means], "lr": 0.00016, "name": "means"},
{"params": [model.scales], "lr": 0.005, "name": "scales"},
{"params": [model.quats], "lr": 0.001, "name": "quats"},
{"params": [model.opacities], "lr": 0.05, "name": "opacities"},
{"params": [model.sh_coeffs], "lr": 0.0025, "name": "sh_coeffs"},
])
# Density control strategy
strategy = DefaultStrategy(
densify_grad_thresh=0.0002,
densify_start_iter=500,
densify_stop_iter=15000,
densify_interval=100,
prune_opacity_thresh=0.005,
reset_opacity_interval=3000,
)
strategy_state = strategy.initialize_state()
# Load all training images
import cv2
images = []
for p in data["image_paths"]:
img = cv2.imread(str(p))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = torch.tensor(img, dtype=torch.float32, device=device) / 255.0
images.append(img)
num_views = len(images)
K = data["K"].to(device)
viewmats = data["viewmats"].to(device)
print(f"Starting training: {num_iterations} iterations, {num_views} views")
for iteration in range(num_iterations):
# Randomly pick a view
idx = torch.randint(0, num_views, (1,)).item()
gt_image = images[idx]
viewmat = viewmats[idx]
# Render
renders, alphas, info = rasterization(
means=model.means,
quats=model.quats,
scales=torch.exp(model.scales),
opacities=torch.sigmoid(model.opacities).squeeze(-1),
colors=model.sh_coeffs,
viewmats=viewmat.unsqueeze(0),
Ks=K.unsqueeze(0),
width=data["width"],
height=data["height"],
sh_degree=model.sh_degree,
)
rendered_image = renders[0]
# [H, W, 3]
# Compute loss (L1 + SSIM)
l1_loss = torch.abs(rendered_image - gt_image).mean()
# Simplified SSIM (the real implementation is more complex)
from torchmetrics.functional import structural_similarity_index_measure as ssim
ssim_loss = 1.0 - ssim(
rendered_image.permute(2,0,1).unsqueeze(0),
gt_image.permute(2,0,1).unsqueeze(0),
data_range=1.0
)
loss = 0.8 * l1_loss + 0.2 * ssim_loss
# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Density control
if iteration < 15000:
strategy.step(
params=[model.means, model.scales, model.quats,
model.opacities, model.sh_coeffs],
optimizers=[optimizer],
state=strategy_state,
step=iteration,
info=info,
)
# Logging
if iteration % 1000 == 0:
psnr = -10 * torch.log10(
torch.mean((rendered_image - gt_image) ** 2)
)
print(f"
Iter {iteration:5d} \

Plan E: Original 3DGS (INRIA) Reference Implementation

The academic gold standard. The baseline of every paper is built on this implementation:

bash
# Install git clone https://github.com/graphdeco-inria/gaussian-splatting.git cd gaussian-splatting pip install -r requirements.txt pip install submodules/diff-gaussian-rasterization pip install submodules/simple-knn
# Train (requires COLMAP-format input) python train.py \
-s ./data/my_scene/ \
--iterations 30000 \
--densify_until_iter 15000 \
--densification_interval 100 \
--opacity_reset_interval 3000 \
--densify_grad_threshold 0.0002 \
--sh_degree 3
# Render test views python render.py \
-m ./output/my_scene/ \
--skip_train
# Evaluate metrics python metrics.py \
-m ./output/my_scene/

Required input directory structure:

text
my_scene/ ├── images/
# Undistorted images ├── sparse/ │
└── 0/ │
├── cameras.bin │
├── images.bin │
└── points3D.bin └── (optional) images_2/
# 1/2 resolution (faster training)
images_4/
# 1/4 resolution
images_8/
# 1/8 resolution

Hardware × Runtime Reference Table

Based on the standard test scene of 83 iPhone 15 photos (4000×3000):

GPUVRAMTraining time (30K iter)Max scene sizePowerReference price
RTX 20606 GB~90 min< 100 images160W¥1,500 used
RTX 306012 GB~30 min< 300 images170W¥2,000
RTX 4060 Ti16 GB~22 min< 400 images160W¥3,200
RTX 407012 GB~20 min< 300 images200W¥4,000
RTX 408016 GB~15 min< 500 images320W¥7,500
RTX 409024 GB~8–12 min< 1000 images450W¥13,000
RTX 509032 GB~6–8 min< 1500 images575W¥16,000

VRAM usage rule of thumb:

text
VRAM ≈ base overhead (2GB) + image cache (N × resolution × 12B) + Gaussian parameters (M × 200B)

• 100 4K images + 1M Gaussians ≈ 6–8 GB

• 300 4K images + 3M Gaussians ≈ 12–16 GB

• 1000 4K images + 5M Gaussians ≈ 20–24 GB

Alternatives if you have no GPU:

OptionCostSpeedNotes
Cloud GPU rental (Vast.ai)RTX 4090: $0.25/hSame as localHourly billing, best price-performance
Google Colab Pro$10/monthT4/A100Time limits and disconnect risk
Cloud services (Luma/Polycam)Free–$8/month3–45 minEasiest, but no parameter tuning
OpenSplat CPU modeFree~100× slowerFor verification only, not practical

Training Parameters in Depth

Core Parameter Groups

  1. Iterations and scheduling
ParameterDefaultPurposeTuning direction
iterations30,000Total training stepsBump to 50K for difficult scenes
densify_from_iter500Iteration at which densification beginsKeep default
densify_until_iter15,000Iteration at which densification stopsLower if you have too many Gaussians
densification_interval100Run densification every N stepsKeep default
opacity_reset_interval3,000Reset opacity every N stepsKeep default
  1. Density control (the biggest driver of final quality)
ParameterDefaultPurposeTuning direction
densify_grad_threshold0.0002Gaussians with gradient above this value get splitLower → more splits → denser → more VRAM
percent_dense0.01Gaussians larger than 1% of scene scale are split rather than clonedKeep default
min_opacity0.005Gaussians below this value are prunedRaise → more aggressive pruning → fewer floaters
  1. Learning rates
ParameterInitialFinalNotes
position_lr0.000160.0000016Exponential decay, controls how fast Gaussians move
feature_lr0.0025-Color / SH coefficient learning rate
opacity_lr0.05-Opacity learning rate
scaling_lr0.005-Scale learning rate
rotation_lr0.001-Rotation learning rate
  1. Loss function
text
Total Loss = (1 - λ) × L1_loss + λ × (1 - SSIM)

Default λ = 0.2. L1 enforces pixel-level accuracy; SSIM enforces structural similarity.

diagram

Common Errors and Troubleshooting

Issue 1: Floaters

Symptoms: random colored specks or translucent fog appear in mid-air

Causes:

• Over-densification: gradient threshold too low, Gaussians spawn even in empty regions

• Dust/water droplets/lens smudges in the input images

• Background sky region wrongly reconstructed

Solutions:

bash
# Solution 1: raise the pruning threshold --min_opacity 0.01
# Up from 0.005
# Solution 2: make densification less aggressive --densify_grad_threshold 0.0005
# Up from 0.0002
# Solution 3: use a mask to exclude the sky # Generate a sky mask before training, exclude these regions from the loss

Inside PostShot: after training, use the built-in Crop tool to manually delete floater regions.

Issue 2: Overfitting

Symptoms: looks perfect from training viewpoints, but new viewpoints show severe artifacts

Causes:

• Too few training viewpoints (< 50 images)

• Uneven view coverage (some directions have no images)

• Too many iterations

Solutions:

bash
# Solution 1: reduce iterations --iterations 15000
# Down from 30000
# Solution 2: add regularization # Use Nerfstudio's depth regularization ns-train splatfacto \
--pipeline.model.use-depth-loss True \
--pipeline.model.depth-loss-mult 0.1
# Solution 3: address the root cause—add more training views # Go back to capture and fill in the missing angles

Issue 3: Underdense

Symptoms: visible holes or transparent regions in the scene

Causes:

• Initial point cloud too sparse (COLMAP extracted too few features)

• Densification gradient threshold too high

• Densification stopped too early

Solutions:

bash
# Solution 1: lower the densification threshold --densify_grad_threshold 0.0001
# Down from 0.0002
# Solution 2: extend the densification phase --densify_until_iter 20000
# Up from 15000
# Solution 3: increase initial point-cloud density # Use more feature points in COLMAP colmap feature_extractor --SiftExtraction.max_num_features 16384

Issue 4: Color Bleeding

Symptoms: edge colors of an object bleed into adjacent regions

Causes:

• Gaussians too large, crossing object boundaries

• SH degree too low to model view variation correctly

Solutions:

bash
# Solution 1: cap maximum Gaussian size # Add scale clipping in your training code
# Solution 2: raise SH degree --sh_degree 4
# Up from 3 (increases memory usage)
# Solution 3: use masks to assist training # Generate a mask for foreground objects to keep Gaussians from crossing boundaries

Issue 5: Training Crash / NaN

Symptoms: Loss suddenly becomes NaN or Inf, training stops

Causes:

• Learning rate too high

• Outliers in initial point cloud (very far points)

• Numerical overflow caused by insufficient VRAM

Solutions:

bash
# Solution 1: lower the learning rate --position_lr_init 0.00008
# Halved
# Solution 2: clean the initial point cloud # Remove outliers from the COLMAP output python -c " import pycolmap r = pycolmap.Reconstruction() r.read('./sparse/0/') # Remove points farther than 3 std deviations from center import numpy as np pts = np.array([p.xyz for p in r.points3D.values()]) center = pts.mean(axis=0) dists = np.linalg.norm(pts - center, axis=1) threshold = dists.mean() + 3 * dists.std() # filter... "
# Solution 3: use mixed-precision training # gsplat supports FP16 by default, which can reduce numerical issues

Issue 6: VRAM Insufficient (OOM)

Symptoms: CUDA out of memory error

Solutions in order of priority:

bash
# 1. Lower image resolution (most effective) # Use the images_2/ or images_4/ subdirectory python train.py -s ./data/ --resolution 2
# Use 1/2 resolution
# 2. Limit the number of Gaussians --densify_grad_threshold 0.0005
# Fewer splits # Or set Max Splat Count in PostShot
# 3. Reduce image cache --data_device cpu
# Keep images in CPU memory, load to GPU on demand
# 4. Lower SH degree --sh_degree 2
# Down from 3, saves ~30% memory
# 5. Reduce the number of training images # Uniformly sample 200 from 500

Training Quality Evaluation

Quantitative Metrics

After training, use a test set to evaluate quality:

bash
# Nerfstudio automatic evaluation ns-eval \
--load-config ./output/config.yml \
--output-path ./eval_results.json
# Original 3DGS evaluation python metrics.py -m ./output/my_scene/

Interpretation:

MetricMeaningExcellentAcceptablePoor
PSNR (dB)Peak signal-to-noise ratio> 2825–28< 25
SSIMStructural similarity> 0.920.85–0.92< 0.85
LPIPSPerceptual similarity (lower is better)< 0.100.10–0.20> 0.20

Qualitative Checks

Numerical metrics cannot replace human inspection. Below are items you must verify visually:

  1. Edge sharpness: are object edges crisp, or are there blurry halos?

  2. New-view consistency: from interpolated positions between training views, do you see flicker or popping?

  3. Detail preservation: are text and texture details still legible?

  4. Floaters: are there colored specks in the air that don't belong to the scene?

  5. Transparency: have any regions become transparent that shouldn't be?

Inktoys' Take

On choosing a training tool, here is my hands-on experience:

I have used nearly every tool listed above. My daily workflow has settled on this combination:

Quick preview: Luma AI (shoot on phone, upload directly, see results in 3 minutes, decide whether the scene is worth doing seriously).

Production projects: RealityScan alignment → PostShot training. RealityScan's SfM is 10× faster and more robust than COLMAP, and PostShot's live preview lets me catch problems mid-training and adjust on the fly.

Research / experiments: Nerfstudio splatfacto. When I need to compare methods, test new parameters, or reproduce papers, Nerfstudio's modular design is irreplaceable.

On "how long is enough":

This is the question beginners ask most. The answer is simple: watch the loss curve. When the loss has fallen by less than 1% over the most recent 5,000 iterations, continuing to train is meaningless. For most scenes, that inflection point shows up around 20,000–25,000 iterations.

There is one exception: if your scene has lots of high-frequency detail (foliage, grass, text), it may take 40,000–50,000 iterations to fully converge. At that point SH degree 3 may also be insufficient—try degree 4.

On hardware investment:

If you take 3DGS seriously, the RTX 4090 is currently the best price-performance choice. Its 24 GB of VRAM handles the vast majority of scenes, and it trains 3–4× faster than an RTX 3060. Given the time you save, the investment pays for itself quickly.

If your budget is tight, the RTX 3060 12 GB is the minimum usable configuration. The 6 GB RTX 2060 can technically run training, but you will spend a great deal of time downscaling resolution, capping Gaussian counts, and fighting OOM—time you could have spent on more valuable work.

In one sentence: training is the most "automated" stage of the 3DGS pipeline—once the upstream prep is done right, training itself is just clicking start and waiting. Spend your energy on capture and SfM; 90% of the problems that surface during training are upstream errors becoming visible.

Further Reading

• Kerbl, B. et al. (2023). "3D Gaussian Splatting for Real-Time Radiance Field Rendering." SIGGRAPH.

• Ye, V. et al. (2025). "gsplat: An Open-Source Library for Gaussian Splatting." JMLR.

• Kheradmand, A. et al. (2024). "3D Gaussian Splatting as Markov Chain Monte Carlo." NeurIPS.

• Bulo, S. R. et al. (2024). "Revising Densification in Gaussian Splatting." CVPR.

• PostShot official documentation: https://jawset.com/postshot/docs

• Nerfstudio splatfacto documentation: https://docs.nerf.studio/

• LichtFeld Studio GitHub: https://github.com/MrNeRF/LichtFeld-Studio

• gsplat documentation: https://docs.gsplat.studio/