3D Gaussian Splatting has garnered extensive attention and application in real-time neural rendering. Concurrently, concerns have been raised about the limitations of this technology in aspects such as point cloud storage, performance, and robustness in sparse viewpoints, leading to various improvements. However, there has been a notable lack of attention to the fundamental problem of projection errors introduced by the local affine approximation inherent in the splatting itself, and the consequential impact of these errors on the quality of photo-realistic rendering. This paper addresses the projection error function of 3D Gaussian Splatting, commencing with the residual error from the first-order Taylor expansion of the projection function. The analysis establishes a correlation between the error and the Gaussian mean position. Subsequently, leveraging function optimization theory, this paper analyzes the function's minima to provide an optimal projection strategy for Gaussian Splatting referred to Optimal Gaussian Splatting, which can accommodate a variety of camera models. Experimental validation further confirms that this projection methodology reduces artifacts, resulting in a more convincingly realistic rendering.
We derive the mathematical expectation of the projection error (Top left), visualize the graph of the error function under two distinct domains and analyze when this function takes extrema through methods of function optimization (Top right). We further derive the projection error function with respect to image coordinates and focal length through the coordinate transformation between image coordinates and polar coordinates and visualize this function, with the left-to-right sequence corresponding to the 3D-GS rendered images under long focal length, 3D-GS rendered images under short focal length, the error function under long focal length, and the error function under short focal length (Below).
Illustration of the rendering pipeline for our Optimal Gaussian Splatting and the projection of 3D Gaussian Splatting. The blue box depicts the projection process of the original 3D-GS, which straightforwardly projects all Gaussians onto the same projection plane. In contrast, the red box illustrates our approach, where we project individual Gaussians onto corresponding tangent planes.
As the focal length decreases, the field of view expands, leading to more Gaussians deviating from the projection center and an overall increase in projection error. In such cases, 3D-GS exhibit more artifacts, such as needle-like structures or cloud-like large Gaussians, which obscure parts that perform well under long focal length, thereby significantly degrading the overall image quality. Since our method employs a central radial projection for the projection plane, we do not encounter such defects.
For Mip-Splatting and Scaffold-GS still employ conventional projection of 3D-GS, errors remain larger compared to ours.
By straightforwardly modifying the transformation from image space to camera space, we have achieved adaptation to various camera models, with the caveat that the original 3D-GS's projection struggles to support these camera models. And since our method is based on differentiable rasterizer, it can also be used to train on datasets with radial/tangential distortion. We illustrate the results obtained from training on the non-pinhole camera dataset Matterport directly, where 3D-GS fails entirely to reconstruct the scene.
@article{huang2024erroranalysis3dgaussian,
title={On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy},
author={Letian Huang and Jiayang Bai and Jie Guo and Yuanqi Li and Yanwen Guo},
journal={arXiv preprint arXiv:2402.00752},
year={2024}
}