NeuralPVS: Learned Estimation of Potentially Visible Sets

Xiangyu Wang1, Thomas Köhler2, Jun Lin Qiu2, Shohei Mori1, Markus Steinberger2, Dieter Schmalstieg1,2
1VISUS, University of Stuttgart, Germany, 2Graz University of Technology, Austria
SIGGRAPH Asia 2025
Zotero Connector friendly
NeuralPVS Teaser

Overview of the NeuralPVS pipeline. The left side illustrates the overall system and task, where the camera is colored purple, and the white rendering indicates geometry invisible to the camera. A froxelized representation of the input scene is fed into the neural network with interleaving layers and outputs the potentially visible set (PVS) in froxelized form, as displayed in the middle. The network is trained with pairs consisting of a froxelized scene and the corresponding ground-truth PVS in froxelized form. The network runs at 100 Hz (10 ms per frame) on the GPU and generates less than 1% error rate, without introducing noticeable artifacts in the rendered images. The right side shows the rendered PVS of the frame from a bird's-eye view.

Abstract

Real-time visibility determination in expansive or dynamically changing environments has long posed a significant challenge in computer graphics. Existing techniques are computationally expensive and often applied as a precomputation step on a static scene. We present NeuralPVS, the first deep-learning approach for visibility computation that efficiently determines from-region visibility in a large scene, running at approximately 100 Hz processing with less than 1% missing geometry. This approach is possible by using a neural network operating on a voxelized representation of the scene. The network's performance is achieved by combining sparse convolution with a 3D volume-preserving interleaving for data compression. Moreover, we introduce a novel repulsive visibility loss that can effectively guide the network to converge to the correct data distribution. This loss provides enhanced robustness and generalization to unseen scenes. Our results demonstrate that NeuralPVS outperforms existing methods in terms of both accuracy and efficiency, making it a promising solution for real-time visibility computation.

Video

Pipeline

NeuralPVS Pipeline

For each viewcell, the scene's geometry is froxelized into a GV (geometry froxel-grid), which is input to the PVS (potentially visible set) estimator network. A 3D interleaving function first compresses the GV channels; a CNN then predicts the visible part of the geometry grid; afterwards, a 3D deinterleaving function reconstructs the full PVS. Geometric primitives in froxels marked invisible in the PVS are culled from all further rendering computations.

Results

NeuralPVS Results

(FNR: false negative rate in froxels; FPR: false positive rate in froxels; PER: pixel error rate in the rendered image.)

Per-frame PVS estimation performance with key frames images of the Viking Village scene in the video above. The sequence has 1800 frames in total, with 410 frames of PVS computed shown in the figure. The error pixels of the key frames are marked in red on the rendered image.

For more results, please refer to the paper.

BibTeX

@misc{wang2025neuralpvs,
  title={NeuralPVS: Learned Estimation of Potentially Visible Sets},
  author={Xiangyu Wang and Thomas Köhler and Jun Lin Qiu and Shohei Mori and Markus Steinberger and Dieter Schmalstieg},
  year={2025},
  eprint={2509.24677},
  archivePrefix={arXiv},
  primaryClass={cs.GR}
}