We study 3D point cloud attribute compression via a volumetric approach: assuming point cloud geometry is known at both encoder and decoder, parameters of a continuous attribute function are quantized to and encoded, so that discrete samples can be recovered at known 3D points at the decoder. Specifically, we consider a nested sequences of function subspaces , where is a family of functions spanned by B-spline basis functions of order , is the projection of on and encoded as low-pass coefficients , and is the residual function in orthogonal subspace (where ) and encoded as high-pass coefficients . In this paper, to improve coding performance over [1], we study predicting at level given at level and encoding of for the case (RAHT()). For the prediction, we formalize RAHT(1) linear prediction in MPEG-PCC in a theoretical framework, and propose a new nonlinear predictor using a polynomial of bilateral filter. We derive equations to efficiently compute the critically sampled high-pass coefficients amenable to encoding. We optimize parameters in our resulting feed-forward network on a large training set of point clouds by minimizing a rate-distortion Lagrangian. Experimental results show that our improved framework outperformed the MPEG G-PCC predictor by to in bit rate reduction.
View on arXiv