当前主流的可实时推理的NeRF方法,如Plenoxel、Dvgo存在模型存储消耗巨大的问题,一个模型需要上百兆,不利于NeRF渲染方案的实用化。本项目使用了向量量化技术对NeRF模型进行了压缩,大大减小了模型大小,同时保持了较好的重建效果。
git clone https://www.modelscope.cn/DAMOXR/cv_nerf_3d-reconstruction_vector-quantize-compression.git
本方法通过对当前NeRF模型存储内容的分析,发现模型中存在大量冗余的体素(Voxel),因此首先分析体素的重要性,又针对不同的场景采用自适应的阈值来划分体素的重要程度。对于重要的体素,我们使用向量量化的方法对其进行压缩,从而达到减小模型存储大小的目的。
在多个数据集上的渲染可视化效果也表明了对效果几乎没有影响。
适用范围: 当前支持两种数据类型,分别是nerf_synthetic和LLFF
运行环境:
Blender 数据集:
import os
from modelscope.msdatasets import MsDataset
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
from modelscope.utils.test_utils import test_level
model_id = 'DAMOXR/cv_nerf_3d-reconstruction_vector-quantize-compression'
pretrained_model = 'ficus_demo.pt'
data_dir = MsDataset.load('nerf_recon_dataset', namespace='damo',
split='train').config_kwargs['split_config']['train']
nerf_synthetic_dataset = os.path.join(data_dir, 'nerf_synthetic')
blender_scene = 'ficus'
data_dir = os.path.join(nerf_synthetic_dataset, blender_scene)
nerf_vq_compression = pipeline(Tasks.nerf_recon_vq_compression,
model=model_id,
dataset_name='blender',
data_dir=data_dir,
downsample=1,
ndc_ray=False,
ckpt_path=pretrained_model
)
render_dir = os.path.join('./exp/', blender_scene)
# For evaluation test dataset to get metrics like psnr, ssim, N_vis is the number of test images
nerf_vq_compression(dict(test_mode='evaluation_test', render_dir=render_dir, N_vis=5))
# For rendering novel views, N_vis is the number of views
nerf_vq_compression(dict(test_mode='render_path', render_dir=render_dir, N_vis=30))
LLFF 数据集:
model_id = 'DAMOXR/cv_nerf_3d-reconstruction_vector-quantize-compression'
pretrained_model = 'fern_demo.pt'
data_dir = MsDataset.load(
'DAMOXR/nerf_llff_data',
subset_name='default',
split='test',
).config_kwargs['split_config']['test']
nerf_llff = os.path.join(data_dir, 'nerf_llff_data')
llff_scene = 'fern'
data_dir = os.path.join(nerf_llff, llff_scene)
nerf_vq_compression = pipeline(Tasks.nerf_recon_vq_compression,
model=model_id,
dataset_name='llff',
data_dir=data_dir,
downsample=4,
ndc_ray=True,
ckpt_path=pretrained_model
)
render_dir = os.path.join('./exp/', llff_scene)
# For evaluation test dataset to get metrics like psnr, ssim, N_vis is the number of test images
nerf_vq_compression(dict(test_mode='evaluation_test', render_dir=render_dir, N_vis=5))
# For rendering novel views, N_vis is the number of views
nerf_vq_compression(dict(test_mode='render_path', render_dir=render_dir, N_vis=10))
压缩前:
Scene | PSNR | SSIM | LPIPS_ALEX | LPIPS_VGG | SIZE |
---|---|---|---|---|---|
chair | 35.7760 | 0.9846 | 0.0095 | 0.0215 | 67.8907 |
drums | 26.0150 | 0.9370 | 0.0494 | 0.0718 | 67.7655 |
ficus | 34.1024 | 0.9826 | 0.0122 | 0.0222 | 70.7913 |
hotdog | 37.5654 | 0.9828 | 0.0129 | 0.0301 | 84.0902 |
lego | 36.5248 | 0.9834 | 0.0072 | 0.0178 | 68.8282 |
materials | 30.1193 | 0.9523 | 0.0259 | 0.0584 | 84.6609 |
mic | 34.9041 | 0.9885 | 0.0078 | 0.0144 | 66.6982 |
ship | 30.7121 | 0.8941 | 0.0831 | 0.1376 | 70.9092 |
Mean | 33.2149 | 0.9632 | 0.0260 | 0.0467 | 72.7043 |
VQRF压缩后:
Scene | PSNR | SSIM | LPIPS_ALEX | LPIPS_VGG | SIZE |
---|---|---|---|---|---|
chair | 35.1320 | 0.9806 | 0.0185 | 0.0355 | 5.6044 |
drums | 25.9492 | 0.9316 | 0.0624 | 0.0983 | 4.9731 |
ficus | 33.8504 | 0.9813 | 0.0136 | 0.0288 | 6.0117 |
hotdog | 37.2531 | 0.9807 | 0.0154 | 0.0367 | 9.6663 |
lego | 35.9736 | 0.9808 | 0.0087 | 0.0242 | 6.1286 |
materials | 30.0846 | 0.9511 | 0.0279 | 0.0617 | 8.2645 |
mic | 34.3908 | 0.9861 | 0.0128 | 0.0241 | 3.9262 |
ship | 30.5233 | 0.8908 | 0.0852 | 0.1430 | 7.0612 |
Mean | 32.8946 | 0.9604 | 0.0306 | 0.0565 | 6.4545 |
压缩前:
Scene | PSNR | SSIM | LPIPS_ALEX | LPIPS_VGG | SIZE |
---|---|---|---|---|---|
fern | 24.9985 | 0.7981 | 0.1595 | 0.2513 | 179.9157 |
flower | 28.1521 | 0.8574 | 0.1043 | 0.1775 | 179.8139 |
room | 32.1206 | 0.9514 | 0.0770 | 0.1622 | 179.8682 |
leaves | 21.1095 | 0.7439 | 0.1425 | 0.2215 | 179.6977 |
horns | 28.3430 | 0.8835 | 0.1027 | 0.1801 | 179.8110 |
trex | 27.6879 | 0.9107 | 0.0793 | 0.2019 | 179.8421 |
fortress | 31.4531 | 0.8982 | 0.0666 | 0.1425 | 179.8664 |
orchids | 19.8818 | 0.6468 | 0.1914 | 0.2786 | 179.9021 |
Mean | 26.7183 | 0.8362 | 0.1154 | 0.2019 | 179.8396 |
VQRF压缩后:
Scene | PSNR | SSIM | LPIPS_ALEX | LPIPS_VGG | SIZE |
---|---|---|---|---|---|
fern | 24.7626 | 0.7861 | 0.1725 | 0.2687 | 16.4782 |
flower | 27.8500 | 0.8421 | 0.1183 | 0.2032 | 16.6604 |
room | 31.7010 | 0.9446 | 0.0900 | 0.1842 | 16.6127 |
leaves | 20.9996 | 0.7262 | 0.1580 | 0.2559 | 16.4771 |
horns | 27.8376 | 0.8613 | 0.1289 | 0.2195 | 16.3832 |
trex | 27.1907 | 0.8979 | 0.0930 | 0.2256 | 16.4172 |
fortress | 31.0268 | 0.8764 | 0.1027 | 0.1895 | 17.7717 |
orchids | 19.7541 | 0.6346 | 0.2073 | 0.2984 | 16.3075 |
Mean | 26.3903 | 0.8212 | 0.1338 | 0.2306 | 16.6385 |
@inproceedings{li2023compressing,
title={Compressing volumetric radiance fields to 1 mb},
author={Li, Lingzhi and Shen, Zhen and Wang, Zhongshu and Shen, Li and Bo, Liefeng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={4222--4231},
year={2023}
}