Wanhua Li*, Renping Zhou*, Jiawei Zhou, Yingwei Song, Johannes Herter, Minghan Qin, Gao Huang†, Hanspeter Pfister†
(* indicates equal contribution, † means Co-corresponding author)
| Project page | Full Paper | Video |
| Datasets Annotations | Google Drive | BaiduWangpan
| Pretrained Model | Google Drive | BaiduWangpan
| Pregenerated Point Clouds by COLMAP | Google Drive | BaiduWangpan
This repository contains the official implementation of the paper "4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models" (CVPR 2025).
@inproceedings{li20254dlangsplat4dlanguage,
title={4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models},
author={Wanhua Li and Renping Zhou and Jiawei Zhou and Yingwei Song and Johannes Herter and Minghan Qin and Gao Huang and Hanspeter Pfister},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025}
}
The repository contains submodules, thus please check it out with
git clone git@github.com:zrporz/4DLangSplat.git --recursive
4D LangSplat uses the following software versions:
- Python 3.10
- CUDA 12.4
- GCC 10.2.0
On default, run the following commands to install the relative packages
conda create -n 4DLangSplat python=3.10
conda activate 4DLangSplat
pip install -r requirements.txt
### submodules for gaussian rasterization ###
pip install -e submodules/simple-knn
pip install -e submodules/4d-langsplat-rasterization
### submodules for generate segmentation map ###
pip install -e submodules/4d-langsplat-tracking-anything-with-deva
Our models are trained and evaluated on HyperNeRF and Neu3D datasets. Please follow their instructions to prepare your dataset, or run the following commands:
bash scripts/download_hypernerf.sh data/hypernerf
bash scripts/download_neu3d.sh data/neu3d
To evaluate the rendering results, we use RoboFlow to annotate the datasets. The annotations can be accessed through this link: Download the Annotations.
Follow 4DGaussians, we use COLMAP to generate the point clouds. Please follow their pipeline, or use ours: Download the Point Clouds
Then put them under data/<hypernerf or neu3d>/<dataset name>
. You need to ensure that the data folder is organized as follows:
|——data
| | hypernerf
| | americano
| |——annotations
| |——train
| |——README
| |——video_annotations.json
| |——camera
| |——rgb
| |——1x
| |——000001.png
| ...
| |——2x
| ...
| |——dataset.json
| |——metadata.json
| |——points.npy
| |——scene.json
| |——points3D_downsample2.ply
| |——chickchicken
| ...
| | neu3d
| | coffee_martini
| |——annotations
| |——train
| |——README
| |——cam00
| |——images
| |——0000.png
| ...
| |——cam01
| ...
| |——cam00.mp4
| |——cam01.mp4
| ...
| |——poses_bounds.npy
| |——points3D_downsample2.ply
| |——cur_roasted_beef
| ...
We provide the pretrained checkpoints of gaussian model and autoencoder: Download Pretrained Checkpoint.
For HyperNeRF dataset, take americano
as an example. Put checkpoint folder upder the output/hypernerf/americano
and run the following commands for rendering and evaluation
bash scripts/render-hypernerf.sh
bash scripts/eval-hypernerf.sh
For Neu3D dataset, take coffee_martini
as an example. Put checkpoint folder under the output/neu3d/coffee_martini
and run the following commands for rendering and evaluation
bash scripts/render-neu3d.sh
bash scripts/eval-neu3d.sh
The evaluation results will be saved under eval/eval_results
.
- release the code of the 4d-langsplat-rasterization
- release the code of the 4d-langsplat-tracking-anything-with-deva
- release the code of the evaluation
- release the code of the autoencoder
- release the code of preprocessing
- release the code of training
- release the the pretrained model
- release the preprocessed dataset
- update the arxiv link