diff --git a/README.md b/README.md index c654998e..4e947c10 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,10 @@ -# UT CODa Model Training Configurations -We utilize the ST3D++ code repository and OpenPCDet model training configuration files for all empirical analyses conducted for the UT CODa paper. +# University of Texas Campus Object Dataset Object Detection Models + +Official model development kit for CODa. We strongly recommend using this repository to run our pretrained +models and train on custom datasets. Thanks to the authors of ST3D++ and OpenPCDet from whom this repository +was adapted from. + +![Sequence 0 Clip](./docs/codademo.gif) ## Installation @@ -13,13 +18,9 @@ Please refer to [GETTING_STARTED.md](docs/GETTING_STARTED.md) to learn more usag Our code is released under the Apache 2.0 license. -## Acknowledgement - -Our code is heavily based on [OpenPCDet v0.3](https://github.com/open-mmlab/OpenPCDet/commit/e3bec15f1052b4827d942398f20f2db1cb681c01). Thanks OpenPCDet Development Team for their awesome codebase. - -## Citation +## Paper Citation -If you find our work useful in your research, please consider citing: +If you find our work useful in your research, please consider citing our work: ``` @inproceedings{zhang2023utcoda, title={Towards Robust 3D Robot Perception in Urban Environments: The UT Campus Object Dataset}, @@ -29,15 +30,25 @@ If you find our work useful in your research, please consider citing: } ``` -If you find the ST3D++ or OpenPCDet useful, please cite: +## Dataset Citation ``` -@inproceedings{yang2021st3d, - title={ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection}, - author={Yang, Jihan and Shi, Shaoshuai and Wang, Zhe and Li, Hongsheng and Qi, Xiaojuan}, - booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, - year={2021} +@data{T8/BBOQMV_2023, +author = {Zhang, Arthur and Eranki, Chaitanya and Zhang, Christina and Hong, Raymond and Kalyani, Pranav and Kalyanaraman, Lochana and Gamare, Arsh and Bagad, Arnav and Esteva, Maria and Biswas, Joydeep}, +publisher = {Texas Data Repository}, +title = {{UT Campus Object Dataset (CODa)}}, +year = {2023}, +version = {DRAFT VERSION}, +doi = {10.18738/T8/BBOQMV}, +url = {https://doi.org/10.18738/T8/BBOQMV} } ``` + +## Acknowledgement + +Our code is heavily based on [OpenPCDet v0.3](https://github.com/open-mmlab/OpenPCDet/commit/e3bec15f1052b4827d942398f20f2db1cb681c01). Thanks OpenPCDet Development Team for their awesome codebase. + + +Thank you to the authors of ST3D++ or OpenPCDet for an awesome codebase! ``` @article{yang2021st3d++, title={ST3D++: Denoised Self-training for Unsupervised Domain Adaptation on 3D Object Detection}, diff --git a/docs/DEMO.md b/docs/DEMO.md deleted file mode 100644 index da63e044..00000000 --- a/docs/DEMO.md +++ /dev/null @@ -1,48 +0,0 @@ -# Quick Demo - -Here we provide a quick demo to test a pretrained model on the custom point cloud data and visualize the predicted results. - -We suppose you already followed the [INSTALL.md](INSTALL.md) to install the `OpenPCDet` repo successfully. - -1. Download the provided pretrained models as shown in the [README.md](../README.md). - -2. Make sure you have already installed the `mayavi` visualization tools. If not, you could install it as follows: - ``` - pip install mayavi - ``` - -3. Prepare you custom point cloud data (skip this step if you use the original KITTI data). - * You need to transform the coordinate of your custom point cloud to -the unified normative coordinate of `OpenPCDet`, that is, x-axis points towards to front direction, -y-axis points towards to the left direction, and z-axis points towards to the top direction. - * (Optional) the z-axis origin of your point cloud coordinate should be about 1.6m above the ground surface, - since currently the provided models are trained on the KITTI dataset. - * Set the intensity information, and save your transformed custom data to `numpy file`: - ```python - # Transform your point cloud data - ... - - # Save it to the file. - # The shape of points should be (num_points, 4), that is [x, y, z, intensity] (Only for KITTI dataset). - # If you doesn't have the intensity information, just set them to zeros. - # If you have the intensity information, you should normalize them to [0, 1]. - points[:, 3] = 0 - np.save(`my_data.npy`, points) - ``` - -4. Run the demo with a pretrained model (e.g. PV-RCNN) and your custom point cloud data as follows: -```shell -python demo.py --cfg_file cfgs/kitti_models/pv_rcnn.yaml \ - --ckpt pv_rcnn_8369.pth \ - --data_path ${POINT_CLOUD_DATA} -``` -Here `${POINT_CLOUD_DATA}` could be the following format: -* Your transformed custom data with a single numpy file like `my_data.npy`. -* Your transformed custom data with a directory to test with multiple point cloud data. -* The original KITTI `.bin` data within `data/kitti`, like `data/kitti/training/velodyne/000008.bin`. - -Then you could see the predicted results with visualized point cloud as follows: - -

- -

diff --git a/docs/GETTING_STARTED.md b/docs/GETTING_STARTED.md index 6a68f31c..050a5804 100644 --- a/docs/GETTING_STARTED.md +++ b/docs/GETTING_STARTED.md @@ -2,109 +2,94 @@ The dataset configs are located within [tools/cfgs/dataset_configs](../tools/cfgs/dataset_configs), and the model configs are located within [tools/cfgs](../tools/cfgs) for different datasets. - ## Dataset Preparation -Currently we provide the dataloader of KITTI dataset and NuScenes dataset, and the supporting of more datasets are on the way. - -### KITTI Dataset -* Please download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows (the road planes could be downloaded from [[road plane]](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing), which are optional for data augmentation in the training): -* NOTE: if you already have the data infos from `pcdet v0.1`, you can choose to use the old infos and set the DATABASE_WITH_FAKELIDAR option in tools/cfgs/dataset_configs/kitti_dataset.yaml as True. The second choice is that you can create the infos and gt database again and leave the config unchanged. +Currently, we support the CODa, JRDB, KITTI, nuScenes, and Waymo datasets. For JRDB support, you should use the jrdb branch in this repository. +We describe how to prepare the other datasets in the [preparing external datasets section](#external-datasets). -``` -OpenPCDet -├── data -│ ├── kitti -│ │ │── ImageSets -│ │ │── training -│ │ │ ├──calib & velodyne & label_2 & image_2 & (optional: planes) -│ │ │── testing -│ │ │ ├──calib & velodyne & image_2 -├── pcdet -├── tools -``` +### Campus Object Dataset (CODa) -* Generate the data infos by running the following command: -```python -python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml -``` +Please download the official CODa using the [dataset development kit](https://github.com/ut-amrl/coda-devkit) +and link it in the data directory in this repo as follows. You may need to symlink the `3d_raw` directory +to the `3d_comp` directory if you downloaded CODa by split rather than by sequence. -### NuScenes Dataset -* Please download the official [NuScenes 3D object detection dataset](https://www.nuscenes.org/download) and -organize the downloaded files as follows: ``` -OpenPCDet +coda-models ├── data -│ ├── nuscenes -│ │ │── v1.0-trainval (or v1.0-mini if you use mini) -│ │ │ │── samples -│ │ │ │── sweeps -│ │ │ │── maps -│ │ │ │── v1.0-trainval +│ ├── coda_original_format +│ ├── 2d_rect +| ├── 3d_raw +│ ├── calibrations +│ ├── metadata +│ ├── poses +| ├── timestamps ├── pcdet ├── tools ``` -* Install the `nuscenes-devkit` with version `1.0.5` by running the following command: -```shell script -pip install nuscenes-devkit==1.0.5 +* Convert from CODa file format to KITTI format +```python +python tools/create_data.py coda ``` -* Generate the data infos by running the following command (it may take several hours): -```python -python -m pcdet.datasets.nuscenes.nuscenes_dataset --func create_nuscenes_infos \ - --cfg_file tools/cfgs/dataset_configs/nuscenes_dataset.yaml \ - --version v1.0-trainval +* Generate the data infos by running the following command: +```python +python -m pcdet.datasets.coda.coda_dataset create_coda_infos tools/cfgs/dataset_configs/da_coda_oracle_dataset_full.yaml ``` -### Waymo Open Dataset -* Please download the official [Waymo Open Dataset](https://waymo.com/open/download/), -including the training data `training_0000.tar~training_0031.tar` and the validation -data `validation_0000.tar~validation_0007.tar`. -* Unzip all the above `xxxx.tar` files to the directory of `data/waymo/raw_data` as follows (You could get 798 *train* tfrecord and 202 *val* tfrecord ): +You should now see the following file structure. + ``` -OpenPCDet +coda-models ├── data -│ ├── waymo -│ │ │── ImageSets -│ │ │── raw_data -│ │ │ │── segment-xxxxxxxx.tfrecord -| | | |── ... -| | |── waymo_processed_data -│ │ │ │── segment-xxxxxxxx/ -| | | |── ... -│ │ │── pcdet_gt_database_train_sampled_xx/ -│ │ │── pcdet_waymo_dbinfos_train_sampled_xx.pkl +│ ├── coda_original_format +│ ├── 2d_rect +| ├── 3d_raw +│ ├── calibrations +│ ├── metadata +│ ├── poses +| ├── timestamps +│ ├── coda128_allclass_full +│ ├── gt_database +| ├── ImageSets +│ ├── testing +│ ├── training +│ ├── coda_dbinfos_train.pkl +| ├── coda_infos_test.pkl +| ├── coda_infos_train.pkl +| ├── coda_infos_trainval.pkl +| ├── coda_infos_val.pkl ├── pcdet ├── tools ``` -* Install the official `waymo-open-dataset` by running the following command: -```shell script -pip3 install --upgrade pip -# tf 2.0.0 -pip3 install waymo-open-dataset-tf-2-0-0==1.2.0 --user -``` +## Using Pretrained Models +The examples below use the provided pretrained model on 32 vertical channel resolution. To download pretrained +model weights for the other resolutions. Refer to the [Pretrained Models] + +### Live Visualization using ROS (ROS Installation Required) + +You will need to have installed the ROS in your conda environment according to the [INSTALL.md](./INSTALL.md) +for the following to work. In a separate terminal, publish your point clouds over ROS. Run the following command, +replacing `YOUR_POINT_CLOUD_TOPIC_NAME` with the point cloud topic being published. Depending on your ROS +configuration, the maximum inference frequency varies between 2-5 Hz. -* Extract point cloud data from tfrecord and generate data infos by running the following command (it takes several hours, -and you could refer to `data/waymo/waymo_processed_data` to see how many records that have been processed): -```python -python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos \ - --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml +``` +python ros_demo.py --pc YOUR_POINT_CLOUD_TOPIC_NAME ``` -Note that you do not need to install `waymo-open-dataset` if you have already processed the data before and do not need to evaluate with official Waymo Metrics. +You should something that looks like this: -# Using Pretrained Models +![Sequence 0 Clip](./rosdemo.png) -## Visualize Detector Performance on CODa +### Visualize Detector Performance on CODa (Open3D) Before visualizing object detections on CODa, you will first need to download the pre-trained model weights. Then, run the following command, specifying the path to the model weights. ``` -python demo.py --ckpt ../output/da-coda-coda_models/waymocenterhead/pvrcnn_allclass128full_finetune_headfull/coda128bestwaymoLR0.010000OPTadam_onecycle/ckpt/checkpoint_epoch_30.pth +python demo.py ``` -## Visualize Detector Performance on Custom Dataset +### Visualize Detector Performance on Custom Dataset (Open3D) To visualize the pre-trained model predictions on your dataset. Create a directory named `velodyne` and place the `.bin` files that you would like to use in this directory. Then set the `--data_path` cli argument to the parent directory for your `velodyne` directory. The file structure should look as follows: @@ -121,40 +106,7 @@ Then, run the set the `--ckpt` argument to your model weights path and run the c python demo.py --ckpt ../output/da-coda-coda_models/waymocenterhead/pvrcnn_allclass128full_finetune_headfull/coda128bestwaymoLR0.010000OPTadam_onecycle/ckpt/checkpoint_epoch_30.pth --dataset_name demo --data_path {PARENT_DIRECTORY} --ext .bin ``` -## Training & Testing - - -### Test and evaluate the pretrained models -* Test with a pretrained model: -```shell script -python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --ckpt ${CKPT} -``` - -* To test all the saved checkpoints of a specific training setting and draw the performance curve on the Tensorboard, add the `--eval_all` argument: -```shell script -python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --eval_all -``` - -* Notice that if you want to test on the setting with KITTI as **target domain**, - please add `--set DATA_CONFIG_TAR.FOV_POINTS_ONLY True` to enable front view - point cloud only: -```shell script -python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --eval_all --set DATA_CONFIG_TAR.FOV_POINTS_ONLY True -``` - -* To test with multiple GPUs: -```shell script -sh scripts/dist_test.sh ${NUM_GPUS} \ - --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} - -# or - -sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_GPUS} \ - --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} -``` - - -### Train a model +# Training You could optionally add extra command line parameters `--batch_size ${BATCH_SIZE}` and `--epochs ${EPOCHS}` to specify your preferred parameters. @@ -167,11 +119,6 @@ sh scripts/dist_train.sh ${NUM_GPUS} --cfg_file ${CONFIG_FILE} sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_GPUS} --cfg_file ${CONFIG_FILE} ``` -* Train with a single GPU: -```shell script -python train.py --cfg_file ${CONFIG_FILE} -``` - ### Train the Pre-trained Take Source Only model with SECOND-IoU on Waymo -> KITTI as an example: ```shell script @@ -191,122 +138,144 @@ sh scripts/dist_train.sh ${NUM_GPUS} --cfg_file cfgs/da-waymo-kitti_models/secon ``` Notice that you also need to focus the performance of the **best model**. -# Reproducing Oracle Benchmarks - -## UT CODa - PP -`cfgs/coda_models/pointpillar_1x.yaml` - -## UT CODa - CP -`cfgs/code_models/centerpoint.yaml` - -## UT CODa - PVRCNN -`cfgs/coda_models/pvrcnn_oracle.yaml` - -## nuScenes - PP -`cfgs/da-nuscenes-coda_models/cbgs_pp_multihead.yaml` - -## nuScenes - CP -`cfgs/da-nuscenes-coda_models/da_cbgs_voxel0075_res3d_centerpoint.yaml` - -## nuScenes - PVRCNN -tbd - -## Waymo - PP -`cfgs/da-waymo-coda_models/da_pointpillar_1x.yaml` -## Waymo - CP -`cfgs/da-waymo-coda_models/da_centerpoint.yaml` -## Waymo - PVRCNN -`cfgs/da-waymo-coda_models/pvrcnn/da_pvrcnn_with_centerhead_rpn.yaml` - -# Reproducing Domain Adaptation Results - -## Waymo - Direct - -Train waymo using coda pvrcnn architecture - -`cfgs/da-waymo-coda_models/direct_coda_pvrcnn_oracle.yaml` - -Run with ./dist_test.sh to eval on coda +# Testing -`cfgs/da-waymo-coda_models/direct_coda_pvrcnn_oracle.yaml` - -## Waymo - ST - -Use Waymo direct as pretrained model - -`cfgs/da-waymo-coda_models/pvrcnn_st3d/pvrcnn_st3d.yaml` - -## Waymo - FT - -Use Waymo direct as pretrained model - -`cfgs/da-waymo-coda_models/pvrcnn_st3d/pvrcnn_st3d_finetune_head.yaml` -`cfgs/da-waymo-coda_models/pvrcnn_st3d/pvrcnn_st3d_finetune_headfull.yaml` - -## Waymo - ST FT - -Use Waymo ST as pretrained model - -`cfgs/da-waymo-coda_models/pvrcnn_st3d/pvrcnn_st3d_finetune_headfull.yaml` - -## nuScenes - Direct -TBD - -# Reproducing UT CODa to AV Datasets - -## UT CODa32 FT nuScenes - -Dist Train (train on UT CODa eval on nuScenes) Choose best epoch to finetune later - -`cfgs/da-coda-nuscenes_models/pvrcnn_32oracle_coda.yaml` - -Dist Train (finetune head) - -`cfgs/da-coda-nuscenes_models/pvrcnn_32oracle_finetune_head.yaml` +* Test with a pretrained model: +```shell script +python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --ckpt ${CKPT} +``` -Dist Train (finetune full) +* To test all the saved checkpoints of a specific training setting and draw the performance curve on the Tensorboard, add the `--eval_all` argument: +```shell script +python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --eval_all +``` -`cfgs/da-coda-nuscenes_models/pvrcnn_32oracle_finetune_headfull.yaml` +* Notice that if you want to test on the setting with KITTI as **target domain**, + please add `--set DATA_CONFIG_TAR.FOV_POINTS_ONLY True` to enable front view + point cloud only: +```shell script +python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --eval_all --set DATA_CONFIG_TAR.FOV_POINTS_ONLY True +``` -## UT CODa128 FT nuScenes +* To test with multiple GPUs: +```shell script +sh scripts/dist_test.sh ${NUM_GPUS} \ + --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} -Dist Train (Train on UT Coda eval on nuScenes) Choose best epoch to finetune later +# or -`cfgs/da-coda-nuscenes_models/pvrcnn_128oracle_coda.yaml` +sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_GPUS} \ + --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} +``` -Dist Train (finetune head) -`cfgs/da-coda-nuscenes_models/pvrcnn_128oracle_finetune_head.yaml` +# Preparing Other Datasets + -Dist Train (finetune full) +### KITTI Dataset +* Please download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows (the road planes could be downloaded from [[road plane]](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing), which are optional for data augmentation in the training): +* NOTE: if you already have the data infos from `pcdet v0.1`, you can choose to use the old infos and set the DATABASE_WITH_FAKELIDAR option in tools/cfgs/dataset_configs/kitti_dataset.yaml as True. The second choice is that you can create the infos and gt database again and leave the config unchanged. -`cfgs/da-coda-nuscenes_models/pvrcnn_128oracle_finetune_headfull.yaml` +``` +OpenPCDet +├── data +│ ├── kitti +│ │ │── ImageSets +│ │ │── training +│ │ │ ├──calib & velodyne & label_2 & image_2 & (optional: planes) +│ │ │── testing +│ │ │ ├──calib & velodyne & image_2 +├── pcdet +├── tools +``` -## UT CODa32 FT Waymo +* Generate the data infos by running the following command: +```python +python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml +``` -Dist Train (Train on UT Coda eval on Waymo) Choose best epoch to finetune later +### NuScenes Dataset +* Please download the official [NuScenes 3D object detection dataset](https://www.nuscenes.org/download) and +organize the downloaded files as follows: +``` +OpenPCDet +├── data +│ ├── nuscenes +│ │ │── v1.0-trainval (or v1.0-mini if you use mini) +│ │ │ │── samples +│ │ │ │── sweeps +│ │ │ │── maps +│ │ │ │── v1.0-trainval +├── pcdet +├── tools +``` -`cfgs/da-coda-waymo_models/pvrcnn_32oracle_coda.yaml` +* Install the `nuscenes-devkit` with version `1.0.5` by running the following command: +```shell script +pip install nuscenes-devkit==1.0.5 +``` -Dist Train (finetune head) +* Generate the data infos by running the following command (it may take several hours): +```python +python -m pcdet.datasets.nuscenes.nuscenes_dataset --func create_nuscenes_infos \ + --cfg_file tools/cfgs/dataset_configs/nuscenes_dataset.yaml \ + --version v1.0-trainval +``` -`cfgs/da-coda-waymo_models/pvrcnn_32oracle_finetune_head.yaml` +### Waymo Open Dataset +* Please download the official [Waymo Open Dataset](https://waymo.com/open/download/), +including the training data `training_0000.tar~training_0031.tar` and the validation +data `validation_0000.tar~validation_0007.tar`. +* Unzip all the above `xxxx.tar` files to the directory of `data/waymo/raw_data` as follows (You could get 798 *train* tfrecord and 202 *val* tfrecord ): +``` +OpenPCDet +├── data +│ ├── waymo +│ │ │── ImageSets +│ │ │── raw_data +│ │ │ │── segment-xxxxxxxx.tfrecord +| | | |── ... +| | |── waymo_processed_data +│ │ │ │── segment-xxxxxxxx/ +| | | |── ... +│ │ │── pcdet_gt_database_train_sampled_xx/ +│ │ │── pcdet_waymo_dbinfos_train_sampled_xx.pkl +├── pcdet +├── tools +``` +* Install the official `waymo-open-dataset` by running the following command: +```shell script +pip3 install --upgrade pip +# tf 2.0.0 +pip3 install waymo-open-dataset-tf-2-0-0==1.2.0 --user +``` -Dist Train (finetune full) +* Extract point cloud data from tfrecord and generate data infos by running the following command (it takes several hours, +and you could refer to `data/waymo/waymo_processed_data` to see how many records that have been processed): +```python +python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos \ + --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml +``` -`cfgs/da-coda-waymo_models/pvrcnn_32oracle_finetune_heafull.yaml` +Note that you do not need to install `waymo-open-dataset` if you have already processed the data before and do not need to evaluate with official Waymo Metrics. -## UT CODa128 FT Waymo -Dist Train (Train on UT Coda eval on Waymo) Choose best epoch to finetune later -`cfgs/da-coda-waymo_models/pvrcnn_128oracle_coda.yaml` -Dist Train (finetune head) +# Reproducing Oracle Benchmarks -`cfgs/da-coda-waymo_models/pvrcnn_128oracle_finetune_head.yaml` +We provide all of the model configurations with short descriptions for each of the models trained for the CODa +paper in the scripts directory. We list the experiments that use each of the slurn job files. -Dist Train (finetune full) +## ls6_slurn_train.bash +1. AV Datasets to CODa (Train from scratch, ST3D++ self-training, Finetuning) +2. AV Dataset Benchmarks +3. CODa to AV Datasets +4. CODa pretrained models for 3 classes -`cfgs/da-coda-waymo_models/pvrcnn_128oracle_finetune_headfull.yaml` +## ls6_slurn_jrdbtrain.bash +1. CODa/Waymo to JRDB +2. JRDB to JRDB +## ls6_slurn_codatrain.bash +1. CODa pretrained models for all classes \ No newline at end of file diff --git a/docs/INSTALL.md b/docs/INSTALL.md index 05f5214f..a3a6fba0 100644 --- a/docs/INSTALL.md +++ b/docs/INSTALL.md @@ -1,30 +1,55 @@ # Installation -### Requirements +## Requirements All the codes are tested in the following environment: -* Linux (tested on Ubuntu 14.04/16.04) -* Python 3.6+ -* PyTorch >= 1.1 -* CUDA >= 9.0 -* [`spconv v1.0 (commit 8da6f96)`](https://github.com/traveller59/spconv/tree/8da6f967fb9a054d8870c3515b1b44eca2103634) or [`spconv v1.2`](https://github.com/traveller59/spconv) +* Linux (tested on Ubuntu 20.04/22.04) +* Python 3.9+ +* [PyTorch <= 1.10](https://pytorch.org/get-started/previous-versions/) [(Deprecation of THC.h in PyTorch >=1.11.*)](https://github.com/open-mmlab/mmdetection3d/issues/1332) +* CUDA >= 10.0 +* [`spconv v2.x`](https://github.com/traveller59/spconv) - -### Install `pcdet v0.3` -NOTE: Please re-install `pcdet v0.3` by running `python setup.py develop` even if you have already installed previous version. +## Install `coda-models` a. Clone this repository. -```shell -git clone https://github.com/CVMI-Lab/ST3D.git +```bash +git clone git@github.com:ut-amrl/coda-models.git +``` + +b. Install the dependent libraries as follows + +``` +conda env create -f environment.yml ``` -b. Install the dependent libraries as follows: +If the above command fails, you can also run the following commands in order: -* Install the dependent python libraries: +ROS support (optional) +``` +conda install mamba -c conda-forge +conda config --env --add channels conda-forge +conda config --env --add channels robostack-staging +conda config --env --remove channels defaults +conda install ros-noetic-desktop ``` -pip install -r requirements.txt + +Model Inference (mandatory) ``` +conda install -c "nvidia/label/cuda-11.3.0" cuda + +conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge -c. Install this `pcdet` library by running the following command: -```shell +pip install spconv-cu113 + +pip install -r requirements.txt +``` + +c. Install this repository + +``` python setup.py develop ``` + +## Post Installation Steps + +Verify your installation by following any of the tutorials in the [Getting Started](./GETTING_STARTED.md) +section. \ No newline at end of file diff --git a/docs/codademo.gif b/docs/codademo.gif new file mode 100644 index 00000000..58d50ed6 Binary files /dev/null and b/docs/codademo.gif differ diff --git a/docs/demo.png b/docs/demo.png deleted file mode 100644 index 6c6e6353..00000000 Binary files a/docs/demo.png and /dev/null differ diff --git a/docs/rosdemo.png b/docs/rosdemo.png new file mode 100644 index 00000000..8618bd0f Binary files /dev/null and b/docs/rosdemo.png differ diff --git a/environment.yml b/environment.yml new file mode 100644 index 00000000..b60d611c --- /dev/null +++ b/environment.yml @@ -0,0 +1,208 @@ +name: coda11_3 +channels: + - pytorch + - defaults + - robostack-staging + - conda-forge +dependencies: + - _libgcc_mutex=0.1=conda_forge + - _openmp_mutex=4.5=2_kmp_llvm + - blas=1.0=mkl + - boltons=23.0.0=pyhd8ed1ab_0 + - brotli-python=1.1.0=py38h17151c0_0 + - bzip2=1.0.8=h7f98852_4 + - c-ares=1.19.1=hd590300_0 + - ca-certificates=2023.7.22=hbcca054_0 + - cffi=1.15.1=py38h4a40e3a_3 + - charset-normalizer=3.2.0=pyhd8ed1ab_0 + - colorama=0.4.6=pyhd8ed1ab_0 + - conda=23.7.4=py38h578d9bd_0 + - conda-package-handling=2.2.0=pyh38be061_0 + - conda-package-streaming=0.9.0=pyhd8ed1ab_0 + - cryptography=41.0.3=py38hcdda232_0 + - cudatoolkit=11.3.1=h9edb442_10 + - ffmpeg=4.3=hf484d3e_0 + - fmt=10.1.1=h00ab1b0_0 + - freetype=2.10.4=h0708190_1 + - gmp=6.2.1=h58526e2_0 + - gnutls=3.6.13=h85f3911_1 + - icu=73.2=h59595ed_0 + - idna=3.4=pyhd8ed1ab_0 + - intel-openmp=2021.4.0=h06a4308_3561 + - jbig=2.1=h7f98852_2003 + - jpeg=9e=h166bdaf_1 + - jsonpatch=1.32=pyhd8ed1ab_0 + - jsonpointer=2.4=py38h578d9bd_2 + - keyutils=1.6.1=h166bdaf_0 + - krb5=1.21.2=h659d440_0 + - lame=3.100=h7f98852_1001 + - lcms2=2.12=hddcbb42_0 + - ld_impl_linux-64=2.38=h1181459_1 + - lerc=2.2.1=h9c3ff4c_0 + - libarchive=3.6.2=h039dbb9_1 + - libcurl=8.3.0=hca28451_0 + - libdeflate=1.7=h7f98852_5 + - libedit=3.1.20191231=he28a2e2_2 + - libev=4.33=h516909a_1 + - libffi=3.4.4=h6a678d5_0 + - libgcc-ng=13.2.0=h807b86a_0 + - libiconv=1.17=h166bdaf_0 + - libmamba=1.5.1=h744094f_0 + - libmambapy=1.5.1=py38h5cd715c_0 + - libnghttp2=1.52.0=h61bc06f_0 + - libpng=1.6.37=h21135ba_2 + - libsolv=0.7.24=hfc55251_4 + - libssh2=1.11.0=h0841786_0 + - libstdcxx-ng=13.2.0=h7e041cc_0 + - libtiff=4.3.0=hf544144_1 + - libuv=1.43.0=h7f98852_0 + - libwebp-base=1.2.2=h7f98852_1 + - libxml2=2.11.5=h232c23b_1 + - libzlib=1.2.13=hd590300_5 + - llvm-openmp=16.0.6=h4dfa4b3_0 + - lz4-c=1.9.3=h9c3ff4c_1 + - lzo=2.10=h516909a_1000 + - mamba=1.5.1=py38haad2881_0 + - mkl=2021.4.0=h06a4308_640 + - mkl-service=2.4.0=py38h95df7f1_0 + - mkl_fft=1.3.1=py38h8666266_1 + - mkl_random=1.2.2=py38h1abd341_0 + - ncurses=6.4=h6a678d5_0 + - nettle=3.6=he412f7d_0 + - numpy-base=1.24.3=py38h31eccc5_0 + - olefile=0.46=pyh9f0ad1d_1 + - openh264=2.1.1=h780b84a_0 + - openjpeg=2.4.0=hb52868f_1 + - openssl=3.1.2=hd590300_0 + - pip=23.2.1=py38h06a4308_0 + - pluggy=1.3.0=pyhd8ed1ab_0 + - pybind11-abi=4=hd8ed1ab_3 + - pycosat=0.6.4=py38h0a891b7_1 + - pycparser=2.21=pyhd8ed1ab_0 + - pyopenssl=23.2.0=pyhd8ed1ab_1 + - pysocks=1.7.1=pyha2e5f31_6 + - python=3.8.18=h955ad1f_0 + - python_abi=3.8=2_cp38 + - pytorch=1.10.0=py3.8_cuda11.3_cudnn8.2.0_0 + - pytorch-mutex=1.0=cuda + - readline=8.2=h5eee18b_0 + - reproc=14.2.4=h0b41bf4_0 + - reproc-cpp=14.2.4=hcb278e6_0 + - requests=2.31.0=pyhd8ed1ab_0 + - ruamel.yaml=0.17.32=py38h01eb140_0 + - ruamel.yaml.clib=0.2.7=py38h1de0b5d_1 + - setuptools=68.0.0=py38h06a4308_0 + - six=1.16.0=pyh6c4a22f_0 + - sqlite=3.41.2=h5eee18b_0 + - tk=8.6.12=h1ccaba5_0 + - toolz=0.12.0=pyhd8ed1ab_0 + - torchaudio=0.10.0=py38_cu113 + - torchvision=0.11.0=py38_cu113 + - typing_extensions=4.7.1=pyha770c72_0 + - urllib3=2.0.4=pyhd8ed1ab_0 + - wheel=0.38.4=py38h06a4308_0 + - xz=5.4.2=h5eee18b_0 + - yaml-cpp=0.7.0=h27087fc_2 + - zlib=1.2.13=hd590300_5 + - zstandard=0.19.0=py38h0a891b7_0 + - zstd=1.5.5=hfc55251_0 + - pip: + - addict==2.4.0 + - ansi2html==1.8.0 + - appdirs==1.4.4 + - asttokens==2.4.0 + - attrs==23.1.0 + - backcall==0.2.0 + - ccimport==0.4.2 + - certifi==2023.7.22 + - click==8.1.7 + - comm==0.1.4 + - configargparse==1.7 + - contourpy==1.1.0 + - cumm-cu113==0.4.11 + - cycler==0.11.0 + - dash==2.13.0 + - dash-core-components==2.0.0 + - dash-html-components==2.0.0 + - dash-table==5.0.0 + - decorator==5.1.1 + - docker-pycreds==0.4.0 + - easydict==1.10 + - executing==1.2.0 + - fastjsonschema==2.18.0 + - fire==0.5.0 + - flask==2.2.5 + - fonttools==4.42.1 + - gitdb==4.0.10 + - gitpython==3.1.36 + - imageio==2.31.3 + - importlib-metadata==6.8.0 + - importlib-resources==6.0.1 + - ipython==8.12.2 + - ipywidgets==8.1.1 + - itsdangerous==2.1.2 + - jedi==0.19.0 + - jinja2==3.1.2 + - joblib==1.3.2 + - jsonschema==4.19.0 + - jsonschema-specifications==2023.7.1 + - jupyter-core==5.3.1 + - jupyterlab-widgets==3.0.9 + - kiwisolver==1.4.5 + - lark==1.1.7 + - llvmlite==0.41.0rc1 + - matplotlib==3.7.3 + - matplotlib-inline==0.1.6 + - nbformat==5.7.0 + - nest-asyncio==1.5.7 + - networkx==3.1 + - ninja==1.11.1 + - numba==0.58.0rc2 + - numpy==1.24.4 + - open3d==0.17.0 + - packaging==23.1 + - pandas==2.0.3 + - parso==0.8.3 + - pathtools==0.1.2 + - pccm==0.4.8 + - pexpect==4.8.0 + - pickleshare==0.7.5 + - pillow==10.0.0 + - pkgutil-resolve-name==1.3.10 + - platformdirs==3.10.0 + - portalocker==2.7.0 + - prompt-toolkit==3.0.39 + - protobuf==4.24.3 + - ptyprocess==0.7.0 + - pure-eval==0.2.2 + - pybind11==2.11.1 + - pygments==2.16.1 + - pyparsing==3.1.1 + - pyquaternion==0.9.9 + - python-dateutil==2.8.2 + - pytz==2023.3.post1 + - pywavelets==1.4.1 + - pyyaml==6.0.1 + - referencing==0.30.2 + - retrying==1.3.4 + - rpds-py==0.10.3 + - scikit-learn==1.3.0 + - scipy==1.10.1 + - sentry-sdk==1.31.0 + - setproctitle==1.3.2 + - sharedarray==3.2.3 + - smmap==5.0.0 + - spconv-cu113==2.3.6 + - stack-data==0.6.2 + - tensorboardx==2.6.2.2 + - threadpoolctl==3.2.0 + - tifffile==2023.8.30 + - tqdm==4.66.1 + - traitlets==5.9.0 + - tzdata==2023.3 + - wandb==0.15.10 + - wcwidth==0.2.6 + - werkzeug==2.2.3 + - widgetsnbextension==4.0.9 + - zipp==3.16.2 +prefix: ~/miniconda3/envs/coda11_3 diff --git a/pcdet/version.py b/pcdet/version.py index 8b21d291..85e72417 100644 --- a/pcdet/version.py +++ b/pcdet/version.py @@ -1 +1 @@ -__version__ = "0.5.0+9c9ba39" +__version__ = "0.5.0+ddae3e5" diff --git a/tools/create_data.py b/tools/create_data.py new file mode 100644 index 00000000..62224bf9 --- /dev/null +++ b/tools/create_data.py @@ -0,0 +1,74 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import argparse +from os import path as osp + +from tools.data_converter import coda_converter as coda_converter + +def coda_data_prep(root_path, + info_prefix, + out_dir, + workers, + resolution): + """Prepare the info file for CODa dataset. + + Args: + root_path (str): Path of dataset root. + info_prefix (str): The prefix of info filenames. + out_dir (str): Output directory of the generated info file. + workers (int): Number of threads to be used. + """ + + from tools.data_converter import coda_converter as coda + + ### TODO: Set these manually + channels=32 + load_dir = osp.join(root_path, 'coda_original_format') + save_dir = osp.join(out_dir, f'coda{channels}_allclass_full') + ### + + splits = ['training', 'testing', 'validation'] + for split in splits: + converter = coda.CODa2KITTI( + load_dir, + save_dir, + workers=workers, + split=split, + channels=channels + ) + converter.convert() + print("length dataset ", len(converter)) + +parser = argparse.ArgumentParser(description='Data converter arg parser') +parser.add_argument('dataset', metavar='coda', help='name of the dataset') +parser.add_argument( + '--root-path', + type=str, + default='./data', + help='specify the root path of dataset. Defaults to ./data') +parser.add_argument( + '--version', + type=str, + default='v1.0', + required=False, + help='specify the dataset version, no need for coda') +parser.add_argument( + '--out-dir', + type=str, + default='./data', + required=False, + help='output directory of kitti formatted dataset. Defaults to ./data') +parser.add_argument('--extra-tag', type=str, default='coda') +parser.add_argument('--resolution', type=int, default=32) +parser.add_argument( + '--workers', type=int, default=4, help='number of threads to be used') +args = parser.parse_args() + +if __name__ == '__main__': + if args.dataset == 'coda': + coda_data_prep( + root_path=args.root_path, + info_prefix=args.extra_tag, + out_dir=args.out_dir, + workers=args.workers, + resolution=args.resolution + ) diff --git a/tools/data_converter/coda_converter.py b/tools/data_converter/coda_converter.py new file mode 100644 index 00000000..8e19bcbf --- /dev/null +++ b/tools/data_converter/coda_converter.py @@ -0,0 +1,636 @@ +# Copyright (c) AMRL. All rights reserved. +r"""Adapted from `Waymo to KITTI converter + `_. +""" + +from glob import glob +import os +import json +from os.path import join, isfile +import shutil + +import yaml +import numpy as np +from scipy.spatial.transform import Rotation as R + +from PIL import Image + +import tqdm +from multiprocessing import Pool + +class CODa2KITTI(object): + """CODa to KITTI converter. + This class serves as the converter to change the CODa raw data to KITTI + format. + Args: + load_dir (str): Directory to load CODa raw data. + save_dir (str): Directory to save data in KITTI format. + prefix (str): Prefix of filename. In general, 0 for training, 1 for + validation and 2 for testing. + workers (int, optional): Number of workers for the parallel process. + test_mode (bool, optional): Whether in the test_mode. Default: False. + """ + + def __init__(self, + load_dir, + save_dir, + workers=64, + split="training", + test_mode=False, + channels=128): + self.filter_empty_3dboxes = True + self.filter_no_label_zone_points = True + + self.lidar_list = [ + 'os1' + ] + self.type_list = [ + 'CAR', + 'PEDESTRIAN', + 'BIKE', + 'MOTORCYCLE', + 'SCOOTER', + 'TREE', + 'TRAFFIC SIGN', + 'CANOPY', + 'TRAFFIC LIGHT', + 'BIKE RACK', + 'BOLLARD', + 'PARKING KIOSK', + 'MAILBOX', + 'FIRE HYDRANT', + 'FREESTANDING PLANT', + 'POLE', + 'INFORMATIONAL SIGN', + 'DOOR', + 'FENCE', + 'RAILING', + 'CONE', + 'CHAIR', + 'BENCH', + 'TABLE', + 'TRASH CAN', + 'NEWSPAPER DISPENSER', + 'ROOM LABEL', + 'STANCHION', + 'SANITIZER DISPENSER', + 'CONDIMENT DISPENSER', + 'VENDING MACHINE', + 'EMERGENCY AID KIT', + 'FIRE EXTINGUISHER', + 'COMPUTER', + 'OTHER', + 'HORSE', + 'PICKUP TRUCK', + 'DELIVERY TRUCK', + 'SERVICE VEHICLE', + 'UTILITY VEHICLE', + 'FIRE ALARM', + 'ATM', + 'CART', + 'COUCH', + 'TRAFFIC ARM', + 'WALL SIGN', + 'FLOOR SIGN', + 'DOOR SWITCH', + 'EMERGENCY PHONE', + 'DUMPSTER', + 'SEGWAY', + 'BUS', + 'SKATEBOARD', + 'WATER FOUNTAIN' + # Classes below have not been annotated + # 'GOLF CART' + # 'TRUCK' + # 'CONSTRUCTION BARRIER' + # 'TELEVISION', + # 'VACUUM CLEANER', + ] + self.coda_to_kitti_class_map = { + # Full Class List + 'CAR': 'Car', + 'PEDESTRIAN': 'Pedestrian', + 'BIKE': 'Cyclist', + 'MOTORCYCLE': 'Motorcycle', + 'SCOOTER': 'Scooter', + 'TREE': 'Tree', + 'TRAFFIC SIGN': 'TrafficSign', + 'CANOPY': 'Canopy', + 'TRAFFIC LIGHT': 'TrafficLight', + 'BIKE RACK': 'BikeRack', + 'BOLLARD': 'Bollard', + 'CONSTRUCTION BARRIER': 'ConstructionBarrier', + 'PARKING KIOSK': 'ParkingKiosk', + 'MAILBOX': 'Mailbox', + 'FIRE HYDRANT': 'FireHydrant', + 'FREESTANDING PLANT': 'FreestandingPlant', + 'POLE': 'Pole', + 'INFORMATIONAL SIGN': 'InformationalSign', + 'DOOR': 'Door', + 'FENCE': 'Fence', + 'RAILING': 'Railing', + 'CONE': 'Cone', + 'CHAIR': 'Chair', + 'BENCH': 'Bench', + 'TABLE': 'Table', + 'TRASH CAN': 'TrashCan', + 'NEWSPAPER DISPENSER': 'NewspaperDispenser', + 'ROOM LABEL': 'RoomLabel', + 'STANCHION': 'Stanchion', + 'SANITIZER DISPENSER': 'SanitizerDispenser', + 'CONDIMENT DISPENSER': 'CondimentDispenser', + 'VENDING MACHINE': 'VendingMachine', + 'EMERGENCY AID KIT': 'EmergencyAidKit', + 'FIRE EXTINGUISHER': 'FireExtinguisher', + 'COMPUTER': 'Computer', + 'TELEVISION': 'Television', + 'OTHER': 'Other', + 'HORSE': 'Other', + 'PICKUP TRUCK': 'PickupTruck', + 'DELIVERY TRUCK': 'DeliveryTruck', + 'SERVICE VEHICLE': 'ServiceVehicle', + 'UTILITY VEHICLE': 'UtilityVehicle', + 'FIRE ALARM': 'FireAlarm', + 'ATM': 'ATM', + 'CART': 'Cart', + 'COUCH': 'Couch', + 'TRAFFIC ARM': 'TrafficArm', + 'WALL SIGN': 'WallSign', + 'FLOOR SIGN': 'FloorSign', + 'DOOR SWITCH': 'DoorSwitch', + 'EMERGENCY PHONE': 'EmergencyPhone', + 'DUMPSTER': 'Dumpster', + 'VACUUM CLEANER': 'VacuumCleaner', + 'SEGWAY': 'Segway', + 'BUS': 'Bus', + 'SKATEBOARD': 'Skateboard', + 'WATER FOUNTAIN': 'WaterFountain' + } + #MAP Classes not found in KITTI to DontCare + for class_type in self.type_list: + class_name = class_type.upper() + if class_name not in self.coda_to_kitti_class_map.keys(): + self.coda_to_kitti_class_map[class_name] = 'DontCare' + + self.coda_to_kitti_occlusion = { + "None": 0, + "unknown": 0, + "Unknown": 0, + "Light": 1, + "Medium": 1, + "Heavy": 2, + "Full": 2 + } + + self.load_dir = load_dir + if split=="validation" or split=="training": + self.save_dir = join(save_dir, "training") + else: + self.save_dir = join(save_dir, split) + self.workers = int(workers) + self.split = split + self.test_mode = test_mode + + self.label_save_dir = f'{self.save_dir}/label_' + self.label_all_save_dir = f'{self.save_dir}/label_all' + self.image_save_dir = f'{self.save_dir}/image_' + self.calib_save_dir = f'{self.save_dir}/calib' + self.point_cloud_save_dir = f'{self.save_dir}/velodyne' + self.pose_save_dir = f'{self.save_dir}/pose' + self.timestamp_save_dir = f'{self.save_dir}/timestamp' + self.imageset_save_dir = f'{save_dir}/ImageSets' + + self.bbox_label_files = [] + self.image_files = [] # Store cam0 as paths + self.lidar_files = [] + + self.cam_ids = [0, 1] + self.sens_name_to_id = { + 'cam0': 0, + 'cam1': 1, + 'os1': 2 + } + + # Used to downsample lidar vertical channels + self.channels = channels + + self.process_metadata() + self.create_folder() + self.create_imagesets() + + def process_metadata(self): + metadata_path = join(self.load_dir, "metadata") + assert os.path.exists(metadata_path), "Metadata directory %s does not exist" % metadata_path + + metadata_files = glob("%s/*.json" % metadata_path) + metadata_files = sorted(metadata_files, key=lambda fname: int(fname.split('/')[-1].split('.')[0]) ) + + for mfile in metadata_files: + assert os.path.isfile(mfile), '%s does not exist' % mfile + meta_json = json.load(open(mfile, "r")) + + label_list = meta_json["ObjectTracking"][self.split] + self.bbox_label_files.extend(label_list) + + lidar_list = [label_path.replace('3d_label', '3d_raw').replace('.json', '.bin') + for label_path in label_list] + self.lidar_files.extend(lidar_list) + + image_list = [label_path.replace('3d_label', '2d_rect') + .replace('os1', 'cam0').replace('.json', '.png') for label_path in label_list] + self.image_files.extend(image_list) + + def create_imagesets(self): + if self.split=="testing": + imageset_file = "test.txt" + elif self.split=="training": + imageset_file = "train.txt" + elif self.split=="validation": + imageset_file = "val.txt" + + imageset_path = join(self.imageset_save_dir, imageset_file) + imageset_fp = open(imageset_path, 'w+') + + for lidar_path in self.lidar_files: + lidar_file = lidar_path.split('/')[-1] + _, _, traj, frame_idx = self.get_filename_info(lidar_file) + frame_name = f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}' + imageset_fp.write(frame_name+'\n') + + imageset_fp.close() + + def convert(self): + """Convert action.""" + print('Start converting ...') + + + pool = Pool(processes=self.workers) + + file_list = list(range(len(self))) + for _ in tqdm.tqdm(pool.imap_unordered(self.convert_one, [(self, i) for i in file_list]), total=len(file_list)): + pass + print('\nFinished ...') + + @staticmethod + def get_filename_info(filename): + filename_prefix = filename.split('.')[0] + filename_prefix = filename_prefix.split('_') + + modality = filename_prefix[0]+"_"+filename_prefix[1] + sensor_name = filename_prefix[2] + trajectory = filename_prefix[3] + frame = filename_prefix[4] + return (modality, sensor_name, trajectory, frame) + + @staticmethod + def set_filename_by_prefix(modality, sensor_name, trajectory, frame): + if "2d_rect"==modality: + filetype = "jpg" # change to jpg later + elif "2d_bbox"==modality: + filetype = "txt" + elif "3d_raw"==modality: + filetype = "bin" + elif "3d_bbox"==modality: + filetype = "json" + sensor_filename = "%s_%s_%s_%s.%s" % ( + modality, + sensor_name, + trajectory, + frame, + filetype + ) + return sensor_filename + + @staticmethod + def get_calibration_info(filepath): + filename = filepath.split('/')[-1] + filename_prefix = filename.split('.')[0] + filename_split = filename_prefix.split('_') + + calibration_info = None + src, tar = filename_split[1], filename_split[-1] + if len(filename_split) > 3: + #Sensor to Sensor transform + extrinsic = yaml.safe_load(open(filepath, 'r')) + calibration_info = extrinsic + else: + #Intrinsic transform + intrinsic = yaml.safe_load(open(filepath, 'r')) + calibration_info = intrinsic + + return calibration_info, src, tar + + def load_calibrations(self, outdir, trajectory): + calibrations_path = os.path.join(outdir, "calibrations", str(trajectory)) + calibration_fps = [os.path.join(calibrations_path, file) for file in os.listdir(calibrations_path) if file.endswith(".yaml")] + + calibrations = {} + for calibration_fp in calibration_fps: + cal, src, tar = self.get_calibration_info(calibration_fp) + cal_id = "%s_%s"%(src, tar) + + if cal_id not in calibrations.keys(): + calibrations[cal_id] = {} + + calibrations[cal_id].update(cal) + + return calibrations + + def convert_one(self, args): + """Convert action for single file. + Args: + file_idx (int): Index of the file to be converted. + """ + _, file_idx = args + relpath = self.bbox_label_files[file_idx] + filename = relpath.split('/')[-1] + fullpath = join(self.load_dir, relpath) + _, _, traj, frame_idx = self.get_filename_info(filename) + + for cam_id in self.cam_ids: + cam = "cam%i" % cam_id + img_file = self.set_filename_by_prefix("2d_rect", cam, traj, frame_idx) #change to rect later + img_path = join(self.load_dir, '2d_rect', cam, str(traj), img_file) + + cam_id = cam[-1] + self.save_image(traj, img_path, cam_id, frame_idx, file_idx) + if not self.test_mode: + self.save_label(traj, cam_id, frame_idx, file_idx) + + calibrations_path = os.path.join(self.load_dir, "calibrations", str(traj)) + self.save_calib(traj, frame_idx, file_idx) + + self.save_lidar(traj, frame_idx, file_idx, self.channels) + + self.save_pose(traj, frame_idx, file_idx) + self.save_timestamp(traj, frame_idx, file_idx) + + def __len__(self): + """Length of the filename list.""" + return len(self.bbox_label_files) + + def save_image(self, traj, src_img_path, cam_id, frame_idx, file_idx): + """Parse and save the images in jpg format. Jpg is the original format + used by Waymo Open dataset. Saving in png format will cause huge (~3x) + unnesssary storage waste. + + Assumes images are rectified + Args: + frame_path (str): Absolute filepath to image file + file_idx (int): Current file index. + frame_idx (int): Current frame index. + """ + assert isfile(src_img_path), "Image file does not exist: %s" % src_img_path + kitti_img_path = f'{self.image_save_dir}{str(cam_id)}/' + \ + f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}.jpg' + shutil.copyfile(src_img_path, kitti_img_path) + + def save_calib(self, traj, frame_idx, file_idx): + """Parse and save the calibration data. + Args: + calib_path (str): Filepath to calibration file + traj (int): Current trajectory index. + """ + calibrations = self.load_calibrations(self.load_dir, traj) # TODO figure out import + + # Save transform lidar to cameras + Tr_os1_to_cam0 = np.array(calibrations['os1_cam0']['extrinsic_matrix']['data']).reshape(4,4) + + R_cam0_to_cam1 = np.array(calibrations['cam0_cam1']['extrinsic_matrix']['R']['data']).reshape(3,3) + T_cam0_to_cam1 = np.array(calibrations['cam0_cam1']['extrinsic_matrix']['T']).reshape(3, 1) + Tr_cam0_to_cam1 = np.eye(4) + Tr_cam0_to_cam1[:3, :] = np.hstack((R_cam0_to_cam1, T_cam0_to_cam1)) + + Tr_os1_to_cam1 = Tr_cam0_to_cam1 @ Tr_os1_to_cam0 + Tr_os1_to_cams_np = [ + Tr_os1_to_cam0[:3, :].reshape((12, )), Tr_os1_to_cam1[:3, :].reshape((12, )) + ] + + camera_calibs = [] + Tr_os1_to_cams = [] + calib_context = '' + for cam_id in self.cam_ids: + # Save projection matrix for cameras + cam = "cam%i" % cam_id + camera_calib = calibrations['%s_intrinsics'%cam]['projection_matrix']['data'] + camera_calib = [f'{i:e}' for i in camera_calib] + camera_calibs.append(camera_calib) + + Tr_os1_to_cams.append([f'{i:e}' for i in Tr_os1_to_cams_np[cam_id]]) + + # Save rectification matrix for coplanar just identity (since images are rectified) + R0_rect = [f'{i:e}' for i in np.eye(3).flatten()] + + for cam_id in self.cam_ids: + calib_context += 'P' + str(cam_id) + ': ' + \ + ' '.join(camera_calibs[cam_id]) + '\n' + calib_context += 'R0_rect' + ': ' + ' '.join(R0_rect) + '\n' + for cam_id in self.cam_ids: + calib_context += 'Tr_velo_to_cam_' + str(cam_id) + ': ' + \ + ' '.join(Tr_os1_to_cams[cam_id]) + '\n' + + with open( + f'{self.calib_save_dir}/' + + f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}.txt', + 'w+') as fp_calib: + fp_calib.write(calib_context) + fp_calib.close() + + def save_lidar(self, traj, frame_idx, file_idx, channels=128): + """Parse and save the lidar data in psd format. + Args: + traj (int): Current trajectory index. + frame_idx (int): Current frame index. + """ + bin_file = self.set_filename_by_prefix("3d_raw", "os1", traj, frame_idx) + bin_path = join(self.load_dir, "3d_raw", "os1", traj, bin_file) + assert isfile(bin_path), "Bin file for traj %s frame %s does not exist: %s" % (traj, frame_idx, bin_path) + point_cloud = np.fromfile(bin_path, dtype=np.float32).reshape(-1, 4) + + point_cloud = self.downsample_lidar(point_cloud, channels) + + pc_path = f'{self.point_cloud_save_dir}/' + \ + f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}.bin' + + point_cloud.astype(np.float32).tofile(pc_path) + + def downsample_lidar(self, pc, channels): + # Downsamples by selecting vertical channels with different step sizes + vert_ds = 128 // int(channels) + pc = pc[:, :4].reshape(128, 1024, -1) + ds_pc = pc[np.arange(0, 128, vert_ds), :, :] + return ds_pc.reshape(-1, 4) + + def save_label(self, traj, cam_id, frame_idx, file_idx): + """Parse and save the label data in txt format. + The relation between coda and kitti coordinates is noteworthy: + 1. x, y, z correspond to l, w, h (coda) -> l, h, w (kitti) + 2. x-y-z: front-left-up (coda) -> right-down-front(kitti) + 3. bbox origin at volumetric center (coda) -> bottom center (kitti) + 4. rotation: +x around y-axis (kitti) -> +x around z-axis (coda) + Args: + traj (str): Current trajectory index. + frame_idx (str): Current frame index. + """ + anno_file = self.set_filename_by_prefix("3d_bbox", "os1", traj, frame_idx) + anno_path = join(self.load_dir, "3d_bbox", "os1", traj, anno_file) + anno_dict = json.load(open(anno_path)) + + twod_anno_file = self.set_filename_by_prefix("2d_bbox", "cam0", traj, frame_idx) + twod_anno_path = join(self.load_dir, "2d_bbox", "cam0", traj, twod_anno_file) + + twod_anno_dict = np.loadtxt(twod_anno_path, dtype=int).reshape(-1, 6) + + calibrations = self.load_calibrations(self.load_dir, traj) # TODO figure out import + # Save transform lidar to cameras + Tr_os1_to_camx = np.array(calibrations['os1_cam0']['extrinsic_matrix']['data']).reshape(4,4) + + if cam_id==1: + R_cam0_to_cam1 = np.array(calibrations['cam0_cam1']['extrinsic_matrix']['R']['data']).reshape(3,3) + T_cam0_to_cam1 = np.array(calibrations['cam0_cam1']['extrinsic_matrix']['T']).reshape(3, 1) + Tr_cam0_to_cam1 = np.eye(4) + Tr_cam0_to_cam1[:3, :] = np.hstack((R_cam0_to_cam1, T_cam0_to_cam1)) + + Tr_os1_to_camx = Tr_cam0_to_cam1 @ Tr_os1_to_camx + + fp_label_all = open( + f'{self.label_all_save_dir}/' + + f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}.txt', 'w+') + id_to_bbox = dict() + id_to_name = dict() + + #Reset fp_label file if it exists + fp_label = open( + f'{self.label_save_dir}{cam_id}/' + + f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}.txt', 'w') + fp_label.close() + for anno_idx, tred_anno in enumerate(anno_dict["3dbbox"]): + bounding_box = None + name = tred_anno['classId'].upper() + if name not in self.type_list: + # print("Unknown class %s found, skipping..."%name) + continue + + class_idx = self.type_list.index(name) + + if name not in self.coda_to_kitti_class_map: + # print("Class %s not mapped to KITTI, skipping..."%name) + continue + my_type = self.coda_to_kitti_class_map[name] + + #TODO: Add in bbox filtering by number of points + bounding_box = twod_anno_dict[anno_idx, :] + + height = tred_anno['h'] + length = tred_anno['l'] + width = tred_anno['w'] + + x = tred_anno['cX'] + y = tred_anno['cY'] + z = tred_anno['cZ'] - height / 2 + + # not available + truncated = 0 + alpha = -10 + coda_occluded = tred_anno['labelAttributes']['isOccluded'] + occluded = self.coda_to_kitti_occlusion[coda_occluded] + + pt_ref = Tr_os1_to_camx @ \ + np.array([x, y, z, 1]).reshape((4, 1)) + x, y, z, _ = pt_ref.flatten().tolist() + + rotation_y = -tred_anno['y'] - np.pi / 2 # Convert to correct direction and limit range + track_id = tred_anno['instanceId'] + + line = my_type + \ + ' {} {} {} {} {} {} {} {} {} {} {} {} {} {}\n'.format( + round(truncated, 2), occluded, round(alpha, 2), + round(bounding_box[0], 2), round(bounding_box[1], 2), + round(bounding_box[2], 2), round(bounding_box[3], 2), + round(height, 2), round(width, 2), round(length, 2), + round(x, 2), round(y, 2), round(z, 2), + round(rotation_y, 2)) + + line_all = line[:-1] + '\n' + + fp_label = open( + f'{self.label_save_dir}{cam_id}/' + + f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}.txt', 'a') + fp_label.write(line) + fp_label.close() + + fp_label_all.write(line_all) + + fp_label_all.close() + + def save_pose(self, traj, frame_idx, file_idx): + """Parse and save the pose data. + Note that SDC's own pose is not included in the regular training + of KITTI dataset. KITTI raw dataset contains ego motion files + but are not often used. Pose is important for algorithms that + take advantage of the temporal information. + Args: + traj (str): Current trajectory. + frame_idx (str): Current frame index. + """ + pose_dir = join(self.load_dir, "poses", "dense") + pose_path = join(pose_dir, "%s.txt"%traj) + assert isfile(pose_path), "Pose file for traj %s frame %s does not exist: %s" % (traj, frame_idx, pose_path) + pose_np = np.loadtxt(pose_path, skiprows=int(frame_idx), max_rows=1) + + pose_T = pose_np[1:4].reshape(3, -1) + pose_quat_xyzw = np.append(pose_np[5:8], pose_np[4]) + pose_R = R.from_quat(pose_quat_xyzw).as_matrix() + + pose_kitti = np.eye(4) + pose_kitti[:3, :] = np.hstack((pose_R, pose_T)) + + np.savetxt( + join(f'{self.pose_save_dir}/' + + f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}.txt'), + pose_kitti) + + def save_timestamp(self, traj, frame_idx, file_idx): + """Save the timestamp data in a separate file instead of the + pointcloud. + Note that SDC's own pose is not included in the regular training + of KITTI dataset. KITTI raw dataset contains ego motion files + but are not often used. Pose is important for algorithms that + take advantage of the temporal information. + Args: + traj (str): Current trajectory + frame_idx (str): Current frame index. + """ + ts_dir = join(self.load_dir, "timestamps") + frame_to_ts_file = "%s.txt" % traj + frame_to_ts_path = join(ts_dir, frame_to_ts_file) + ts_s_np = np.loadtxt(frame_to_ts_path, skiprows=int(frame_idx), max_rows=1) + ts_us_np = int(ts_s_np * 1e6) + + with open( + join(f'{self.timestamp_save_dir}/' + + f'{str(traj).zfill(2)}{str(frame_idx).zfill(5)}.txt'), + 'w') as f: + f.write(str(ts_us_np)) + + def create_folder(self): + """Create folder for data preprocessing.""" + if not self.test_mode: + dir_list1 = [ + self.label_all_save_dir, self.calib_save_dir, + self.point_cloud_save_dir, self.pose_save_dir, + self.timestamp_save_dir, self.imageset_save_dir + ] + dir_list2 = [self.label_save_dir, self.image_save_dir] + else: + dir_list1 = [ + self.calib_save_dir, self.point_cloud_save_dir, + self.pose_save_dir, self.timestamp_save_dir, self.imageset_save_dir + ] + dir_list2 = [self.image_save_dir] + + for d in dir_list1: + os.makedirs(d, exist_ok=True) + for d in dir_list2: + for i in range(2): + os.makedirs(f'{d}{str(i)}', exist_ok=True) diff --git a/tools/demo.py b/tools/demo.py index 49c1d491..ddd9078a 100644 --- a/tools/demo.py +++ b/tools/demo.py @@ -90,11 +90,11 @@ def normalize_color(color): def parse_config(): parser = argparse.ArgumentParser(description='arg parser') - parser.add_argument('--cfg_file', type=str, default='cfgs/da-coda-coda_models/waymocenterhead/pvrcnn_allclass128full_finetune_headfull.yaml', + parser.add_argument('--cfg_file', type=str, default='cfgs/da-coda-coda_models/waymocenterhead/pvrcnn_allclass32full_finetune_headfull.yaml', help='specify the config for demo') - parser.add_argument('--data_path', type=str, default='../data/coda128_allclass_full', + parser.add_argument('--data_path', type=str, default='../data/coda32_allclass_full', help='specify the point cloud data file or directory') - parser.add_argument('--ckpt', type=str, default=None, help='specify the pretrained model') + parser.add_argument('--ckpt', type=str, default='../ckpts/waymocenterhead32/checkpoint_epoch_30.pth', help='specify the pretrained model') parser.add_argument('--show_gt', type=str, default=False, help='Visualize ground truth annotations') parser.add_argument('--show_preds', type=str, default=True, help='Predict and visualize bounding boxes') parser.add_argument('--dataset_name', type=str, default="coda", help='Dataloader to use for demo Options: [coda, jrdb, demo]') diff --git a/tools/ros_demo.py b/tools/ros_demo.py index 954311c7..ce0627e7 100644 --- a/tools/ros_demo.py +++ b/tools/ros_demo.py @@ -41,9 +41,9 @@ def parse_config(): parser = argparse.ArgumentParser(description='arg parser') parser.add_argument('--cfg_file', type=str, default='cfgs/da-coda-coda_models/waymocenterhead/pvrcnn_allclass32full_finetune_headfull.yaml', help='specify the config for demo') - parser.add_argument('-pc','--point_cloud_topic', type=str, default='/coda/ouster/points', + parser.add_argument('--point_cloud_topic', type=str, default='/coda/ouster/points', help='specify the point cloud ros topic name') - parser.add_argument('--ckpt', type=str, default=None, help='specify the pretrained model') + parser.add_argument('--ckpt', type=str, default='../ckpts/waymocenterhead32/checkpoint_epoch_30.pth', help='specify the pretrained model') args = parser.parse_args() diff --git a/tools/scripts/ls6_slurn_train.bash b/tools/scripts/ls6_slurn_train.bash index 49648f9d..67c143a2 100644 --- a/tools/scripts/ls6_slurn_train.bash +++ b/tools/scripts/ls6_slurn_train.bash @@ -116,8 +116,6 @@ export WANDB_API_KEY=dfd81f8955f7587d12b13da5256e56f80a89c014 # export CONFIG_FILE2=cfgs/da-kitti-coda_models/pvrcnn/pvrcnn_old_anchor.yaml # export EXTRA_TAG2=kitti_oracle - - # Launch CODa to AV dataset models # CODa32 nuscenes oracle