We adopt the same protocol with BGNN for Visual Genome and Openimage datasets.
The following is adapted from BGNN by following the same protocal of Unbiased Scene Graph Generation from Biased Training You can download the annotation directly by following steps.
-
Download the VG images part1 part2. Extract these images to the file
/path/to/vg/VG_100K
. -
Download the scene graphs annotations and extract them to
/path/to/vg/vg_motif_anno
. -
Link the image into the project folder
ln -s /path-to-vg datasets/vg
We adopt Openimage datasets from BGNN.
-
The initial dataset(oidv6/v4-train/test/validation-annotations-vrd.csv) can be downloaded from offical website.
-
The Openimage is a very large dataset, however, most of images doesn't have relationship annotations. To this end, we filter those non-relationship annotations and obtain the subset of dataset (.ipynb for processing ).
-
You can download the processed dataset: Openimage V6(38GB), Openimage V4(28GB)
-
By unzip the downloaded datasets, the dataset dir contains the
images
andannotations
folder. Link theopen_imagev4
andopen_image_v6
dir to the/datasets/openimages
then you are ready to go.
mkdir datasets/openimages
ln -s /path/to/open_imagev6 datasets/openimages
Here we give instructions of downloading and preprocessing of the dataset.
You can download the images and notations directly by following steps.
-
Download the VG images part1 part2. Extract these images to the file
path_to/vg/VG_100k_images
. -
Download the scene graphs annotations and extract them to
path_to/vg/vg_motif_anno
.Here,
path_to
is where you extract the dataset. (Size of the Dataset ~ 27Gb, choose the directory which has enough space). -
Link the data folder to project folder using the following command:
ln -s path_to/vg datasets/vg
- Change the directory to
datasets
using the following command:
cd datasets/
- Create
s1_data
directory using the following command:
mkdir s1_data
- Create Learning Scenario S1 dataset using the following command:
python data_generation_s1.py --file_path "vg/vg_motif_anno/"
- Change the directory to
datasets
using the following command:
cd datasets/
- Create
s2_data
directory using the following command:
mkdir s2_data
- Create Learning Scenario S2 dataset using the following command:
python data_generation_s2.py --file_path "vg/vg_motif_anno/"
- Change the directory to
datasets
using the following command:
cd datasets/
- Create
s3_data
directory using the following command:
mkdir s3_data
- Create Learning Scenario S3 dataset using the following command:
python data_generation_s3.py --file_path "vg/vg_motif_anno/"
Register you implement Dataset and Evaluator by editing the cvpods/data/datasets/paths_route.py
.
_PREDEFINED_SPLITS_VG_STANFORD_SGDET = {
"dataset_type": "VGStanfordDataset", # visual genome stanford split
"evaluator_type": {
"vgs": "vg_sgg",
},
"vgs": {
# the former is image directry path, the later is annotation directry path
"vgs_train": ("vg/VG_100k_images", "vg/vg_motif_anno"),
"vgs_val": ("vg/VG_100k_images", "vg/vg_motif_anno"),
"vgs_test": ("vg/VG_100k_images", "vg/vg_motif_anno"),
}
}
Now look at HOW_T0_USE.md for knowing various commands to run the train and eval scripts (especially if you are using multiple gpus)
More details refer to cvpods tutorial.