visions-datasets

Datasets for training machine learning models to read MTG card names in images.

Validation dataset

Validation image with visualized target detections and related annotations

This dataset located in validation folder was used to validate the method used in Visions. The method achieves 0.96 recall and 0.993 precision in recognizing cards using this dataset.

The inputs consist of images that have multiple MTG cards in them. Those can be found in validation/images. There's 500 images containing 1750 cards in total including 875 modern frame cards and 875 M15 frame cards.

In addition to the images there's the ground truth annotations containing target detections and card names in ICDAR 2015 format. Those can be found in validation/gt. Gt and image files can be paired by name. E.g. image IMG_20191130151746.jpg has a gt file called gt_IMG_20191130151746.txt.

While files in validation/gt contain target detections for full card names, there's also target detections for single words. Those are in validation/gt_split. Many text detection algorithms detect singular words, so this dataset is useful for them.

In addition to card names the card type can be included if it's fully visible.

Connected components of text and noise

Name text connected components

Other connected components

FASText algorithm searches images for connected components that could be parts of text. The found components have to be classified as being parts of text or something else.

In Visions the classification is done by a convolutional neural network. It was trained with a collection of images of connected components. Components were mined from synthetically created images similar to the images in the validation dataset.

The connected components images are color images of size 24x24. There's 347 783 positive connected components that are parts names and 267 875 negative connected components that are not parts names. Samples of both classes can be found in samples/cc_negative and samples/cc_positive. There's quite a few files in the dataset so the dataset is split in packages.

Folder link

Positive connected components

Negative connected components

Images of card names

Short names

Long names

Images of card names were used to train the text recognition network. The images are color images with height 32 and varying width. The ground truth is included in the file name e.g. Black_Lotus_1.jpg. The number at the end of the file name separates variations of the same name from each other.

The dataset is split in three parts with different maximum lengths for the content. The first set has starts of names that have maximum 10 characters. The other two are similar datasets for 20 and 44 characters. There's 25 685 images in the 10 character set, 85 380 in the 20 character set and 88 365 in the 44 character set.

Samples from each set can be seen in samples/names_10, samples/names_20 and samples/names_44.

10 characters

20 charactesr

44 characters