MIDV2020 « rectified photos » for Field Localisation and Recognition in Identity Documents

Field Localisation and Recognition of Identity Documents (IDs) is a challenging task due to the variability of captures, the complexity of background composition and the diversity of document layout.
Also, it is common to encounter ID Types that were not seen at training time such as new versions or when addressing new countries. These new ID Types come with new fonts, background and layout.
The layout of an ID Type is the same amongst the instances of the type and can be modelized to improve the performances of the system. We call such modelisation the template.

To facilitate research in this area, we composed this new dataset based on MIDV2020 (https://arxiv.org/abs/2107.00396) dataset and organized it for Field Localisation and Recognition in IDs in the context of new ID Types.

Based on this new dataset, we propose the following task:
We give as input the rectified document image as well as the associated template. The goal of this task is to output the field localisation, the name of the field and the textual content.

A git project (https://gitlab.inria.fr/tneittho/midv2020-rectified-photo) is associated to this dataset and propose a cross validation training to evaluate the performances of the systems in the case of new ID types. To that end, all documents of one type are kept for validation and the documents of another type are kept for test.

This dataset and task are presented in the following paper: https://link.springer.com/chapter/10.1007/978-3-031-41501-2_21

download link: https://www.irisa.fr/intuidoc/data/database/rectified_photos.tar.xz

Data

The images folder contains the documents from the in-the-wild capture rectified to a plane document.
The names are formated as such: `<document type>_<number of document>.jpg`.

alphabet.json

Contains the alphabet of the different types and the mapping from the original alphabet to the used alphabet in the case of low representation.

annotations.json

{
    <image_name>: {
        « doc_type »: ,
        « camer_type »: ,
        « capture_condition »: ,
        « fields »: [
            {
                « label »: <text label>,
                « map_label »: <text label based on mapped alphabet>
                « type »: <type of field>,
                « x1 »: ,
                « y1 »: ,
                « x2 »: ,
                « y2 »:
            },
            …
        ]
    }
}

templates.json

{
    <type name>: {
        « h »: height of the format,
        « w »: width of the format,
        « fields »: [
            {
                « type »: <type of field>,
                « x1 »: ,
                « y1 »: ,
                « x2 »: ,
                « y2 »:
            },
            …
        ]
    }
}

train/validation/test

Contains the list of images for the split train/validation/test.
One line corresponding to one document.

Contact

For any question, contact Timothée Neitthoffer: timothee.neitthoffer@irisa.fr (also @idnow.io or @gmail.com)