Base de données HBF49

Presentation

HBF49 is a unified feature representation for universal online symbol recognition (see [1]). It was constructed by aggregating 49 features covering all the aspects needed for recognition of symbols that may have different characteristics (single stroke or multi-strokes, in/sensitivity to orientation, stroke order…). The details of the approach and the precise definition of the 49 features are given in the related paper [1].

HBF49 is aimed to serve as a universal benchmarking representation and to provide, in association with standard classifiers, a baseline method for future online symbol recognition systems. The representation was demonstrated to yield consistently accurate recognition results with standard statistical classifiers (1NN, SVM) over a range of highly diverse categories of symbols, including: handwritten digits, mathematical symbols, iconic gestures, geometrical objects, architectural objects, single stroke gestural commands, and user-defined gestural commands. Experiments were conducted both in Writer-Independent and Writer-Dependent settings.

Contents

As a benchmarking representation for evaluation of novel symbol recognition methods and systems, the HBF49 feature description of several benchmarking symbols database are made available to download on this page.  Thus, the zip file (see Download) contains the extracted HBF49 representation of symbols from 8 datasets, including IRONOFF (digits), HHReco (geometrical shapes), NicIcon (multi-stroke pen gestures), IMIsketchDB-S (vectorized symbols from architectural sketches), Sign (single-stroke gestures) and ILGDB (user defined single-stroke gestures).

The WEKA script files are also included for transparency of protocols and easy reproduction of all the results presented in [1].

Usage

The HBF49 database can be exploited according to different scenarios for research about online symbol recognition. Researchers can:
– compare performance of their own symbol recognition systems with HBF49 representation of the benchmarks
– consider HBF49 as a starting point for designing a representation targeted to a specific datasets, by selecting features from it or by adding new features
– experiment their own machine learning algorithms for recognition of real-world handwritten symbols, pre-extracted under the powerful HBF49 representation and in a standard format (Weka ARFF), for comparisons of classification methods

Téléchargement

La base de données peut être téléchargée sous forme d’un fichier zip contenant toutes les données ainsi que les fichiers d’informations décrivant les données collectées.

Cliquer ici pour télécharger le fichier zip.

Contact

Pour toute question ou suggestion, vous pouvez contacter Eric Anquetil.

Reference

These datasets can be used freely for research purpose. However, any published work utilizing them should refer explicitly to the following reference paper.

[1] Adrien Delaye, Eric Anquetil. HBF49 feature set: A first unified baseline for online symbol recognition. Pattern Recognition, 46(1):117-130, 2013.