ImadocSen-OnDB is a database containing on-line handwritten sentences made up of English words in lowercase letters. It can be used to train and evaluate handwriting recognition systems, either on sentences or on isolated words, or for writer identification tasks. The data were collected on Tablet PCs.
The first version of this database was published in [Quiniou2005], at ICDAR 2005. The size of the database has been growing ever since [Quiniou2009a] (see References for further details).
The sentences have been written from texts of the Brown corpus [Francis1979] (see References). The database contains files in the InkML format, in which each sentence is saved as well as information on the acquisition device, on the writer, and on the sentence transcription. The words of the sentences, that were manually extracted, are also given and can be used to perform isolated word recognition, for example.
The data collection protocol, as well as the storage format, is described in more details in the given files (file dataset_infos.txt, in the zipfile).
Examples of on-line handwritten sentences from the database
ImadocSen-OnDB is structured as follows (on 11/20/2010):
* 51 writers (including 42 different writers)
* 1,017 handwritten sentences
* 15,849 extracted words
The handwritten sentence database can be downloaded as a zipfile containing all the data as well as information on the data collected.
For any question or suggestion, please contact Eric Anquetil.
[Quiniou2005] S. Quiniou and E. Anquetil and S. Carbonnel. Statistical Language Models for On-line Handwritten Sentence Recognition, 2005, Proceedings of the International Conference on Document Analysis and Recognition, pp 516-520.
[Quiniou2009a] S. Quiniou and F. Bouteruche and E. Anquetil. Word Extraction Associated with a Confidence Index for On-Line Handwritten Sentence Recognition, 2009, International Journal on Pattern Recognition and Artifical Intelligence, vol. 23(5), pp 945-966.
[Francis1979] W. Francis and H. Kucera. Brown Corpus Manual. Brown University, 1979.