Fusion of Knowledge in Document Analysis
Soutenue le jeudi 6 décembre 2012
The recognition of digitized collections of structured documents like archives documents is difficult not only by the complexity of the documents organization, but also by their degradations (stains, tears, ink bleeding through paper, curves due to scanning…). To improve the recognition quality while dealing with the noise produced by these degradations, it is necessary to use as much as possible knowledge in the analysis process.
However, the sources of knowledge for document analysis are multiple. By focusing on the notion of page, we propose to decompose them into three types: a priori knowledge on the page (linked to a document type), knowledge from inside of the page (found in the image but not directly, this is why it needs to be extracted) and knowledge from outside of the page (from other pages of a collection of documents or from users in an interactive process).
We present how it is possible to fuse and use these types of knowledge by introducing 4 elements: a document description language, a perceptual layer, a visual memory and an iterative analysis. These elements can be added to an existing system to transform it and give it new capabilities. To validate this, we proposed DMOS-PI, a generic multi-resolution system for document analysis of document collections. It allows building perceptual vision mechanisms, with asynchronous interaction, able to bring to a page, knowledge from outside to the page (from a user, other pages or other processing).
This perceptual system can at the same time: improve the recognition quality; allow managing the intrinsic complexity of knowledge by simplifying its expression, while producing a complex mechanism (but automatically generated), with a reduced combinatory.
These principles have been validated at a large scale, in width on different kinds of documents (musical scores, mathematical formulae, forms, tables, archives documents, newspapers…) and in depth on large collections. This validation has been done at an industrial level on a total of more than 700,000 pages of documents mainly in the context of 8 research contracts and the creation of Evodia, a spin-off from our research group.
- Rapporteurs :
- Andreas Dengel (DFKI, Kaiserslautern)
- Rolf Ingold (Université de Fribourg)
- Karl Tombre (Université de Lorraine)
- Examinateurs :
- David Doermann (University of Maryland)
- Jean-Marc Jézéquel (Université de Rennes 1)
- osep Llados (Universitat Autònoma de Barcelona)
- Jean-Marc Ogier (Université de La Rochelle)