The IFN/ENIT-database contains material for training and testing of Arabic handwriting recognition software. There are more than 2200 binary images of handwriting sample forms from 411 writers, about 26,000 binary word images have been isolated from the forms and saved individually for easy of access. A ground truth file for each word in the database has been compiled. This file contains information about the word such as the position of the words base line, and information on the individual used characters in the word.

The Features of the IFN/ENIT-database are:

300 dpi binary handwritten words (town/village names)

26459 city words

212.211 characters and ligatures
Each word supplied with an automatically determined and manually verified ground truth
Extracted from artificial forms filled by Tunisian people (sample)
Simulates writing on a letter:
Unrestricted for writing style
No writing lines or boxes used
Divided into 4 disjoint sets for training and testing
Now performance comparison is possible
Image format documentation included

The IFN/ENIT-database was developed by:

Ecole Nationale d'Ingénieurs de Tunis (ENIT), Tunisia


The Volkswagen Stiftung under contract number I/75 084 supports this project.

 
September 06
webmaster@ifnenit.com