Welcome to the TCR Neuroph project home page!
You can download TCR Neuroph from Source Forge on this link - TCR Neuroph v 1.0
This application is part of Neuroph project, and it demonstrates how neural networks can be applied for text recognition. It is an A.I. research open source student project. Project was started in October 2009 and it's first version was developed by students of FON - School of Business Administration, University of Belgrade, Serbia and mentored by Zoran Ševarac, teaching assistent at Department of Software Engineering. This first version of application is for demonstration purpose.
TCR Neuroph application uses artificial neural network to recognize text characters from image (scanned documents, photography etc.) and transforms them into editable document (as MS Word .doc or Notepad and Wordpad .txt file). It is based on neural network that can learn to recognize more characters.
Java 6 - Sun Microsystems
How does it work
Application cleans the input image and extracts (cuts) character from the image. Every extracted image of character is croped by the edges of character and resized to 15x19 pixels. This size is used in this application as a standard letter size.
After processing, every image is sent to neural network for recognizing. Neural network recognizes every image using it's allready learned fonts and sizes and as a result returns a character with the best recognition. In this version of application for demo is used a neural network trained for TIMES NEW ROMAN font, capital letters only.
The characters are assembled as editable text, as a final result of recognizing the current loaded image. Other images can be loaded, recognized and added to the same editable text.
The main advantage is that the neural network is packed as a single file and it can be replaced with another neural network (it's actually the same neural network which have learned greater spectrum of characters) without changing the application or it's code.
*(The only term is that neural network must be trained under same condition as the one that is replaced)
For training neural network we used Java Neural Network Framework - Neuroph 2.3.1
The network that is currently used was trained for Timew New Roman font. The size of input images for network is 15x19 in FULL_COLOR MODE. The training of this network took 15 minutes after 753 iterations (preformed on dual core 2.41 GHz and 2GB ram computer). Learning rate was set to 0.1, Max error was set to 0.01 and Momentum was set to 0.0. Hidden layers of neurons were set to 26.
We also used java-image-scaling 0.8.4 release for scaling images to prefered size.
Main issue was recognizing connected letters as below:
There were also problems with smuges and imprecisions in JPEG images. This problem can cause for application to block trying to recognize an undefined image.
For the starting the application as double click on jar file could be problematic with the Windows 7 users. It can be started using Comand Promt. On other OS (as Windows XP, Vista, Mac OS etc.) there are no reported issues.
Current plans for the future for this project are:
- To resolve issues listed above
- To train new neural network with greater base of knowledge (Capital and small letters, numbers, punctuation characters, various fonts)
- To put the application on Netbeans platform
jHRT Handwriting recognition tool based on Neuroph - Project page
This application was developed in Java in Netbeans 6.7.1 IDE
Ivana Jovičić, student of FON - School of Business Administration, University of Belgrade, Serbia
mail to: firstname.lastname@example.org
Vladimir Kolarević, student of FON - School of Business Administration, University of Belgrade, Serbia
mail to: email@example.com
Marko Ivanović, student of FON - School of Business Administration, University of Belgrade, Serbia
mail to: firstname.lastname@example.org
Mentor - Zoran Ševarac
mail to: email@example.com