TensorFlow Model for OCR Description

TensorFlow Model for OCR Project Description

The first step in the existing project compiling a testbench by selecting letters in the GUI in English (this wasn't previously included as one of the languages) and identifying each letter in the test file. Each letter was stored as an array of 27 floats representing the amount of blackspace in each subsection of the character.

The next step was sending this data to the Python TensorFlow model file and then parsing the training data. The communication between C and Python was done through a command line argument which sent the data for each identified character to stdout and then imported it into the Python file.

The TensorFlow file then divided the data into test data and solutions. The 27 floats representing the character were the test data and the matching correct characters that had been found through the GUI were the solutions. The program also tested files which Professdor Finkel had manually trained using his nearest neighbor search algorithm on texts in a variety of different languages. These all already had a 95%-99% level of accuracy and were used to gauge the comparative success of the TensorFlow model

The next step was to train and test the neural network model. The final model contained one sequential layer with 256 weights and 55 epochs of training. The reason for only including one layer was that adding more layers caused issues with overfitting

Finally, the Python program printed its output back to stdout, and then this was sent back into the original C file. This output was then connected both to the original OCR program's translation of the text to stdout, or to GUI mode, depending on the user input

The Makefile was edited so that using the TensorFlow algorithm instead of the original nearest neighbor search would only be a matter of including the -t flag in the command instead of the default -b