This report is the final year report of Mr. Rahul Thapa Magar. He is a graduate of university of Wolverhampton. We will provide you the link to his project very soon. You can download his report by clicking download here button at the end of this page.
- Introduction
1.1. Academic Question
a. How will the system work and what techniques, tools and technologies will it
use to extract text from images?
b. Will the users need to log in for using this system?
c. What is the accuracy of your system? Will your system performance
decrease if the image is provided with enough noise?
d. How will the users get benefited by this system?
1.2. Aims & Objectives
1.2.1. Aims
• To train the model on synthetic dataset.
• To extract handwritten word not just from paper or electronic documents
but also from natural scene images.
• To help the people to store data digitally without copying from the
documents
1.2.2. Objectives
• To research on the Internet, Books, Journals, Articles etc.
• To implement suitable classifiers and algorithm.
• To build the platform, web application for the people to perform Optical
Character Recognition (OCR).
1.3. Brief Details of the Artifact Produced and Background to the project.
Data has become the most valuable assets in the world. People are storing
data in both electronic and paper-based format. They need stored data in
their daily lives to run their businesses. Rewriting those stored data is time
consuming and unproductive. Traditionally, text recognition has been done
on document images because of their well suited digitise planner paperbased formats. But when it comes to natural scene images, the accuracy decreases drastically because of their highly variance in appearance and layout in the images. Additionally, natural images are suffered from noises, inconsistent light, occlusions, orientation etc which makes difficult for the classifier to detect and recognize the text in comparison to document images. In the recent years, the advancement came in the field of computer
vision techniques and the large volume of datasets produced over the last
decades has made possible to recognize the text form even natural scene
images. In this project text spotting is done from natural images by
implementing two techniques i.e. word detection followed by word
recognition. This project does not perform character recognition instead it
recognizes word through word spotting mechanism. The detector is built
with Tesseract and OpenCV and recognition is done by Convolutional
Neural Network (CNN). CNN is trained on synthetic datasets known as VGG
synthetic word datasets. This project is based on flask web application
where the users perform OCR by uploading images in the system.
Artefact (proposed) to be developed
Artefact 1
Image upload
Artefact 2
Word Detection
Artefact 3
Word Recognition
1.4. Potential Users
There are no specific users required to use this system. Everyone can utilize
this system to perform OCR. Today, the corporates around the world
upgraded to digital format. For instance, they store the corporate data,
information etc. in electronic from. Moreover, the people from every field are
recognizing the importance of OCR because they do not have to go through
the hassle of copying the whole words from the hard documents. Since its
development, it has been applied to many fields and still widening its
horizon. Some of the fields of OCR are Handwriting recognition, Receipt
Imaging, Legal Industry, Banking, HealthCare, Captcha, Automatic Number
Plate Recognition, ATMA: android travel mate application etc. It seems
everybody needed such systems in today’s world where the data has
become the valuable assets. So, application of OCR cannot be restricted to
just some fields and some users.
1.5. Scope and Limitations of the project.
Text, being consider as the only tools for preserving and communicating
information. Today’s modern world is designed to interpret and
communicate using text clues, labels, texts etc. found in the surroundings.
So, text has been scattered through many images and videos for the
communication purposes. Extracting such texts from the images and storing
the information in digital format helps to secure from the damages done by
the theft of hard documents. Sometimes we need to digitally replicate the
text of the images. In such cases OCR can play an important role.
System is based on word recognition method instead of character
recognition. Unlike the character recognition, which recognize the word by
the recognition of letters, word recognition has to trained with the whole
word as input.
So, the recognition of such model is constraint to the number of words in
the dictionary because in such method we can cover all the words for
recognition. Similarly, the accuracy of this method is low because the model
is trained with small no of datasets. The reason behind small number of
datasets is because of computational limitations. The other limitation of this
system are it does not work offline, only recognize the English alphabetical
words.
1.6. Report Structure
• Introduction: It provides the overall introduction of the project. It
includes topic such as project aims, objectives, scope, limitations,
academic question, and artifact.
• Literature Review: It includes the necessary information for the
completion of the project such as background research,
components, and similar system.
• Development: This section provides the information from project
planning to its development. It includes all the planning’s, designs,
and testing.
• Answering Academic Question: This section provides the answers
regarding the academic questions.
• Conclusion: It concludes the whole project with its future escalation.
• Critical Evaluation: It includes all the necessary evaluation towards
the report, systems, and development process.
Click the download button to download the full report.