Online-text-recognizer-OCR

Online text recognizer -based on Tesseract-OCR engine- allows users to easily convert scanned documents, screenshots and photos into editable and searchable text, such as DOC, TXT or PDF.

Main page

Result page: Displaying uploaded image

Result page: Displaying preprocessed image

files tree

index.php : main page including the form for the input
bridge.php : called when the user submit the form to the system. it uploads the image to the server and execute the python file from the Core component. finally, redirecting the user to the result page.
result.php : the page file called by "bridge.php".
download.php : responsible for download the result in different format. called from "result.php".
Container : this folder is used to save the input/output file to make their accesibility easy by the component. It includes:
1. original image
2. image with fixed rotation
3. output : txt, pdf and doc filse
Core : represents the Core component.
vendor : includes the packages related to the View component( bootstrap, jQuery...)

System requirements

The main open sources used are:

Tesseract-OCR engine

https://github.com/tesseract-ocr/tesseract: This package contains the latest version 4 of the OCR engine “libtesseract” and a command line program “tesseract”.

The next step after installing the engine is to set up the environment path to get recognized in python script.

Get the data file for each language you want to include in the system from: https://github.com/tesseract-ocr/tessdata.

Afterward, copy the data file inside “testdata” folder: :......\Tesseract-OCR\tessdata.

Pytesseract: python interface for Tesseract-OCR Engine.

https://pypi.org/project/pytesseract/

Cv2: python interface for OpenCV library.

https://pypi.org/project/opencv-python/

N.B:

To set up python in the web server in order to execute python script through php, you have to add the following lines in Apache configuration file httpd.conf:

Options +ExecCGI 
AddHandler cgi-script .py 
ScriptInterpreterSource Registry-Strict

for further information/question, feel free to email at : itaha.kamil@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Online-text-recognizer-OCR		Online-text-recognizer-OCR
core		core
css		css
img		img
scss		scss
vendor		vendor
README.md		README.md
ReadMe.txt		ReadMe.txt
bridge.php		bridge.php
demo-main.png		demo-main.png
demo-result1.png		demo-result1.png
demo-result2.png		demo-result2.png
download.php		download.php
index.php		index.php
result.php		result.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Online-text-recognizer-OCR

Main page

Result page: Displaying uploaded image

Result page: Displaying preprocessed image

files tree

System requirements

N.B:

About

Uh oh!

Releases

Packages

Languages

quickailabs/Online-text-recognizer-OCR

Folders and files

Latest commit

History

Repository files navigation

Online-text-recognizer-OCR

Main page

Result page: Displaying uploaded image

Result page: Displaying preprocessed image

files tree

System requirements

N.B:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages