pyocr
pip install pyocr
pip install tesseract-ocr
If tesseract-ocr failed to install then follow this link install-tesseract-ocr-on-ubunt
Tesseract
sudo apt-get install tesseract-ocr
sudo apt-get install tesseract-ocr-eng
Demo test command if tesseract is working:
tesseract scan.png scanned.txt -l eng
https://github.com/tesseract-ocr/tesseract
https://github.com/tesseract-ocr/tesseract/wiki
https://github.com/tesseract-ocr/tesseract/wiki/Compiling
http://miphol.com/muse/2013/05/install-tesseract-ocr-on-ubunt.html
Use Python / PIL or similar to shrink whitespace
http://stackoverflow.com/a/9398422
Using python and PIL how can I grab a block of text in an image?