Home | Back to Courses
Intelligently Extract Text & Data from Document with OCR NER

Partner: Udemy
Affiliate Name:
Area:
Description: Welcome to Course "Intelligently Extract Text & Data from Document with OCR NER" !!!In this course you will learn how to develop customized Named Entity Recognizer. The main idea of this course is to extract entities from the scanned documents like invoice, Business Card, Shipping Bill, Bill of Lading documents etc. However, for the sake of data privacy we restricted our views to Business Card. But you can use the framework explained to all kinds of financial documents. Below given is the curriculum we are following to develop the project.To develop this project we will use two main technologies in data science are,Computer VisionNatural Language ProcessingIn Computer Vision module, we will scan the document, identify the location of text and finally extract text from the image. Then in Natural language processing, we will extract the entitles from the text and do necessary text cleaning and parse the entities form the text.Python Libraries used in Computer Vision Module.OpenCVNumpyPytesseractPython Libraries used in Natural Language ProcessingSpacyPandasRegular ExpressionStringAs are combining two major technologies to develop the project, for the sake of easy to understand we divide the course into several stage of development.Stage -1: We will setup the project by doing the necessary installations and requirements.Install PythonInstall DependenciesStage -2: We will do data preparation. That is we will extract text from images using Pytesseract and also do necessary cleaning.Gather ImagesOverview on PytesseractExtract Text from all ImageClean and Prepare te
Category: Development > Data Science > Text Mining
Partner ID:
Price: 149.99
Commission:
Source: Impact
Go to Course