Tesseract engine optical character recognition (OCR) is a technology used to convert scanned paper documents, PDF files, and images to searchable text data. The OCR engine detects the characters present in the image and puts those characters into words, enabling developers to search and edit the content of the document. Tesseract optical character recognition engine is one of the most accurate OCR engines currently available for .NET. It's licensed under Apache 2.0 and has been supported by Google since 2006. Tesseract OCR library is available for various different operating systems. In this article, I will demonstrate extracting image text using Tesseract and writing C# code under Windows OS.
Subscribe to:
Post Comments (Atom)
Mocking API Responses in Azure API Management Portal
A mock API imitates a real API call by providing a realistic JSON or XML response to the requester. Mock APIs can be designed on a developer...
-
A mock API imitates a real API call by providing a realistic JSON or XML response to the requester. Mock APIs can be designed on a developer...
-
Microsoft Azure has multiple services for hosting HTTP-based web applications. Visual Studio developers can directly publish web application...
-
LiteDB is a simple, serverless, fast and lightweight, embedded .NET document database written in .NET C# managed code. It's completely...
No comments:
Post a Comment