# pythontesseract

> Python wrapper for Google's Tesseract-OCR

**Wikidata**: [Q100153106](https://www.wikidata.org/wiki/Q100153106)  
**Source**: https://4ort.xyz/entity/pythontesseract

## Summary
pythontesseract is a Python wrapper for Google's Tesseract-OCR Engine, enabling optical character recognition (OCR) capabilities in Python applications. It is released under the Apache Software License 2.0 and simplifies the integration of Tesseract-OCR into Python workflows. The library was first published on June 24, 2009, and is maintained by Matthias A Lee.

## Key Facts
- **Initial Release Date**: June 24, 2009.
- **Creator**: Matthias A Lee.
- **License**: Apache Software License 2.0.
- **Programming Language**: Python.
- **Latest Version (as of 2019)**: 0.2.7 (released January 29, 2019).
- **Repository**: Hosted on GitHub at https://github.com/madmaze/pytesseract.
- **Related Projects**: Used by Gramps Web API (an open-source genealogy database server).
- **Package Manager**: Available as `python-pytesseract` on openSUSE.

## FAQs
### Q: What is pythontesseract used for?
A: pythontesseract provides a Python interface to Google's Tesseract-OCR Engine, allowing developers to extract text from images or scanned documents programmatically.

### Q: Is pythontesseract free to use?
A: Yes, it is open-source software released under the Apache Software License 2.0, permitting free use and redistribution.

### Q: How do I install pythontesseract?
A: The library can be installed via PyPI using `pip install pytesseract`, though Tesseract-OCR must also be installed separately on the system.

## Why It Matters
pythontesseract plays a critical role in simplifying OCR tasks for Python developers, enabling efficient text extraction from visual data. By abstracting the complexities of interacting with Tesseract-OCR, it reduces development time and lowers the barrier to entry for applications requiring optical character recognition, such as document scanning, data digitization, and automated testing. Its open-source nature and permissive licensing foster collaboration and integration into larger projects, contributing to its adoption in both academic and commercial contexts. The library’s compatibility with Python—a widely used language in data science and automation—further amplifies its utility across diverse domains.

## Notable For
- **Pythonic Interface**: Provides a straightforward API for Tesseract-OCR, avoiding low-level system calls.
- **Active Maintenance**: Regular updates (e.g., 0.2.7 in 2019) ensure compatibility with evolving Python and Tesseract versions.
- **Cross-Platform Use**: Supports integration with projects like Gramps Web API, demonstrating versatility in web and desktop applications.
- **Permissive Licensing**: Apache 2.0 license encourages reuse in proprietary and open-source projects alike.

## Body
### Overview
pythontesseract is a software library that wraps Google’s Tesseract-OCR Engine, enabling Python developers to incorporate OCR functionality into their applications. It acts as an intermediary, simplifying the execution of Tesseract commands and parsing their output.

### Development History
- **First Published**: June 24, 2009.
- **Notable Releases**:
  - **0.1.7**: Released May 17, 2017.
  - **0.2.0**: Introduced January 30, 2018, marking a major version update.
  - **0.2.7**: Latest documented release as of January 29, 2019.

### Technical Details
- **Dependencies**: Requires Tesseract-OCR to be installed on the system.
- **Compatibility**: Works with Python 3.x and integrates with tools like Gramps Web API.
- **Functionality**: Supports text extraction from images in multiple formats (e.g., PNG, JPEG) and provides configuration options for OCR accuracy tuning.

### Licensing
- **License Type**: Apache Software License 2.0.
- **Copyright Status**: Copyrighted under the author’s name (Matthias A Lee).

### Related Projects
- **Gramps Web API**: Utilizes pythontesseract for genealogy data processing, highlighting its utility in domain-specific applications.
- **Tesseract-OCR**: The underlying OCR engine developed by Google.

## References

1. [Release 0.1.7. 2017](https://github.com/madmaze/pytesseract/releases/tag/v0.1.7)
2. [Release 0.1.8. 2018](https://github.com/madmaze/pytesseract/releases/tag/v0.1.8)
3. [Release 0.1.9. 2018](https://github.com/madmaze/pytesseract/releases/tag/v0.1.9)
4. [Release 0.2.0. 2018](https://github.com/madmaze/pytesseract/releases/tag/v0.2.0)
5. [Release 0.2.1. 2018](https://github.com/madmaze/pytesseract/releases/tag/v0.2.1)
6. [Release 0.2.2. 2018](https://github.com/madmaze/pytesseract/releases/tag/v0.2.2)
7. [Release 0.2.4. 2018](https://github.com/madmaze/pytesseract/releases/tag/v0.2.4)
8. [Release 0.2.5. 2018](https://github.com/madmaze/pytesseract/releases/tag/v0.2.5)
9. [Release 0.2.6. 2018](https://github.com/madmaze/pytesseract/releases/tag/v0.2.6)
10. [Release 0.2.7. 2019](https://github.com/madmaze/pytesseract/releases/tag/v0.2.7)
11. [Release 0.2.8. 2019](https://github.com/madmaze/pytesseract/releases/tag/v0.2.8)
12. [Release 0.2.9. 2019](https://github.com/madmaze/pytesseract/releases/tag/v0.2.9)
13. [Release 0.3.0. 2019](https://github.com/madmaze/pytesseract/releases/tag/v0.3.0)
14. [Release 0.3.1. 2019](https://github.com/madmaze/pytesseract/releases/tag/v0.3.1)
15. [Release 0.3.2. 2020](https://github.com/madmaze/pytesseract/releases/tag/v0.3.2)
16. [Release 0.3.3. 2020](https://github.com/madmaze/pytesseract/releases/tag/v0.3.3)
17. [Release 0.3.4. 2020](https://github.com/madmaze/pytesseract/releases/tag/v0.3.4)
18. [Release 0.3.5. 2020](https://github.com/madmaze/pytesseract/releases/tag/v0.3.5)
19. [Release 0.3.6. 2020](https://github.com/madmaze/pytesseract/releases/tag/v0.3.6)
20. [Release 0.3.7. 2020](https://github.com/madmaze/pytesseract/releases/tag/v0.3.7)
21. [Release 0.3.8. 2021](https://github.com/madmaze/pytesseract/releases/tag/v0.3.8)
22. [Release 0.3.9. 2022](https://github.com/madmaze/pytesseract/releases/tag/v0.3.9)
23. [Release 0.3.10. 2022](https://github.com/madmaze/pytesseract/releases/tag/v0.3.10)
24. [Release 0.3.11. 2023](https://github.com/madmaze/pytesseract/releases/tag/v0.3.11)
25. [Release 0.3.12. 2023](https://github.com/madmaze/pytesseract/releases/tag/v0.3.12)
26. [Release 0.3.13. 2023](https://github.com/madmaze/pytesseract/releases/tag/v0.3.13)