Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
65.9k
Stars
9.5k
Forks
210
Issues
311
Contributors
502
Watchers
ocrchineseocrpdf2markdownpp-ocrpp-structuredocument-parsingdocument-translationkieai4sciencepdf-extractor-ragpdf-parserragpaddleocr-vl
Python
{"name":"Apache License 2.0","spdxId":"Apache-2.0"}
Project Description
A rich, leading and practical OCR tool library
