getomni-ai

zerox

未分类

getomni-ai

OCR & Document Extraction using vision models

11.1k
Stars
730
Forks
53
Issues
17
Contributors
56
Watchers
ocrpdf
TypeScript
{"name":"MIT License","spdxId":"MIT"}

Project Description

Zerox OCR is a versatile tool designed to convert documents (PDF, DOCX, images, etc.) into Markdown format for AI ingestion. It processes files by converting them into images, then uses vision models (e.g., OpenAI, Azure OpenAI, AWS Bedrock, Google Gemini) to extract and aggregate Markdown content. Key features include support for multiple file types, structured data extraction, per-page processing, and customizable system prompts. Available as both Node.js and Python packages, Zerox offers options for maintaining format, concurrent processing, and error handling. It’s ideal for handling complex layouts, tables, and charts, making it a powerful solution for document-to-text conversion in AI workflows.

© 2025 GitHub Fun. All rights reserved.