# MinerU

## 中文版介绍

> MinerU 是由开放数据实验室（OpenDataLab）开发的一款先进的文档解析平台，旨在通过其多模态和生成式 AI 能力，实现文档内容的结构化提取与智能处理。它能够将 PDF、Word 文档、PPT、图片、HTML 等多种格式的文件转换为 Markdown、JSON、LaTeX 和 HTML 等结构化数据，并支持与各大模型客户端及 Agent 框架的无缝集成。作为一款强大的开源工具，MinerU 在 GitHub 上拥有超过 56,000 颗星，专注于从复杂文档中（包括表格、数学公式、化学方程式和多语言文本）进行高精度信息提取。平台提供**免登录的快速 API 访问通道**和**登录认证的专业 API 服务**、离线部署选项，以及适用于 Windows、macOS 和 Linux 的桌面客户端。通过 **MinerU Client Protocol (MCP)**，它实现了与 Cursor 等大模型客户端的**参数化、结构化、无缝协同工作**，将自然语言指令转化为精准的文档解析操作。此外，MinerU 还通过 **OpenClaw 式流式工作流**，提供可被集成、可被编排的灵活技能封装能力，支持自动化处理流程

### 核心功能

- **高保真度文档结构化转换**：支持将 PDF、Word、PPT、图片、HTML 等转换为 Markdown/JSON/LaTeX/HTML 等结构化格式，确保内容完整性和格式准确性。
- **高级表格识别与处理**：支持识别旋转表格、跨页单元格、合并单元格，可灵活导出为 CSV、HTML、Markdown 格式。
- **精确公式识别**：支持输出 LaTeX/MathML 格式的数学公式。
- **多语言 OCR 文本识别**：具备强大的多语言光学字符识别能力。
- **化学论文分析**：支持分子结构检测等专业领域的文档解析。
- **批量处理能力**：高效处理大量 PDF 文档。
- **图像与图表提取**：智能识别并提取文档中的图像和图表。
- **MinerU Client Protocol (MCP) 支持**：通过轻量级客户端与大模型客户端无缝集成，实现参数化、结构化的文档解析调用。

### 智能体（Agent）

- [Skills & MCP](https://mineru.net/apiManage/docs)：探索 MinerU Agent 生态系统与 MCP Client 的详细信息，包括如何将 MinerU 能力封装为可供 AI Agent 调用的技能。
- [免登录 Agent API](https://mineru.net/apiManage/docs)：无需 Token，通过 `POST https://mineru.net/api/v1/agent/parse/url` 直接解析 PDF URL，专为 AI Agent 工作流设计（文件 ≤ 10MB，≤ 20 页）。
- [登录精准解析 API](https://mineru.net/apiManage/docs)：需申请 Token，通过 `POST https://mineru.net/api/v4/extract/task` 提交解析任务，支持最大 200MB / 600 页，输出 Markdown/JSON/docx/html/latex。

### 应用程序接口命令行界面（Api Cli）

- [CLI/SDK](https://github.com/opendatalab/MinerU-Ecosystem/blob/main/cli/README.md)：提供可直接复制运行的命令行工具和 SDK，方便开发者快速集成 MinerU 到其工作流程中。

### 自定义

- [OpenClaw 技能](https://clawhub.ai/MinerU-Extract/mineru-ai)：详细介绍如何将 MinerU 的文档解析能力封装为符合 OpenClaw 规范的技能，实现与其他 AI 工作流的无缝集成。

### 文档

- [API 参考](https://mineru.net/apiManage/docs)：包含 Python 代码示例的 REST API 文档。
- [GitHub 仓库](https://github.com/opendatalab/MinerU)：源代码、问题反馈和版本发布。
- [速率限制](https://mineru.net/apiManage/limit)：API 配额和速率限制政策。
- [KIE SDK](https://mineru.net/apiManage/kie-sdk)：关键信息提取 SDK 指南。
- [KIE 使用方法](https://mineru.net/apiManage/kie-usage)：KIE 教程和示例。

### 研究

- [MinerU 论文（arXiv:2409.18839）](https://arxiv.org/abs/2409.18839)：MinerU：一款用于精确提取文档内容的开源解决方案。
- [MinerU 2.5 论文（arXiv:2509.22186）](https://arxiv.org/abs/2509.22186)：MinerU 2.5：一种用于高效高分辨率文档解析的解耦视觉-语言模型。

### 产品

- [在线使用](https://mineru.net/OpenSourceTools/Extractor)：用于文档/网页解析的在线 Demo。
- [在线 API](https://mineru.net/apiManage/docs)：用于文档/网页解析的 RESTful API。
- [桌面客户端](https://mineru.net/client)：适用于 Windows、macOS 和 Linux 的免费桌面应用程序。
- [Hugging Face 演示](https://huggingface.co/spaces/opendatalab/MinerU)：无需安装，在线试用 MinerU。
- [Python SDK](https://github.com/opendatalab/MinerU-Ecosystem/tree/main/sdk/python): pip install mineru-open-sdk
- [Go SDK](https://github.com/opendatalab/MinerU-Ecosystem/tree/main/sdk/go): Install: go get github.com/opendatalab/MinerU-Ecosystem/sdk/go@latest
- [TypeScript SDK](https://github.com/opendatalab/MinerU-Ecosystem/tree/main/sdk/typescript): npm install mineru-open-sdk

### 生态系统集成

- [Dify 插件](https://marketplace.dify.ai/plugins/langgenius/mineru)：MinerU 官方 Dify 插件，实现文档内容快速检索与增强生成。
- [Coze 插件](https://www.coze.cn/store/plugin/7527957359730360354)：MinerU 官方 Coze 插件，支持在 Coze 平台上进行文档解析。
- [n8n 节点](https://www.npmjs.com/package/n8n-nodes-mineru)：MinerU 官方 n8n 节点，助力构建自动化工作流。
- [FastGPT 集成](https://opendatalab.github.io/MinerU/zh/usage/plugin/FastGPT/)：在 FastGPT 中集成 MinerU，提升信息抽取能力。
- [RagFlow 集成](https://opendatalab.github.io/MinerU/zh/usage/plugin/RagFlow/)：在 RagFlow 中使用 MinerU 进行文档解析和数据结构化。
- [LangChain](https://github.com/opendatalab/MinerU-Ecosystem/tree/main/langchain_mineru): 用于langchain做rag插件
- [Cherry Studio](https://opendatalab.github.io/MinerU/zh/usage/plugin/Cherry_Studio/)：Cherry Studio 中的 MinerU 集成，提供更多文档处理选项。
- [LlamaIndex](https://github.com/opendatalab/MinerU-Ecosystem/tree/main/llama-index-readers-mineru): 用于LlamaIndex 做rag插件

### 关于 MinerU

由上海人工智能实验室旗下的OpenDataLab开发和维护。该项目基于 AGPL-3.0 许可证开源。

---

## English Version Introduction

> MinerU, developed by OpenDataLab (Open Data Laboratory), is an advanced document parsing platform designed to extract structured content and enable intelligent processing through its multimodal and generative AI capabilities. It can convert various file formats such as PDF, Word documents, PPT, images, and HTML into structured data formats like Markdown, JSON, LaTeX, and HTML. MinerU supports seamless integration with various large language model clients and Agent frameworks. As a powerful open-source tool, MinerU boasts over 56,000 stars on GitHub, specializing in high-precision information extraction from complex documents, including tables, mathematical formulas, chemical equations, and multilingual text. The platform offers **a free, no-login quick access API channel** and **authenticated professional API services**, offline deployment options, and desktop clients for Windows, macOS, and Linux. Through its **MinerU Client Protocol (MCP)**, it achieves **parameterized, structured, and seamless collaborative work** with LLM clients like Cursor, translating natural language instructions into precise document parsing operations. Furthermore, MinerU provides flexible skill encapsulation capabilities via **OpenClaw-style streaming workflows**, supporting automated processing. 

### Core Features

- **High-Fidelity Document Structural Conversion**: Supports converting PDF, Word, PPT, images, HTML, and other formats into structured formats like Markdown/JSON/LaTeX/HTML, ensuring content integrity and format accuracy.
- **Advanced Table Recognition & Processing**: Capable of identifying rotated tables, cross-page cells, and merged cells, with flexible export options to CSV, HTML, and Markdown.
- **Precise Formula Recognition**: Outputs mathematical formulas in LaTeX/MathML format.
- **Multilingual OCR Text Recognition**: Possesses robust multilingual optical character recognition capabilities.
- **Chemical Paper Analysis**: Supports specialized document parsing, including molecular structure detection.
- **Batch Processing Capability**: Efficiently handles large volumes of PDF documents.
- **Image and Chart Extraction**: Intelligently identifies and extracts images and charts from documents.
- **MinerU Client Protocol (MCP) Support**: Seamless integration with LLM clients via a lightweight client for parameterized, structured document parsing calls.

### Agents

- [Skills & MCP](https://mineru.net/apiManage/doc)：Explore the MinerU Agent ecosystem and detailed information on MCP Client, including how to encapsulate MinerU capabilities into AI Agent-callable skills.
- [Agent Lightweight API (No Token)](https://mineru.net/apiManage/docs)：No login required. Use `POST https://mineru.net/api/v1/agent/parse/url` to parse a PDF URL directly. Designed for AI Agent workflows (≤ 10MB, ≤ 20 pages).
- [Precision Parsing API (Token Required)](https://mineru.net/apiManage/docs)：Apply for a Token at mineru.net/apiManage/docs. Submit tasks via `POST https://mineru.net/api/v4/extract/task` with `{"url": "...", "model_version": "vlm"}`. Poll results via `GET https://mineru.net/api/v4/extract/task/{task_id}`. Supports up to 200MB / 600 pages, outputs Markdown/JSON/docx/html/latex.

### API Command Line Interface (Api Cli)

- [CLI/SDK](https://github.com/opendatalab/MinerU-Ecosystem/blob/main/cli/README.md)：Provides command-line tools and SDKs that can be directly copied and run, facilitating quick integration of MinerU into developer workflows.

### Customization

- [OpenClaw Skills](https://clawhub.ai/MinerU-Extract/mineru-ai)：Detailed information on how to encapsulate MinerU's document parsing capabilities into skills compliant with OpenClaw specifications, enabling seamless integration with other AI workflows.

### Documentation

- [API Reference](https://mineru.net/apiManage/docs)：REST API documentation with Python code examples.
- [GitHub Repository](https://github.com/opendatalab/MinerU)：Source code, issue tracking, and releases.
- [Rate Limits](https://mineru.net/apiManage/limit)：API quota and rate limiting policies.
- [KIE SDK](https://mineru.net/apiManage/kie-sdk)：Key Information Extraction SDK guide.
- [KIE Usage](https://mineru.net/apiManage/kie-usage)：KIE tutorials and examples.

### Research

- [MinerU Paper (arXiv:2409.18839)](https://arxiv.org/abs/2409.18839)：MinerU: An Open-Source Solution for Precise Document Content Extraction.
- [MinerU 2.5 Paper (arXiv:2509.22186)](https://arxiv.org/abs/2509.22186)：MinerU 2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing.

### Products

- [Online Usage](https://mineru.net/OpenSourceTools/Extractor)：Online demo for document/webpage parsing.
- [Online API](https://mineru.net/apiManage/docs)：RESTful API for document/webpage parsing.
- [Desktop Client](https://mineru.net/client)：Free desktop application available for Windows, macOS, and Linux.
- [Hugging Face Demo](https://huggingface.co/spaces/opendatalab/MinerU)：Try MinerU online without installation.

### Ecosystem Integrations

- [Dify Plugin](https://marketplace.dify.ai/plugins/langgenius/mineru)：Official MinerU plugin for Dify, enabling quick document content retrieval and enhanced generation.
- [Coze Plugin](https://www.coze.cn/store/plugin/7527957359730360354)：Official MinerU plugin for Coze, supporting document parsing on the Coze platform.
- [n8n Node](https://www.npmjs.com/package/n8n-nodes-mineru)：Official MinerU node for n8n, facilitating the construction of automated workflows.
- [FastGPT Integration](https://opendatalab.github.io/MinerU/zh/usage/plugin/FastGPT/)：Integrate MinerU into FastGPT to enhance information extraction capabilities.
- [RagFlow Integration](https://opendatalab.github.io/MinerU/zh/usage/plugin/RagFlow/)：Utilize MinerU in RagFlow for document parsing and data structuring.
- [Cherry Studio](https://opendatalab.github.io/MinerU/zh/usage/plugin/Cherry_Studio/)：MinerU integration within Cherry Studio, offering more document processing options.

### About MinerU

Developed and maintained by OpenDataLab under Shanghai AI Laboratory. The project is open-sourced under the AGPL-3.0 license.