Chinese version: README.zh.md
SnapOCR brings Baidu PaddleOCR's document parsing workflow into Raycast.
PaddleOCR positions itself as a toolkit that turns PDF and image documents into structured, AI-friendly data such as Markdown and JSON, with multilingual support for global document workflows. SnapOCR packages that capability into a lightweight desktop workflow for screenshots, PDFs, and images.
This is not a plain text OCR wrapper. SnapOCR uses Paddle's layout-parsing API to understand document structure before recognition, then reconstructs headings, paragraphs, tables, formulas, charts, and image regions into reusable Markdown. The output is much closer to the original hierarchy of the page than a flat OCR dump.
SnapOCR should not be read as a tool for one language or one region. The underlying PaddleOCR stack is built for multilingual document understanding, and its document parsing pipeline is meant for complex, real-world pages rather than only clean text screenshots.
That makes SnapOCR suitable for users worldwide who work with:
PaddleOCR's own product direction maps directly to SnapOCR's value:
SnapOCR turns those capabilities into a fast desktop workflow without requiring local OCR models, native dependencies, or command-line tooling.
| Command | Description |
|---|---|
| Quick OCR | Capture a screenshot and instantly copy structured OCR text |
| Preview OCR | Capture a screenshot and review structured OCR output in Raycast |
| Export OCR Markdown | Choose a PDF or image file and export a Markdown bundle with extracted images |
Use Quick OCR when you want the fastest workflow for turning a screenshot into reusable text.
Use Preview OCR when you want to inspect the recognized result in Raycast first, especially for complex pages with tables, formulas, or dense document structure.
Use Export OCR Markdown when you want a real output package:
document.md fileBecause SnapOCR uses Paddle's layout parsing endpoint, the Markdown output preserves much more structure than a plain OCR pipeline:
Enable optional features in extension preferences when you need better recovery from messy real-world inputs:
You need a free Baidu AIStudio account to get the API credentials. The OCR service is provided through Baidu AIStudio's PaddleOCR platform.
Baidu AIStudio also offers a free PaddleOCR allowance that is usually enough for personal use. Exact quotas can change, so users should check the official PaddleOCR page for the latest details.
API_URL - the base URL, for example https://xxx.aistudio-app.com, without /layout-parsingTOKEN - the access token stringOpen Raycast, search for one of the commands, and enter:
| Setting | Value |
|---|---|
| Access Token | The TOKEN value from step 1 |
| API URL | The base URL from API_URL, for example https://xxx.aistudio-app.com |