-
Notifications
You must be signed in to change notification settings - Fork 3
Quick Start
This guide will help you run your first OCR task with Kiri OCR using both the Command Line Interface (CLI) and Python API.
The CLI is the fastest way to test the model on an image without writing any code.
Run OCR on a single image file:
kiri-ocr predict path/to/document.jpgWhat happens?
- Auto-Download: The model is automatically downloaded from Hugging Face (first run only).
- Detection: The text detector finds all text regions in the image.
- Recognition: The OCR model reads the text in each region.
- Output: The extracted text is printed to your terminal.
To save the extracted text and visual reports, use the --output flag:
kiri-ocr predict document.jpg --output results/ --verboseThis creates a results/ directory containing:
| File | Description |
|---|---|
extracted_text.txt |
The plain text content of the document. |
ocr_results.json |
Detailed structured data with bounding boxes and confidence scores. |
ocr_result.png |
The input image with recognized text overlaid on top. |
boxes.png |
The input image with detected bounding boxes drawn. |
report.html |
An interactive HTML report showing the image and results side-by-side. |
-
Use GPU: Add
--device cudafor faster processing. -
Table Mode: Use
--mode wordsto detect individual words (better for sparse text). Default is--mode lines. -
JSON Only: Add
--no-renderto skip generating image/HTML reports (faster).
For integration into your own Python applications, use the OCR class.
from kiri_ocr import OCR
# Initialize (downloads model automatically)
ocr = OCR()
# Run inference
text, results = ocr.extract_text('document.jpg')
# Print extracted text
print(text)You can access detailed information like bounding boxes, confidence scores, and line numbers.
from kiri_ocr import OCR
# Initialize with GPU support and verbose logging
ocr = OCR(device='cuda', verbose=True)
# Process document
text, results = ocr.extract_text('document.jpg')
# Iterate through detailed results
print(f"Found {len(results)} text regions:")
for line in results:
box = line['box'] # [x, y, width, height]
text = line['text'] # Recognized text string
conf = line['confidence'] # Confidence score (0.0 - 1.0)
print(f"[{conf:.1%}] {text} at {box}")If you already have cropped images of text lines (e.g., from a separate detection process), you can skip the detection step:
# Recognize a single cropped line image
text, confidence = ocr.recognize_single_line_image('line_crop.png')
print(f"Recognized: '{text}' with confidence {confidence:.2f}")Load a model you trained yourself or downloaded separately:
# Load local model file
ocr = OCR(model_path="path/to/my_model.safetensors")
# Load from a different Hugging Face repo
ocr = OCR(model_path="my-username/my-custom-kiri-model")Kiri OCR Home | GitHub Repository | Report Issue
© 2026 Kiri OCR. Released under the Apache 2.0 License.