OCR

Module · Stable

Extract structured fields from a document image using the configured OCR provider.

Extracts structured fields from a document image (JPEG, PNG). In field_list mode set ocr_fields to a list of field names (e.g. [‘invoice_number’, ‘total’, ‘date’]); in structured_schema mode supply a JSON schema dict via response_schema. Child keys are exposed as ocr_a.ocr_fields.invoice_number etc. in downstream steps.

When to use

Use when a file path arrives from an upstream step — an Incoming email attachment, a Document Store output, or an actor upload — and you need its fields (invoice number, total, date, etc.) as named context variables. The resulting ocr_fields dict feeds directly into Create Deal, Add Note, Set values, or any send step.

When not to use

If the document is a native PDF with selectable text rather than a scanned image, use Extract Proposal Field — it uses text extraction and is faster and more reliable on digital PDFs. If you only need a free-form summary of a document rather than specific named fields, use Ask AI with a vision-capable model instead.

Inputs

Configured per use: file_path, image_path, ocr_engine, ocr_provider, extraction_mode, ocr_fields, response_schema, extraction_instructions.

Outputs

ocr_result
ocr_fields
ocr_engine_used

Auto-generated from the skill registry (load_skills()). Do not edit by hand.