Read PDF
Controlled node
Overview
Extracts text content from PDF files. The node supports two parsing modes: Analytical (traditional text extraction) and AI (vision-based parsing for complex layouts, scanned documents, or images within PDFs). The output is returned in markdown format.
Parser Options
| Parser | Description | Best For |
|---|---|---|
| analytical | Traditional PDF text extraction using document parsing algorithms. | Standard text-based PDFs, digital documents. |
| ai | Vision-based parsing using AI models to read the document as images. | Scanned documents, complex layouts, images within PDFs, handwritten content. |
Choosing a Parser
Use analytical for standard digital PDFs with selectable text. Use ai for scanned documents, PDFs with complex layouts, or when the analytical parser fails to extract text correctly.
Inputs
| Input | Type | Description | Default |
|---|---|---|---|
| Run | Event | Fires when the node starts parsing the PDF. | - |
| File | FileSource | The PDF file to parse. Accepts a single file from the library. | - |
| Parser | Enum | The parsing method to use. Options: analytical or ai. | analytical |
Outputs
| Output | Type | Description |
|---|---|---|
| Done | Event | Fires when the node has finished parsing. |
| Text | Text | The extracted text content in markdown format. Returns an error object if parsing fails. |
Runtime Behavior and Defaults
- File Handling: If an array of files is provided, the node processes only the first file.
- Project Context: Requires a valid project ID from the workflow runtime context. Returns an error if no project ID is found.
- Analytical Mode: Uses traditional PDF parsing to extract text content directly from the document structure.
- AI Mode: Generates a temporary download link for the file and processes it through an AI vision service. This mode is better for scanned documents or complex layouts but may take longer and consume additional tokens.
- Error Handling: Returns an error object in the
textoutput if the file is invalid, the project ID is missing, or parsing fails.
Example Usage
Basic PDF Text Extraction
[Start Node] → Run → [Read PDF Node] → Done → [AI Write Node]
↓
File (PDF from library)
↓
Text → Prompt input of AI Write
Processing Scanned Documents
- Set the Parser property to
aiin the properties panel. - Connect a file source containing a scanned PDF to the File input.
- Connect the Text output to an AI Write node to summarize or analyze the scanned content.
Error Handling Pattern
[Read PDF Node] → Text → [If Node] → True (text contains content) → [Process Text]
↓
False (text contains error) → [Log Error Node]