AI Vision Write Data
Controlled node
Overview
AI Vision Write Data uses a vision-language model to analyze images and extract structured information. Unlike the AI Vision Write node which generates free-form text, this node requires a schema definition to structure the output as JSON. This makes it ideal for tasks like extracting specific fields from documents, analyzing visual data, or converting image content into structured records.
Provide an image, a prompt describing what information to extract, and a schema defining the desired output structure. The node will analyze the image and return data matching your schema specification.
Model options
Intellectible provides several vision-language models for AI Vision Write Data.
| Model | Description | Context Size |
|---|---|---|
| ultra light | Fast, lightweight vision model suitable for simple extraction tasks | 128000 |
| standard | Balanced performance and quality for most vision tasks | 128000 |
| advanced | High-quality vision model (Kimi K2.5) for complex analysis and detailed extraction | 128000 |
This node requires a valid schema input to define the structure of the output data. The schema can be a JSON Schema object or a regular object that will be automatically converted to JSON Schema. If the schema is invalid or missing, the node will return an error.
Inputs
| Input | Type | Description | Default |
|---|---|---|---|
| Run | Event | Fires when the node starts running | - |
| Model | Enum | The vision model to use for analysis | advanced |
| Prompt | Text | Instructions for what information to extract from the image | - |
| Image | Data | The image to analyze (accepts Jimp image objects from Read Image or image processing nodes) | - |
| Schema | Data | JSON Schema or object defining the desired output structure | - |
| Max Tokens | Number | Maximum number of tokens to generate in the response | 2000 |
| Temperature | Number | Controls randomness/creativity in the output (0-1) | 0.7 |
| Seed | Number | Random seed for reproducible results | 1 |
Outputs
| Output | Type | Description |
|---|---|---|
| Done | Event | Fires when the node has finished processing |
| Output | Data | Structured JSON data matching the provided schema |
Runtime Behavior
The node converts the input image to a base64-encoded JPEG and sends it to the vision-language model along with your prompt. The model is instructed to respond only in JSON format.
If the Schema input is provided, the node will:
- Validate if it's already a JSON Schema, or convert it using
toJsonSchemaif it's a regular object - Pass the schema to the model to ensure structured output
- Parse the model's response and validate it against the schema
If Randomize Output is enabled (in the properties panel), the seed is ignored and outputs will vary between runs. Otherwise, the same inputs and seed will produce identical outputs.
The node automatically bills tokens based on the prompt tokens (including image processing) and completion tokens used during the generation.
Example
Extracting structured data from a receipt:
- Use a Read Image node to load a receipt image
- Connect the image output to the Image input of AI Vision Write Data
- Set the Prompt to: "Extract the merchant name, date, total amount, and list of items from this receipt"
- Create a Schema node (or Dictionary node) defining the structure:
{
"merchant": "string",
"date": "string",
"total": "number",
"items": [{"name": "string", "price": "number"}]
} - Connect the schema to the Schema input
- When run, the node outputs a JSON object like:
{
"merchant": "Coffee Shop",
"date": "2024-01-15",
"total": 24.50,
"items": [
{"name": "Latte", "price": 4.50},
{"name": "Sandwich", "price": 12.00}
]
}