AI Vision Write Data

Controlled node

Overview

AI Vision Write Data uses a vision-language model to analyze images and extract structured information. Unlike the AI Vision Write node which generates free-form text, this node requires a schema definition to structure the output as JSON. This makes it ideal for tasks like extracting specific fields from documents, analyzing visual data, or converting image content into structured records.

Provide an image, a prompt describing what information to extract, and a schema defining the desired output structure. The node will analyze the image and return data matching your schema specification.

Model options

Intellectible provides several vision-language models for AI Vision Write Data.

Model	Description	Context Size
ultra light	Fast, lightweight vision model suitable for simple extraction tasks	128000
standard	Balanced performance and quality for most vision tasks	128000
advanced	High-quality vision model (Kimi K2.5) for complex analysis and detailed extraction	128000

Schema required

This node requires a valid schema input to define the structure of the output data. The schema can be a JSON Schema object or a regular object that will be automatically converted to JSON Schema. If the schema is invalid or missing, the node will return an error.

Inputs

Input	Type	Description	Default
Run	Event	Fires when the node starts running	-
Model	Enum	The vision model to use for analysis	advanced
Prompt	Text	Instructions for what information to extract from the image	-
Image	Data	The image to analyze (accepts Jimp image objects from Read Image or image processing nodes)	-
Schema	Data	JSON Schema or object defining the desired output structure	-
Max Tokens	Number	Maximum number of tokens to generate in the response	2000
Temperature	Number	Controls randomness/creativity in the output (0-1)	0.7
Seed	Number	Random seed for reproducible results	1

Outputs

Output	Type	Description
Done	Event	Fires when the node has finished processing
Output	Data	Structured JSON data matching the provided schema

Runtime Behavior

The node converts the input image to a base64-encoded JPEG and sends it to the vision-language model along with your prompt. The model is instructed to respond only in JSON format.

If the Schema input is provided, the node will:

Validate if it's already a JSON Schema, or convert it using toJsonSchema if it's a regular object
Pass the schema to the model to ensure structured output
Parse the model's response and validate it against the schema

If Randomize Output is enabled (in the properties panel), the seed is ignored and outputs will vary between runs. Otherwise, the same inputs and seed will produce identical outputs.

The node automatically bills tokens based on the prompt tokens (including image processing) and completion tokens used during the generation.

Example

Extracting structured data from a receipt:

Use a Read Image node to load a receipt image
Connect the image output to the Image input of AI Vision Write Data
Set the Prompt to: "Extract the merchant name, date, total amount, and list of items from this receipt"

Create a Schema node (or Dictionary node) defining the structure:

{
  "merchant": "string",
  "date": "string", 
  "total": "number",
  "items": [{"name": "string", "price": "number"}]
}

Connect the schema to the Schema input

When run, the node outputs a JSON object like:

{
  "merchant": "Coffee Shop",
  "date": "2024-01-15",
  "total": 24.50,
  "items": [
    {"name": "Latte", "price": 4.50},
    {"name": "Sandwich", "price": 12.00}
  ]
}

Overview​

Model options​

Inputs​

Outputs​

Runtime Behavior​

Example​