Skip to main content

AI Vision Write Data

Controlled node

Overview

AI Vision Write Data uses a vision-language model to analyze images and extract structured information. Unlike the AI Vision Write node which generates free-form text, this node requires a schema definition to structure the output as JSON. This makes it ideal for tasks like extracting specific fields from documents, analyzing visual data, or converting image content into structured records.

Provide an image, a prompt describing what information to extract, and a schema defining the desired output structure. The node will analyze the image and return data matching your schema specification.

Model options

Intellectible provides several vision-language models for AI Vision Write Data.

ModelDescriptionContext Size
ultra lightFast, lightweight vision model suitable for simple extraction tasks128000
standardBalanced performance and quality for most vision tasks128000
advancedHigh-quality vision model (Kimi K2.5) for complex analysis and detailed extraction128000
Schema required

This node requires a valid schema input to define the structure of the output data. The schema can be a JSON Schema object or a regular object that will be automatically converted to JSON Schema. If the schema is invalid or missing, the node will return an error.

Inputs

InputTypeDescriptionDefault
RunEventFires when the node starts running-
ModelEnumThe vision model to use for analysisadvanced
PromptTextInstructions for what information to extract from the image-
ImageDataThe image to analyze (accepts Jimp image objects from Read Image or image processing nodes)-
SchemaDataJSON Schema or object defining the desired output structure-
Max TokensNumberMaximum number of tokens to generate in the response2000
TemperatureNumberControls randomness/creativity in the output (0-1)0.7
SeedNumberRandom seed for reproducible results1

Outputs

OutputTypeDescription
DoneEventFires when the node has finished processing
OutputDataStructured JSON data matching the provided schema

Runtime Behavior

The node converts the input image to a base64-encoded JPEG and sends it to the vision-language model along with your prompt. The model is instructed to respond only in JSON format.

If the Schema input is provided, the node will:

  • Validate if it's already a JSON Schema, or convert it using toJsonSchema if it's a regular object
  • Pass the schema to the model to ensure structured output
  • Parse the model's response and validate it against the schema

If Randomize Output is enabled (in the properties panel), the seed is ignored and outputs will vary between runs. Otherwise, the same inputs and seed will produce identical outputs.

The node automatically bills tokens based on the prompt tokens (including image processing) and completion tokens used during the generation.

Example

Extracting structured data from a receipt:

  1. Use a Read Image node to load a receipt image
  2. Connect the image output to the Image input of AI Vision Write Data
  3. Set the Prompt to: "Extract the merchant name, date, total amount, and list of items from this receipt"
  4. Create a Schema node (or Dictionary node) defining the structure:
    {
    "merchant": "string",
    "date": "string",
    "total": "number",
    "items": [{"name": "string", "price": "number"}]
    }
  5. Connect the schema to the Schema input
  6. When run, the node outputs a JSON object like:
    {
    "merchant": "Coffee Shop",
    "date": "2024-01-15",
    "total": 24.50,
    "items": [
    {"name": "Latte", "price": 4.50},
    {"name": "Sandwich", "price": 12.00}
    ]
    }