Skip to main content

AI Vision Write

Controlled node

Overview

AI Vision Write uses a vision-capable language model to analyze images and generate text based on a provided prompt. This node accepts an image input (as a Jimp image object) and processes it alongside your text prompt to produce descriptive outputs, analysis, or any text generation task that requires visual understanding.

By default, the output is locked to a seed (1) for reproducibility. If you want different outputs each time, enable the Randomize Output option in the panel. If none of the inputs change and randomization is disabled, the generation output is cached for faster development.

Model options

Intellectible provides three vision model tiers optimized for different performance and quality needs:

ModelDescription
ultra lightFast, lightweight vision model suitable for quick analysis and simple image descriptions.
standardBalanced performance model offering good quality vision analysis at moderate cost.
advancedHigh-quality vision model (default) providing detailed image understanding and analysis.
Image Format

The node automatically converts input images to JPEG format for processing. Ensure your image input is a valid Jimp image object (e.g., from the Read Image or Screenshot Webpage nodes).

Inputs

InputTypeDescriptionDefault
ModelEnumThe vision model to use for generation.advanced
PromptTextThe text prompt describing what you want the AI to analyze or generate based on the image.-
ImageDataThe input image to analyze (must be a Jimp image object).-
Max TokensNumberSpecifies the maximum number of tokens to generate.2000
TemperatureNumberControls the creativity/randomness of the output (0.0 to 1.0+).0.7
SeedNumberSets the random seed for reproducible results. Ignored if Randomize Output is enabled.1
RunEventTriggers the node to start processing.-

Outputs

OutputTypeDescription
OutputTextContains the generated text based on the image and prompt analysis.
DoneEventFires when the node has finished processing and the output is ready.

Runtime Behavior and Defaults

When triggered, AI Vision Write:

  1. Validates that the image input is a valid Jimp image object; if not, outputs an empty string
  2. Clones and converts the image to a base64-encoded JPEG string
  3. Constructs a vision message payload with your prompt and the encoded image
  4. Sends the request to the Together AI API using the selected vision model
  5. Returns the generated text content via the Output socket and fires the Done event

Default Values:

  • Model: advanced
  • Max Tokens: 2000
  • Temperature: 0.7
  • Seed: 1

Token Billing: Vision model calls are billed at 2x the standard token rate to account for image processing costs.

Example Usage

Basic Image Analysis:

  1. Connect a Read Image node (or Screenshot Webpage) to the Image input
  2. Connect a Text node with your analysis question (e.g., "Describe what you see in this image") to the Prompt input
  3. Trigger the Run event to start the analysis
  4. The Output will contain the AI's description of the image

Workflow Integration:

[Read Image] ──image──> [AI Vision Write] ──output──> [Show]
│ │
└──────prompt──────────┘

In this example, the Read Image node loads an image from the library, passes it to AI Vision Write along with a descriptive prompt, and the resulting analysis is displayed using the Show node.