AI Vision Write
Controlled node
Overview
AI Vision Write uses a vision-capable language model to analyze images and generate text based on a provided prompt. This node accepts an image input (as a Jimp image object) and processes it alongside your text prompt to produce descriptive outputs, analysis, or any text generation task that requires visual understanding.
By default, the output is locked to a seed (1) for reproducibility. If you want different outputs each time, enable the Randomize Output option in the panel. If none of the inputs change and randomization is disabled, the generation output is cached for faster development.
Model options
Intellectible provides three vision model tiers optimized for different performance and quality needs:
| Model | Description |
|---|---|
| ultra light | Fast, lightweight vision model suitable for quick analysis and simple image descriptions. |
| standard | Balanced performance model offering good quality vision analysis at moderate cost. |
| advanced | High-quality vision model (default) providing detailed image understanding and analysis. |
The node automatically converts input images to JPEG format for processing. Ensure your image input is a valid Jimp image object (e.g., from the Read Image or Screenshot Webpage nodes).
Inputs
| Input | Type | Description | Default |
|---|---|---|---|
| Model | Enum | The vision model to use for generation. | advanced |
| Prompt | Text | The text prompt describing what you want the AI to analyze or generate based on the image. | - |
| Image | Data | The input image to analyze (must be a Jimp image object). | - |
| Max Tokens | Number | Specifies the maximum number of tokens to generate. | 2000 |
| Temperature | Number | Controls the creativity/randomness of the output (0.0 to 1.0+). | 0.7 |
| Seed | Number | Sets the random seed for reproducible results. Ignored if Randomize Output is enabled. | 1 |
| Run | Event | Triggers the node to start processing. | - |
Outputs
| Output | Type | Description |
|---|---|---|
| Output | Text | Contains the generated text based on the image and prompt analysis. |
| Done | Event | Fires when the node has finished processing and the output is ready. |
Runtime Behavior and Defaults
When triggered, AI Vision Write:
- Validates that the image input is a valid Jimp image object; if not, outputs an empty string
- Clones and converts the image to a base64-encoded JPEG string
- Constructs a vision message payload with your prompt and the encoded image
- Sends the request to the Together AI API using the selected vision model
- Returns the generated text content via the Output socket and fires the Done event
Default Values:
- Model:
advanced - Max Tokens:
2000 - Temperature:
0.7 - Seed:
1
Token Billing: Vision model calls are billed at 2x the standard token rate to account for image processing costs.
Example Usage
Basic Image Analysis:
- Connect a Read Image node (or Screenshot Webpage) to the Image input
- Connect a Text node with your analysis question (e.g., "Describe what you see in this image") to the Prompt input
- Trigger the Run event to start the analysis
- The Output will contain the AI's description of the image
Workflow Integration:
[Read Image] ──image──> [AI Vision Write] ──output──> [Show]
│ │
└──────prompt──────────┘
In this example, the Read Image node loads an image from the library, passes it to AI Vision Write along with a descriptive prompt, and the resulting analysis is displayed using the Show node.