Code Nodes: Run Custom JavaScript and Python

Code nodes unlock the full power of programming within your Intellectible workflows. Use them to perform complex data transformations, integrate with any API, run machine learning models, or implement custom business logic that goes beyond standard nodes.

What are Code Nodes?

Code nodes allow you to execute custom JavaScript (Node.js 22) or Python (3.11) code as part of your workflow. Your code runs in isolated, secure containers with access to:

Popular packages - Install any npm or pip package
Workflow inputs - Receive data from previous nodes
External APIs - Make HTTP requests to any service
Machine learning libraries - Use PyTorch, TensorFlow, scikit-learn, etc.
Data processing tools - Pandas, NumPy, Lodash, and more

Each execution runs in a fresh, isolated environment and is automatically cleaned up afterward — giving you full control without infrastructure management.

When to Use Code Nodes

Code nodes are ideal when standard nodes aren't enough:

Use Case	Why Use Code Nodes
Custom API Integrations	Call APIs with complex authentication, custom headers, or special request formatting
Advanced Data Transformation	Complex calculations, data reshaping, or custom algorithms
Machine Learning	Run inference with pre-trained models or custom ML pipelines
Business Logic	Implement company-specific rules, validation, or decision-making
Data Analysis	Statistical analysis, data aggregation, or custom reporting
External Libraries	Use specialized libraries not available in standard nodes

How Code Nodes Work

The Execution Flow

1. Write Code → 2. Install Packages → 3. Receive Inputs → 4. Execute → 5. Return Output

Write your code in the workflow editor
Specify packages (optional) - npm or pip packages to install
Add prebuild script (optional) - setup commands like downloading models
Receive inputs from previous nodes via inputs.json
Code executes in an isolated container (10 minute timeout)
Output returned via output.json or stdout

Fast Cold Starts

Intellectible uses a content-addressed caching system to achieve sub-second cold starts:

First run: Packages are installed, prebuild runs (10-60 seconds)
Subsequent runs: Cached environment loads instantly (~500ms)
Automatic deduplication: Identical code + packages = same cached environment

This means you can use heavy packages like PyTorch or TensorFlow without worrying about slow execution times after the first run.

Key Concepts

Inputs and Outputs

Reading Inputs (Python):

import json

# Inputs automatically available in inputs.json
with open('inputs.json', 'r') as f:
    inputs = json.load(f)

name = inputs.get('name', 'World')
count = inputs.get('count', 1)

Reading Inputs (JavaScript):

const fs = require('fs');

// Inputs automatically available in inputs.json
const inputs = JSON.parse(fs.readFileSync('inputs.json', 'utf8'));

const name = inputs.name || 'World';
const count = inputs.count || 1;

Writing Outputs (Python):

import json

# Write result to output.json
result = {
    'message': f'Hello {name}!',
    'processed': True,
    'count': count * 2
}

with open('output.json', 'w') as f:
    json.dump(result, f)

Writing Outputs (JavaScript):

const fs = require('fs');

// Write result to output.json
const result = {
    message: `Hello ${name}!`,
    processed: true,
    count: count * 2
};

fs.writeFileSync('output.json', JSON.stringify(result, null, 2));

Structuring Your Code

Code nodes run as standalone scripts. In JavaScript, await cannot be used at the top level — wrap your logic in an async function:

const fs = require('fs');

async function main() {
    const inputs = JSON.parse(fs.readFileSync('inputs.json', 'utf8'));
    // ... your async logic here
    fs.writeFileSync('output.json', JSON.stringify(result, null, 2));
}

main().catch(err => console.error(err));

In Python, use a main() function with a __main__ guard:

import json

def main():
    with open('inputs.json', 'r') as f:
        inputs = json.load(f)
    # ... your logic here
    with open('output.json', 'w') as f:
        json.dump(result, f)

if __name__ == '__main__':
    main()

tip

For more details on structuring code with the SDK, see the LibraryClient SDK and DatabaseClient SDK reference docs.

Installing Packages

Specify packages as an array or comma separated list in the code node packages configuration:

Python packages:

["requests==2.31.0", "pandas==2.0.3", "numpy==1.24.3"]

JavaScript packages:

["axios@1.6.0", "lodash@4.17.21", "moment@2.29.4"]

Version pinning recommended: Always specify exact versions for reproducibility.

Prebuild Scripts

Use prebuild scripts for one-time setup tasks:

Download a machine learning model:

#!/bin/bash
pip install sentence-transformers==3.3.1

python3 << DOWNLOAD_MODEL
from sentence_transformers import SentenceTransformer

# Download and cache model
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
model.save("./model")
print("Model downloaded successfully")
DOWNLOAD_MODEL

Install system dependencies:

#!/bin/bash
apt-get update && apt-get install -y ffmpeg
pip install moviepy

Prebuild scripts run once during the initial build. Subsequent executions skip this step and use the cached environment.

Execution Timeout

Code nodes have a 10 minute execution timeout. If your code takes longer:

❌ Avoid: Long-running computations or API calls
✅ Instead: Optimize algorithms, use streaming, or break into multiple steps

Security & Isolation

Every execution runs in a fresh, isolated container with:

No persistent state - Containers are destroyed after execution
Network restrictions - Limited egress (S3, DNS only by default)
Non-root user - Runs as UID 10000 with restricted permissions
Read-only filesystem - Only /tmp is writable

Your code is isolated from other workflows and users.

Common Patterns

Pattern 1: API Integration with Error Handling

import json
import requests

with open('inputs.json', 'r') as f:
    inputs = json.load(f)

api_key = inputs.get('api_key')
query = inputs.get('query', '')

try:
    response = requests.get(
        'https://api.example.com/search',
        headers={'Authorization': f'Bearer {api_key}'},
        params={'q': query},
        timeout=25  # Leave time for processing
    )
    response.raise_for_status()

    result = {
        'success': True,
        'data': response.json(),
        'status_code': response.status_code
    }
except requests.exceptions.RequestException as e:
    result = {
        'success': False,
        'error': str(e),
        'error_type': type(e).__name__
    }

# Write result to output.json
with open('output.json', 'w') as f:
    json.dump(result, f, indent=2)

Pattern 2: Data Transformation with Pandas

import json
import pandas as pd

with open('inputs.json', 'r') as f:
    inputs = json.load(f)

# Convert input data to DataFrame
df = pd.DataFrame(inputs['records'])

# Transform data
df['total'] = df['price'] * df['quantity']
df['category'] = df['category'].str.upper()

# Filter and aggregate
summary = df.groupby('category').agg({
    'total': 'sum',
    'quantity': 'sum'
}).reset_index()

result = {
    'summary': summary.to_dict(orient='records'),
    'total_revenue': df['total'].sum(),
    'record_count': len(df)
}

# Write result to output.json
with open('output.json', 'w') as f:
    json.dump(result, f, indent=2)

Pattern 3: Machine Learning Inference

import json
import sys
from sentence_transformers import SentenceTransformer
import numpy as np

with open('inputs.json', 'r') as f:
    inputs = json.load(f)

# Load cached model from prebuild
model = SentenceTransformer('./model')

query = inputs.get('query', '')
documents = inputs.get('documents', [])

# Generate embeddings
query_embedding = model.encode(query)
doc_embeddings = model.encode(documents)

# Compute similarity scores
from sklearn.metrics.pairwise import cosine_similarity
scores = cosine_similarity([query_embedding], doc_embeddings)[0]

# Rank results
top_indices = np.argsort(scores)[::-1][:5]

results = []
for idx in top_indices:
    results.append({
        'document': documents[idx],
        'score': float(scores[idx]),
        'rank': len(results) + 1
    })

result = {
    'success': True,
    'query': query,
    'results': results
}

# Write result to output.json
with open('output.json', 'w') as f:
    json.dump(result, f, indent=2)

Choosing Between JavaScript and Python

Consideration	JavaScript (Node.js)	Python
Data Processing	Good for JSON/text	Better for numerical/scientific
ML/AI Libraries	Limited	Extensive (PyTorch, TensorFlow, scikit-learn)
API Integrations	Excellent	Excellent
Package Ecosystem	npm (massive ecosystem)	pip (strong data science focus)
Startup Time	Fast (~500ms)	Fast (~500ms)
Syntax Familiarity	JavaScript developers	Python developers

Recommendation: Use Python for data science, ML, and scientific computing. Use JavaScript for general-purpose logic, API integrations, and when you prefer JavaScript syntax.

Performance Tips

Pin package versions - Ensures consistent cached environments
Minimize package count - Fewer packages = faster first build
Use prebuild for one-time setup - Downloads, compilations, etc.
Keep execution under 9 minutes - Leaves buffer before 10 minute timeout
Return only necessary data - Large inpugs and outputs (>100KB) are automatically compressed and streamed in and out of the code nodes

Limitations

10-minute execution timeout - Workflows terminate if code runs longer
No persistent storage - Use library or database nodes for persistence
Limited network access - Restricted to external APIs (no internal services)
10MB input limit - For larger inputs, use library file references

Accessing Project Data from Code Nodes

Code nodes can securely access your project's library files and databases, enabling powerful data-driven workflows:

Library Access:

List and download files from your project library
Upload generated reports, CSVs, or processed data
Delete or manage files programmatically

Database Access:

Query project databases for lookups and analysis
Insert execution logs, metrics, or workflow results
Create tables dynamically and manage schemas

How it works:

Automatic authentication (no API keys in your code)
Project-scoped access (isolated from other projects)
Pre-installed SDK (no installation required)

Example - Download and process library file:

const { LibraryClient } = require('@intellectible/execution-sdk');

const library = new LibraryClient();
const files = await library.listFiles();
const csvFile = files.find(f => f.type === 'text/csv');

const downloadUrl = await library.createDownloadUrl(csvFile.id);
const response = await fetch(downloadUrl);
const data = await response.text();

// Process data...

Example - Log workflow execution to database:

from intellectible_execution import DatabaseClient
from datetime import datetime

database = DatabaseClient()

# Create execution log table (assuming db_id from list_databases())
databases = database.list_databases()
db_id = databases[0]['id']

database.create_table(db_id, 'executions')
database.create_column(db_id, 'executions', 'timestamp', 'text')
database.create_column(db_id, 'executions', 'status', 'text')

# Log execution
database.insert_rows(db_id, 'executions', [{
    'timestamp': datetime.now().isoformat(),
    'status': 'success'
}])

Learn more:

📖 Code Node Data Access Guide - Conceptual overview and patterns
📚 LibraryClient SDK Reference - Complete API documentation
📚 DatabaseClient SDK Reference - Complete API documentation

Getting Started

Ready to build your first code node workflow? Check out the tutorial:

👉 Tutorial: Build a Semantic Search Workflow with Python

What are Code Nodes?​

When to Use Code Nodes​

How Code Nodes Work​

The Execution Flow​

Fast Cold Starts​

Key Concepts​

Inputs and Outputs​

Structuring Your Code​

Installing Packages​

Prebuild Scripts​

Execution Timeout​

Security & Isolation​

Common Patterns​

Pattern 1: API Integration with Error Handling​

Pattern 2: Data Transformation with Pandas​

Pattern 3: Machine Learning Inference​

Choosing Between JavaScript and Python​

Performance Tips​

Limitations​

Accessing Project Data from Code Nodes​

Getting Started​

See Also​