Write Parquet

Controlled node

Overview

The Write Parquet node converts tabular data (arrays of records/objects) into Apache Parquet format and saves it to the project's library. Parquet files are columnar storage files that provide efficient compression and encoding schemes, making them ideal for storing large datasets and analytical workloads.

This node is commonly used to persist processed dataframes, database query results, or transformed CSV data in a format optimized for downstream analytics and storage efficiency.

Inputs

Input	Type	Description	Default
Run	Event	Triggers the write operation. The node will not execute until this event fires.	-
Data	Data	The tabular data to write. Accepts an array of objects/records (e.g., `[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]`).	-
Name	Text	The filename for the output. If the name does not end with `.parquet`, the extension is automatically appended.	-
Path	Text	Optional folder path within the library where the file should be saved (e.g., `"exports/2024"`). If the path does not exist, directories are created automatically. If omitted, the file is saved to the root directory.	`root`

Outputs

Output	Type	Description
Done	Event	Fires when the Parquet file has been successfully written to storage and metadata has been updated.
File	Data	Returns a file object containing metadata about the saved Parquet file: `id`, `name`, `mimeType` (`application/vnd.apache.parquet`), `dir` (directory ID), and `size` (bytes).

Runtime Behavior and Defaults

Automatic Extension: If the provided filename does not include the .parquet extension, it is automatically appended to ensure proper file typing.
Directory Creation: The node automatically creates any missing directories specified in the path input using the library's folder structure.
Storage: Files are written to Google Cloud Storage at gs://{bucket}/libraryDocuments/{projectId}/{fileId}.parquet and registered in the project's document metadata.
Error Handling: If no data is provided, the project ID is unavailable, or the filename is invalid, the node outputs an error object on the file output instead of the file metadata.
Data Format: The input data should be an array of plain objects. Nested objects and arrays are supported but will be serialized according to Parquet schema inference rules.

Example Usage

Connect the output of a database query or data processing node to the Data input, specify a filename, and trigger the Run event to save the results:

Connect a Query Database node's result output to the Data input of Write Parquet.
Set the Name input to "monthly_sales_report" (the node will save it as monthly_sales_report.parquet).
Set the Path input to "reports/2024" to organize the file in a subfolder.
Connect a Start node or event trigger to the Run input.
When executed, the node fires the Done event and outputs the file reference, which can then be passed to Get Library Download URLs to generate a shareable link or used as input to other file processing nodes.

Overview​

Inputs​

Outputs​

Runtime Behavior and Defaults​

Example Usage​

Overview

Inputs

Outputs

Runtime Behavior and Defaults

Example Usage