Pivot Dataframe

Controlled node

Overview

The Pivot Dataframe node transforms tabular data by pivoting (reshaping) it, similar to Excel pivot tables or pandas pivot operations. It takes a dataframe file (Parquet or CSV) and reorganizes it by turning unique values from one column into new columns, while aggregating values from another column.

This is useful for:

Reshaping data from long format to wide format
Creating summary tables with aggregated metrics
Preparing data for visualization or reporting
Transforming categorical data into feature columns

Inputs

Input	Type	Description	Default
Run	Event	Triggers the pivot operation	-
File	Data	The input dataframe file (Parquet or CSV format)	-
Index Columns	Text	Comma-separated list of column names to use as row identifiers (index)	-
Column to Pivot	Text	The column whose unique values will become new columns	-
Value Column	Text	The column containing values to aggregate	-
Aggregation Function	Enum	How to aggregate values when multiple rows map to the same index/pivot combination: `sum`, `mean`, `median`, `min`, `max`, `count`, `first`, `last`	`sum`

Outputs

Output	Type	Description
Done	Event	Fires when the pivot operation completes
Output	Data	The pivoted dataframe as a new Parquet file object

Runtime Behavior and Defaults

When the node runs, it:

Reads the input file (Parquet or CSV) from storage
Parses the indexColumns input - if provided as a string, splits by commas and trims whitespace
Performs the pivot operation using the specified aggregation function
Creates a new Parquet file in the project's temporary files directory
Returns a file object with metadata including the new file ID, name, and size

Default Values:

aggFunction: sum

File Handling:

Input files must be Parquet (.parquet, .pqt) or CSV (.csv) format
Output is always a Parquet file
The output file is automatically named with a timestamp (e.g., pivot_result_1699123456789.parquet)
Files are stored in the __temporaryFiles__ directory of the project library

Example Usage

Scenario: You have a sales dataset with columns date, product, and revenue, and you want to see total revenue per product for each date.

Configuration:

File: Connect your sales dataset file
Index Columns: date
Column to Pivot: product
Value Column: revenue
Aggregation Function: sum

Result: The output will be a pivoted table where:

Rows represent unique dates
Columns represent unique products (e.g., "Product_A", "Product_B")
Cells contain the sum of revenue for that date-product combination

Input data (long format):

date	product	revenue
2024-01-01	A	100
2024-01-01	B	150
2024-01-02	A	200

Output data (wide format):

date	A	B
2024-01-01	100	150
2024-01-02	200	null

Note: If multiple rows exist for the same index/pivot combination, they are aggregated according to the selected function. For example, with count, you would get the number of transactions per date-product pair.

Overview​

Inputs​

Outputs​

Runtime Behavior and Defaults​

Example Usage​

Overview

Inputs

Outputs

Runtime Behavior and Defaults

Example Usage