Combine Search RRF
Uncontrolled node
Overview
The Combine Search RRF node merges multiple ranked search result lists into a single unified result set using Reciprocal Rank Fusion (RRF). This algorithm is particularly effective for hybrid search scenarios where you want to combine results from different retrieval methods (e.g., vector similarity search and keyword search) without requiring score normalization between different scoring systems.
RRF works by assigning each item a score based on its rank in each input list rather than its absolute score. The formula for the RRF score is:
RRF Score = Σ (1 / (rank_in_list + k))
Where k is a constant (default 60) that controls how much weight is given to lower-ranked items. Items that appear in multiple lists receive higher combined scores, while items that only appear in one list receive lower scores.
Key Features
- Score Agnostic: Works with any ranking system (cosine similarity, BM25, etc.) without requiring normalization
- Dynamic Inputs: Supports 2-10 input result lists configurable via the
resultsCountproperty - Rank Preservation: Maintains ranking information from each source list in the output
- Deduplication: Automatically identifies and merges duplicate items across lists using a configurable match field
Inputs
| Input | Type | Description | Default |
|---|---|---|---|
| K | Number | The RRF constant used in the scoring formula. Higher values give more weight to lower-ranked items. | 60 |
| Results 0 | Data | The first ranked list of search results. Each item should have a field matching the Match Field setting (default: intellectible_id). | - |
| Results 1 | Data | The second ranked list of search results. | - |
| Results N | Data | Additional result lists (up to Results 9) controlled by the Results Count setting. | - |
Dynamic Input Configuration
The number of available result inputs is controlled by the resultsCount property (set in the properties panel). This can be set between 2 and 10. When you increase the count, additional inputs (Results 2, Results 3, etc.) will appear on the node.
Outputs
| Output | Type | Description |
|---|---|---|
| Result | Data | A single array containing the fused search results, sorted by RRF score in descending order (highest score first). |
Output Structure
Each item in the output array contains:
- Original fields: All fields from the input item (e.g.,
text,metadata,intellectible_id) - rrfScore: The calculated Reciprocal Rank Fusion score (number)
- score0, score1, ... scoreN: The original score from each input list (null if the item wasn't present in that list)
- rank information: Implicitly encoded in the RRF score calculation
Items are sorted by rrfScore in descending order.
Runtime Behavior and Defaults
- Node Type: Uncontrolled (runs automatically when inputs change; does not require an event trigger)
- Default K: 60 (standard RRF constant)
- Default Match Field:
intellectible_id(used to identify the same item across different result lists) - Max Inputs: 10 result lists
- Min Inputs: 2 result lists
- Ranking: Items not found in a particular list are assigned a rank of
list_length + 1for that list - Deduplication: Items are matched across lists using the field specified in
Match Field
Important Notes
- Input Format: Each result list should be an array of objects. Objects must contain the field specified by
Match Field(default:intellectible_id) to be matched across lists. - Score Handling: The original
scorefield is removed from output items and replaced withscore0,score1, etc. to avoid confusion with the RRF score. - Empty Lists: If an input list is empty or missing, it is treated as having no items (all other items receive the penalty rank for that list).
Example Usage
Hybrid Search (Vector + Text Search)
A common use case is combining vector similarity search results with traditional text search results:
- Connect a Vector Search Database node output to Results 0
- Connect a Text Search Database node output to Results 1
- Set Match Field to
intellectible_id(assuming both searches return rows with this ID) - The Result output will contain a unified list where items appearing in both searches rank higher than items appearing in only one
Multi-Strategy Retrieval
You can combine more than two search strategies by increasing the Results Count property:
- Set Results Count to 3 in the properties panel
- Connect results from:
- Dense vector search → Results 0
- Sparse vector search → Results 1
- Keyword search → Results 2
- The node will automatically calculate RRF scores across all three lists
Example Output
Given two input lists:
Results 0 (from vector search):
[
{"intellectible_id": "doc1", "score": 0.95, "text": "..."},
{"intellectible_id": "doc2", "score": 0.87, "text": "..."}
]
Results 1 (from text search):
[
{"intellectible_id": "doc2", "score": 0.92, "text": "..."},
{"intellectible_id": "doc3", "score": 0.85, "text": "..."}
]
The Result output might look like:
[
{
"intellectible_id": "doc2",
"text": "...",
"rrfScore": 0.0328,
"score0": 0.87,
"score1": 0.92
},
{
"intellectible_id": "doc1",
"text": "...",
"rrfScore": 0.0164,
"score0": 0.95,
"score1": null
},
{
"intellectible_id": "doc3",
"text": "...",
"rrfScore": 0.0161,
"score0": null,
"score1": 0.85
}
]
Note: doc2 ranks highest because it appears in both lists (rank 2 in list 0, rank 1 in list 1: 1/(2+60) + 1/(1+60) ≈ 0.0328).