Remove Stop Words
Uncontrolled node
Overview
The Remove Stop Words node filters out common stop words from text input. Stop words are frequently occurring words (such as "the", "is", "at", "which") that are often not useful for text analysis, natural language processing, or AI model inputs.
This node tokenizes the input text, removes stop words based on the selected language dictionary, and returns both the filtered text and the list of remaining tokens. It supports multiple languages and allows for custom stop word lists.
Supported Languages
The node includes built-in stop word dictionaries for the following languages:
| Language Code | Language |
|---|---|
| eng | English |
| ara | Arabic |
| cat | Catalan |
| dan | Danish |
| deu | German |
| ell | Greek |
| fin | Finnish |
| fra | French |
| hun | Hungarian |
| ind | Indonesian |
| ita | Italian |
| lit | Lithuanian |
| nep | Nepali |
| nor | Norwegian |
| por | Portuguese |
| ron | Romanian |
| rus | Russian |
| spa | Spanish |
| swe | Swedish |
| tur | Turkish |
You can provide additional custom stop words via the Custom input. These are combined with the language-specific stop words. Enter as a comma-separated string (e.g., "custom, words, here") or an array of strings.
Inputs
| Input | Type | Description | Default |
|---|---|---|---|
| Text | Text | The input text from which to remove stop words. | - |
| Language | Enum | The language code for the stop word dictionary to use. | eng |
| Custom | Text | Additional custom stop words to remove (comma-separated string or array). | - |
Outputs
| Output | Type | Description |
|---|---|---|
| Text | Text | The filtered text with stop words removed, joined by single spaces. |
| List | List | An array of the remaining tokens after stop word removal. |
Runtime Behavior
The node processes text immediately when the Text input changes (uncontrolled behavior).
Processing Steps:
- Tokenization: Converts text to lowercase, replaces punctuation with spaces, and splits on whitespace
- Filtering: Removes any tokens that match the selected language's stop word dictionary
- Custom Filtering: Additionally removes any tokens matching the custom stop words list (if provided)
- Output: Returns the remaining tokens as both a joined string and an array
Defaults:
- If no language is specified, defaults to English (
eng) - If the input text is empty or not a string, returns an error object
- Custom stop words are optional and can be provided as either a comma-separated string or an array
Example
Input:
- Text:
"The quick brown fox jumps over the lazy dog" - Language:
eng - Custom:
"quick, lazy"
Output:
- Text:
"brown fox jumps over dog" - List:
["brown", "fox", "jumps", "over", "dog"]
Use Case: This node is commonly used to preprocess text before:
- Sending to AI models for summarization or analysis
- Creating word clouds or frequency analysis
- Text classification or sentiment analysis workflows
- Database text search optimization