Node Description
String Repair Node
The String Repair node is typically designed to look through an Input Product Array for the String columns selected by the user for clean up and repair. Dirty string problems typically result from crawling data from website issues.
The columns typically selected for repair would include the Product Name column, the Product Description column, and the Product Brand column. The other Output Product Array columns would otherwise remain unchanged.
Note that the String Repair node may be applied to any input table that contains crawled or dirty data.
This Premium Node is not available as part of the free Community Edition. Premium Nodes help clean and connect real-world data to Market Simulations, and provide advanced Market Science analysis. Note that these descriptions are often deliberately vague.
Downloads
String Repair
The String Repair node is typically designed to look through an Input Product Array for the String columns selected by the user for clean up and repair. Dirty string problems can arise when crawling data from competitor websites.
Inputs
Input Product Array
The Input Product Array or other table containing the column of input Strings that will be cleaned.
Node
Configuration
The node can be configured to perform the following types of string cleaning:
- Unescape HTML
- Remove Leading / Trailing Spaces
- Remove Irregular Patterns
- Replace Unicode Punctuation with ASCII Punctuation