Node Description

String Repair Node

The String Repair node is typically designed to look through an Input Product Array for the String columns selected by the user for clean up and repair. Dirty string problems typically result from crawling data from website issues.

The columns typically selected for repair would include the Product Name column, the Product Description column, and the Product Brand column. The other Output Product Array columns would otherwise remain unchanged.

Note that the String Repair node may be applied to any input table that contains crawled or dirty data.

This Premium Node is not available as part of the free Community Edition. Premium Nodes help clean and connect real-world data to Market Simulations, and provide advanced Market Science analysis. Note that these descriptions are often deliberately vague.

Downloads

String Repair

The String Repair node is typically designed to look through an Input Product Array for the String columns selected by the user for clean up and repair. Dirty string problems can arise when crawling data from competitor websites.

Inputs

Input Product Array

The Input Product Array or other table containing the column of input Strings that will be cleaned.

Node

Configuration

The node can be configured to perform the following types of string cleaning:

  • Unescape HTML
  • Remove Leading / Trailing Spaces
  • Remove Irregular Patterns
  • Replace Unicode Punctuation with ASCII Punctuation

Outputs

Output Product Array

The Output Product Array corresponds to the Input Product Array but updated with clean strings.