Node Description

Brand Discover Node

The Brand Discover node is designed to take a long list of Brand names and intelligently group them into a Brand Dictionary. The Brand Dictionary can then be used by a downstream Brand Repair node to clean up and repair the Brand names found in an Input Product Array.

Brand names that are a subset of longer Brand names will be grouped together in the Brand Dictionary. For example, “BRAND” will be grouped together with “BRANDNAME”. Similar names will also be grouped. For example, “BRANDNME” will be grouped together with “BRANDNAME” as the string distance between both is very small. The longest English name found and the longest Chinese name found in the grouped raw Brands defines the clean Brand name in the ‘Then Brand’ output column.

Market Model Brands comprise of an upper-case English Part, a Chinese Part, and a Numeric Part. Each part, if it exists, is separated by a slash when output. For example, “BRANDNAME/品牌/1969”. If the Brand has no Chinese Part and no Numeric Part then just the English Part is used without any slashes. For example, “BRANDNAME”. Similarly if the Brand has no English Part or no Numeric Part. All spaces and punctuation is removed and all lower-case characters are converted into upper-case.

This Premium Node is not available as part of the free Community Edition. Premium Nodes help clean and connect real-world data to Market Simulations, and provide advanced Market Science analysis. Note that these descriptions are often deliberately vague.

Downloads

Brand Discovery

The Brand Discover node is designed to take a long list of Brand names and intelligently group them into a Brand Dictionary. The Brand Dictionary can then be used by a downstream Brand Repair node to clean up and repair the Brand names found in an Input Product Array. Both Chinese and English is currently supported.

Inputs

Input Product Array

Contains a long list of all Brand names (both raw and clean names) used to describe each of the Products.

Node

Configuration

The user can specify the minimum number of times a raw Brand name needs to be found in the long Input Product Array for it to be included in the Output Brand Dictionary. The user also specifies the minimum number of times all raw Brand names associated with the final clean Brand needs to be found in the long Input Product Array for it to be included in the Output Brand Dictionary. 

Outputs

Brand Dictionary

The output dictionary containing each discovered raw Brand name and the most likely matching clean Brand name.