Brand Repair Node
The Brand Repair node is designed to look through an Input Product Array for raw Brand names and match them against a cleaned Brand Dictionary. If a match is found then the raw Brand name will be replaced by the clean Brand name found in the Brand Dictionary.
The Brand Repair node may sit directly downstream from a Brand Discover node. The Brand Discover node is designed to take a long list of Brand names and intelligently group them into a clean Brand Dictionary.
Market Model Brands comprise of an upper-case English Part, a Chinese Part, and a Numeric Part. Each part, if it exists, is separated by a slash when output. For example, “BRANDNAME/品牌/1969”. If the Brand has no Chinese Part and no Numeric Part then just the English Part is used without any slashes. For example, “BRANDNAME”. Similarly if the Brand has no English Part or no Numeric Part. All spaces and punctuation is removed and all lower-case characters are converted into upper-case.
The ‘If Brand’ column in the Input Brand Dictionary is first sorted before it is matched against raw Brand names. The ‘If Brand’ is sorted by longest-to-shorted string length and then in descending Z-A lexicographical order. This Z-A lexicographical sorting will also sort Chinese before the English. More obscure characters (such as ‘z’ and Chinese characters with many-strokes) are typically more unique and hence contain more information than common characters (such as ‘a’ and ‘一’). Hence this descending Z-A lexicographical order should lead to better Brand matches.
The Brand Repair node is designed to look through an Input Product Array for raw Brand names and match them against a cleaned Brand Dictionary. If a match is found then the raw Brand name will be replaced by the clean Brand name found in the Brand Dictionary. Both Chinese and English is currently supported.
Input Product Array
The set of Products that may be included in the Market, including their raw Brand names.
The input dictionary containing each raw Brand name to match and the corresponding replacement clean Brand name. The Input Brand Dictionary may have been generated by the Brand Discover node.
The user specifies which column from the Input Product Array contains the raw Brand names that need to be cleaned by the Brand Dictionary.
After searching the Brand Column, the Brand Repair node can also search a Description column and an Attribute column from the Input Product Array.