# Node Description

# Product Similarity Node

The Product Similarity node generates an ‘Output Product Similarity Rankings’ table by ranking the similarity of each Product in the Market relative to each of the other Products. The Product Similarity Rankings table can then be used by downstream nodes to generate a Product Correlation Matrix. For example, a Similarity Matrix node could be placed directly downstream of this Product Similarity node. Ultimately the Product Correlation Matrix would be used to generate a WTP Matrix containing the Willingness To Pay (WTP) of each Customer for each Product in a Market.

When Products are very similar they become substitutable. If the Price of a Product increases then Customers may prefer to purchase the substitutable Product instead. This node calculates the similarity between Products by comparing:

- Descriptive Attributes,
- Categorical Attributes, and
- Numerical Attributes.

Descriptive Attributes include the Product Description, but could also include the Product Name as well as other descriptive fields in which a partial text match is a helpful indicator of similarity.

Categorical Attributes typically include the Brand, Store, Location, and Family of the Products. A perfect match between Categorical Attributes increases the likelihood that Products are similar, while a mismatch may either reduce the likelihood of similarity or indicate nothing at all. Missing Categorical Attributes are ignored.

Numerical Attributes may include Price, Cost, and Volume, but may also include the Dimensions of the Product or a Level of Quality. The distance between Numerical Attributes is an indication of Product similarity, and Similarity Scores can decrease exponentially the greater the numerical distance between these Attributes.

Several Product Similarity nodes may be cascaded together. This gives the user greater control over the rankings, and allows the user to place greater importance on some Attributes than others. Custom algorithms can also be integrated upstream of Product Similarity nodes to complement the Similarity Scores and Weights calculated between Products.

Once all of the Product-to-Product Similarity Scores have been calculated it becomes possible to Rank each Product relative to each of the others. A Ranking of #1 indicates that the ‘Then Product’ is the closest substitute to the ‘If Product’, while a Ranking of #100 indicates that Price of the ‘If Product’ has almost no impact on the sales of the ‘Then Product’. Either the output ‘Ranking’ column or the output ‘Similarity’ column (default) is used by a downstream node to calculate the Product Correlation Matrix and, ultimately, the WTP Matrix.

*This Premium Node is not available as part of the free Community Edition. Premium Nodes help clean and connect real-world data to Market Simulations, and provide advanced Market Science analysis. Note that these descriptions are often deliberately vague.*

# Downloads

# Attribute Similarity

The Product Similarity node generates an ‘Output Product Similarity Rankings’ table by ranking the similarity of each Product in the Market relative to the others by Descriptive, Categorical, and Numerical Attributes. This Similarity Rankings table can then be used by downstream nodes to generate a Product Correlation Matrix and ultimately a WTP Matrix.

This workflow estimates the degree of similarity between each of the Products using a comparison of their Descriptive, Categorical, and Numerical Attributes. The higher the score, the more likely the Products would be considered as substitutes by customers.

## Inputs

#### Input Product Array

The set of Products in a Market which are being compared and ranked by their mutual similarity. Each row corresponds to a unique Product.

#### Product Attributes

A list of Descriptive, Categorical and Numerical Attributes that supplement those found in the Input Product Array. All Attributes found in this table will be automatically added to the Descriptive, Categorical, and Numerical fields set in the Configuration Dialog.

#### Similarity Scores

A set of Similarity Scores generated by upstream Product Similarity nodes or by customized algorithms. Several Product Similarity nodes may be cascaded together to give the user greater control over Product Rankings.

## Node

#### Configuration

Categorical columns can include fields like Store name, Brand, Location, Family and any other string field. Unlike the Descriptive columns, Categorical columns are only compared for exact sameness.

## Outputs

#### Output Product Array

The Output Product Array is equivalent to the Input Product Array without any changes.

#### Similarity Scores

The list of raw Similarity Scores from Product comparisons of Descriptive Attributes, Categorical Attributes, and Numerical Attributes.

#### Similarity Rankings

The final list of Product Rankings from comparisons of their Descriptive Attributes, Categorical Attributes, and Numerical Attributes.