Node Description
Customer Distributions
The Customer Distributions node generates input data for a Market Simulation. It takes an optional Input Attributes List to create a set of Customer Distributions representing the Willingness To Pay (WTP) of Customers in the Market. Each row in the set of Output Customer Distributions corresponds to the partworth value of a Feature, or the WTP of a Product, for a Virtual Customer.
The Input Attributes List can define the Distribution Type and Input Parameters of each Output Customer Distribution. If the Input Attributes List does not define the Output Customer Distribution, then the Input Parameters from the Configuration Dialog are used. Unlike the similar Matrix Distributions node, the Output Customer Distributions from this node will not be correlated.
For example, if the user wishes to create a Normal (Gaussian) Customer Distribution, then the Mean and Standard Deviation (SD) is set according to either the Configuration Dialog, or overridden by the ‘A’ column (corresponding to the Mean) and the ‘B’ column (corresponding to the SD) in the Input Attribute List.
Or for example, if the user wishes to create a Uniform Customer Distribution, then the Minimum Value and the Maximum Value is again set according to either the Configuration Dialog, or overridden by the ‘A’ column (now corresponding to the Minimum Value) and the ‘B’ column (now corresponding to the Maximum Value) in the Input Attribute List.
The Output Customer Distributions from this Customer Distributions node can become part of a Customer Willingness To Pay Matrix (WTP Matrix) for a set of Products. The Input WTP Matrix can feed a downstream Market Simulation node or a Market Tuning node.
The Input Attribute List is optional. Missing values will be replaced by the defaults in the Configuration Dialog. If no input table is provided, then the Customer Distributions node will generate a single Customer Distribution with a Distribution Type and Input Parameters set according to the Configuration Dialog.
The available list of Distribution Types for the user to select from includes:
Normal (Gaussian): (Wikipedia) Generates a set of partworth values for each Virtual Customer in the shape of a Normal (Gaussian) Distribution. The partworth values can be drawn randomly or can have evenly changing gaps within a Normal Distribution of a given Mean and Standard Deviation (SD). The output values can be truncated by the Minimum and Maximum limits (if enabled). The Distribution can be sorted in Ascending, Descending, or Random order. Configuration parameters include:
 Mean (A): Any floatingpoint (double) value
 Standard Deviation (B): Any value greater than > 0.0
Linear: (Wikipedia) Generates a set of partworth values for each Virtual Customer in the shape of a Uniform (Linear) Distribution. The partworth values can be drawn randomly or can be evenly spaced between the Starting Value and the Ending Value, optionally truncated by Minimum and Maximum limits. The Distribution can be sorted in Ascending, Descending, or Random order. Configuration parameters include:
 Starting Value (A): Any floatingpoint (double) value (inclusive)
 Ending Value (B): Any floatingpoint (double) value (inclusive)
Asymptote End: (Wikipedia) Generates a set of partworth values from an Exponential Function of the form [a x EXP(b * CustomerID) + c]. The values selected from this Exponential Function will be between the Start value and 0.0 zero such that the beginning of the curve steeply declines but then rounds off and hugs the end value 0.0 zero. Configuration parameters include:
 Start (A): Any value greater than > 0.0
 Curviness (B): The ‘Curviness’ of the Output Customer Distribution. Decreasing the Curviness will flatten the output curve, while increasing the Curviness will cause the output to be more curvy. A Curviness = 1.0 has been preset to provide a reasonable curve for about 10,000 Customer rows.
Asymptote Start: (Wikipedia) Generates a set of partworth values from an Exponential Function of the form [a x EXP(b * CustomerID) + c]. The values selected from this Exponential Function will be between the Start value and 0.0 zero such that the curve initially hugs the Start value and then steeply declines towards 0.0 zero. Configuration parameters include:
 Start (A): Any value greater than > 0.0
 Curviness (B): The ‘Curviness’ of the Output Customer Distribution. Decreasing the Curviness will flatten the output curve, while increasing the Curviness will cause the output to be more curvy. A Curviness = 1.0 has been preset to provide a reasonable curve for about 10,000 Customer rows.
Beta: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of a Beta Distribution with a userspecified Alpha and Beta:
 Alpha (A): Any value greater than > 0.0
 Beta (B): Any value greater than > 0.0
Binomial: (Wikipedia) Generates a set of random integer partworth values for each Virtual Customer in the shape of a Binomial Distribution with a userspecified Number of Trials and Probability of Success. Note that the Bernoulli distribution is a special case of the binomial distribution where just a single trial is conducted (Trials = 1). Configuration parameters include:
 Trials (A): Number of Trials is any integer value greater than > 0.0
 Probability (B): Probability of Success is any value between 0.0 and 1.0 (exclusive)
Cauchy: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of a Cauchy Distribution with a userspecified Median and Scale:
 Median (A): Any floatingpoint (double) value
 Scale (B): Any value greater than > 0.0
ChiSquare: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of a ChiSquare Distribution with a userspecified ‘Degrees of Freedom’. After the partworth value is calculated, the fixed value from ‘Input Parameter B’ is added to shift the result:
 Degrees of Freedom (A): Any value greater than > 0.0
 Then Add Fixed Value (B): Any floatingpoint value added after the random value is calculated
Exponential: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of an Exponential Distribution with a userspecified Mean. After the partworth value is calculated, the fixed value from ‘Input Parameter B’ is added to shift the result:
 Mean (A): Any value greater than > 0.0
 Then Add Fixed Value (B): Any floatingpoint value added after the random value is calculated
F: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of an F Distribution with a userspecified ‘Degrees of Freedom Numerator’ and ‘Degrees of Freedom Denominator’:
 Degrees of Freedom Numerator (A): Any value greater than > 0.0
 Degrees of Freedom Denominator (B): Any value greater than > 0.0
Gamma: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of a Gamma Distribution with a userspecified Shape and Scale:
 Shape (A): Any value greater than > 0.0
 Scale (B): Any value greater than > 0.0
Inverse Gaussian: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of a Inverse Gaussian Distribution with a userspecified Mu and Lambda. As Lambda tends to infinity, the Inverse Gaussian distribution becomes more like a Normal (Gaussian) distribution:
 Mu (A): The Mean having any value greater than > 0.0
 Lambda (B): The Shape Parameter having any value greater than > 0.0
Poisson: (Wikipedia) The Poisson Distribution can be used for modeling the number of times an event occurs in an interval of time or space. Generates a set of random partworth values for each Virtual Customer in the shape of a Poisson Distribution with a userspecified Probability and Entropy:
 Lambda (A): The Poisson Mean having any value greater than > 0.0
 Entropy (B): The Convergence criterion for cumulative probabilities (set to 0.0 by default)
Quadratic: (Wikipedia) The Quadratic Distribution starts at the yintersect, decreases (or increases) to touch the xintersect once, then increases (or decreases) again. The Distribution follows the equation [y = a ( x^2 – b )] with only one xintersection occurring at the minimum (or maximum) of the yvalue. The Quadratic Distribution can be used to model the ‘Cost To Make’ (CTM) a Product where the Marginal Cost initially falls with increased production, but then starts to increase again as resources become scarce and operational inefficiencies are magnified. As the minimum value is fixed at 0.0 it may be necessary to shift the values in this Distribution before using it in a Market Simulation model.
 XIntersection (A): The CustomerID row in the Output Distribution where the curve touches the XAxis once (the XIntersection cannot equal = 0.0)
 YIntersection (B): The starting value of the Output Distribution where the curve intersects the YAxis (the YIntersection cannot equal = 0.0)
Sawtooth: (Wikipedia) The Sawtooth wave distribution looks like the teeth of a plaintoothed saw. The raw (unsorted) Distribution starts at zero and ramps upwards towards the Distribution’s Amplitude. It reaches the Amplitude after the Distribution’s Period, then drops to zero and starts again. Configuration parameters include:
 Amplitude (A): The maximum height of the wave
 Period (B): The number of Customer rows in the Output Distribution (greater than > 0.0) before the wave repeats itself
Sigmoid: (Wikipedia) Has the characteristic horizontal ‘Sshaped’ curve and is part of the family of Logistic Functions of the form [a / ( 1 + EXP(b * (row – Customers/2) )]. The values selected from this function will be between the Start value and 0.0 zero such that the beginning of the curve hugs the start value, then steepens, then the end of the curve hugs the end value 0.0 zero. Configuration parameters include:
 Start (A): Any value greater than > 0.0
 Curviness (B): The ‘Curviness’ of the Output Customer Distribution. Decreasing the Curviness will flatten the output curve, while increasing the Curviness will cause the output to be more curvy. A Curviness = 1.0 has been preset to provide a reasonable curve for about 10,000 Customer rows.
Simple Bimodal: (Wikipedia) Generates a simple Bimodal Distribution (a ‘twohumped’ Customer Distribution) from two Normal (Gaussian) Distributions. The user specifies the ‘First Mean’ and the ‘Second Mean’ with the Standard Deviation (SD) automatically calculated to be a quarter of the distance between the two Means. The user specifies:
 First Mean (A): Half of the Virtual Customers will be distributed around the ‘First Mean’
 Second Mean (B): Half of the Virtual Customers will be distributed around the ‘Second Mean’. The ‘First Mean’ cannot equal the ‘Second Mean’.
Sinusoidal: (Wikipedia) The smooth periodic oscillation generated from the sine function rising and falling between 0.0 and the Amplitude. The raw (unsorted) Distribution starts rising at halfAmplitude and reaches the Amplitude after a quarterPeriod. It then curves downward and reaches 0.0 zero after threequarterPeriods. Configuration parameters include:
 Amplitude (A): The maximum height of the wave
 Period (B): The number of Customer rows in the Output Distribution (greater than > 0.0) before the wave repeats itself
Spike: (Wikipedia) Is a vertical ‘Sshaped’ curve that looks similar to a rotated Sigmoid function but is generated from a pair of Exponential Functions of the form [a x EXP(b * CustomerID) + c]. The values selected from this Exponential Function will be between the Start value and 0.0 zero such that the beginning of the curve steeply declines, then rounds off, but then steeply declines again towards the end value 0.0 zero. Note that a sorted Normal Distribution will also generate a similar looking vertical Sshaped curve. Configuration parameters include:
 Start (A): Any value greater than > 0.0
 Curviness (B): The ‘Curviness’ of the Output Customer Distribution. Decreasing the Curviness will flatten the output curve, while increasing the Curviness will cause the output to be more curvy. A Curviness = 1.0 has been preset to provide a reasonable curve for about 10,000 Customer rows.
Square: (Wikipedia) The Square wave distribution alternates at a steady frequency between the Amplitude and 0.0 zero. The raw (unsorted) Distribution starts at the Amplitude and drops to zero after a halfPeriod. After the Distribution’s Period, the wave is reset to its Amplitude and starts again. Configuration parameters include:
 Amplitude (A): The maximum height of the wave
 Period (B): The number of Customer rows in the Output Distribution (greater than > 0.0) before the wave repeats itself
T: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of a T Distribution with a userspecified Degrees of Freedom. After the partworth value is calculated, the fixed value from ‘Input Parameter B’ is added to shift the result:
 Degrees of Freedom (A): Any value greater than > 0.0
 Then Add Fixed Value (B): Any floatingpoint value added after the random value is calculated
Triangle: (Wikipedia) The Triangle wave distribution raises and falls linearly between 0.0 and the Amplitude. The raw (unsorted) Distribution climbs steadily from halfAmplitude and reaches the Amplitude after a quarterPeriod. It then falls steadily and reaches 0.0 zero after threequarterPeriods. Configuration parameters include:
 Amplitude (A): The maximum height of the wave
 Period (B): The number of Customer rows in the Output Distribution (greater than > 0.0) before the wave repeats itself
Weibull: (Wikipedia) Generates a set of random partworth values for each Virtual Customer in the shape of a Weibull Distribution with a userspecified Shape and Scale:
 Shape (A): Any value greater than > 0.0
 Scale (B): Any value greater than > 0.0
Note: technical details concerning how the data generation is performed can be found by referring to the Apache Commons Math Library.
This Community Node documentation assumes you have already downloaded the opensource KNIME analytics platform and installed the free Market Simulation (Community Edition) plugin. If not, start by returning to Getting Started.
Downloads
#1 Normal Distribution
Inputs
None
No inputs required.
Node
Configuration
For a ‘Normal’ Distribution Type, Input Parameter A = Mean and Input Parameter B = Standard Deviation (SD).
Outputs
Attribute List
The Output Attribute List is empty if the Input Attribute List is missing.
Customer Distributions
There will be a Customer Distribution column for each row in the Input Attribute List. But if the Input Attribute List is missing then the node will generate just a single Customer Distribution called ‘Distribution’.
#2 Inverse Gaussian
Inputs
None
No inputs required.
Node
Configuration
For an ‘Inverse Gaussian’ Distribution Type, Input Parameter A = Mu and Input Parameter B = Lambda. See the Wikipedia page for more details.
Outputs
Attribute List
The Output Attribute List is empty if the Input Attribute List is missing.
Customer Distributions
There will be a Customer Distribution column for each row in the Input Attribute List. But if the Input Attribute List is missing then the node will generate just a single Customer Distribution called ‘Distribution’.
#3 Simple Bimodal
Inputs
None
No inputs required.
Node
Configuration
For a ‘Simple Bimodal’ Distribution Type, Input Parameter A = First Mean and Input Parameter B = Second Mean. The Standard Deviation (SD) of both Normal Distributions is equal to a quarter of the distance between the two Means. See the Wikipedia page for more details.
Outputs
Attribute List
The Output Attribute List is empty if the Input Attribute List is missing.
Customer Distributions
There will be a Customer Distribution column for each row in the Input Attribute List. But if the Input Attribute List is missing then the node will generate just a single Customer Distribution called ‘Distribution’.
#4 Many Distributions
Inputs
Attribute List
To generate many Customer Distributions at the same time, the ‘Table Creator’ node can be used to define the Distribution Type as well as the Input Parameters in the ‘Input Attribute List’.
Node
Configuration
The Distribution Type and Input Parameters in the Configuration Dialog are used as default values in case they are not defined in the upstream Input Attribute List.
Outputs
Attribute List
The Output Attribute List adds the Mean and Standard Deviation (SD) columns to the Input Attribute List.
Customer Distributions
There will be a Customer Distribution column for each row in the Input Attribute List.
Update #1 – Chains
 Distribution Naming, and
 Chaining together multiple Customer Distribution nodes.
Distribution Naming: In the past, solo Customer Distributions would all be named ‘Distribution’. To change this name it was necessary to use the KNIME ‘Column Rename’ node. It was only possible to directly specify a Customer Distribution Name if an optional ‘Input Attribute List’ was connected to the topport of the node. Now it is possible to set a userdefined name for each solo Customer Distribution within the node’s Configuration Dialog.
Chaining Customer Distributions: In the past, the KNIME ‘Column Appender’ node or the KNIME ‘Joiner’ node was required to collect together multiple Customer Distributions into a single table. Now a new (optional) ‘Input Customer Distributions’ port has been added to the bottom of the Customer Distributions node. Using this bottomport allows the user to link upstream Customer Distributions so that they are automatically appended before the new Customer Distributions generated downstream.
Update #2 – Linear Types
Two Customer Distribution Types have been deprecated and replaced with a new ‘Linear’ type:
 Deprecated: Uniform
 Deprecated: Ordered
 New: Linear
Linear: (Wikipedia) Generates a set of partworth values for each Virtual Customer in the shape of a Uniform (Linear) Distribution. The partworth values can be drawn randomly or can be evenly spaced between the Starting Value and the Ending Value by setting the ‘Smooth’ parameter. The Distribution can optionally be truncated by Minimum and Maximum limits. The Distribution can be sorted in Ascending, Descending, or Random order. Configuration parameters include:
 Starting Value (A): Any floatingpoint (double) value (inclusive)
 Ending Value (B): Any floatingpoint (double) value (inclusive)
Update #3 – Min / Max

Maximum: If enabled, the data generated for the Customer Distribution will capped at this ceiling Maximum. If a randomly generated data point is greater than this Maximum value then a second randomly generated data point will be used instead. The final data point will only be set to this Maximum value after multiple attempts to generate an acceptable random data point have failed. This Configuration Dialog default can be overridden by a ‘Maximum’ column in the Input Attribute List.
Minimum: If enabled, the data generated for the Customer Distribution will capped at this floor Minimum. If a randomly generated data point is less than this Minimum value then a second randomly generated data point will be used instead. The final data point will only be set to this Minimum value after multiple attempts to generate an acceptable random data point have failed. This Configuration Dialog default can be overridden by a ‘Minimum’ column in the Input Attribute List.
In the example below, the first Distribution has been capped between a Maximum and Minimum range, while the second Distribution has not. The difference between the two outputs can be seen in the histogram.
Update #4 – New Types
A number of new Customer Distribution Types have been implemented, including:
 Exponential Functions:
 Asymptote Start
 Asymptote End
 Sigmoid
 Spike
 Periodic Functions:
 Square
 Triangle
 Sawtooth
 Sinusoidal
 Other Functions:
 Quadratic