Using Parametric Input Distributions

From Support

Jump to: navigation, search

Efficient algorithms have been developed for imitating the sampling from a large number of parametric families of probability models. Some SIGMA functions are presented for artificially generating samples that behave very much as though they were actually drawn from specific parametric distributions.

The values of parameters for these models determine the particular characteristics of the sample. This ability to easily change the nature of the input by changing a few parameter values is the primary advantage of using these models to drive a simulation. The variate generation algorithms in common use are fast and require very little memory. Furthermore, you can easily run replications, compress time, and generalize the results to other systems having the same structure. The major drawback to using parametric input distributions is that they can be difficult to explain and justify to people who have no background in probability and statistics.

Devroye (1986) provides a very complete reference on variate generation algorithms. The article by Leemis (1986) catalogs the relationships between dozens of probability laws.

There are several obvious classifications of probability models: finite or infinite range, continuous or discrete values. On a practical level we can also classify probability models as primarily providing good models for input processes or good models for output statistics.

Most of the common distributions that are used to model output statistics are derived from the normal (also called the Gaussian) distribution. In SIGMA the function NOR{M;S} will imitate sampling from a normally distributed population with a mean of M and a standard deviation of S. M can be any real-valued expression, and S can be any positive real-valued expression. Samples from other distributions such as the t, F, and χ-square can easily be derived from their relationship to the standard normal distribution (Leemis, 1986). A selection of parametric distributions that are good for input modeling are provided as SIGMA functions.

Standard Beta Density Shapes
Standard Beta Density Shapes

The function BET{P;Q} imitates sampling from a beta random variate with parameters given by the positive real-valued expressions P and Q. This is a standard beta on the interval from 0 to 1 and can be scaled to the interval (C,C+D) in the usual way as

C+D*BET{P;Q}. 

The beta distribution is one of the most useful in simulation input modeling because of the richness of shapes it can take with simple changes of its two parameters. The above figure provides a convenient "matrix" of beta distribution shapes for various values of its parameters.

ERL{M} will give a sample imitating an M-Erlang random variate. The parameter, M, can be any positive integer-valued expression. Multiplication by a real number, A, will move the mean from M to A*M. Since the M-Erlang is the sum of M independent exponentially distributed random variates with mean 1, A*ERL{1} will be an exponential random variate with mean, A.

If you want a sample that is even more highly skewed than one having an exponential, the function, GAM{A}, will imitate sampling from a gamma distribution with a fractional parameter. Here the shape parameter, A, is a real variable strictly between 0 and 1. The result can be multiplied by a scale parameter to give a variety of distribution shapes. For integer values of M, M-Erlang random variates have the same distribution as gamma variates.

The function, TRI{C}, will imitate sampling from a triangular shaped distribution over the range from 0 to 1. The mode (peak) of the distribution is at the value of the real-valued expression, C, which is between 0 and 1 inclusive. The linear function,

 A+(B-A)*TRI{(D-A)/(B-A)}

imitates a sample from a triangular distribution between A and B with a mode at D.

It is easy to code other probability models with SIGMA. We will illustrate this with two very useful models, the multinomial and the lambda. The multinomial probability law models independent sampling, with replacement, where there are only K possible mutually-exclusive outcomes. The simplest example of multinomial sampling is drawing a numbered ball from a jar where after each sample the chosen ball is placed back in the jar. While there are more efficient algorithms available, multinomial sampling is easily done in SIGMA using the DISK function. To illustrate, suppose that there are only three possible outcomes from our experiment. We will see a 1 with a probability of 1/2, a 2 with a probability of 1/3, and a 3 with a probability of 1/6. We simply set up a data file called, MULTINOM.DAT, which has the following entries

1  1  1  2  2  3  

The statement

X = DISK{MULTINOM.DAT;1+6*RND} 

will assign the values of X with the given probabilities. The index of the above DISK function is a randomly chosen integer from 1 to 6 (rounding the index down is automatic).

The lambda (more properly the "generalized" lambda) distribution is like the beta in that it can take on a wide variety of shapes. This distribution is discussed by Ramberg, et al. (1979). The major difference between the lambda and the beta is that the lambda can take on an infinite range of values whereas the beta is restricted to take on values only within a specified interval. There are four parameters to the lambda that can be estimated using subjective data. Generation of a lambda variate is very easy. Suppose that you have defined real-valued state variables, X and R, along with the four lambda parameters, L1, L2, L3, and L4. The statements

R=RND,
X=L1+(R^L3-(1-R)^L4)/L2 

will give values of X that imitate sampling from the lambda.

You should exercise caution when using Erlang, exponential, gamma, lambda, and normal distributions for input modeling. Variates from these families can take on very large values. You need to check and/or control for reasonableness. For instance, if you are using an exponential (ERL{1}) variate to model the service time at a store, it is not reasonable for the service time to exceed some upper limit. No one is going to wait years for service. Truncating these distributions at some upper bound is advised. To illustrate, a vertex with the state changes,

X=ERL{1},
X=X*(X<5)

would produce a sample from an exponential variate truncated to be strictly less than 5. The truncated probability (the likelihood that an exponential variate exceeds 5) is added to the probability of zero occurring.


Back to Inputs/Outputs

Personal tools
Navigation