# Discrete Event System Modeling

```"A process cannot be understood by stopping it. Understanding must move with the flow of the process,
must join it and flow with it"
-First Law of Mentat
```

This chapter contains an introduction to the key concepts and terminology of discrete event simulation. The event graph, a method of concisely organizing the elements of a discrete event simulation, is introduced. Using a simple waiting line as an example, an elementary event graph is developed and explained. The future events list, which is the master scheduler of events in a discrete event simulation, is examined in detail. A verbal description of an event graph is introduced as a first step in developing a formal event graph.

## Background and Terminology for Systems Modeling

Here we will use computer simulation to study the dynamic behavior of systems –i.e., how systems change over time. Our focus will be on those systems where the status of a system changes at a particular instant of time; such systems are called discrete event systems. Discrete event systems can be found in areas as diverse as manufacturing, transportation, computing, communications, finance, medicine, and agriculture. Engineers, scientists, managers, and planners use simulation methodologies to design and test new systems and to evaluate existing ones, thus avoiding the expense and risks of physical prototypes and pilot studies.

### Systems

It will be sufficient for our purposes to define a system as:

```     A collection of entities that interact with a common purpose according to sets of
laws and policies.
```

### Models

We will define a model simply as

```A system used as a surrogate for another system.
```

In typical computer simulation models, a system with mathematical entities is used as a surrogate for a system with physical entities. In this book when we use the word system, without qualification, we are referring to a real or hypothetical system that is the subject of the simulation study.

Mathematical models are conceptual abstractions of a particular aspect of a system. The mathematics we will be using include probability, statistics, and graphs. When we use the word model, without qualification, we will be referring to a graphical description of a system called an event graph. Finally, simulations will refer to computer programs developed from event graph models. Simulations will be our methodology for studying the model. A model serves as the interface between a system and a methodology for studying the system.

When evaluating a simulation, it is important to differentiate modeling a system from coding a model. Whether or not a simulation is "good" is more or less objective. A good simulation is a completely faithful rendition of a good model; nothing in the model is lost in the code. The process of testing if the simulation is good is called simulation verification, which is discussed in more detail later. This is often much more complicated than verification of other types of computer programs. However, there is no conceptual difficulty in defining a good program as being error free. Of course, the most one can honestly certify about any computer program is that it currently has no known errors.

Defining what constitutes a "good" model is much more subjective. A good model is based on good assumptions. Good assumptions make the simulation more efficient or the system easier to understand while costing little in terms of validity. Driving a good bargain between model simplicity and model validity is the essence of the art of modeling.

A purist view of validity is that a model will be valid as long as it is based on explicit assumptions and the implications of the assumptions are well understood. From this academic viewpoint, a modeling error is not correctly stating and applying all model assumptions. If we take the pragmatist's view of a good model as contributing to correct decisions, it is possible to test the effects of including a particular detail or making a certain assumption by comparing the behavior of the simulation with and without the detail or assumption. Simulation is one of the few methodologies that allows testing the robustness of models with different assumptions. Perhaps the biggest danger in simulation modeling is including too much detail in the model. An experienced consultant in the field once remarked that he could tell a novice at simulation by the excessive amount of detail in his or her models.

A technique for keeping model details at a reasonable level is to focus on the similarities among the entities in the system rather than the differences. If transient entities (customers, jobs, messages, etc.) can be treated as identical, you can develop a valid model that merely keeps track of the numbers of transient entities at various stages of their progress through a system. This makes it unnecessary to have a detailed record for each individual entity. Such a model would require updating relatively few integers (the counts) instead of creating and maintaining separate records for every transient entity in the system. Similarly, treating resident entities (servers, machines, buffers, etc.) as identical allows you to maintain counts of the numbers of resident entities in the various states of their process cycles rather than keeping a record of the status of each entity. In situations where there are a great many transient entities in the system at one time, it might be necessary to treat transient entities as identical. It is certainly more efficient to have a single integer variable that counts transient entities than it is to create and maintain thousands of complete records.

It is natural to notice differences between entities in a system. However, a valuable modeling skill is the ability to recognize similarities. Differences are modeled only when they are essential to the validity of the study results. It is also natural to include detail unless there are solid reasons for assuming that it can be omitted. When building simulation models of complicated systems, it is good practice to require justification for including detail. Even when the differences in entities are thought to be important, a skilled modeler will be able to define groups of entities that can be treated as identical.

Sometimes the activity of developing a simulation model has as much value as the model itself. Building a model forces us to identify our objectives, determine constraints, quantify our knowledge, and expose our misconceptions. It could be argued that a study has merit even if its recommendations are never implemented and that a simulation model has value even if it never runs. Of course, this is of little comfort to the student who fails a homework assignment or an engineer who must try to find another job. It is vastly more satisfying to professors and employers if the simulation model runs and the recommendations from the study are adopted. The motivation for the development of event graphs was to make simulation models easier to build and verify. With event graphs, it is much easier to verify that your simulation program reflects the way you have modeled the system than it is to validate that the model actually can be used to imitate the relevant behavior of the real system.

### Model Verification

An absolutely valid simulation model with all the detail and behavior of real life is probably not attainable, or even desirable. However, every simulation model should do what its creator intended. Ensuring that the computer code for the simulation model does what you think it is doing is referred to as the process of model verification.

There is a trade-off involved between validation and verification of a simulation model. Adding detail to a model makes the code more complicated. If correctly implemented, this detail will perhaps improve model validity. However, adding complex details can make code verification more difficult if not impossible. Identifying only the substantive details that are important to include in a simulation model is important and often subject to negotiation.

Really gross errors in a simulation code can be detected using standard statistical testing. For example, a classical paired t-test between the means of samples from the real world and those from simulation runs might be conducted. (A description of a t-test can be found in any introductory statistics textbook.) However, there may not be enough real-world data to reject a hypothesis that the model and real world data have nearly the same parameters. Moreover, the data itself may not be valid. (See the "Five Dastardly D's of Data" in Chapter 9.)

Translating a model from computer code into clear language is a good exercise. It is interesting to contrast what two different people think a model is doing. You can also compare what you think your model is doing with what it thinks it is doing using the English translations generated by SIGMA.

An excellent tool for helping to verify a simulation model is an informal exercise called a "Turing test," named after a man who conjectured on the possibility of not being able to distinguish computing machines from real people. For a Turing test, actual blank forms used in the day-to-day management of a system are filled in with either simulated or real data. Only the blanks that are relevant for the purposes of the study are different. Managers and other people familiar with the system are then asked to identify the real and simulated documents and tell how each form was identified. It is vital that everyone know in advance that this exercise is likely to be repeated several times.

People familiar with these forms are usually more comfortable doing this exercise than they are in reviewing a computer code, evaluating a statistical analysis, or even observing an animation. Just the process of determining what data on each form are relevant to the study can make this exercise worthwhile.

A typical experience with a Turing test has been that the manager can immediately identify most, if not all, of the bogus forms! This is actually a good outcome; it indicates that the manager is paying attention, and it can lay the foundation for effective communication. A non-technical manager "winning" the first round may also help diffuse any antagonism that they may have developed toward the simulation project. What happens next is critical: when the manager tells how the simulated data was identified, changes to the simulation model should be made and the exercise repeated. It is the repetition of the exercise that is important, not the outcome of each iteration.

Statistical analysis to assess whether or not the outcome of the exercise is likely to result from guessing is presented in Schruben (1980). However, such formal analysis can actually be detrimental. It could easily inhibit communication and alienate the manager by moving the discussion to the unfamiliar ground of mathematical statistics.

## Discrete Event Systems and Simulations

As stated earlier, systems in which changes occur at particular instants of time are called discrete event systems. In a simulation of a discrete event system, time is advanced in discrete (variable and often random length) steps to the next interesting state change; uninteresting time intervals are skipped over. This coarse level of detail permits the modeling of very large systems such as airports and factories.

A description of the state of a discrete event system will include values for all of its numerical attributes as well as any schedule it might have for the future. Changes in the state are called events. In a production system, events might include the completion of a machining operation (the state of a machine would change from "busy" to "idle"), the failure of a machine (the machine state would change to "broken"), the arrival of a repair crew (the machine state would change to "under repair"), the arrival of a part at a machining center (the machine might again become "busy"), etc.

The ability to identify the events in a discrete event system is an important skill, one that takes practice to acquire. Initially, you might use the following simple steps as a guide to identify system events:

1. State the purpose of your system. Be aware that there might be several (conflicting) purposes.
2. State the objectives of your study.
3. Design, at least qualitatively, the experiments you might want to run with your simulation.
4. Identify the resident and transient entities in your system and their important attributes; assign names to the attributes.
5. Identify the dynamic attributes and the circumstances that cause their values to change . . . these will be the events.

The building blocks of a discrete event simulation program are event procedures. Each event procedure makes appropriate changes in the state of the system and, perhaps, may trigger a sequence of other events to be scheduled in the future. Event procedures might also cancel previously scheduled events. An example of event cancelling might occur when a busy computer breaks down. End-of-job events that might have been scheduled to occur in the future must now be cancelled (these jobs will not end in the normal manner as originally expected).

The event procedures describing a discrete event system are executed by a main control program that operates on a master appointment list of scheduled events. This list is called the future events list and contains all of the events that are scheduled to occur in the future. The main control program will advance the simulated time to the next scheduled event. The corresponding event procedure is executed, typically changing the system state and perhaps scheduling or cancelling further events. Once this event procedure has finished executing, the event is removed from the future events list. Then the control program will again advance time to the next scheduled event and execute the corresponding event procedure. The simulation operates in this way, successively calling and executing the next scheduled event procedure until some condition for stopping the simulation run is met. The operation of the main simulation event scheduling and execution loop is illustrated in Figure 2.2.

Main Event-Scheduling Algorithm

### Extended Example

We will follow the changes in a typical future events list by examining a simulation of a machine center with three identical machines (numbered 0, 1, and 2) and two workers (worker 0 and worker 1).

The types of events that might occur in this example are the ARRIVAL of the next part at the machining center, a machine "STARTing" or "FINISHing" work on a part, a BREAKDOWN of a machine, and a broken machine being REPAIRED. An actual simulation model would, of course, have other types of events.

For each event that pertains to a specific machine and/or operator, the machine number followed by the operator number (if appropriate) are listed as event attributes. At a particular time during a simulation run, the future events list might look like the one pictured in Table 2.1. The future events are logically sorted according to times that events are scheduled to occur. Here time will be measured in minutes.

The current time in this example is 3.00, and the ARRIVAL of a part has just occurred at the center. This ARRIVAL event has "scheduled" the next ARRIVAL event to occur at time 3.37. We can determine the status of each machine by scanning down the future events list and checking what lies in the future for each machine. (Recall that the machine number is designated by the first event attribute.) Machine 0 is due to FINISH processing the part it is currently working on at time 3.20, so machine 0 must be busy. Likewise, machine 1 is busy and due to FINISH working on a part at time 3.40. Finally, machine 2 will be REPAIRED at time 3.43, so it is currently being fixed by the repair crew. We can see from the future events list in Table 2.1 that when the part arrived at time 3.00 none of the machines were available to start working on it. Thus, the part will join other parts in a queue waiting to be processed.

```Future Events List for a System with Three Machines
(Time = 3.00)
```
```Time	Event Type	Event Attributes
3.00	ARRIVAL
3.20	FINISH	        0,1
3.35	BREAKDOWN	1
3.37	ARRIVAL
3.40	FINISH	        1,0
3.43	REPAIRED	2
9.01	BREAKDOWN	0
```

Note that machine 0 is due to experience its next BREAKDOWN at time 9.01 and machine 1 is due for a BREAKDOWN at time 3.35 - before it can FINISH its current operation. Therefore, when machine 1 breaks down at time 3.35, the FINISH event for this machine at time 3.40 will have to be cancelled.

To see how this machining center simulation might proceed, we will now advance the current time to 3.20 and execute the FINISH event on machine 0. Looking at the second attribute of this FINISH event, we see that operator 1 becomes idle. Since we know that there is at least one part waiting, we can immediately START processing the next part. A new START event for machine 0 has been scheduled to occur at the current time of 3.20 with operator 1. The future events list is now like Table 2.2.

```Future Events List for a System with Three Machines
(Time = 3.20).
```
```Time	Event Type	Event Attributes
3.20	FINISH	        0,1
3.20	START	        0,1
3.35	BREAKDOWN	1
3.37	ARRIVAL
3.40	FINISH    	1,0
3.43	REPAIRED	2
9.01	BREAKDOWN	0
```

We next execute the START event for machine 0 at time 3.20 with operator 1. Suppose that a stored (or randomly generated) machine processing time for machine 0 is 1.20 minutes, then the FINISH event for this machine will be scheduled to occur 1.20 minutes from the current time of 3.20 or at time 4.40. The future events list after executing the START event at time 3.20 is shown in Table 2.3.

```Future Events List for a System with Three Machines
(Time = 3.20)
```
```Time	Event Type	Event Attributes
3.20	START	        0,1
3.35	BREAKDOWN	1
3.37	ARRIVAL
3.40	FINISH	        1,0
3.43	REPAIRED	2
4.40	FINISH	        0,1
9.01	BREAKDOWN	0
```

Next, we advance the time to 3.35 and execute the BREAKDOWN event for machine 1. This BREAKDOWN event will cause the FINISH event for machine 1 scheduled at time 3.40 to be cancelled. We will assume here that the part is destroyed when the machine breaks down and that the worker becomes available for other work. If it takes five minutes for a repair crew to repair machine 1, the future events list after the BREAKDOWN occurs is like that in Table 2.4. (Note the REPAIRED event for machine 1 has been scheduled at 8.35, five minutes beyond the current time of 3.35.) The simulation will now advance to time 3.37 when the next ARRIVAL event will occur.

```Future Events List for a System with Three Machines
(Time = 3.35)
```
```Time	Event Type	Event Attributes
3.35	BREAKDOWN	1
3.37	ARRIVAL
3.43	REPAIRED	2
4.40	FINISH	        0,1
8.35	REPAIRED	1
9.01	BREAKDOWN	0
```

Do not worry if this all seems a bit mysterious for now. Discrete event simulation modeling is more than a simple exercise in computer programming. It is initially somewhat confusing for everyone. You will soon discover that it is relatively straightforward once you grasp the concept of an event and understand the relationships between events. For this, we will use event graphs.

## Event Graphs

The three elements of a discrete event system model are the state variables, the events that change the values of these state variables, and the relationships between the events (one event causing another to occur). An event graph organizes sets of these three objects into a simulation model. In the graph, events are represented as vertices (nodes) and the relationships between events are represented as edges (arrows) connecting pairs of event vertices. Time sometimes elapses between the occurrence of events.

The basic unit of an event graph is an edge connecting two vertices. Suppose the edge represented in Figure 2.3 is part of an event graph. We interpret the edge between A and B as follows:

```Whenever event A occurs, it might cause event B to occur.
```

Basically, edges represent the conditions under which one event will cause another event to occur, perhaps after a time delay.

Simple Event Graph Edge

Using this notation, we can build a model that simulates a simple waiting line with one server (e.g., a ticket booth at a theater, the drive-in window at a fast-food restaurant, etc.). For our example, we will model an automatic carwash with one washing bay. The event graph of our carwash is represented in Figure 2.4.

We will begin our examination of this graph by discussing each vertex. The RUN vertex models the initialization of the simulation, the ENTER vertex models a car entering the carwash line, the START vertex models the start of service, and the LEAVE vertex models the end of service.

The state variables chosen to describe this system are:

```SERVER  = 	the status of the washing bay (busy, idle), initially set idle.
```
```QUEUE   = 	the number of cars waiting in line, initially set equal to zero.
```

To make our model more readable, we also define the constants, IDLE=1 and BUSY=0.

Simple Event Graph of a Carwash

Next, we will focus on the changes in the state variables, shown in braces. The simulation RUN is started by making the washing bay at the carwash available for use {IDLE=1,BUSY=0,SERVER=IDLE}. Each time a car ENTERs the line, the length of the waiting line is incremented {QUEUE=QUEUE+1}. When service STARTs, the washing bay is made busy {SERVER=BUSY} and the length of the line is decremented {QUEUE=QUEUE-1}. Whenever a car has been washed and LEAVEs the washing bay, the washing bay is again made available {SERVER=IDLE} to wash other cars.

The dynamics of an event graph model are expressed in the edges of the graph. We read an event graph simply by describing the edges exiting each vertex (out-edges). In-edges take care of themselves. Continuing with our example, we look at each edge in Figure 2.4.

The simulation RUN is started by having the first car ENTER the carwash (edge from RUN to ENTER). If the ENTERing car finds the washing bay idle, service will START immediately (edge from ENTER to START). Each time a car ENTERs the carwash, the next car will be scheduled to ENTER sometime in the future (edge from ENTER to ENTER). The START service event will always schedule a car to LEAVE after that car has been washed (edge from START to LEAVE). Finally, if there are cars waiting in line when a car LEAVEs, the washing bay will START servicing the next car right away (edge from LEAVE to START).

The self-scheduling edge (the loop) on the ENTER event is the conventional way of perpetuating successive customer arrivals to the system. There will typically be some random time delay between customer arrivals.

After looking at the carwash model, you may have guessed that the state changes for an event vertex are typically very simple. Most of the action occurs on the edges of the graph. The conditions and delays associated with the edges of the event graph are very important; it is on the graph edges that the logical flow and dynamic behavior of the model are defined. For each edge in the graph we will need to define under what conditions and after how long one event might schedule another event to occur.

We will associate with each edge a set of conditions that must be true in order for an event to be scheduled. Also associated with each edge will be a delay time equal to the interval until the scheduled event occurs. Time will be measured in minutes for our examples. We have enriched the basic event graph to include edge conditions and edge delay times (see Figure 2.5). This edge is interpreted as follows:

```If condition (i) is true at the instant event A occurs, then event B will be scheduled to
occur t minutes later.
```
Conditional Event Graph Edge with a Time Delay

If the condition is not true, nothing will happen, and the edge can be ignored until the next time event A occurs. You can think of an edge as nonexistent unless its edge condition is true. If the condition for an edge is always true (denoted as 1==1), the condition is left off the graph. We will call edges with conditions that are always true unconditional edges. Zero time delays for edges are not shown on the graph.

While you are learning to read event graphs, it might be a good idea to use the edge interpretation in the previous paragraph as a template for describing each edge. Once the edges in the graph are correct, the state changes associated with each vertex are typically easy to check.

Our carwash model with edge conditions and delay times is shown in Figure 2.6. The state variables SERVER and QUEUE are now denoted by S and Q, respectively, and the status of S is indicated by 1 or 0 ( IDLE=1, BUSY=0 ). In addition, the time between successive car arrivals (probably random) is denoted by ta and the service time required to wash a car is denoted by ts. When values of ta are actually needed, they might be obtained from a data file or generated by algorithms like those in Chapter 9.

In Figure 2.6, state changes associated with each vertex are enclosed in braces and edge conditions in parentheses. As you read the following description, identify a single edge with each sentence.

Carwash Model with Edge Conditions and Delay Times
```At the start of the simulation RUN, the first car will ENTER the system. Successive cars ENTER the
system every ta minutes. If ENTERing cars find that the server is available (S>0), they can START
service. Once cars START service, ts minutes later they can LEAVE. Whenever a car LEAVEs, if the
queue is not empty (Q>0), the server will START with the next car.
```

Now re-read the above paragraph without looking at Figure 2.6. You will see that it is a concise description of the behavior of a queueing system. With practice, a system description can be read easily from the edges of an event graph. This is an excellent way to communicate the essential features of a simulation model and a good first step in model validation. With experience in reading event graphs, it becomes easier to detect modeling errors. This graph represents a completely defined simulation model. To run this model, only the starting and ending conditions for the run need to be specified.

## Verbal Event Graphs

Before designing your own event graph model, it is a vital that you develop a verbal description of your system. This description would include state changes associated with each vertex along with a verbal description of each edge condition and delay time on the graph. A verbal event graph for a generic single server queueing system is shown in Figure 2.7.

Developing a verbal description of your system is a necessary first step toward building a realistic and accurate simulation model. It will help you conceptualize the major components in the system, determine the key events and their interrelationships, and identify the state variables, edge conditions, and time delays necessary for the model. Note that state variables will need to be defined that permit testing of all edge conditions in your verbal event graph. Once you have constructed a detailed verbal description, the event graph model will be much easier to build.

Verbal Event Graph of a Single Server Queue

## Visual Power of Event Graphs

The visual modeling power of event graphs is most appreciated after one recognizes the complicated details involved in a discrete event simulation. The fundamental concept in event graph modeling is to use a directed graph as a picture of the relationships among the elements in sets of expressions characterizing the dynamics of the system. Each vertex of the graph is identified with a set of expressions for the state changes that result when the corresponding event occurs. Each edge in the graph identifies sets of logical and temporal relationships between a pair of events.

# SIGMA

SIGMA is based on the simple and intuitive Event Relationship Graph (sometimes called an ERG or Event Graph) approach to simulation modeling. The SIGMA project began as an effort to implement the notion of Event Relationship Graphs on personal computers and has evolved into a powerful and practical method for simulation modeling. SIGMA, the Simulation Graphical Modeling and Analysis system, is an integrated, interactive approach to building, testing, animating, and experimenting with discrete event simulations, while they are running. SIGMA is specifically designed to make the fundamentals of simulation modeling and analysis easy. SIGMA is able to translate a simulation model automatically into fast C source code that can be compiled and linked to the sigmalib.lib library to run from a spreadsheet or web interface. SIGMA can also write a description of a simulation model in English. SIGMA was developed without external or University funding.