Literature Study

Home

Literature Study

On this page, some thoughts, ideas and conclusions will be listed with respect to a literature study of the Master's Thesis. This page therefore tries to answer the "Why?" and the "What?" of specific research areas. These are mere notes and gatherings of certain data. For a more detailed and comprehensive explanation, read the thesis instead. On this page, there are the following sections:

Random Number Generators
Gathering Data
Queues
Extend: Professional Simulation Tools v5
Deep dive into different tools for modelling

Random Number Generators

Random Numbers, Pierre L'Ecuyer, Université de Montréal, Int. Encyc. Social and Behavioral Sciences; June 2001 [link]

In this short article, Pierre L'Ecuyer gives a short and simplified outline regarding random number generators (RNG). The article is short, straight to the point and gives us a pretty good idea for our RNG engine.
Basically, computers have issues with "true randomness". To solve this, RNGs generate numbers that appear random, instead of being random (mainly because there is nothing that says a number is more random than any other). To do this, we will create a so-called sample space that is uniformly distributed, large enough and with a big enough period length. All results lie somewhere in between 0 and 1, which need to be send through a distribution function to get an RNG that follows distributions.
The best type of engine for this problem is the combined linear multiple recursive generator, which has been studied by many. With L'Ecuyers recommendation parameters (see below), we have: \[ x_{1, n} = (1403580 \cdot x_{1,n-2} - 810728 \cdot x_{1,n-3}) \mod (2^{32} - 209) \] \[ x_{2, n} = (527612 \cdot x_{1,n-1} - 1370589 \cdot x_{1,n-3}) \mod (2^{32} - 22853) \] \[ u_n = \left({x_{1, n} \over 2^{32} - 209} - {x_{2, n} \over 2^{32} - 22853}\right) \mod 1 \] Unfortunately, this engine requires knowledge of \(x_{1, 0}\), \(x_{1, 1}\), \(x_{2, 0}\) and \(x_{2, 2}\) for a full result.

We're not going to be looking at cryptology, because they have a totally different understanding on what RNGs must be.

Good parameters and implementations for combined multiple recursive random number generators, Pierre L'Ecuyer; Operations Research, 47(1), 159-164. [link]

When we combine the CMRG from the above paper with this one, we can take a look at L'Ecuyers pseudocode for the generator. Unfortunately, the unknowns are not implemented in the code, but their preconditions are identified in the paper:

The vectors (s10, s11, s12) and (s20, s21, s22) contain the values of (\(x_{1,0}\), \(x_{1,1}\), \(x_{1,2}\)) and (\(x_{2,0}\), \(x_{2,1}\), \(x_{2,2}\)), respectively. Their initial values constitute the seed. Before the procedure is called for the first time, one must initialize s10, s11, s12 to (exact) nonnegative integers less than m1 and not all zero, and s20, s21, s22 to nonnegative integers less than m2 and not all zero.

Thus, we get a pretty good RNG with a seed consisting of six parts, which is hardly user-friendly. Thus, we must find a solution for this issue, which may come in the form of a normal LCG (although they "should be discarded" as of 2001 [previous paper]).

Random Number Generation, Pierre L'Ecuyer, Université de Montréal, Département d'Informatique et de Recherche Opérationelle [link]

The chapter dives deeper into RNGs and how to look at them. It is important to denote:

RNGs typically refer to the engine used to generate a random number between 0 and 1, before a reverse distribution function is applied. This is why I personally believe it would be a good idea to implement the RNGs with a predefined set of statistical functions, or to add the transformations as additional blocks to our library, or both.
Practically, the uniformly distributed output space of an RNG lies between 0 and 1 (both exclusive), but for mathematical analysis, we often assume 0 is inclusive./li>
The RNG must have a long period, must be efficient, repeatable and portable. It needs an efficient jump-ahead and must provide sufficient randomness.
LCGs and MRGs cannot provide sufficient randomness (and have bad statistical properties), when comparing it to an engine with a lattice structure.
Section 3.5 on page 13, provides a good/sufficient explanation concerning the jump-ahead feature, which exploits some matrix properties.
All proposed RNGs are based upon linear sequences, which may be too regular. One of the methods I believe to be the most useful is to transform the output by using a non-linear function.
The inversion of the distribution function can be incredibly complex and may require a setup phase in which a dictionary is created for a reversal search.

TestU01: A C Library for Empirical Testing of Random Number Generators, Pierre L'Ecuyer and Richard Simard, Université de Montréal, ACM Transactions on Mathematical Software, Vol. 33, No. 4, Article 22; August 2007 [link]

As was clear from the previous articles, RNGs are incredibly useful and necessary within large systems (that make use of statistical data). But because there are so many to choose from, we must have solid requirements for these generators. It is important we can objectively test and check the performance of RNGs on our systems and choose the right fit within the scope of your project. Luckily, Pierre L'Ecuyer bundled all these tests within a single C-library, as described in this paper. Acknowledging the fact that the library was written in C and our library will be in Python, we have three options: either recreate the TestU01 library in Python (possibly as a standalone, officially published library, which will be referenced and used as a dependency in ours); skim the web for such a library; or create a mapper between our code and the C-code which does basically that. All of these options have some advantages and disadvantages as highlighted below:

Option	Advantages	Disadvantages
Recreate `TestU01`	Full control over source Better understanding of the RNGs used No additional compilation needs to be done	It must be fully recreated without flaws It hasn't been accepted by the general public (yet)
Skim the web for a `Python`-version	Accepted library for multiple users Possibly tested pretty well No additional work for this library	Uncertain if it will have sufficient documentation No control over the code
Create a mapper between `C` and `Python`	Uses the original accepted source Well defined and documented Uses code that has been worked on for over 15 years Easy to do	Needs additional source and compilation Linking can give difficulties Last update was in 2009 Not necessarily platform-independent

Now, disregarding the how to implement the library within ours, we must also take a look at the functionality of the library as discussed in the paper.

To summarize, empirical testing of RNGs is very important, and yet no comprehensive, flexible, state-of-the-art software is available for that, aside from [TestU01]. [...] It implements a larger variety of tests than any other available competing library we know. It is also more flexible, the implementations are more efficient, and it can deal with larger sample sizes and a wider range of test parameters than for the other libraries.

Usually, RNGs are over the interval \((0, 1)\), or over the binary set \(\{0, 1\}\).
There are two categories of RNGs: those that apply on sequences of real numbers and those that apply on bitstreams. It is clear that both categories are strongly related, the latter one being a special case of the first category, albeit very faintly.
RNGs take their values from a finite set \(\Psi_t\). This set must therefore be evenly distributed over the unit cube.
Within some systems (cryptology, gambling machines...), it is important there is a certain unpredictability
"No universal test or battery of tests can guarantee, when passed, that a given generator is fully reliable for all kinds of simulations." But we can be confident in these tests.
Contrary to most statistical problems, when testing RNGs, we will not be using a rejection area \(R\) (or \(\alpha\) in some cases), but in fact we will report our p-value \[ p = P[Y \geq y|H_0] \]
If this p-value is suspicious (small, but not too small), but not clear enough to indicate rejection or acceptance, we can try the test again until we obtain an acceptable value. ("such values should normally appear approximately \(2\%\) of the time")
Multiple authors advocate the usage of a two-level procedure for testing RNGs, in which N observations (or transformations thereof) will be tested with a goodness of fit (GOF) test.
The majority of tests in the library are distributed w.r.t. chi-square, normal, or Poisson.
"[...] when an RNG started from a given seed fails spectacularly a certain test, it often fails that test for most admissible seeds."

Let's take a look at the possible tests that are mentioned in the paper. Before I list them, I though it'd be good to mention statistics is not my strong suite. Never was, never will be. So excuse any and all error's I may have made in summarizing all the possible tests.

Tests for a single stream of \(n\) real numbers in \((0, 1)\)
- Measuring global uniformity: Compare the empirical distribution of the generator to the \(U(0, 1)\) distribution via a GOF. Another option is to compare the sample mean and sample variance to \({1 \over 2}\) and \({1 \over 12}\) respectively. This method mainly fails for bad RNGs.
- Measuring clustering: Sort the output of the generator in increasing order and compute a smoothing of the gaps in between them. If the results are grouped together, we can deduce it from this method.
- Run and gap tests: This test assumes that if we have an interval \([\alpha, \beta]\), there must be as many visits to this interval as any other interval of this size. This method rules out generators whose visits in certain intervals are clustered in time.
Tests based on \(n\) subsequences of length \(t\) for real numbers in \((0, 1)\)
- Partitioning the unit hypercube and counting the hits per piece: Since all outputs of an RNG must be evenly distributed over the unit hypercube, we can divide the cube into smaller parts and check that all parts have a similar enough amount of hits. This can be done in a few different ways, each with their own distributions.
- Other ways of partitioning the hypercube: While the previous method(s) distributed the unit cube equally and uniformly, we could also do this otherwise. These tests will be able to identify unpredictedness of RNGs.
- Alternatives to counting: Instead of counting the hits and using that metric to determine uniformity, one could also use the knowledge of collisions and gaps to determine uniformity.
- Close pairs of points in space: Instead of looking at where each point was located, we will be looking at the distances between all points to determine uniformity.
Tests that generate \(n\) subsequences of random length for real numbers in \((0, 1)\)
- We will also test RNGs by generating numbers until something happens.
One long binary stream of \(n\) bits
- We can analyze a sequence of bits and determine if it has some correlation or predictability about it via skimming the sequence.
- We may also find some mathematical equivalence, or check the compressibility (under LZW).
Tests based on \(n\) bit strings of length \(m\)
- Partitioning the set of \(m\) bit strings and counting hits in subsets: We will analyze a bit sequence by determining the most common (and least random) subsequences and checking them against the sequence.
- Close pairs: Again we will take a look at the distances in such a sequence.

Tools for RNG: TestU01 [link]

As discussed within the previous paper, it is rather important for RNG to be tested on the machine they're going to be used. Seeing as there are numerous tests (and categories thereof) that exist, it is a good idea to have a predefined set of tests that give a good and valid result on which your choice for an RNG can be based. TestU01 is a C-framework, written by P. L'Ecuyer for this sole purpose.

The only downside is the bad readability of some parts of the code. That coupled with the fact that this code hasn't been touched in ten years is enough to conclude it'd be better to write a Python wrapper around this framework. This way, we still have the original library, but also have a way of using it within our library.

It's important to denote Python integers work differently to C integers. Instead of using integers of 4 bytes, Python constantly allocates enough bytes to fit the value. Especially in ctypes (seeing as everything is represented by integers). This is an important issue we need to keep in mind when creating our RNG. We can solve this by converting all intermediary values to ctypes types in our generator (slower), or via modulo-dividing all in-between values with \(2^{32}\) (faster).

Also, my analysis of the wrapper shows it runs at least 30 times slower than the actual C-code when using a Python-implementation of an XOR-shift as RNG (compared to the same C-implementation). For me, this is a sign that it might be good to actually have C-code for the actual RNG, making sure there is as few overhead as possible.

Tools for RNG: RngStreams [link]

P. L'Ecuyer also wrote a small library in multiple languages that allows you to use long streams and substreams (of which also exists a wrapper). This is a clean and pretty good implementation of a RNG. Yet, while both are possible to be combined inside of Python, it might be good to actually write the RNG part, including functions for testing them within C, making sure there are the least amount of function calls that switch between Python and C.

Gathering Data

There are many ways in which data can be gathered, measured and obtained within a model. The simplest way of doing this is to have a block which keeps track of all data that has been passed in order to obtain all measure scores. But this is a slow approach and lacks terribly in performance (both on the topic of memory as time). Based upon Filip Claeys' thesis on HGPSS (section 6.2.1), we can deduce there are a mere 10 values we need for stochastic results (assume we get the list of \(X_1\), \(X_2\), ..., \(X_N\)):

Number of read values: \(N\)
Number of read values that differ from zero: \(N_0\)
Sum of all values: \(S = \sum_i^N X_i\)
Sum of all squared values: \(Sq = \sum_i^N X_i^2\)
Mean of all the values: \(\mu = {S \over N}\)
Mean of all the values that differ from zero: \(\mu_0 = {S \over N_0}\)
Smallest read value: \(min\)
Largest read value: \(max\)
Variance of the values: \(V = \sigma^2 = Sq / N + \mu^2 - \left(2\mu S\right) / N\)
Variance of the values that differ from zero: \(V_0 = \sigma_0^2 = Sq / N_0 + \mu_0^2 - \left(2\mu_0 S\right) / N_0\)

As you can clearly see from the math of all these values, we only have to keep track of \(N\), \(N_0\), \(min\), \(max\), \(S\) and \(Sq\) in order to obtain all the above-mentioned information. Of course, this assumes the input data was taken from a normal distribution. Do note it is also possible to keep track of the integral (or an approximation to be precise) if you keep it in memory and also store the previous time it was updated.

According to Feldman/Shavitt, it is also possible to compute the \(median\) on the fly, meaning that we can also obtain estimates for the \(Q1\) and \(Q3\) values, resulting in the possibility to draw boxplots from this small component (see also this source). "Estimates" being the operative word here; meaning it might be interesting to compare this information against the \(Q1\) and \(Q3\) values obtained from any other algorithm (Tukey, Minitab, Freund and Perles, Moore and McCabe, Mendenhall and Sincich) and draw conclusions from it.

We might also want to take a look at the \(P^2\) Algorithm for Dynamic Calculation of Quantiles and Histograms without Scoring Observations. This would also require the storage of the \(p/2\)-, \(p\)- and \((1+p)/2\)-quantile, which are updated in real-time. At this current point in time, this algorithm seems the most viable solution for our problem.

Queues

A queue is a common datatype in programming and is basically a datastructure which allows data to be stored and obtained. The two most common types are a normal Queue (FIFO) and a Stack (LIFO), but Priority Queues also exist. Therefore, I would like to identify a queue with an entry criteria (where the new item will be inserted) and an exit criteria (which item will exit first). But instead of needing to set two algorithms, one can easily join these two together in a queueing discipline (the order in which items are sorted).

Let's say we will always take the last item from the queue on exit (dequeue). This way, we only need to define an order in which the items will be sorted. For both FIFO and LIFO, this means ordering them by arrivals. For FIFO, we will insert an item to the front of the list, for LIFO we will add to the back. Unfortunately, as can be deduced, the FIFO algorithm will require a lot of additional computing power because all items need to be moved. A more efficient way of storing items would be via a tree instead of a list. A requirement for this tree is an efficient insert and delete.

In this tree, we need to give all items a certain priority on which the tree will do its insertion. For FIFO, this priority will be just the number of the item, starting from zero. For LIFO, we will make use of the existence of negative numbers and starting from 0, we will negate the item number for its priority. For priority queues, we require a function that, based on the item itself and its number (index), can identify a unique priority.

Which tree to use? According to the following table, any tree other than the binary tree is acceptable. Because 2-3 trees and 2-3-4 trees use different kind of nodes internally and can be represented with a red-black tree, I'd give preference over the red-black tree. Again, we require efficient inserts and deletes, which makes the red-black tree desirable over the AVL-tree (see this article).

	Seek / Find	Peek	Insert / Enqueue	Remove Any	Dequeue / Remove Highest	Space Complexity
List	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	`O(n)`
Linked List	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	`O(n)`
Circular Linked List	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	`O(n)`
Doubly Linked List	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(n)` Worst Case: `O(n)`	`O(n)`
Binary Tree	Average Case: `O(log n)` Worst Case: `O(n)`	Average Case: `O(log n)` Worst Case: `O(n)`	Average Case: `O(log n)` Worst Case: `O(n)`	Average Case: `O(log n)` Worst Case: `O(n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	`O(n)`
2-3 Tree	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	`O(n)`
2-3-4 Tree	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	`O(n)`
Red-Black Tree	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	`O(n)`
AVL Tree	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(1)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	`O(n)`
Binary Heap	Average Case: `O(n)` Worst Case: `O(n)`	Average Case: `O(1)` Worst Case: `O(1)`	Average Case: `O(1)` Worst Case: `O(log n)`	Average Case: `O(log n)` Worst Case: `O(log n)`	Average Case: `O(1)` Worst Case: `O(1)`	`O(n)`
Hashtable	Average Case: `O(1)` Worst Case: `O(n)`	Average Case: `O(1)` Worst Case: `O(n)`	Average Case: `O(1)` Worst Case: `O(n)`	Average Case: `O(1)` Worst Case: `O(n)`	Average Case: `O(1)` Worst Case: `O(n)`	`O(n)`

Then, there is also the principle of reneging (i.e. items that leave a queue if they have waited too long). In order to do that, we require each item to have the time instance from whence they have arrived. Instead of changing our pair into a triple, we will just assume that the arrived time is their index (seeing as time will always linearly increase). Now, if we have a reneging function that checks every single time if an item has exceeded its stay, we can determine which items to schedule for reneging. Within the context of DEVS, we do not necessarily need to check this every time. When a new item arrives, we can find the point in time the first item is to be reneged. Similarly, when an item leaves the queue (either via reneging or normally), we can identify the next to-renege time.

Balking can be done via a control operation before joining any queue, so we will not be looking at this principle. Now, all we have to do is take a look at jockeying.

Queueing Theory [link]

Queueing systems can be identified via 5 properties:

A: Probability distribution of the arrival process.
B: Probability distribution of the service process.
C: Number of channels/servers.
D: Maximum number of customers allowed in a queueing system.
E: Maximum number of customers in total.

Whereas common options for A and B are M (Poisson/Exponential distribution), D (Deterministic/Constant value) and G (general distribution with known mean and variance). If D and E are not specified, they are assumed to be infinite. For instance, a common (single queue) queueing system can be defined as M/M/1. With all this information given, we can obtain the following information from a queue:

Customer arrival rate (per minute)
Service rate per server (per minute)
Overall system effective arrival rate (per minute)
Overall system effective service rate (per minute)
Overall server utilization
Average number of customers in the system (L)
Average number of customers in the queue (Lq)
Average number of customers in the queue for a busy system (Lb)
Average time customers spend in the system (W)
Average time customers spend in the queue (Wq)
Average time customers spend in the queue for a busy system (Wb)
Probability that all servers are idle (Po)
Probability an arriving customer waits (Pw or Pb)

Additionally, we need to take a look at the following principles in queueing systems:

balking: Customers decide not to join a queue if it's too long.
reneging: Customers leave a queue if they have waited too long.
jockeying: Customers switch between queues if they believe another queue is faster.
capacity: Maximum amount of customers in a queue.

Extend: Professional Simulation Tools v5

User's Guide

This book is the user manual for an old DEVS GUI. It gives a solid understanding of a list of all the necessary low-level "building blocks" in the industry. Skimming and reading though the pages of this book, I was able to construct a list of all the components I'd like to include. Some paragraphs and/or sections also seeded the idea behind some other possible useful components.

When taking a closer look at Extend, I was able to deduce its parallels to PythonPDEVS. Both consist of blocks that are connected plex-wise in a specific configuration; in both tools it is possible to add/remove blocks and/or connections (E24-E25); both allow specific end times in the simulations (E55) and both can work with real-time simulations (E55, E208-E211). This is why it is, in my opinion, acceptable to look at Extend for making some deductions w.r.t. PythonPDEVS.

Part 1: Extend

The book is divided into a few different parts. This very first part goes into detail into everything that might be useful in Extend, without any industrial applications (these are explained and expanded on in the following parts). This is why all the low-level components in our library are inspired on sections from this part, whereas the domain-specific components can be extracted from other parts of the book.

Chapter 1: Running a Model

This chapter describes the main and most basic usages of Extend and forms a basis for the rest of the book. It is therefore not too far fetched that it gave the most ideas for blocks in our library.
A set of some mathematical components can be necessary within certain contexts. Not only the adder and the multiplier, as described on E21, but also all other main blocks that we know (and love) from Causal Block Diagrams can come in handy. This set includes: the subtractor, the divider, the negator, the inverter, the differentiator and the integrator. Thinking out loud, this set can be expanded with the modulo divider, the powerfunction and the function blocks as a main basis for mathematical functionality. As used in the examples in the book, it never hurts to be able to create equations in DEVS models.
The book also makes use of some generators to be able to use with the data. In this chapter, the Input Data block is used together with a stochastical random number generator. This makes it clear to me that the usage of a number of different generators can become useful. As for the RNG part, I'll leave the exact blocks to use for the corresponding section. The other generators include the input data generator and the constant generator. Two other blocks that (in my opinion) should join this set are the incrementor and the function generator. This way, normal generation of data can be represented, which may come in handy within an industrial context.
The Holding Tank is rather a domain-specific component, so for the time being, we will leave its functionality out of our library. In hindsight, this block can be represented by an accumulative sum block followed by a threshold block. This further implies we can ignore the Holding Tank without loss of generality, for now. This block may become part of a possible, higher-order, domain-specific part of the library, due to its excessive use in future examples.
E23-E24 shows that plotting is important in models. This is why I decided to add the lineplot, to plot the input over time. I can also deduce it is possible to plot multiple curves on the same plot (in Extend at most 4), which gave me the idea of using a variable input system. The number of inputs is to be initialized in the blocks constructor (at least 1). This system can be used in other blocks, but will mainly be added to the lineplot, the adder and the multiplier.
Page E29 and onwards talk about some sort of teller, which is technically a delayer, but prompted the idea of having a counter that counts the amount of messages that passed through it. This would allow users to keep track of the load of a system.
On E30, we can see a simple FIFO queue, but we know that there are more possibilities; introducing the stack and the priority queue. It is also said that the Bank Line simulation model, introduced on E28, has a queue that will automatically send the next item to the following available timer, if one is available. Besides it giving me the idea of adding a load balancer, I kept wondering "Is it possible in PythonPDEVS to know a next element is free before sending something to that element?"

Chapter 2: Building a Model

In general, the main takeaway from this chapter is that Extend works as any DEVS editor should work. It allows addition, changes and deletion of blocks and connections, it consists of a logical list of libraries with all components and simulations can be ran with a number of specifications. Nonetheless, there were a some ideas for new blocks, while reading through it.
Although E48 speaks of the Input Data block, its explanation and options made me believe that outputting the data only to a plot might not suffice. Which is why the table output became a thing. Its main purpose is to generate a table instead of a graph (possibly with Python Pandas).
Statistical data can be useful in a lot of situations. This is why there is a part of the library that focuses on stochastic data. The min, max, mean, median, variance, Q1 distance, Q3 distance, cumulative sum and cumulative product blocks form the basis for this sub-library and are joined by the above-mentioned counter. Of course, most of these components can be (as it is in the book) joined together into a single component.

Chapter 3: Enhancing your Model

This chapter goes, as the title suggests, more into detail on complex models. Whereas the chapter mainly talks about Extends functionality to make your models better looking (color picker, text boxes, drawing ovals, animation...), another part goes into detail on the practicalities concerning complex models (hierarchy, optimization...).
While the Equation block (E86-E88, E222-E224) looks interesting and certainly can add a lot to reduce the complexity of a model, I'm not quite sure yet how to do this, nor if this is a solid addition (over a bypass for the CBD-components). Either way, it is a good thing to keep in mind.
The three Control blocks Extend offers are a little bit UI-specific for my taste. Knowing that PythonPDEVS is mainly text-based, a Switch, a Slider and a Meter (E88-E91) don't have that much of an effect.
The message blocks described in this chapter do sound useful. A stop, a sound and a prompt block (E92-E94) can offer some much needed IO possibilities for end users. When joined with a message block (that prints certain text when it receives a message), debugging without the use of a debugger can really be improved, along with giving the user the possibility to output some values/messages. Of course a write block that writes some message to a file cannot lack. Yet, what happens when the data has to be formatted, or when it should follow a file reference?

Chapter 4: Fundemental Continuous and Discrete Event Modelling

While the previous chapters mainly list a number of blocks, chapter 4 details on how to use the blocks and on how to use them in a continuous manner. Since a computer cannot represent infinity (there is always an upper bound), we should turn continuous systems into discrete systems by simply taking a small enough timestep. As the note on E99 says, when talking about a continuous model, we simply mean "a model that has constant time intervals in between its events".
Using the description of the Generic Library of Extend (E100), we can learn why it is important to have certain basic components. Mainly, this is because they come in handy in a multitude of different domains and research areas.
Whereas E102-E105 give an excellent overview of the components in the Generic Library of Extend, it is also a good table to get some inspiration on the building blocks for our library. The decision, select input, select output, accumulative sum, accumulative product, read, AND, OR, NOT, NAND, NOR, XAND, XOR, T Flip-Flop, RS NOR Latch and threshold blocks came to be thanks to this table.
The example of E105-E112 in which we model a lynx and hare population is a solid and quite common example from the biological domain as to why many of these components are useful.
Furthermore, it is quite important to keep the pitfalls (E118-E119) in mind while creating a model. Alas, this is a problem for the user, that no library can solve. Except, if the library contains a function that checks for model safety.
The rest of the chapter, detailing the Discrete Event Library of Extend does not quite resonate its usefulness with me (except for the queues, which I've already listed). There are very few situations in which I believe any of the mentioned blocks should be used over a combination of already existing blocks in our library. Thus, I'm yet to see an acceptable example as to why to choose these blocks specifically.

Chapter 5: Using Libraries

This chapter goes into detail on specific creation, usage and maintenance of libraries for Extend. It is therefore not illogical that there is close to no information that will also apply on PythonPDEVS.
An unexpected good piece of information comes from the way a library may be organized (E146). All previously listed blocks mainly fall under some sort of general subdivision. If there were to come some domain-specific aspects, they should be placed under a subject-specific set of modules. Any blocks that are model-specific also get their own category, but seeing as we're trying to build a library for any model, it is doubtful we will have this subcategory. We will choose user-friendliness over an in-depth separated library. This is why there is no general and/or domain section in the library, but instead a grouping to a few general sections.

Chapter 6: Input and Output

The full scope of this chapter goes over how to get data into models and how to read them out. Previously, the lineplot has been listed, but the chapter has some more specific info on its usages.
The description of how to use the plotter, made me think it might be useful to note that matplotlib will probably be used for this in PythonPDEVS. Its GUI has some of the same tools and/or functionality, which is explained well on their site.
The tool explanation for the plotter also has some interesting functionality and options (E151-E158) that the lineplot may be extended to.
The listing of the different plotters in Extend (E158-E160) made me think of some new interesting types of blocks: namely the histogram, boxplot, FFT, scatterplot, barchart and the piechart.
It may be useful to be able to split a table into its different columns, each representing its own data stream (E174-E175), which can be done with a table input block.
As mentioned before, when reading from and writing to files, it may be useful to do this in some sort of structured manner. Either this can be done via some templated file (similar to Xtext and Xtend, as per my project for the MDE course), or a set of predefined output file types are hardcoded into this block. I personally prefer the first option. Either way, report files like on E188-E189 should be able to be outputted from our model.

Chapter 7: More Modeling Techniques

The main bulk of this chapter contains some good-to-know hints and tricks when modelling a system (E194-E195). Furthermore, it goes into detail on customization of the tool (memory, block icons, animation, aestetic connections...). Seeing as this is all very Extend-specific, this chapter has no other prudent information that can be used in our library. PythonPDEVS has a pretty good engine, so topics like tracing, DAG solving and optimization can be explicitly skipped.
Maybe a timer block can be useful to identify how long a submodel took to perform, but I don't quite see the appeal, nor how I'd do such a thing.

All chapters from this point forward will go into detail on domain-specific aspects, respectively manufacturing, business processes and industry.

Part 2: MFG (Manufacturing)

This part details the Extend+Manufacturing package, which can be used for modelling all types of industrial and commercial systems, as long as they involve items together with system resources; are a series of logically related activities that try to achieve a specified goal; and are organized around events that drive most businesses (M3-M4). Within this context it is clear as to why the elements that will join our library (as they are snatched from the pages of this book) may become part of some domain-specific sublibrary.

Manufacturing Chapter 1: Areas of Application

This chapter gives a subtle insight into manufacturing systems, services, material handling systems and communication systems. It gives some interesting ideas for blocks, but is still quite incomplete.
Although the book does not go into detail on these components, we can clearly identify the following blocks in the Assembly-Rework model on M12: a stock, a conveyor belt, a machine and a joiner. We can also identify a buffer, a labor pool and a batch (among others), but I need additional information on these blocks in order to identify if they are required to become part of our library. This information can be found in future chapters.
Most examples we see in the Services section (M16-M25) can be recreated with the basic building blocks that were mentioned before. Of course, a model that is applicable for a specific domain may require components with another name (e.g. in the Fast Food model (M19), the freezer is technically the same as the above-mentioned stock). Some aliasing of blocks may be in order. What was interesting though, was the reneging queue block in the Ferrari Agency model, where items are removed from a stack if it took too long for them to be cleared.
It is unfortunate that there is very few information on the Material Handling Systems, mainly because conveyor belts and other transportation blocks are the core of most models in factory-context ^{[citation needed]}. Seeing as this is the first chapter of this part, it is logical to conclude we will get back to this further on.
We already have a load balancer in our generic library, but as we see in the CSMA model (M31), network systems also require nodes and message buses. In fact, we may even add slaves (or clients) and a master (or server). We can even go as far as saying we also have a client-server that can be both a client and a server at the same time! These blocks should give plenty of possibilities when it comes to computer networks, but may be expanded with e.g. firewalls, switch nodes...

Manufacturing Chapter 2: Tutorial

As it is said in the name of the chapter, we follow a tutorial on how to create a certain assembly-like model from scratch.
Some leftout information from the previous chapter is filled on page M40, where the idea behind a buffer block is explained in detail. As far as I can see, it is the same as a normal FIFO queue, thus it does not require addition to the existing library.
At this point in time (M42), I have realized that the exit block is incredibly common in this types of models (based on all the examples I've encountered). Whereas I, at first, was quite reluctant to add such a block, I now have come to realize it is too useful to leave out of the library.
The batch and unbatch blocks get some additional information (M50-M51), making it clear what their uses are. They basically group and ungroup a predefined set of items respectively. Seeing as it is not too hard to image its usefulness in an industrial context, these two block can be added without concern to the library.
M58-M64 talks about analysis and confidence in results, based upon some statistical blocks that are added to the model without any connections. If this is possible in PythonPDEVS, these blocks can be a lifesaver in complex models. Of course, we may also implement a few functions that analyze our entire model and obtain its results this way. We might even add tracers to do this.

Manufacturing Chapter 3: Overview of the Libraries

This chapter is incredibly useful and handy when we look at the goal of this project. We get some sort of listing as to what components we can use in Extend and thus allows us to list many components we may add to our library.
M66 gives us an excellent overview of all different kinds of activity blocks in any factory assembly line, or (more general), in a system that includes transportation. We can add a conveyor belt, a conveyor carousel, a crane and a transporter if we only look at movement. The latter of these listed components does not imply movement by cart, as described on the page, but in fact general movement. This way, we have no need for a route block and, in fact, we can even use this for transportation through the air (drones, planes...), over sea (boats, ships, rafts, water transportation systems...), via land (carts, trucks, AGV/ASRs, mailman post delivery by bike...) and by rails (trains, carts...). The machine and station blocks can be substituted by a delayer, unless we also use custom items (classes) in our transportation that set certain attributes.
Now, if we were to implement any of the transportation blocks, we can say that their duration is defined via its distance and its speed, or by just its duration. For general purposes, we may take the latter one and implement some physics blocks, i.e. duration (= distance / speed), speed (= distance / duration), distance (= speed / duration)...
As said before, the reneging queue is an interesting design and purpose. When we look at M68, we can also identify a holding block that could have some usefulness to it.
As far as resources are concerned, I do not believe they thoroughly add new functionality to the already existing one.

Manufacturing Chapter 4: General Modeling Concepts

While running a model, it's important that it runs in such a way you intended it to. That's what this chapter is all about.
The internal workings of PythonPDEVS functions, is tested and is quite solid, therefore practically all information on the workings of Extend that are talked about in this chapter can be skipped.

Manufacturing Chapter 5: Items and Values

When creating DEVS, we will pass items through our model (especially in a more industrial context). This chapter in the book handles how it is done in Extend. Currently, in our library, we only use numbers to pass data through our system. Although, we can continue to use numbers, the main idea is that all blocks work with any kind of data. This idea yields an item generator and an item updater block that allow all kinds of blocks to be generated. As far as the mathematical blocks are concerned, unless you override the operators, Python will throw an error. As for the output blocks, it is required to be able to cast items to dictionaries for these blocks to work as wanted (this must be done via overriding the __dict__ method). All other blocks (that do not depend on the values of the items) should keep on functioning as before.
If we look at the overview of all blocks concerning custom items (M80) and their details (rest of the chapter), it does seem interesting to also have item attribute getter and item attribute setter blocks. Even though the latter one is a simpler version of the item updater block mentioned before, it can be useful to have a simplified block to do this. The program block from that same list does sound interesting, but seeing as there is no information on it in this chapter, I will refrain from adding it to our library.

Manufacturing Chapter 6: Queueing

https://discuss.fogcreek.com/joelonsoftware/default.asp?cmd=show&ixPost=22948

Previously, we've dabbled a little bit into queues and their different types that could be interesting to use in our library. This chapter actually goes way deeper into all different kinds of queues and allows us to update our previously made list.
The idea for the implementation of the priority queue block was to have a lambda function in its constructor that is called on all passed items, which should return the priority as a comparable value. The complexer this function, the higher the block's complexity (exponentially in worst case). Thus, this block can also be used for some of the other queues listed in the book.
The queue resource pool is an interesting kind of queue that has a multitude of possibilities from the top of my head (networks, workload estimators...).
Both blocking and balking (M97-M98) are interesting and realistic concepts in queueing systems. (Especially balking, which I wouldn't have thought about on my own.) Yet, I've listed all the blocks required to fulfill both concepts.
With reneging being covered with its corresponding queue, jockeying is not far away. Albeit a rare occurrence in models, allowing items to move from one queue to another that is faster can be quite an ingenious system. This is why we need our reneging queue in conjunction with a select output block. Now, our reneged item can rejoin the system and move to the shortest queue.

Manufacturing Chapter 7: Routing

The name of this chapter can somewhat give the wrong idea on what is going on here. Instead of it concerning about network routing, it specifies the DEVS routing; meaning it specifies some tips, tricks and best practices, while at the same time not giving too much additional information.
The prioritizer block explained on M114, is an innovative and refreshing addition to the system. I personally do believe it can be incredibly useful in numerous models.
A throw and catch block can be a good addition to the system, but unfortunately I can only see the catching happening in a hierarchical context.

Manufacturing Chapter 8: Processing

Returning to a more industrial context, this chapter discusses different ways to connect activity blocks and how to control processing time and availability of resources. This implicitly implies that the chapter will go into detain on different techniques in model creation that can be used to handle processing. While these are certainly useful for an end-user, there is not much this chapter contributes to the creation and design of our library.
The activity service block, introduced in the previous chapter is returning quite commonly, making me rethink my choice not to talk about it before. It can be a useful block in context of overflow prevention and allowing specific passthrough of items.

Manufacturing Chapter 9: Batching and Unbatching

Again, a quite non-informative chapter for the main idea of our library. While it is useful in describing what the possibilities are in terms of batching and unbatching, it is not so much when talking about useful block combinations. When we batch by simply joining all the items in a list and unbatch by dismantling this list, our system applies to most conditions and terms setup in this chapter.
In very few cases, the batching requires turning one item into another single item of the same type. For this, we may add a functional batch and a functional unbatch block that uses lambdas to join and split the items as required.

Manufacturing Chapter 10: Resources

While going through the previous chapters of the book, at times the usage of resources and resource pools are mentioned. Yet, they are not explained in detail, which is why it's good to have this chapter. If not only for a more detailed explanation, but possibly also to give some reasoning behind the "why" of certain (possible) blocks.
While it is perfectly possible to use items instead of resources, the resource system from Extend is quite ingenious. It keeps track of all items/messages currently in the model and caps it off at a certain point. Afterwards, items can be released or recycled in the system. If it is possible in PythonPDEVS to keep track of all messages that are being sent, this can be done perfectly. Otherwise, we might need to introduce a limiter block (which is technically a counter, a decision and an activity service block linked together validly) to our library.
I have come to realize it may be useful for items to be generated at a specific time, meaning we might need to add an input to our generators that defines the time interval until the next output. But, in order to prevent a massive overhead in all the components that need this, it may be better to introduce a scheduler block that queues inputs in FIFO and outputs them, depending on a schedule table.
The shift block from Extend adds some functionality to our model we do not yet have. It allows certain blocks only to do things at specific points in time. E.g. a worker station in a factory is only available from 9 a.m. to 5 p.m. (because it needs a supervisor). My first instinct was to make a coupled DEVS from which the root of our system should inherit and setting a tablized schedule for when certain blocks are "online". Seeing as I doubt we can do this simply in PythonPDEVS, I came up with another option. Let's say we have a simple block that discards all input (or outputs it differently) when it's "offline" and can turn itself "on" at predefined times. Now, we simple have to add this block in front of any block that has a shift (possibly while allowing n inputs and (2*)n outputs) and Bob's your uncle. Yet, when using shifts, the note on M175 is incredibly important/useful to keep in mind.

Manufacturing Chapter 11: Activity-based Costing

Extend allows its items to be created and manipulated with a specific cost. We might be interesting in adding this information, but theoretically we can already make use of this via the item updater or the item attribute setter. Although, this is a quite an obnoxious solution, I do not see how PythonPDEVS can store global information in specific blocks (unless if the models root inherits from a specialized coupled DEVS), which seems to become quite a recurring problem, the further we dive into specialized components.
If we are able to use costs as described above, we either do not have to have any additional blocks (because it will be stored in the library), or we have to add some new blocks that can fetch, alter, accumulate and create cost for messages. But what with passing through of numbers that need a cost? Extend seems to always work with a predefined item or message class, which allows this kind of functionality to be more flexible. So we have to outweigh global flexibility of all components over the usefulness of adding/allowing cost in the model.

Manufacturing Chapter 12: Statistics and Model Metrics

While a model itself is not necessary an optimization, nor is it an optimal solution, it is possible to obtain enough data and critical information from these models. That's what this chapter is all about.
I will purposefully skip M194-M203, which talks about random generators and their possible distributions. If required, all information about this kind of things is better taken from the research done on L'Ecuyers work on RNGs (see above) and other in-depth statistical and mathematical research concerning confidence intervals and function fitting.
The rest of the chapter is about obtaining data and how to do so. All used/necessary components for the type of data needed are currently available in our library, except for the timing of components (see above). Note that, as implicitly discussed on M208, most errors are the modelers fault, but it's easy to blame the components.

Part 3: BPR (Business Process Reengineering)

Besides industrial and mathematical models, there are also organizational and management models. That's what this part is about. To quote B5: "BPR is the analysis and redesign of business processes. It requires companies to look at their fundamental processes from a cross-functional perspective and ask 'Why?' and 'What if?'."
When we take a look at the BPR library blocks, though, we can clearly identify some of the previously mentioned blocks, giving us the clear vision that this part might not add new ideas to our library, but maybe give another way of looking at it. Now, if we take a look at the names of the chapters from this part and compare them to those from the previous part, it is clear that there will be a lot of repetition in this part, resulting in some less informative, useful and/or important information.

BPR Chapter 1: Areas of Application

We get a look at some example models and some typical areas within business models in this chapter. It gives us a clear look on how models can be used in yet another context and/or area of expertise, while also providing certain tips and tricks in modelling such models.
While the cycle time is yet another example as to why it can be useful to have some sort of timing in a model, I still have no idea how to do timing in a clean and DEVS-appropriate manner. We may do this by creating some sort of coupled DEVS that times and outputs the duration of an item in its contents, but it's rather inconvenient and unclean, in my opinion. What is clean, is a block that has two inputs and outputs a passthrough of the original message and the time difference between arrivals on input 1 and input 2.
While the update to allow custom technology was important in 2000 (the year the book was published), we've already come a long way concerning this topic. So far even that there is not a lot of useful information in this subsection. Workflow information, routing details and ABC can also be ignored, seeing as this is all repetition of previous chapters/parts.
Surprizingly, the planning cycles denoted on B28-B29 are quite agile and, in fact, do come close to our current DevOps-way of working. Of course, these are still basic models which need to be remodeled in DEVS.
If the Extend+BPR building blocks can facilitate many aspects of the ISO9000/EN29000 standard (B31); and knowing we have all building blocks in our library, it is safe to say that our library can also facilitate many of those aspects. Maybe we need to read up on this standard?
The Causal Loops section (B34-B36) is yet another important section for model makers, but may have some interesting concepts to add to some sort of safety analysis.

BPR Chapter 2: Tutorial

The same as with Extend+MFG, this second chapter will detail a tutorial on how to use the Extend+BPR functionality in the best way possible. Therefore, it is clear that the main bulk of this chapter is not useful, nor important for us. The BPR blocks already are part of our library, the creation in Extend is the same as before and window information about the program does not translate to PythonPDEVS, as mentioned before.
Although the model used, is continuously expanded, there is nothing that is added that is not representable with our current library, nor is there any new information for us.

BPR Chapter 3: Overview of blocks in the BPR and Statistics libraries

As the name of the chapter describes, this chapter will give an overview of the BPR and Statistics blocks, both of which already discussed in previous parts/chapters.
The explanation of the Measurement block on B61, makes it clear that this vaguely described block is technically an item attribute getter block. All other described blocks have a clearer correlation, just by looking at its name and use case.

BPR Chapter 4: General Modeling Concepts

As it's probably clear by now, this part has a lot of repetition and recapitulation of other parts/chapters of the book. I suppose they weren't counting on anyone actually using all parts of the book. Anyhow, this chapter is no different than the others as it will go about certain settings and validations that apply on a BPR model (and technically apply on any model).
Previously, I mentioned I would not go into detail on things concerning the PythonPDEVS source/functionality; and this chapter is no different.
The description on Model verification on page B69 contains an excellent workflow/pseudo-algorithm for actual verification. If we add functions for this type of things to our library, this workflow will be incredibly helpful in accurately verifying a model.

BPR Chapter 5: Items and Informational Values

Although the title of this chapter clearly indicates that we will look at Informational Values, the introduction of this chapter does not indicate how they're different from normal values and, in fact, ignores the "Informational" part. Therefore, it is abundantly clear there will be a lot of repetition when comparing this chapter to Manufacturing Chapter 5: Items and Values.
We can see that the scheduling of items (B75-B77) is done by linking a certain time at an output value in a table-wise fashion. This allowed me to think about how the scheduler block is going to work in our library. Either we must combine an item to a timestamp beforehand, or we must require a timestamp to be inputted at the same time as our item in the block. While both options are valid, they have some downsides. The first option is ugly and causes for some unnecessary overhead, while the second option has some undefined behaviour. However, if we have the possibility in PythonPDEVS to 'suck' or request an input from a connected port at the same time, a lot of issues will be solved.

BPR Chapter 6: Queueing

The six pages that make up this chapter only talk about how queueing works in Extend+BPR. More specifically, how its Stack block can be used in combination with priority setters. Ignoring the fact that in Extend+BPR, the Stack block is in fact a queue with a changeable type (FIFO, LIFO, Priority or Reneging), all good concepts w.r.t. queues are reiterated (blocking, balking, reneging, jockeying...).

BPR Chapter 7: Routing

As it was with manufacturing, this chapter describes how items can be routed in a lot of different ways throughout your model(s). And while it can certainly provide additional useful information, most of it is targeted at modelers.

BPR Chapter 8: Processing

Activity blocks are blocks that apply certain events on item streams. And even though a factory owner, or someone at a management level should keep in mind that most processes are pipelined, the contents of this chapter do not provide any additional information to be used.
Scheduling and bringing activities "online" can certainly be done via the scheduler and activity service blocks respectively, but combined with the shift block, there is no doubt we can schedule and manage processes as precise as wanted.

BPR Chapter 9: Batching and Unbatching

Yet another tiny chapter that provides no more information over the corresponding chapter in the manufacturing part.

BPR Chapter 10: Resources

While resource management may be even more important in the business-side of things, all blocks and information we can find in the Extend+MFG suffices to bring this to an acceptable end goal.
Currently, the only thing that is not yet being able to be done is the use of the same resource in multiple places (B133), which strongly depends on how resources can be obtained and managed in PythonPDEVS.

BPR Chapter 11: Activity-Based Costing

This chapter describes ABC similarly to chapter 11 from manufacturing, but I keep on having the same questions ans before: How do we apply ABC to our models in a DEVS- and PythonPDEVS-friendly way? Generally, ABC just adds a cost field to the passed through item, but not all items have fields that can be set (integers, strings...)

BPR Chapter 12: Statistics and Model Metrics

The first sentence of this chapter sounded familiar, so I decided to go back to Extend+MFG and compared this chapter to manufacturing chapter 12. Except for some minor differences to make this chapter stick to the business side of things, it is a word-for-word copy of the same chapter of the previous part.

Part 4: SDI Industry

After a rather annoyingly repetitive part, the Extend+Industry seems to be a refreshing and another way to look at certain models in industry. For as far as I understand it, SDI adds a database to a model, allowing easy manipulation of all model data. It can use real-world data such as time (S14,S19) and geographical data (see Wikipedia) in order to populate models with real-world events. And yet, it is vaguely explained in the book, detailing nothing more than database accessor blocks and vague examples. This makes it therefore unclear if SDI is necessary, or useful within our problem domain.

Starting from page S31, the idea of a flow architecture is also introduced. A flow considers a continuous flow of material, instead of the movement of items through our models. We can do this either by giving each item a predefined value it represents (making our system deterministic once more), but this forces us to do a lot of math. As mentioned on S36, this "aggregating" technique lacks precision. As described on S38, flow systems can be represented in a discrete way by simply communicating rate information between systems, requiring us to add a very specific sublibrary concerning flows. As per the example, it will contain at least a bucket and a pipe. These blocks will contain some mathematics in order to determine its volume and rate. In some, to be defined manner, these blocks should also provide some interaction to all other library blocks, allowing more complex models.

After taking a better look at the physics behind flow systems, it looks as if we have quite a complicated system. Volume Flow Rate \(Q\) can in fact be used (\(A_1 v_1 = A_2 v_2\)) for determining the flowing of liquids through filled pipes, but we need some sort of implementation for Bernoulli's Principle if we want to allow more complex systems. Also, what happens for a bucket? How do we indicate we want some liquid out of the bucket? Generally, the problem is "what messages are being communicated in a flow system?" If we take a look at some specific implementation for a bucket and a pipe, we get the following:

A pipe just passes its \(Q\) it gets in its inport and passes its on to its outport. Physically, speeds and such can change, as well as quite innovative pipe shapes, but let's ignore Bernoulli for now. Note that this assumes all pipes are filled with water. If this is not the case, there is going to be a delay, depending on the pipe's specs.
The bucket has a \(Q_{in}\) and a \(Q_{req}\) (requested output volume, works as a quite precise tap) as input ports and \(Q_{overflow}\) and \(Q_{out}\) as output ports. Internally it holds \(t_{prev} := 0\), \(Q_{ic}\), \(Q_{oc}\), \(V\) and \(V_{max}\). That way, we will always be able to compute its internal volume \(V\) at time \(t\) by using the following algorithm: \[V = V + \left(Q_{ic} - Q_{oc} - Q_{overflow}\right)\left(t - t_{prev}\right)\] \[Q_{ic} = Q_{in}, Q_{oc} = Q_{req}\] \[t_{prev} = t\] If \(V\) now exceeds \(V_{max}\), we can say that \(Q_{overflow}\) equals \(V - V_{max}\) if our next internal event is only 1 timestep away.

They could be joined by a flow splitter and a flow joiner that have \(n\) outputs and inputs respectively and do what they say on the tin: splitting and joining \(Q\) into single, equal flows (\(Q_{split} = {Q \over n}\)).

Appendices

Appendix A: Menu Command Reference

Seeing as this is Extend-specific, this appendix is not amula underpplicable to our project.

Appendix B: Upper Limits

In case you were interested in running Extend as it were in 2000, these are the specs and/or details that mattered. Of course, a model containing 2 billion blocks can be impressive on one side, but seems obnoxious to work with. Anyhow, these values do not matter for our project.

Appendix C: Cross-Platform Considerations

Seeing as Python is mainly platform independent, this appendix also does not apply. If there were to be a block or a function in our library that was made platform-specific, it is important to mention that information in the documentation of the block and/or the function.

Appendix D: Generic Library Blocks

While most blocks are used in examples in the book, there might be some that should be added to our library.

Even though, I haven't seen it used too often, the retain block might come in handy in few models.
While I split up the mean and the variance, Extend keeps the computation in the same block. I'm not sure if it is useful to join the two blocks, or if it's better to keep them separate.

Appendix E: Discrete Event Library Blocks

While most of the blocks in this library are, in my opinion, either overhead, useless, or possible with other blocks, it is good to have an overview of what should be done. Especially seeing as I disregarded this entire library previously.

The gate is interesting to say the least, but I unfortunately do not know how this can be achieved in PythonPDEVS.

Appendix F: Blocks in the Manufacturing Library

Most of these blocks are added to our library, if they are useful in modelling. Things like tools can be replaced by resource pools and such, or by using them in a closed system. Anyhow, besides the preemptive process block, there is not a lot that peaks my interest and even then, I do not know if it is useful/necessary.

Appendix G: Blocks in the Statistics Library

As previously stated, most blocks in this library do not have to be connected, but instead accumulate overall data which can be used for overall model analysis. In PythonPDEVS, I doubt this can be done in a similar manner, but instead requires additional functionality. This can be either via functions that check our model, by the use of hierarchy and inheritance of the root of the model, or via tracers that store overall results. The latter of which sounds like a way better option. Thus, should we expand our library to include few tracers for this purpose, or can we assume this to be out-of-scope for this project?

Appendix H: Blocks in the BPR Library

In true fashion to the entire Extend+BPR section, there are no new additions or changes to be added to our library, seeing as everything can be represented with already-existing building blocks.

Programmer's Reference

While the User's Guide focused more on the usage of Extend, the Programmer's Reference tries to dive deeper into the core of the program. From the very first page, it is clear that this book itself can be incredibly useful in creating a GUI for DEVS. Yet, seeing as PythonPDEVS does not have/need a GUI for now, it is intriguing to see what ideas/concepts it might spring for our library.

Chapter 1: Parts of a Block

Extend comes with a full modal builder, which allows the user to specify certain attributes. Even though, this is pretty useful within its context, this is not something we have any use for.
As was common in the early 2000-years, Extend is bundled with ModL, its own builtin programming language (comparable to C). As the book mentions, "If you are familiar with the C programming language, you are probably very comfortable with the ModL code. [...] When you build blocks in Extend using ModL, the block's program is compiled to native machine code. ModL is essentially C with a few extensions and changes." (P13)
Interesting enough, Extend provides 4 different types of connectors, while (as far as I've seen), PythonPDEVS only has 1, which corresponds to Extend's universal connector. Even though, it would be useful to be able to check the connections of a model, this would disregard the universal nature in which we want our library to work.

Chapter 2: ModL Code Overview and Block Creation

There are a lot of issues with the ModL language, in comparison to numerous other programming languages. Without going into depth on why I don't think certain choices should have been picked, suffice to say, I'm getting slightly annoyed by ModL.
The example of P38-P60 stated a block which technically converts distances from one unit to another, making me wonder how useful this is. Personally, I've had a similar idea of adding a converter to our library, which could convert between tons of different units (including from the imperial system to the metric system). The only issue with this would be variable conversions like with valuta.

Chapter 3: The ModL Language

Luckily, we will not have to leave the comfort zone of Python for our library, meaning the information of the functionality of ModL, which is represented in this chapter, does not fall within the scope of our project.
ModL does provide a set of predefined constants, but while I was tempted to add them to our library, it occurred to me that they (and more) already exist in Python.

Chapter 4: ModL Programming Techniques

Simulations can already be ran in PythonPDEVS, meaning most of this chapter will not apply. What is good to keep in mind is that all blocks must run as efficient as possible. This is why I made the decision to add the Big O-notation to the explanation of each block.
The pull mechanism P203 describes exactly what I am not sure of exists in PythonPDEVS. A solution to allow this in PythonPDEVS is to give all queues a request input, which can ask to release an item, and make the delayer send an item release request if it is empty.
Debugging of models can be handy, as described on P232 and in the following chapter. Yet, it lies outside the scope of our project. Additionally, we must guarantee that our library is tested sufficiently and is as bug-free as possible.

Chapter 5: Using the ModL Source Code Debugger

An explanation of what debugging is and how to do it in Extend make up this chapter, which is good to have, but lies outside the scope of our project.

Appendices

Appendix A: Upper Limits

It is good to know the limitations of your system, especially when programming. Anyhow, this appendix is exactly the same as Appendix B from the User's Guide.

Appendix B: ASCII Table

If you are unaware of what ASCIIis and how it works, this table should help. But then again, a quick Google search will provide more detailed and descriptive results than this appendix.

Appendix C: Menu Command Numbers

It is possible to call a GUI-functionality from ModL, all of which have been given numbers as listed in this appendix.

Appendix D: Cross-Platform Considerations and Equivalents

Admittedly, not everything is the same on each platform, but Python solves most of it (see also Appendix C from the User's Guide).

Appendix E: The SDI Database API

Even though I haven't seen any reference to the SDI database in this book, it is an appendix that complements Part 4 from the User's Guide. It might come in handy if we take a deep dive into the SDI itself.

Deep dive into different tools for modelling

I've been quite thoroughly going through the massive manual for Extend (currently, the software is renamed to ExtendSim 10 DE), but it's not the only tool supporting block-based modeling and simulation. There are numerous others (often used by major companies in massive projects), of which I will try a handful and compare them to one another (see this table). This will be done by taking a look at

which big companies use this software,
the software itself (its look and feel, user-friendliness, visual beauty...),
the formalisms used within the software (DEVS, GPSS, ...),
the possibility for complex models (Can you model an entire factory? How many building blocks are there?) and
the simplicity for creating and simulating a small (example) model.

Note that these values are purely based upon the trial/free versions of all these tools.

Simul8 [link]

Simul8 is a process simulation creation tool that allows for business process analysis and performance improvement. It's a web-based tool, meaning no installation is required and, theoretically, it works on any system. Simul8 is used by companies like Nokia, McDonalds, Coca-Cola, Cisco and many others. Mayor drawbacks for educational or personal usage are a short, limited trial version (mere 14 days, short modeling sessions...) and its expensiveness.

Anyhow, Simul8 is used by a lot of companies in industry, indicating that it must be an excellent tool, but personally I was thoroughly disappointed when I started experimenting with the trial version. Don't get me wrong, graphically Simul8 is incredibly appealing (due to its Microsoft Office 2007+-style user interface) and allows for users to create both beautiful looking models and simulate them instantaneously. There is also the possibility to show live graphs and add images and styling to the models.

For me, a major drawback was the finickiness of the editor. There are overall low response times for any setting or action (if they fire at all), giving of the vibe of Simul8 being rather slow and laggy. Admittedly, there are loads of features and possibilities that might allow for Simul8 to be this slow, but in my opinion, they want to support too much, resulting in bad software with a lot of features, yet lacking in an extensive building block library. Don't get me wrong, I know how difficult it is to built a tool like this, especially in a browser, but in my opinion, they should opt to do loads of things differently.

It could be well that this negative look I'm giving, could be the result of major throttling and downsizing due to it being a trial version, but if that's the case and I'm getting that feeling from the trial, I personally wouldn't be bothering in buying the tool (especially because of the price, which is not worth it, for me).

Another aspect to take into account is its steep learning curve. Yes, you may have a cash register queuing model within 5 minutes, but if you want to do even a little bit more, you really have to know what you want and where to find it. So, in my opinion, it has a steep learning curve in the same way that software from Adobe is said to have a steep learning curve.

While Simul8 not being my personal favorite, it does come with some interesting aspects, mainly being a Carbon Footprint attribute that can be set on any (functional) block. This works similarly to the cost that was/is builtin into Extend, but is more modular. This made me think we could add a tracer in our library that handles all this tracking.
When comparing our already defined building blocks with those from Simul8, I decided that my machine and station blocks were basically the same, so I merged them into a worker block.

FlexSim [link]

FlexSim is a 3D simulation modeling and analysis tool that allows users to improve any system or process, while also being able to transform data into predictions. Looking at its examples alone colored me intrigued. FlexSim is used by Ford, Coca-Cola, FedEx, Apple Toyota, IBM and many more. It has applications in industry, healthcare, warehousing and others, thus being incredibly wide. Unfortunately, FlexSim is only available for Windows

An immediate eyecatcher is the fact that FlexSim comes bundled with an entire 3D-engine with highly realistic, immersive graphics. As a plus, there are at least 30 building blocks, even more control units (including A* Navigators) and numerous libraries to be linked.

FlexSim's 3D view is clean, elegant, yet a little bit sensitive. As someone who has dabbled in 3D animation and modelling (read as: making 3D models), the interface feels familiar and is easy getting used to. I can certainly see myself fiddling about in this program and create my models this way. Following a tutorial, it became abundantly clear that creating factory models, or even warehouse storage rooms is incredibly simple and easy. AI person behavior is a little bit trickier, but then again, this makes for a more complex model.

In short, FlexSim is useful, visually appealing and clean, while supporting infinite different kinds of models, especially when there is no model limit (presumably in paid version, but not enough information to be found). I can see many major companies designing and checking their warehouses, factories... in this software.

As far as PythonPDEVS is concerned, FlexSim made me realize agents can be interesting to add. In hindsight, if we call an item an agent, we remain this functionality without the need of adding something extra to our library. Another idea is coupling our library to some sort of 3D-visualizer. Blender and Unity came to mind.

Enterprise Dynamics (INCONTROL) [link]

INCONTROL is a company that makes simulation software like Enterprise Dynamics. They claim Enterprise Dynamics is the leading simulation software platform to assist and support in the modeling and analyzing of virtually any problem. Applicable in warehousing, manufacturing, material handling, supply chain, railway and supply chain simulations, this tool is used by Philips, KNAPP AG, Volvo, Schiphol and others. It is, again, only available on Windows.

The software's user interface reminds me of some Adobe programs, as well as tools like Edius, Autodesk Maya, Inkscape and AToMPM. Similar to some Apple applications, the tool has only a toolbar and some floating windows, each with their own content and functionality, yet linked to other tools. The Model Layout makes sense at first glance and looks and feels similar to how I would make a tool like this.

By default, the channels are not enabled in Enetrprise Dynamics. This implies, one does not see the connections in the system, let alone be able to edit them. As soon as I found this, I was able to quickly create simple models. Downside of this tool are the Library Tree that also contains non-atoms, certain things working quite finicky and the overall feeling of incompleteness I'm getting from using this tool. Yet, Enetrprise Dynamics comes closest to representing the DEVS formalism we will be using in our library.

The lack of information on how to use this tool is bothersome, therefore making good documentation the main takeaway when looking towards our PythonPDEVS library. Another point of interest was their usage of channels (in DEVS called ports). While Enterprise Dynamics does not allow a single output port to be connected to multiple inputs, which is possible in DEVS, they solve this issue via simply allowing multiple input and output ports, with some function choosing to which port this information is sent. While we could add this functionality to all our building blocks in our library, the select output block we have does this already. We would just need an additional block.
Interestingly, they also toy with the fact that channels are open (green) or blocked (red). An item cannot be sent on a blocked port, while it can be sent on an open port. If we could add this idea to all the blocks in our library, we may be a step closer to implementing the pulling mechanic that was described above. But then again, a puller block could bypass this idea completely.

Arena [link]

As a part of Rockwell Automation, Arena is a company that makes software (called Arena, made for Windows) specifically for BPR, based upon DEVS, or so they say.
Unfortunately, I was not able to determine whether this was possible, seeing as my Windows distribution would not allow the installation of Arena, mainly due to an invalid binding of Microsoft Access.

Comparison Table

	Extend (ExtendSim 10 DE)	Simul8	FlexSim	Enterprise Dynamics
Used By	Stanford University, PacBell, Pitney Bowes, P&G, Northrop Grumman, Canadian Army, Boeing...	Nokia, Coca-Cola, McDonalds, Cisco, Unilever, Dell, American Red Cross...	Ford, Coca-Cola, FedEx, Apple, Toyota, IBM...	Philips, KNAPP AG, Volvo, Schiphol...
User Interface	Clean, logical for the most part. Icons make sense. Resized windows with multiple tabs don't resize the tabs, which is annoying.	Logical interface, clean layout. Icons make sense.	Logical interface, clean layout, beautiful graphics. Neat, logical icons.	Building blocks overview contains other functionalities, can look messy quite quickly. Icons are lacking.
Searchable Building Blocks	No	No, but too few blocks to matter	Yes	No
Adding Building Blocks	two clicks	two clicks	via dragging and dropping	via dragging and dropping
User Friendliness	Would much prefer additional information on all building blocks on hover or something, now it's more of a guessing game. Workspace is elegant and clean. Gives the sense of Eclipse and Logisim. Toolbars and option window system is presumably based upon most software from Adobe.	Some inconsistencies, loads of functionality, logically structured, workspace can be finicky. Looks like Microsoft Word 2007+.	Great look and feel, but requires getting used to (especially the controls). It looks amazing.	Logical workflow, easy to use. 3D-view has strange controls.
Loading Time	15 seconds	1-2 minutes, if it loads at all	10 seconds	5 seconds
Learning Curve	Quick to use. Components make sense and have logical images, which helps.	Quite high due to lots of functionality	High. If you are not used to 3D-tools, this can be even higher.	Low, quick to use.
Simple Model In:	2 minutes	5 minutes (including processing time of 'block has been placed' events)	7 minutes (including getting to know the controls and figuring out how to do connections)	5 minutes (including research time on why I could not see connections; i.e. until I found the 'channels')
Main Formalisms Note that GPSS in this context mainly stands for Activity Scanning or Process Interaction	DEVS with some GPSS influence (resource pools, workflow systems, pulling outputs), but not required	Mainly GPSS (but wen using a part of the tiny library, you can obtain pure DEVS) + resource pools and flow diagrams	DEVS with variable port numbers, possibility for process flows, scripting and Agent-Based Modeling	DEVS with push system (GPSS influence)
Amount of Different Building Blocks Bundled	~90	~20	~55	~60
Allows for custom addons/plugins/building blocks?	Yes, separated in different libraries	Not specified, assuming not	Yes	Yes
Model Possibilities	\(200000000000^{8000} = \inf\)	No limit on building blocks specified => \(\inf\)	\(55^{30} = 1.62\cdot 10^{52}\) in trial version, excluding the amount of possible connections	\(60^{500000} = \inf\)
Domain-Specific Models Possible?	Yes	Yes, but only factories	Yes	Yes
Debugger Available?	Yes	No	Yes	Yes
Ready On-The-Go?	No	Yes	No	No
Operating System	Windows, Mac OS	On browser, so theoretically all OS	Windows	Windows
Best Feature	Graphically: Speed slider is defined with a turtle (slow) and a hare (fast). Technically: Notebook for additional information about the models.	Carbon Footprint calculator	Complex AI (A*) functionalities available for agent-based-systems	Automatic component connection, but this is obviously not foolproof!