jexelmans
/
thesis


			
							12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187118811891190119111921193119411951196119711981199120012011202120312041205120612071208120912101211121212131214121512161217121812191220122112221223122412251226122712281229123012311232123312341235123612371238123912401241124212431244124512461247124812491250125112521253125412551256125712581259126012611262126312641265126612671268126912701271127212731274127512761277127812791280128112821283128412851286128712881289129012911292129312941295129612971298129913001301130213031304130513061307130813091310131113121313131413151316131713181319132013211322132313241325132613271328132913301331133213331334133513361337133813391340134113421343134413451346134713481349135013511352135313541355135613571358135913601361136213631364136513661367136813691370137113721373137413751376137713781379138013811382138313841385138613871388138913901391139213931394139513961397139813991400140114021403140414051406140714081409141014111412141314141415141614171418141914201421142214231424142514261427142814291430143114321433143414351436143714381439144014411442144314441445144614471448144914501451145214531454145514561457145814591460146114621463146414651466146714681469147014711472147314741475147614771478147914801481148214831484148514861487148814891490149114921493149414951496149714981499
							\chapter{Implementation: The SCCD Runtime}
\label{chapt:Implementation}

This chapter discusses the main artifact of our work: our implementation of a statechart execution runtime with configurable semantics.

We first explain how our implementation differs from the original SCCD project, from which it was forked. We then introduce the basic design of our implementation, which consists, among other things, of a simple action language, the statechart language with semantic variability, and, on top of it, the SCCD language and its event-queueing execution, implemented in what we call ``the Controller''. We then explain the detailed design and implementation of each of those languages, giving a deep insight into our solution. Finally, we go back to discuss and motivate changes made from the original SCCD project, at a more detailed level.


\section{Current State of SCCD}

% \subsection{Original SCCD project}

The original implementation of SCCD \cite{VanMierlo2016} was a statechart \& class diagram compiler + execution runtime library, which supported a (smaller) subset of BSML semantic configurations as well.

At the highest level, an SCCD model is a \emph{class diagram} where the behavior of the classes is defined as a statechart. The class diagram has a single \emph{default class}, which is instantiated when the model is instantiated. This instance may \emph{dynamically create} new instances of other classes (statecharts). The relationships between instances (e.g. multiplicities) are modeled in the class diagram and enforced during execution.

It generated executable code in a number of target languages. The supported target languages were Python, JavaScript and C\# \cite{glenn2014statecharts}.

Semantic configurations were limited to Big-Step Maximality (\textsc{Take One}, \textsc{Take Many}), Internal+Input Event Lifeline (\textsc{Next Small Step}, \textsc{Next Combo Step}, \textsc{Present in Remainder}, \textsc{Queue}) and Priority (\textsc{Source-Child}, \textsc{Source-Parent}).

\begin{figure}
\centering
\includegraphics{original_sccd_features}
\caption{Feature diagram of original SCCD}
Figure taken from \cite{VanMierlo2016}
\label{fig:original_sccd_features}
\end{figure}

% \todo{illustration of original SCCD project, many-to-many compiler, feature tree?}

\subsection{SCCD in this thesis}

The SCCD discussed in this thesis is a fork from the original SCCD. The most important ``functional additions'' to original SCCD are the following:

\begin{description}
  \item[More semantic options]
    The main goal of this thesis. On top of the semantic options supported in original SCCD, the following options were added: Big-Step Maximality: \textsc{Syntactic}, Combo-Step Maximality: \textsc{Combo Take One}, \textsc{Combo Take Many}, \textsc{Combo Syntactic}, Memory Protocol: \textsc{Big Step}, \textsc{Combo Step}, \textsc{Small Step}, Priority: \textsc{Hierarchical} (\textsc{Arena-Parent}, \textsc{Arena-Child}), \textsc{Explicit Priority}, \textsc{Negation of Triggers}, Order of Small Steps: \textsc{None} and \textsc{Explicit Ordering}. See \ref{tab:sccd_as_bsml} for a full overview.

  \item[Action language]
    In the original SCCD, action code had to be written in the same language as the target language of compilation, making models non-portable. This was only a temporary solution, the plan was to add an action language eventually. Another reason for integrating an action language is to have precise control over Memory Protocol semantics. Therefore, SCCD now has a built-in (textual) statically type-checked action language.
\end{description}

\begin{table}
\footnotesize
\centering
\begin{minipage}{.4\linewidth}
\centering
\begin{tabular}{ | c | c |}
  \hline
  \textbf{Big-Step Maximality} & \\
  \hline
  \textsc{Take One} & \checkmark \\
  \textsc{Take Many} & \checkmark \\
  \textsc{Syntactic} & \checkmark \\
  \hline
  \hline
  \textbf{Combo-Step Maximality} & \\
  \hline
  \textsc{Combo Take One} & \checkmark \\
  \textsc{Combo Take Many} & \checkmark \\
  \textsc{Combo Syntactic} & \checkmark \\
  \hline
  \hline
  \textbf{Input/Internal Event Lifeline} & \\
  \hline
  \textsc{Present in Whole} & \\
  \textsc{Present in Remainder} & \checkmark \\
  \textsc{Present in Next Combo-Step} & \checkmark \\
  \textsc{Present in Next Small-Step} & \checkmark \\
  \textsc{Present in Same} & \\
  \hline
\end{tabular}
\end{minipage}%
\begin{minipage}{.6\linewidth}
\centering
\begin{tabular}{ | c | c | }
  \hline
  \textbf{Enabledness/Assignment Memory Protocol} & \\
  \hline
  \textsc{GC/RHS Big Step} & \checkmark \\
  \textsc{GC/RHS Combo Step} & \checkmark \\
  \textsc{GC/RHS Small Step} & \checkmark \\
  \hline
  \hline
  \textbf{Priority} & \\
  \hline
  \textsc{Hierarchical} & \checkmark \\
  \textsc{Explicit Priority} & \checkmark \\
  \textsc{Negation of Triggers} & \checkmark \\
  \hline
  \hline
  \textbf{Order Of Small Steps} & \\
  \hline
  \textsc{None} & \checkmark \\
  \textsc{Explicit Ordering} & \checkmark \\
  \textsc{Dataflow} &  \\
  \hline
  \hline
  \textbf{Concurrency} & \\
  \hline
  \textsc{Single} & \checkmark \\
  \textsc{Many} & \\
  \hline
\end{tabular}
\end{minipage}
\caption{Semantic options supported in SCCD}
\label{tab:sccd_as_bsml}
\end{table}

Under the hood, SCCD has changed so much, that little resemblance with the original remains. Most importantly, SCCD is no longer a compiler (code generator), but an execution runtime, which loads models in an XML format and executes them. A detailed discussion and motivation for the changes made, follows at the end of this chapter, in Section \ref{sec:changes}.

\section{Overview of Implementation}

The SCCD runtime consists of 3 languages: the action language, the statechart language, and the SCCD language. The action language is part of the statechart language, and the statechart language is part of the SCCD language. Apart from these 3 languages, a few libraries are included. The dependency graph between SCCD's packages reflects this (Figure \ref{fig:package_dependencies}).

\begin{figure}
\centering
\includegraphics[]{package_dependencies_nolabel}
\caption{Dependencies between packages (directories) of SCCD.}
The meaning of an arrow is: ``depends on''.
\label{fig:package_dependencies}
\end{figure}

An overview of the directory structure (packages) at the highest level:

\begin{description}
  \item[\texttt{action\_lang}] Implementation of the action language parser, syntax and semantics.
  \item[\texttt{statechart}] Implementation of the statechart parser, syntax and variable semantics. Naturally depends on \texttt{action\_lang} since the action language is part of the statechart language as well.
  \item[\texttt{cd}] Placeholder implementation of the ``class diagram'' (SCCD language) parser and syntax. A model in SCCD is always a class diagram, with the behavior of each class defined as a statechart. Each class diagram also has a \emph{default class} defined, which is initialized at startup, just like e.g. the ``main class'' in a Java program. In this thesis, all SCCD models consist of only a single default class, whose behavior is defined by a statechart.
  \item[\texttt{controller}] Implementation of the Controller, the primitive for executing SCCD models.
  \item[\texttt{realtime}] An optional library, wrapping around the Controller, for soft real-time (wall-clock sync'ed) execution of models. Most ``real'' applications would want to use this. Includes support for event loop integration.
  \item[\texttt{test}] Parser and executor of SCCD tests. An SCCD test consists of an SCCD model, a set of timed input events, and a set of timed output events.
  \item[\texttt{util}] Utility library, used by most of the project. Contains classes like \texttt{Bitmap}, \texttt{Duration}, etc.
\end{description}

Every one of the 3 languages has a parser, a set of language constructs (the abstract syntax), and an implementation of its execution. These elements are implemented in the sub-packages \texttt{parser}, \texttt{static} and \texttt{dynamic} of each language. The parser and dynamic packages always depend on static: The parser package \emph{produces} instances declared in static; the dynamic package \emph{reads/interprets} instances from static (it does not modify them) in order to carry out execution.

Also note how the the sub-packages of a language on the right depend on their respective sub-packages of a language on the left: E.g. the statechart language's parser-package depends on the action language's parser-package, as the statechart language may include action language fragments, and therefore in order to parse statechart models, it also has to parse action language code, but it does not depend on the action language's dynamic-package, as in order to parse statechart models, it does not have to execute action language code.

Figure \ref{fig:package_dependencies} shows the dependencies between the packages (directories) of SCCD. Note how the \texttt{action\_lang}, \texttt{statechart} and \texttt{cd} packages (directories) have the same sub-package structure of \texttt{static}, \texttt{dynamic} and \texttt{parser}. The sub-packages on the right depend on their equivalent on the left. The recurring sub-packages have the following meaning:

\begin{description}
  \item[\texttt{static}] Roughly, the \emph{syntactical constructs} of the language. A loaded model is a connected graph of instances of the classes (types) defined in this package. They are constructed by the parser, and some semantic processing steps / checks (e.g. static analysis) are possibly done. Those processing steps are also implemented in this package. Once the model is loaded, the model itself no longer changes.
  \item[\texttt{dynamic}] The types defined in this package are responsible for creating and executing instances of a loaded model. This package only \emph{reads} loaded models (i.e. instances from \texttt{static}), it does not modify them.
  \item[\texttt{parser}] The parser logic: a composition of rule-based algorithms for parsing textual (in the case of the action language) or XML (in the case of the statechart and class diagram languages) data into loaded models, producing instances of \texttt{static}.
\end{description}

The dependency graph among packages in Figure \ref{fig:package_dependencies} hints that the Controller is the ``\texttt{dynamic}'' part of the ``\texttt{cd}'' package. This is a correct observation, as the Controller \emph{reads} class diagram models and creates and executes instances of it. Initially, the Controller was given its own package to make it prominent. At some point, it may be put in the \texttt{cd.dynamic} package.

In the following sections, we will introduce the detailed design of the action language (Sec \ref{sec:action_lang_impl}), the statechart language (Sec \ref{sec:statechart_impl}), and finally, the Controller (or better, the class diagram language top-level execution semantics) (Sec \ref{sec:controller_impl}).


\section{Action Language Implementation}\label{sec:action_lang_impl}

The action language is an important new feature in SCCD. In statecharts, action language code is mostly used in a transition's guard condition (an expression) and a transition's action (a block of statements). The integration of an action language makes the statechart language much more powerful, while remaining portable. By using a customly developed language, we have complete control over its features.

The action language is a simple \textbf{procedural}, \textbf{statically typed} language with a syntax somewhat resembling Python (for expressions) and JavaScript (for function declarations and blocks). It is a language on its own, and can be used independently of other parts of the SCCD project. (An interactive prompt app is included in the project to demonstrate this.) Currently, the language is interpreted, but it would be trivial to add a code generator, since much of the work a compiler does has already been implemented in the static analysis step, such as assigning a memory layout to declared variables.

In this section, the action language itself is the focus, but we may mention statecharts now and then, as a motivation for why e.g. some design decision was made. Figure \ref{fig:parser_analysis_execution} shows how the action language is used. First, the parser constructs an AST of syntactical constructs. Next, a static analysis step ``initializes'' the AST tree (note the prime symbol). Then, using the initialized AST tree, execution happens as a function transforming memory.
We will now briefly discuss the parser and syntactical constructs, followed by static analysis and execution.

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{parser_analysis_execution}
\caption{Usage of the action language}
\label{fig:parser_analysis_execution}
\end{figure}


\subsection{Parser and syntactical constructs}\label{sec:action_lang_constructs}

The Python library Lark\cite{Erez2020Lark} is used to parse action language fragments. Lark constructs a parser from a grammar file. Lark can switch between 2 parsing algorithms: LALR(1) and Early. The former offers better performance, while the latter is easier to write grammars for. We use LALR(1) since it was found to work perfectly well.

The grammar file used by the action language is listed in Appendix \ref{chapt:grammar}. Apart from the grammar file, the parsing step also uses a ``transformer'' class, written in Python, which translates encountered textual constructs to our own syntactical constructs. The transformer class also \emph{desugars} some syntax, e.g. \texttt{i += 1} becomes \texttt{i = i + 1}.

Figures \ref{fig:cd_expression} and \ref{fig:cd_statement} show a class diagram of the syntactical constructs that make up the language. Some constructs are composed of other constructs, and can form a tree structures, called ASTs (abstract syntax trees). The result of parsing a piece of action code is always an AST.

\textbf{Example:} Figure \ref{fig:cd_y=2x} shows the AST for the statement \texttt{y = 2 * x}. At the root of the AST is the assignment statement. The left-hand side of the assignment is the identifier \texttt{y}, which is treated as an LValue. At the right-hand side is the expression \texttt{2 * x}, where $x$ is treated as an RValue.

\subsubsection{Expressions, statements and LValues}

Every construct either implements the \texttt{Expression} or \texttt{Statement} interface. \texttt{Expression} is not a subtype of \texttt{Statement}, or vice versa. Expressions can be \emph{evaluated}, statements can be \emph{executed}, and both can have side-effects, such a new value to a variable being written. Expression evaluation always yields a result, statement execution does not.

There is also the abstract class \texttt{LValue}, which is a bit special. \texttt{LValue} inherits \texttt{Expression} because all \texttt{LValue} instances can also be treated as expressions (i.e. RValues), but only if they occur in an expression context. Vice versa, an \texttt{LValue} instance occuring in an LValue context is never treated as an expression. The only LValue-context currently existing in the action language is the left-hand side of an \texttt{Assignment} statement. When an \texttt{LValue} type is the root of an AST, it is always treated as an \texttt{Expression}.

All constructs have at least 2 methods (See Table \ref{tab:methods}):
\begin{enumerate}
  \item A \emph{static analysis} method, implementing the static analysis step, which must be executed on every AST before it can be executed.
  \item An \emph{execution} (or evaluation) method, implementing the \emph{actionable} part of the language.
\end{enumerate}

We will first discuss static analysis, execution follows later.

\begin{table}[h!]
\centering
\begin{footnotesize}
\begin{tabular}{ | r | c | c | }
\hline
&             \textbf{Static Analysis} &         \textbf{Execution} \\
\hline
Expression &  init\_expr(:Scope): SCCDType &                eval(:MemoryInterface): Any \\
LValue &      init\_lvalue(:Scope, rhs\_type: SCDDType) &   eval\_lvalue(): int \\
Statement &   init\_stmt(:Scope): ReturnBehavior &          exec(:MemoryInterface): ReturnValue \\
\hline
\end{tabular}
\end{footnotesize}
\caption[Methods for static analysis and execution]{Methods for static analysis and execution. The meaning of these methods is discussed in the subsections on Static Analysis (\ref{sec:action_static_analysis}) and Execution (\ref{sec:action_execution}).}
\label{tab:methods}
\end{table}

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{CD_Expression2}
\caption{Syntactical constructs, Expression type.}
\label{fig:cd_expression}
\end{figure}

\begin{figure}
\centering
\includegraphics{CD_Statement2}
\caption{Syntactical constructs, Statement type.}
\label{fig:cd_statement}
\end{figure}

\begin{figure}
\centering
\includegraphics{CD_SCCDType}
\caption{Type constructs}
\label{fig:cd_sccdtype}
\end{figure}

\begin{figure}
\centering
\includegraphics{CD_y=2x}
\caption{AST for statement: \texttt{y = 2 * x}}
\label{fig:cd_y=2x}
\end{figure}

\subsection{Static analysis}\label{sec:action_static_analysis}

Static analysis is an important feature of the action language. It is responsible for 2 related things:
\begin{enumerate}
  \item Static typing with type inference: Determining the types of all expressions, while checking whether types are compatible in all expressions and statements.
  \item Assigning a memory layout to declared variables. Currently, only stack memory is supported, there is no heap. (The execution runtime allocates stack frames on Python's heap, though.) 
\end{enumerate}

Static analysis must be done once, after the structure of the AST has been created, and before the AST can be ``executed'' (interpreted). Static analysis may \emph{fail} (yield an error) if the AST is found to be semantically invalid, e.g. when a type error is found. Static analysis is an (idempotent) operation on the AST, that only adds information to the tree, such as types, memory offsets, and scope objects (see later).

Every construct in the action language has a method implementing the analysis step. To run static analysis on an AST, one calls the static analysis method on the root node of the AST. Nodes that have children implement static analysis by also invoking it on their children. As such, the analysis step is performed on the entire AST.

Depending on whether a node is an \texttt{Expression}, \texttt{LValue} or \texttt{Statement}, the method for performing static analysis is as follows:

\begin{description}
  \item[Expression:] \texttt{init\_expr(scope: Scope): SCCDType}

  Initializes the expression as part of the given scope (more on scopes later), determines and returns the type of the expression. 

  Typically, expressions use the scope to lookup types and memory offsets of encountered variable names.

  \item[LValue:] \texttt{init\_lvalue(scope: Scope, rhs\_type: SCCDType)}

  Initializes the LValue as part of the given scope, with the given type \texttt{rhs\_type}.

  An LValue may introduce a new variable to the scope, or if the variable already exists, it merely asserts that the types match (assignment-wise), and looks up the memory offset of the existing variable.

  \item[Statement:] \texttt{init\_stmt(scope: Scope): ReturnBehavior}

  Initializes the statement as part of the given scope, determines and returns the ``return behavior'' (see later section) of the statement.
\end{description}

\subsubsection{Determining expression types}

The static analyzer determines the type of every expression. Different strategies are used for different expression types:

\begin{description}
  \item[Literals] (e.g. \texttt{IntLiteral}) Trivial, the type is always the same.

  For \texttt{ArrayLiteral}, the type is Array-of-element-type, and mixed element types are not allowed.
  \item[BinaryExpression] (e.g. a sum) The type depends on the operator, the left-hand side and right-hand side expressions, e.g. the sum of two integers is an integer.
  \item[UnaryExpression] (e.g. unary minus) The type depends on the operator and the expression.
  \item[Identifier] If the identifier occurs in a RValue context, its type is discovered by looking up the variable name in the scope object received during static analysis.% If the identifier occurs in an LValue context, either a new variable is created (if the name does not exist in the scope) or a new value is assigned to an existing variable. If a new variable is created, the type of the rhs-expression is used
  \item[FunctionDeclaration] The type is ``function type'' (Figure \ref{fig:cd_sccdtype}). The formal parameters are type-annotated in the syntax. The return type is inferred (see later).
  \item[FunctionCall] First, it is asserted that the expression being called is of function type. Then the type is the return type of the function being called.
\end{description}

\subsubsection{Variable declaration type inference}

A useful feature of the action language is its type inference for declared variables (this section), and return types of functions (next section). Type inference results in less verbose code, saving development and maintenance time.

In many statically typed languages such as C or Java, when declaring a variable, a type must be given. This is not the case for the action language, because of the following principles:

\begin{enumerate}
  \item The only way to declare a new variable is by \emph{assigning} a value to a name that does not yet exist in the current scope (or parent scopes).

  As a consequence, variables are always initialized with a value, which can prevent certain errors. 
  \item The type of a variable on the left-hand side of an assignment is inferred from right-hand side of the assignment. The right-hand side is just an expression, and the type of expressions is statically known.
\end{enumerate}

A type annotation is also not optional (like in TypeScript or Haskell): it is simply not part of the language. The only place where type annotations occur, is in the formal parameters of a function declaration. Unlike e.g. Haskell, the action language is not powerful enough to infer the possible types of a function's formal parameters.

\subsubsection{Function return type inference}

As mentioned before, the return type of a declared function is inferred:

\begin{python}
  inc = func(i: int) {
    return i + 1;
  };
\end{python}

In the above code fragment, the value assigned to \texttt{inc} will be of type \texttt{func(int) -> int}, meaning, a function taking an integer as parameter, and returning an integer.

This feature is more complex than one might think, as a piece of code may contain multiple branches. Consider the following fragment, which the static analyzer will reject:

\begin{python}
  inc = func(i: int) {
    if (i < 10)
      return i + 1;
    else
      return "too large" # error: not all branches return the same type
  };
\end{python}

Different branches return different types, making the return type of the function vary depending on the input. This is not allowed in the action language. Detection of this is implemented by having static analysis come up with a static description of the \emph{return behavior} of every statement (just like static analysis determines the type of every expression). The return behavior is recorded in an object of type \texttt{ReturnBehavior} (Figure \ref{fig:cd_static_analysis}), returned from the \texttt{Statement.init\_stmt} method, which was already mentioned.

The return behavior can be one of:

\begin{description}
  \item[NEVER] The statement never returns, i.e. there are no return-statements in any of the branches.
  \item[SOME\_BRANCHES] The statement contains a conditional branch. One or more branches return a value of the same known type, other branches do not have return-statements.
  \item[ALWAYS] All of the statements branches contain a return statement, and they all return the same known type.
\end{description}

For SOME\_BRANCHES and ALWAYS, the ``return type'' (e.g. ``int'') is included in the \texttt{ReturnBehavior} object.

For simple statements, such as \texttt{Assignment} or \texttt{ExpressionStatement} (return behavior: NEVER) and \texttt{ReturnStatement} (return behavior: ALWAYS), the return behavior is always the same. For complex types of statements, the return behavior depends on their sub-statements:

\begin{description}
  \item[\texttt{Block}] A sequence of statements. The return behavior of a block is calculated with an algorithm that walks over the sequence statements: Initially, the return behavior is NEVER. For each next statement, the ``so-far''-calculated return behavior is \emph{sequenced} with the return behavior of that statement. Sequencing means: the return behavior can only go from NEVER to SOME\_BRANCHES or ALWAYS, and only from SOME\_BRANCHES to ALWAYS. While this happens, as soon as a return type has been established, later statements must return the same type, if they have non-NEVER return behavior, or we throw an error.

  This algorithm uses the static method \texttt{ReturnBehavior.sequence} (Figure \ref{fig:cd_static_analysis}), which takes to ``so-far''-calculated behavior, and the behavior of the ``next''-statement to produce a new ``so-far''-return behavior.
  \item[\texttt{IfStatement}] A conditionally executed statement, with an optional else-branch. First of all, if there is no else-branch, the return behavior of the else-branch is considered NEVER. Next, the branches are \emph{combined} according to an algorithm: If both branches have the exact same return behavior, that will be the return behavior of the \texttt{IfStatement} as well. In all other cases, the return behavior will be SOME\_BRANCHES, as at least one of the branches does not return ALWAYS and at least one does not return NEVER. We also check if the returned types match. If not, we throw an error.

  This algorithm is implemented in the static method \texttt{ReturnBehavior.combine\_branches} (Figure \ref{fig:cd_static_analysis}).
\end{description}

It is clear that for a function body, only NEVER and ALWAYS are allowed, because whether a function returns something, and the type of what is returned, must always be the same.

The SOME\_BRANCHES option thus is illegal for function bodies, but it is allowed in other parts of the AST. For instance, look at the code fragment and its AST in Figure \ref{fig:cd_branches}. The \texttt{IfStatement} in the AST has return behavior SOME\_BRANCHES, meaning it may or may not return (an integer). However, it occurs in a \texttt{Block} (which is a sequence of statements), where it is followed by a \texttt{ReturnStatement}, which has return behavior ALWAYS, meaning it always returns (an integer). As a result, the \texttt{Block} itself also has return behavior ALWAYS, and as such, is a valid function body.

\begin{figure}
\centering
Code fragment:
\begin{python}
  inc = func(i: int) {
    if (i < 10)
      return i + 1;
    return 0;
  };
\end{python}
AST:

\includegraphics{CD_Branches}
\caption{Code fragment and its AST, with return behavior annotated for all statements}
\label{fig:cd_branches}
\end{figure}

\subsubsection{Scope object}

As we have seen before, the static analysis method of every construct expects a reference to a \texttt{Scope} object as parameter. A scope object primarily serves to lookup and declare variables, containing their types, const-ness and names. It is also a static description of the memory layout of a stack frame during execution, containing the memory offsets of variables in the frame. See Figure \ref{fig:cd_static_analysis} for a UML diagram of the Scope class.

During static analysis, most types of constructs (expressions, statements) simply pass the \texttt{Scope} object on to their children. This means the scope did not change.

\begin{figure}
\centering
\includegraphics{CD_static_analysis}
\caption{Classes involved in static analysis}
\label{fig:cd_static_analysis}
\end{figure}

\textbf{Example:} Figure \ref{fig:seq_y=2x} shows static analysis performend on the AST of Figure \ref{fig:cd_y=2x}. The identifier ``\texttt{x}'' is analyzed as an RValue (\texttt{init\_expr}), and uses the \texttt{Scope} object to lookup the memory offset of its variable, and finds that it is $0$. The identifier ``\texttt{y}''  is analyzed as an LValue (\texttt{init\_lvalue}) and uses the \texttt{Scope} object to attempt to ``put'' a variable \texttt{y} of type \texttt{SCCDInt}. The operation succeeds, and \texttt{y} is given memory offset 1.

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{SEQ_y=2x_2}
\caption{Sequence diagram for static analysis of statement \texttt{y = 2 * x}}
\label{fig:seq_y=2x}
\end{figure}

In some cases, the scope does change: A function declaration introduces a new scope for the function's body. This means that variables declared in functions are not visible outside of that function declaration. Function bodies can access (read/write) variables of surrounding scopes, however. (A function declaration is just an expression, so function declarations are allowed almost everywhere, and can be nested arbitrarily.) In order to access surrounding scopes, when a new scope is introduced, that scope always has a \emph{parent}. The only scope without a parent is the \emph{global scope}, which is typically created by the invoker of the static analysis step on the root node of the AST.

Action language constructs can create new scopes during static analysis. Currently, the only construct doing this, is the \texttt{FunctionDeclaration} expression. Created scope objects are stored in the AST, because they contain important information for execuction (such as the size by which to grow the stack upon calling a declared function).

\textbf{Example:} Figure \ref{fig:cd_funcdecl_scope} shows a fragment of code and its AST after static analysis. Upon static analysis, the \texttt{FunctionDeclaration} object creates a new, nested scope for its function body, setting the parent to the scope object it received from the \texttt{Block} object at the top. As a result, the variable \texttt{x} can be successfully looked up from within the function body, while the variable \texttt{y} remains private to the function body. The nested scope is stored in the \texttt{FunctionDeclaration} object itself, because it will be required in order to evaluate the declaration, as seen in the later section on Execution.

\begin{figure}
\centering
Code fragment:
\begin{python}
x = 1;
func() {
  y = 2;
  x = y;
};
\end{python}
AST:

\includegraphics[width=1.0\textwidth]{cd_funcdecl_scope}
\caption{Code fragment with function declaration and its AST after static analysis, showing the hierarchy of \texttt{Scope} objects on the right}
\label{fig:cd_funcdecl_scope}
\end{figure}

There is one more use case for nested scopes, but not in the action language itself: In the statechart language, we also use many nested scopes, as will be discussed in Section \ref{sec:statechart_scopes}. For instance, every transition has its own scope, containing the names of the transition's event parameters as variables, so they can be accessed from the guard condition and transition's action code, but not from elsewhere.


\subsection{Execution}\label{sec:action_execution}

After static analysis has been performed on an AST, and has succeeded, one can execute the AST. The execution of an AST does not modify the AST, it only modifies \emph{memory}, as was shown in Figure \ref{fig:parser_analysis_execution}.

Similar to the static analysis step, to execute an AST, one invokes the execution method of the root node, as it will invoke execution on its children, etc. Depending on the type of the node, the execution method is as follows:

\begin{description}
  \item[Expression] \texttt{eval(memory: MemoryInterface): Any}

  Evaluates the expression, reading/writing from/to the memory object given, and yielding a result. (An expression may write to memory, because a function call is an expression.)

  \item[LValue] \texttt{eval\_lvalue(): int}

  Returns the offset of the LValue (variable) relative to the start of the current stack frame. This offset has already been computed in the static analysis step, and has been stored in the LValue object. It can be a positive or negative value, depending on whether the variable exists in the current stack frame, or one of its ancestors.

  \item[Statement] \texttt{exec(memory: MemoryInterface): Return}

  Executes the statement, reading/writing from/to the memory object given, and yields a \texttt{Return} object, indicating whether the statement execution caused a return-statement to be executed, and if so, what the returned value was.
\end{description}

The \texttt{MemoryInterface}-implementing object received by these methods as a parameter, is conceptually the \emph{stack memory} of the instance being executed. It defines operations for \emph{loading and storing} data from/to memory, as well as \emph{pushing/popping stack frames} (UML diagram in Figure \ref{fig:cd_evaluation_execution}).

\begin{figure}
\centering
\includegraphics{CD_evaluation_execution}
\caption{Classes involved in execution.}
\label{fig:cd_evaluation_execution}
\end{figure}

For most constructs, the implementation of the execution method is trivial. The not-so-trivial cases are function declarations and function calls. A function call should push a stack frame, and this is done by calling \texttt{MemoryInterface.push\_frame}. This method expects a \texttt{Scope} object. The scope object's primary function here is to describe the size by which the stack should grow. For every function declaration, a scope object has been statically computed during static analysis, and stored in the \texttt{FunctionDeclaration}-expression. Every time the \texttt{FunctionDeclaration} is evaluated, a function-value is returned with the scope object embedded in it. A \texttt{FunctionCall} therefore has access to this \texttt{Scope} object in order to grow the stack.


\subsubsection{MemoryInterface implementation}\label{sec:memory_interface}

The default implementation of \texttt{MemoryInterface}, called \texttt{Memory}, supports the following interesting features:
\begin{description}
  \item[Recursion] makes it easier to code certain algorithms, but in the general case (ignoring tail call optimization), has an unpredictable time and memory footprint.
  \item[Closures] are a feature in high-level, garbage-collected languages such as Python, Java and JavaScript. They allow functions to declare other functions, and return those functions as values. Those returned functions are allowed to refer to variables of the function in which they were declared, even if that function's stack frame has been popped. Closures only work in a garbage-collecting environment as stack frames must be allocated on the heap, and not be freed until functions that still have access to them go out of scope.
\end{description}

To support these features, the current implementation allocates stack frames on the heap, and maintains 2 singly-linked lists between stack frames:
\begin{itemize}
  \item List of parents: Every stack frame has a pointer to its parent, i.e. the stack frame of the function that invoked (called) the current function, and hence created the current stack frame. When the current stack frame is popped, its parent becomes the current stack frame (again). This alone suffices to supports recursion.
  \item List of contexts: Every stack frame has a pointer to its context, i.e. the stack frame that was at the top of the stack when the function that created the stack frame was declared. If memory access to a negative offset relative to the current stack frame is requested, the list of contexts is used. This is to support closures.
\end{itemize}
Neither recursion nor closures are necessary or perhaps even desired for usage in statecharts. These are high-level features not encountered in e.g. an embedded environment. An upside is that with these features, it is impossible to write code that contains certain types of memory errors, such as accessing an invalid memory address. Support for these features can be dropped without changing the syntax of the language, but then the static analyzer \emph{should} detect and reject code making use of these features.

If support for recursion and closures is dropped, the maximum size of the call stack can be easily computed statically, allowing for a much simpler implementation of \texttt{MemoryInterface}, that could work with a pre-allocated, fixed amount of memory, suitable for use in an embedded environment.

\section{Statechart Language Implementation}\label{sec:statechart_impl}

\begin{figure}
\centering
\includegraphics[]{xml_init_execute}
\caption{Steps of loading and executing models in the statechart language}
\label{fig:xml_init_execute}
\end{figure}

The statechart language implementation is central to this thesis. Part of the language is a parser/loader function, which parses models in an XML format to construct an AST, and performs an initialization step on this tree, calculating certain static properties of its elements required in our execution implementation. A loaded statechart can then be instantiated and executed.

The execution part of the statechart language includes no queueing of input or output events. It consists only of the instance initialization function (entering the default configuration, must be called once) and the big-step function (responding to a set of input events, may be called repeatedly). Both these functions are only executed when explicitly called. The statechart language itself is therefore \emph{not autonomous}.

Autonomous behavior is implemented through an event loop in the Controller, which is discussed in Section \ref{sec:controller_impl}. Although the statechart language does not depend on the Controller, statechart instances assume some implementation of an event loop, which they use for scheduling and canceling of (future) events, for timed transitions, as we will see.

\subsection{Parser and Constructs}\label{sec:statechart_xml}

Although statecharts are a visual, topological language, our runtime parses statechart models in an XML format. The XML format was originally based on the SCXML standard \cite{W3C2015SCXML}, but ``liberties'' were taken to deviate as desired: no intention of syntactical compatibility was ever on the agenda.

In order to parse XML, we use the well-known Python library lxml \cite{lxml}, which uses the C-library libxml2 \cite{libxml2} under the hood.

As the statechart language can contain action language expressions and statements in various places, the statechart parsing logic will also regularly invoke the action language parser (and static analyzer). An error in action language code ``bubbles up'' as an error in the statechart model.

\subsubsection{Parsing logic and schema}
Apart from parsing our input files, we also want to validate them against a schema. However, no schema was explicitly written in an XML schema language such as XML Schema \cite{W3C2012XSD}, because of the burden of keeping the schema in sync with the parser logic, which, in a system-under-development, is easily forgotten, and can feel like a useless endeavor, since parser logic and schema contain much of the same information. Instead, the parser code was converted into 2 parts:

\begin{enumerate}
  \item A tree of nested declarations of expected XML elements, their order and multiplicities, and callbacks for handling those elements. Figure \ref{fig:cd_parse_rules} shows a class diagram of the structure of the parser rules.
  \item A generic parser function, built on top of lxml, using the tree of nested declarations and callbacks to carry out the parsing.
\end{enumerate}

\begin{figure}
\centering
\includegraphics[width=10cm]{cd_parse_rules}
\caption{Class diagram (conceptual, not actually implemented as these classes) for structure of parser rules}
\label{fig:cd_parse_rules}
\end{figure}

This way, the parsing logic \emph{is} the schema, with all the reusable functionality of checking multiplicities un-duplicated in a generic parser function. This function also shows very comprehensible error messages, printing out a fragment of the input file, with the error highlighted. The function is potentially reusable in other projects, and is also used in other parts of the SCCD project, such as the parsing of test files.

\subsubsection{Statechart XML format}

We'll introduce the XML format for statechart models. The root XML node is always \texttt{<statechart>}. It has the following children, in order:

\begin{description}
  \item[\texttt{<semantics>}] (0..1) The semantic options chosen for the model. For every semantic option that is not specified, SCCD's default value for that option is used.
  \item[\texttt{<datamodel>}] (0..1) The initialization code (in action language) of the statechart's \emph{data model}. The data model is the set of variables (and their types) that are readable and writable from everywhere in the statechart.
  \item[\texttt{<inport>}] (*) An input port. A statechart model can only receive input in the form of \emph{input events}. An input port defines a set of input events.
  \item[\texttt{<outport>}] (*) An output port. A statechart model can only produce output in the form of \emph{output events}. An output port defines a set of output events.
  \item[\texttt{<root>}] (1) The root state of the \emph{state tree} of the statechart model.
\end{description}

\subsubsection{Syntactical constructs}

A \texttt{<statechart>} XML node at the highest level is parsed into a \texttt{Statechart} object in the runtime. Figure \ref{fig:cd_statechart} shows a UML diagram of this class, and the other classes it is composed of. It is trivial how the XML format maps to these constructs. The \texttt{Statechart} object represents a loaded model and can be executed by the execution runtime. The \texttt{Statechart} object consists of a \texttt{StateTree} object, which owns the root of the state tree structure, which is a \texttt{State}. The \texttt{State} class and the other constructs making up the tree structure are shown in Figure \ref{fig:cd_state}.

In the runtime, the \texttt{StateTree} object is created \emph{after} the state tree structure has been created (by the parser). Besides containing the root state, the \texttt{StateTree} object contains a lot of additional fields of information derived from the state tree structure. This information is used for efficiently executing the statechart model. 

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{cd_statechart}
\caption{Syntactical constructs of the statechart language, part 1}
Dark-gray: Action language constructs, see Figure \ref{fig:cd_expression} and Figure \ref{fig:cd_statement}.

Light-gray: Statechart ``tree'' constructs, see Figure \ref{fig:cd_state}.
\label{fig:cd_statechart}
\end{figure}

We'll now introduce the statechart language constructs that make up the state tree structure (Figure \ref{fig:cd_state}). It is clear that the statechart language extends on the action language:  For the action language constructs, we refer to Section \ref{sec:action_lang_constructs}.

\begin{description}
  \item[\texttt{State}] The state class represents a basic state (if it has no children) or Or-state (if it has 1 or more children). %If a \texttt{State} is part of the current configuration, exactly one of its children is part of the current configuration. If a \texttt{State} has children, it also has a \emph{default state}.

  It is also the base class for other types of states. It can have a parent state and any number of children states, creating the state tree structure. The root of a state tree is always of this type (an Or-state).

  \item[\texttt{ParallelState}] represents an And state.%: If a \texttt{ParellelState} is part of the current configuration, all of its children are also part of the current configuration.

  \item[\texttt{HistoryState}] represents a history (pseudo-)state. Since this class is abstract, its subtypes \texttt{ShallowHistoryState} and \texttt{DeepHistoryState} are to be used instead.

  \item[\texttt{Transition}] represents a transition. A transition has a source and a destination state. Except for the root state, \emph{any} state can be the source and/or destination of a transition. The ``source'' is a bi-directional association: every transition is also listed in its source state's list of outgoing transitions.

  A transition can have a \emph{guard condition} (\texttt{guard}), which is just an \texttt{Expression} of our action language.

  A transition can have an \emph{event trigger} (\texttt{trigger}), which is a set of events that have to be present in order for the transition to be enabled.

  A transition can have \emph{actions}, such as raising events or executing action code.

  In the diagram, a transition is also shown to have a \texttt{Scope} object (see action language). This is the variable scope used for evaluating the guard condition, and executing actions. The scope contains the transition trigger's event parameters as variables, so they can be read.

  \item[\texttt{Trigger}] is a transition's trigger, a set of events required to be present for the transition to be enabled.

  It is also a base class for 2 other kinds of triggers: A \texttt{NegatedTrigger} also defines events \emph{not} allowed to be present; an \texttt{AfterTrigger} listens for an \emph{after-event}.

  \item[\texttt{Action}] is an \emph{action}. Actions may be executed when a transition fires. There are 2 types of places in the state tree where actions can be defined: (1) directly on a transition, or (2) as the enter- or exit-actions of a state, which are executed when any transition causes that state to be entered or exited, respectively.

  There are 3 types of actions:
  \begin{description}
    \item[\texttt{RaiseInternalEvent}] raises an internal event. If and when the internal event becomes visible to the statechart, depends on the semantic configuration chosen.
    \item[\texttt{RaiseOutputEvent}] raises an output event.

    The statechart language does not queue output events, instead they are ``delivered'' through a callback, synchronously called during transition execution. The callback's implementation may queue the output event, or immediately ``handle'' the event, performing some action (but caution is required since the statechart is in the middle executing of a transition).
    \item[\texttt{Code}] executes a piece of action language code.
  \end{description}
\end{description}

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{cd_state}
\caption{Syntactical constructs of the statechart language, part 2}
Dark-gray: Action language constructs, see Figure \ref{fig:cd_expression} and Figure \ref{fig:cd_statement}.

Light-gray: ``Action'' class, see bottom of diagram.
\label{fig:cd_state}
\end{figure}

\subsubsection{Example state tree}

As an example, we will show the XML format, visual representation and state tree of a statechart. The statechart implements a very simple control panel of a 4-burner stove, with 2 buttons: A button for increasing the heat of a burner, and a button for selecting the next burner. Holding the button for increasing the heat for longer than 1 second will cause the heat to be increased every 200 ms. There is no way to decrease the heat of a burner.

\begin{figure}
\centering
\includegraphics{sc_stove}
\caption{Statechart for 4-burner stove example}
\label{fig:sc_stove}
\end{figure}

\begin{figure}
\centering
\includegraphics[width=\textwidth]{statetree_stove}
\caption{State tree for 4-burner stove example}
Detailed action language constructs omitted.
\label{fig:statetree_stove}
\end{figure}

The visual representation is shown in Figure \ref{fig:sc_stove}. The state tree is shown in Figure \ref{fig:statetree_stove}. The XML format is as follows:

\begin{lstlisting}[language=XML,frame=single]
<statechart>
  <datamodel>
    burners = [0, 0, 0, 0];
    selected = 0;

    min = func(a: int, b: int) {
      if (a &lt; b) return a;
      return b;
    };

    increase = func {
      burners[selected] = min(burners[selected] + 1, 9);
    };
  </datamodel>

  <inport name="in">
    <event name="pressed_increase"/>
    <event name="released_increase"/>
    <event name="select_next"/>
  </inport>

  <root>
    <parallel id="p">
      <!-- upper orthogonal region -->
      <state id="heat" initial="Released">
        <state id="Released">
          <transition event="pressed_increase" target="../Pushed"/>
        </state>

        <state id="Pushed" initial="Waiting">
          <onentry>
            <code> increase(); </code>
          </onentry>
          <transition event="released_increase" target="../Released"/>

          <state id="Waiting">
            <transition after="1 s" target="../Increasing"/>
          </state>
          <state id="Increasing">
            <transition after="200 ms" target=".">
              <code> increase(); </code>
            </transition>
          </state>
        </state>
      </state>

      <!-- lower orthogonal region -->
      <state id="burner_select">
        <state id="BurnerSelect">
          <transition event="select_next" target=".">
            <code> selected = (selected + 1) % 4; </code>
          </transition>
        </state>
      </state>
    </parallel>
  </root>
</statechart>
\end{lstlisting}
The targets of transitions are XPath-like paths. During parsing, the targets of transitions are only filled in after building the state tree, as an additional step.

% We will not give a detailed explanation of the XML format for specifying state trees.
% As an example, the following XML fragment results in the statechart and state tree shown in Figure \ref{fig:sc_simple}. This example statechart will also be used in the following section, where we explain transition execution.

% \begin{lstlisting}{caption={XML representation of simple statechart model}}
% <root initial="outer">
%   <state id="outer">
%     <transition target="/p/region2/s4"/>
%   </state>
%   <parallel id="p">
%     <state id="region1" initial="s1">
%       <state id="s1">
%         <onentry>
%           <raise event="s1"/>
%         </onentry>
%       </state>
%       <state id="s2">
%         <onentry>
%           <raise event="s2"/>
%         </onentry>
%       </state>
%     </state>
%     <state id="region2" initial="s3">
%       <state id="s3">
%         <onentry>
%           <raise event="s3"/>
%         </onentry>
%       </state>
%       <state id="s4">
%         <onentry>
%           <raise event="s4"/>
%         </onentry>
%       </state>
%     </state>
%   </parallel>
% </root>
% \end{lstlisting}


% \begin{figure}
% \centering
% \begin{subfigure}[b]{.5\textwidth}
%   \centering
%   \includegraphics[width=0.9\linewidth]{test_parallel}
%   \caption{Statechart}
%   % \label{fig:sub1}
% \end{subfigure}%
% \begin{subfigure}[b]{.5\textwidth}
%   \centering
%   \includegraphics[width=0.9\linewidth]{statetree_simple}
%   \caption{State tree, annotating the transition as a dotted line, and a thick line indicating a \texttt{State}'s default state.}
%   % \label{fig:sub2}
% \end{subfigure}
% \caption{Simple, highly synthetic statechart model and its state tree}
% \label{fig:sc_simple}
% \end{figure}

\subsection{Execution: overview}

% The execution of a statechart is a loop, where repeatedly, as a response to a set of input events, zero, one or more transitions are executed, generating a set of output events.

The rest of this chapter explains how statechart models are executed. In doing so, we will also mention regularly that certain properties of statechart constructs are \emph{statically computed}. This points to a static analysis step (or ``tree initialization'' in Figure \ref{fig:xml_init_execute}) that is performed by the parser, right after the state tree has been constructed. This step is not separately discussed.

We will first explain our transition-firing implementation, as it can be understood separately from other parts of the execution runtime. Next, we explain how action language code can be used in statechart models, and how statechart execution causes this code to be executed. Then, we ``zoom out'' to a broad view of how a statechart's execution, i.e. how it is ``stepped'', and how semantic variation is possible in the implementation of this stepping. We then explain the stepping algorithm in detail, including our implementation of the semantic variation points described in BSMLs. Finally, we mention a number of implemented performance optimizations.

\subsection{Firing a transition}\label{sec:firing_transition}

In order to understand statechart execution in our runtime, a good place to start is the transition firing function, since it is independent of the various semantic options that can be chosen.

We assume the transition to fire has already been chosen, and the set of current events is known.

For now, the firing of a transition does 3 things, in the following order:
\begin{enumerate}
  \item Exit a subset of the current states, removing them from the \emph{statechart configuration} (= the set of current states). States are exited in child $\rightarrow$ parent order. For each exited state, execute its exit actions, if there are any.
  \item Execute the transition's actions (in the order specified
  \item Enter a new set of states, adding them to the statechart configuration. States are entered in parent $\rightarrow$ child order. For each entered state, execute its enter actions, if there are any.
\end{enumerate}

The only non-triviality about firing a transition, is determining the sets of exited and entered states. Before we go into detail about how these sets are calculated, we must first explain how the statechart's configuration, and sets of states in general, are represented internally in the SCCD execution runtime.

\subsubsection{Configuration as a bitmap}
It is clear that, as a statechart executes, we have to keep track of its configuration (set of current states). There are many ways to represent the configuration. Conceptually, it is an unordered set. Originally, a Python list of \texttt{State} objects was used, ignoring its order. Simon Van Mierlo at some point introduced \emph{bitmaps} for representing configurations. In order to represent a statechart configuration as a bitmap, every state in the state tree is given a unique integer ID. The root state is given ID 0, and the other states are assigned incrementing numbers in a depth-first fashion, so children are always assigned larger IDs than their parent. Every state ID would be a position in the bitmap, 1 or 0 (in the current configuration or not). Because the resulting bitmaps are relatively small and \emph{dense}, we use Python integers to represent them.

This usage of bitmaps was further extended to all places in the code where \emph{sets of states} had to be represented, because of significant advantages:

\begin{itemize}
  \item Compact representation: For every state in the state tree, only a single bit is used.
  \item Efficient union and intersection operations: These operations are implemented using bitwise-OR and bitwise-AND operations. A single machine instruction can perform these operations for typical statecharts.
  \item When enumerating a bitmap's elements (by iteratively right-shifting the bitmap, and yielding when the least significant bit is 1), they are automatically sorted. This, combined with the depth-first fashion state IDs are assigned, means that when enumerating the ``current states'', they always appear in shallow $\rightarrow$ deep order.
  \item In Python, unlimited size: Our bitmap implementation uses an integer internally. There is no limitation on the maximum size of integers in Python 3, making our solution ``work'' with very large statecharts as well, although performance may take a hit.
  \item A bitmap can be used as key in a mapping. (Useful for caching of transition candidates, see later)
\end{itemize}

To find a \texttt{State} object based on its state ID, an array of all \texttt{State}s, indexed by their ID, is kept in the \texttt{StateTree} object (Figure \ref{fig:cd_statechart}).

Now that we have seen how bitmaps are used to represent the statechart configuration, and can be used to represent sets of states in general, we can explain how the sets of exited and entered states of a transition can be efficiently calculated.

\subsubsection{Calculating set of exited states}

The set of exited states is always equal to the intersection of the set of the transition arena's descendants (not including the arena state itself), and the current configuration. The arena of a transition is the lowest-common-ancestor of its source and target, that is an Or-state. For every transition, the arena's set of descendants can be statically computed. This statically computed set is stored as a bitmap in the \texttt{Transition} object. Now that both the arena's set of descendants and the statechart configuration are bitmaps, the intersection is therefore a simple bitwise-AND operation, yielding the set of exited states, as a bitmap. Enumerating the elements in this bitmap, in reverse, yields the IDs of the states-to-exit in the right order, i.e. from child to parent.

\begin{figure}
\centering
\includegraphics[]{enter_exit_set}
\caption{Statechart with a complex transition $t$, as an example for entered and exited sets of states.}
\label{fig:enter_exit_set}
\end{figure}

\textbf{Example:} Figure \ref{fig:enter_exit_set} shows a statechart and a transition $t$. First, we show that the set of exited states depends on the statechart configuration, and can therefore not be computed statically: As $t$ is fired, it is clear that we exit the AND-state $S$, and therefore all descendants of $S$ are exited as well. The transition $t$ can be fired regardless of $D$ or $E$ being current states, so either $D$ or $E$ should be exited depending on the configuration. Second, we give an example of what the exited states set could look like. Suppose the statechart configuration consists of the states $\{ root, S, U, C, V, D \}$ The arena of transition $t$ is the root state, so the arena's descendants set consists of all other states $\{ root, S, U, A, B, C, V, ... \}$. The intersection between these 2 sets is the exit-set $\{ S, U, C, V, D \}$. After these states have been exited, only the root state is a current state.

\subsubsection{Calculating set of entered states}

The set of entered states is more complex, but can be computed entirely statically for any transition, if the transition's target state is not a history state. (Our implementation supporting history states is an extension to the current algorithm and will be explained later.) As a start, we take the intersection set between the target's state set of ancestors and the arena's descendants, and we call this set the \emph{enter path}. When enumerating the states of the enter path, a sequence of states is obtained, such that the next state is always a child of the current, from the arena to the target state (not including the arena or target state). All states on the enter path are in the entered states set. Naturally, the target state itself is also entered, and if the target state has children, some of those children should also be entered, depending on the type of the target state:

\begin{itemize}
  \item If the target state is an And-state, the And-state and all of its children should be considered targets, recursively.
  \item If the target state is an Or-state, the Or-state and its default state should be considered a target, recursively.
\end{itemize}

This behavior is implemented for all states in the polymorphic \texttt{\_effective\_targets} method. It returns a set of states to enter, if the state is the target of a transition. This includes the state itself, and a subset of the children, as described above. This method is called during calculation of the static set of a transition's entered states.

\textbf{Example:} In Figure \ref{fig:enter_exit_set}, the transition $t$'s enter path consists solely of the state $T$. The state $G$ is also entered, as it is the target of the transition, and calling the \texttt{\_effective\_targets} method on $G$ yields its default state $Z$, which is also added to the set of entered states.

But we are not done: in the previous example, the state $H$ should also be added to the enter-set. This is because on the enter path, there was an And-state $T$. And-states occuring on the enter path add their children that are not on the enter path as target states. The polymorphic method \texttt{\_additional\_effective\_targets} behaves this way for And-states, and is called for every state on the enter path.

We have now successfully calculated the set of entered states statically, and this set is stored as a bitmap in the \texttt{Transition} object, so that it can be quickly looked up during transition firing.

\subsubsection{History implementation}

As we have mentioned before, if the target of a transition is a history state, the entered states set of a transition cannot be computed entirely statically. However, our previous calculation is still correct, except that the target state (in this case, a history state) itself is not entered (history states are pseudo states, and cannot be entered!), but instead, the \emph{history value} for that state should be looked up. A history value is simply the set of states to be entered if its history state is the target of a transition, and is simply added to (union-ed with) the statically calculated entered states set. The history value is another bitmap that we keep for every history state in the statechart, making up the ``dynamic'' part of the statechart, together with the statechart configuration.

How to generate the history values? If we exit a state $A$, we have to save some history value if $A$ has a history state as a direct child. The value to save depends on the type of history:

\begin{itemize}
  \item If a \emph{deep history state}, the history value to be saved is the intersection of $A$'s descendants with the \emph{exited states set}, which was of course calculated as explained above, before any state was exited. The descendants bitmap of every state is pre-calculated, so this operation is another simple intersection of 2 bitmaps.
  \item If a \emph{shallow history state}, state $A$ must be an Or-state, since shallow history states make no sense for And-states. The history value to be saved is the result of calling the method \texttt{\_effective\_targets} on $A$'s exited child: When the history state is a transition target, $A$'s exited child should be treated as the target of the transition, also recursively entering $A$'s default state(s), if there are any. For a given state tree structure, the result of calling the recursive (and therefore slow) \texttt{\_effective\_targets} on any state never changes, so it is also computed statically.
\end{itemize}

So even for shallow history states, the saved history values are deep, i.e. \emph{partial configurations} (instead of recording a single state). With this approach, when restoring a history value, it is not necessary to treat shallow and deep history states separately.

\textbf{Example:} In Figure \ref{fig:history}, when the transition from $C$ to $D$ is made, the shallow history state records the history value $\{ Y, B \}$. If the history state were deep, the recorded history value would be $\{ Y, C \}$.
% \todo{Oorsponkelijk dacht ik hier een voorbeeld te geven, maar er is al een voorbeeld gegeven in Background, sectie \ref{sec:history} ...}

At the beginning of a statechart's execution, there would be no history values unless we explicitly initialize them. We do this, to have \emph{sane} (non-crashing) behavior when a history state is entered before its history value has been saved. The default history values for all states are equal to the result of calling the method \texttt{\_effective\_targets} on the parent of the history state, i.e. we simply enter the ``default'' states.

In order to efficiently set and look up history values in our ``dynamic'' part of the statechart, every history state has a unique, statically-assigned \texttt{history\_id} (incrementing from 0, depth-first), and we store all history values in an array, indexed by history id. We cannot write history values directly to the \texttt{HistoryState} object itself, since it belongs to the ``static'' part of the statechart, which should never change during execution.

\subsubsection{Timed transitions}

There is one more feature that is yet unsupported by the transition firing algorithm as explained so far: timed transitions, i.e. transitions with an after-trigger.

In visual syntaxes of the statecharts, as well as in our statechart XML format, a timed transition is given no event trigger, but a \emph{time duration expression}, such as $100 ms$, indicating that after the source of the transition having been part of the configuration for $100 ms$, the transition becomes enabled.

% \todo{Idem: dit is ook al uitgelegd in Background... }

In the implementation, an after-trigger behaves much like a regular trigger, in the sense that it also responds to an event. This event, called an \emph{after-event}, is hidden to the modeler. It is \emph{scheduled} with its future timestamp upon entering the source state of the transition, and \emph{canceled} upon exiting it.

There are 2 types of events a statechart can respond to: \emph{Input events} come from the environment, and \emph{internal events} are raised by fired transitions in the statechart itself. At first intuition, it may seem like after-events are internal events, since they are scheduled by the statechart itself, but this is not true: internal events cannot have a time delay, they always occur (conceptually) at the same timestamp as they were raised, even if the transition that raised the internal event occured \emph{logically} before the transition responding to it. Input events, by contrast, can be given any timestamp, and are put in a global \emph{event queue}, where they are sorted by timestamp, and popped when they become due. It has been mentioned before that the implementation of this event queue is not part of the statechart language, yet a statechart instance expects \emph{some implementation}, and must be given callbacks for scheduling and canceling events upon instantiation.

% For now, it suffices to know that upon entering a state with outgoing timed transitions, after-events have to be scheduled for those transitions, and when exiting such state, all of its scheduled after-events have to be canceled. Every statechart instance therefore has access to the global event queue, but only uses it for scheduling (future) events, and canceling scheduled events. When scheduling an event, it receives a scheduled ID, which it can use for canceling later on.

Just like the history values, we cannot store the scheduled IDs in the state tree (\texttt{State}, \texttt{Transition} or \texttt{AfterTrigger}, ... objects), as shouldn't be altered by execution. We therefore statically assign every \texttt{AfterTrigger} a unique after-ID (incrementing from 0, depth-first), which serves as index in an array of scheduled IDs, stored in the ``dynamic'' part of the statechart.

\subsubsection{Final algorithm}

To summarize, the transition firing algorithm looks as follows:
\begin{enumerate}
  \item Calculate the sets of entered and exited states as described above.
  \item Exit states in order child $\rightarrow$ parent. For every exited state $A$:
    \begin{enumerate}
      \item If $A$ has a history state as its direct child, save the history value for that history state, as described above.
      \item For all outgoing transitions of $A$ that have an after-trigger, cancel the scheduled after-event.
      \item Perform the exit-actions of $A$.
      \item Remove $A$ from the configuration.
    \end{enumerate}
  \item Perform the transition's actions.
  \item Enter states in order parent $\rightarrow$ child. For every entered state $B$:
    \begin{enumerate}
      \item Add $B$ to the configuration.
      \item Perform the enter-actions of $B$.
      \item For all outgoing transitions of $B$ that have an after-trigger, schedule their after-event.
    \end{enumerate}
\end{enumerate}

The dynamic part of a statechart consists of the following values:
\begin{itemize}
  \item The statechart configuration (set of current states), represented as a bitmap.
  \item The history values of all history states in the statechart, represented as a history-ID-indexed array of bitmaps.
  \item The scheduled IDs for current states that have timed outgoing transitions, represented as an after-ID-indexed array of ``None'' (no scheduled event) or a ``scheduled ID''.
\end{itemize}

These 3 values are fields in an object called \texttt{StatechartExecution}. This is also the object that implements the method for firing a transition (\texttt{fire\_transition}).

\subsection{Action language in a statechart}

Action language code can occur in various places in a statechart model:

\begin{itemize}
  \item The statechart's \emph{data model}, which is a \texttt{Block} (a sequence of \texttt{Statement}s), declaring and initializing the statechart's variables.
  \item The guard condition of transitions is an action language \texttt{Expression}.
  \item The \emph{delay} of an after-trigger is also an \texttt{Expression}.
  \item The various types of actions (enter state, exit state, or fire transition):
  \begin{itemize}
    \item Code-actions execute a \texttt{Block}.
    \item Raise event-actions have \texttt{Expression}s for the raised event's parameters.
  \end{itemize}
\end{itemize}

\begin{figure}
\centering
\includegraphics[]{sc_counter}
\caption{``Counter'' statechart model, with a datamodel and action language expressions and statements}
\label{fig:sc_counter}
\end{figure}

\textbf{Example:} Figure \ref{fig:sc_counter} shows a statechart model containing action code. When the statechart is initialized, the datamodel code is executed, initializing the \texttt{ctr} variable to 0. Both transitions have a guard condition reading the \texttt{ctr} variable. The transition from Counting to itself updates \texttt{ctr} by incrementing it by one.

The following desired features influenced the design of the action language, and its integration in the statechart language:

\begin{itemize}
  \item Variables and functions declared in the statechart's data model should be readable and writeable in all parts of the statechart.
  \item Variables and functions can be declared outside of the data model section, but they are temporary and local to the fragment (scope) where they appear.
  \item If a transition's trigger defines parameters on its events, those parameters should be readable from the transition's guard condition, and its actions.
  \item As part of the statechart language, builtin functions (and possibly variables) can be defined, which are readable (not writable) in all parts of the model. An example of a useful builtin function, is the \emph{in\_state} function, checking if a given state (or lists of states) is part of the current configuration. Another useful builtin function \emph{log}, logs text to the console.
\end{itemize}

\subsubsection{Action language static analysis (in a statechart)}\label{sec:statechart_scopes}
As explained in Section \ref{sec:action_lang_impl}, every parsed expression or statement in the action language has to be statically analyzed before it can be executed. For static analysis, a \texttt{Scope} object must always be given, in which variables are looked up, and to which declared variables are added.

The solution is to first define a special ``builtin'' scope, consisting of our builtin functions, which are declared as ``constant'' (read-only). Then an ``instance'' scope, with as parent the builtin scope, is created, and passed to the static analysis function of the statechart's datamodel, adding all of the variables declared in the datamodel to that scope. Finally, all other occurrences of action code are given their own scope, with the instance-scope as parent. To allow transition's guard conditions and action code to read event parameters, transitions themselves also get their own scope.

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{sc_scopes}
\caption{A statechart model (upper left) and its hierarchy of scopes}
\label{fig:sc_scopes}
\end{figure}

\textbf{Example:} Figure \ref{fig:sc_scopes} shows a statechart model and its scope hierarchy. The variable \texttt{y}, declared in the datamodel, is part of the instance-scope. Both transitions have their own scope, containing their event parameters, if there are any. Action and guard scopes have their transition's scope as parent, so they can read event parameters. Also note that the function $f$ declared in the datamodel also has its own scope, not explicitly created by the statechart parser, but by the action language itself.


\subsubsection{Action language execution (in a statechart)}
In the action language, the hierarchy of \texttt{Scope} objects is a static representation of stack memory during execution. If the action code contains no recursive function calls, an upper bound can be put on the maximum amount of stack memory required for a statechart's execution. The absense of recursive calls would also guarantee halting for every piece of action code, as the action language has no loop-structures. However, neither the action language nor the statechart language checks for the presence of recursive calls to claim any of these model properties.

At the beginning of a statechart's execution, we start out with an empty (zero-length) stack memory object. Then, the stack frame for the builtin-scope is pushed, and the builtin values (function implementations) copied to that frame. Next, the stack frame for the instance-scope is pushed, and the action code of the statechart's datamodel is ran, initializing that stack frame. Now the statechart's memory has been initialized, and default states can be entered (possibly triggering more action code) to complete the statechart's initialization. For transition execution and guard evaluation, stack frames are pushed and popped, always symmetrically. In between the execution of transitions, the stack frames of the builtin-scope and instance-scope are never popped. When e.g. evaluating the guard condition of the upper transition in Figure \ref{fig:sc_scopes}, the stack frames of the transition's event parameters and of the guard condition (the latter is an empty frame) are pushed, and popped when evaluation is complete.

\subsection{Broader picture: Stepping of a statechart}

We've looked at the transition execution algorithm and the way action language fragments in a statechart are statically analyzed and executed. We ``zoom out'' now, to look at the statechart interface ...

We treat statecharts as a subclass of Big-Step Modeling Languages \cite{sabzali2010deconstructing}, where at the highest level, a model's execution is a sequence of \emph{big-steps}. A big-step is a model's response to a set of input events, and consists of the execution of zero or more transitions, changing the model's state and producing output events.

As we have seen in Section \ref{sec:bsmls}, it is here that many semantic variations can occur, such as:
\begin{itemize}
  \item If multiple transitions are enabled, which subset and in what order to fire them?
  \item If the firing of transitions raises internal events, when and for how long can those enable transitions?
  \item If transitions make changes to the statechart's memory, at what point do they become visible to other transitions?
\end{itemize}

We will now cover our implementation of the statechart's stepping implementation, which is where all of the semantic variation points are implemented.

\subsection{Rounds}\label{sec:rounds}

In BSMLs, a big-step consists of any sequence of \emph{small-steps}. Small-steps are sets of concurrently executed transitions, but since concurrency is not yet implemented in the runtime, a small-step always consists of a single transition. Depending on the semantic options chosen, sequences of small-steps may also be grouped into \emph{combo-steps}, with a sequence of combo-steps making up a big-step. \emph{Maximality}-semantic options exist for both big-steps and combo-steps, defining restrictions on the transitions that can be taken together in such a step.

\begin{figure}
\centering
\includegraphics[]{big_step}
\caption{Left: A big-step without combo-step semantics. Right: A big-step with combo-step semantics.}
\label{fig:big_step}
\end{figure}

In the implementation, this grouping (big-step $>$ combo-step $>$ small-step) was generalized into the \texttt{SuperRound} class (Figure \ref{fig:cd_round}). There is also a \texttt{SmallStep} class, which has the superclass \texttt{Round} in common with \texttt{SuperRound}. Big-steps and combo-steps are implemented as \texttt{SuperRound}s. A \texttt{SuperRound} has a fixed \emph{subround} object, of type \texttt{Round}, and can therefore be another \texttt{SuperRound}, or a \texttt{SmallStep} round. This way, arbitrary numbers of levels of groupings can be constructed.

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{CD_Round}
\caption{Class Diagram for the runtime implementation of rounds and transition candidate generation}
\label{fig:cd_round}
\end{figure}

A \texttt{Round} can be executed, which may cause transitions to be fired. If the execution of a round caused no transitions to be fired, we say the round was \emph{empty}. When a \texttt{SuperRound} is executed, it will repeatedly execute its subround, until its subround becomes empty.

The round execution method is called \texttt{run\_and\_cycle\_events}, and returns a pair of bitmaps containing information about the transitions that were fired during the round:
\begin{description}
  \item[arenas\_changed] A bitmap containing the arenas of the transitions fired during round execution.
  \item[arenas\_stabilized] A bitmap containing the arenas of the transitions fired during round execution that have a target state syntactically marked as ``stable''.
\end{description}
If a round was empty, both bitmaps will be zero.

\subsubsection{Maximality implementation}
Every SuperRound has a \emph{maximality} setting. This setting enforces restrictions on the transitions that can be taken together during a round's execution. Possible maximality settings are:
\begin{description}
  \item[TakeMany] No restrictions
  \item[TakeOne] No 2 transitions with overlapping arenas can be taken together
  \item[Syntactic] No 2 transitions with overlapping arenas entering ``stable'' states can be taken together
\end{description}
These restrictions are implemented by maintaining for every SuperRound an ever-growing set of \emph{forbidden arenas}, i.e. arenas that new transition candidates are not allowed to overlap with. At the beginning of every round, the initial set of forbidden arenas is received as a parameter. For a big-step, this parameter will be 0. For other rounds, this parameter are the forbidden arenas mandated by its parent round. After every fired transition, the forbidden arenas-set grows according to the maximality option for the SuperRound.

The set of forbidden arenas is implemented as a bitmap. When an arena is ``in the set'', not just the state ID-bit of the arena is ``1'', but also the state ID-bits of its \emph{descendants}. This way, checking for overlapping arenas (binary-AND) when considering transition candidates can be efficiently done, as well as adding new arenas to the set (binary-OR).

\textbf{Example:} Figure \ref{fig:rounds_execution}. shows a sequence diagram of the execution of a big-step. In the example, there are 3 levels of rounds: The superrounds ``big\_step'' and ``combo\_step'', and ``small\_step''. At the highest level, big-step execution is initiated by another part of the runtime, with the forbidden arenas set to 0, because at the beginning of a big-step, no transitions have yet been fired, and therefore no restrictions exist. The big-step executes the first combo-step, which in turn executes the first small-step. The forbidden arenas for the first small-step are still 0, as no transitions have yet been executed. When the first small-step has executed, a pair (\texttt{arenas\_changed1}, \texttt{arenas\_stable1}) is returned to the combo-step. The combo-step records these values, but in its next execution call to the small-step, it only passes \texttt{arenas\_changed1} as the set of forbidden arenas, since its maximality setting (TakeOne) does not care about stable states. The next small-step returns a pair like the first one did. The combo-step's 3rd attempt to execute a small-step results in an empty small-step, causing the combo-step to end. At the end of the combo-step, the union of the \texttt{arenas\_changed}- and \texttt{arenas\_stabilized}-values of both small-steps are returned, for the big-step to deal with. The big-step maximality setting (Syntactic) only looks at the \texttt{arenas\_stabilized}-value, and passes it on as the forbidden arenas-set in the execution of the 2nd combo-step. The second combo-step consists of only a single small-step. An attempt at a 3rd combo-step yields an empty step, causing the big-step to end.

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{rounds_execution}
\caption[Sequence diagram for the execution of a big-step]{Top: Sequence diagram for the execution of a big-step.

Bottom: An example statechart producing the big-step sequence above, from its initial state, upon any input event. All transitions are triggerless. States with checkmark-symbol are stable states.

Note: the method \texttt{run} is actually called \texttt{run\_and\_cycle\_events} in the runtime.}
\label{fig:rounds_execution}
\end{figure}

Using our constructs of rounds and superround-maximality, all combinations of Big-Step Maximality and Combo-Step Maximality can be implemented. Figure \ref{fig:rounds_example} shows the rounds-implementations for all combinations of maximality options currently supported.

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{rounds_example}
\caption[Implementations of currently supported maximality configurations]{Implementations of currently supported maximality configurations. Note that the parent of a small-step is always a \textsc{TakeOne}-maximality superround. This assures fairness.}
\label{fig:rounds_example}
\end{figure}

\subsubsection{Detection of never-ending rounds}
\textsc{Take Many} and \textsc{Syntactic} allow for an infinite number of transitions to be taken. Similar to Rhapsody \cite{harel2004rhapsody}, this is dealt with by counting the number of sub-rounds in a \texttt{SuperRound}. When a certain limit is exceeded, a runtime error is thrown. This limit is currently hardcoded at 100, but could be a property of the model.

\subsubsection{Fairness under all circumstances}
It is important to note that in Figure \ref{fig:rounds_example}, the parent of a small-step is always a superround with TakeOne-maximality. Even if no TakeOne-semantic option has been chosen in the statechart's semantic options, a TakeOne-superround is secretly inserted, to ensure \emph{fairness}: Even if there never is any restriction on the transitions that can execute, we still demand them to be executed in a fair order, i.e. transitions with non-overlapping arenas are given a higher priority.

\subsubsection{Event Lifeline implementation}
The event lifeline semantic options of BSMLs specify when input- and internal events are present. They can all be considered relative to some round (big/combo/small-step), and put into two categories of their relationship to that round (with exceptions, discussed later): Either the event becomes present in the \emph{current}, or in the \emph{next} round. Table \ref{tab:event_lifeline} shows these categories.

\begin{table}
\centering
\begin{scriptsize}
\begin{tabular}{ | r | c | c | c | }
\hline
&             \textbf{Current round (remainder of)} &                         \textbf{Next round} & \textbf{Special}\\
\hline
Big Step &    input: \textsc{Whole}, internal: \textsc{Remainder} & - & internal: \textsc{Queue} \\
Combo Step &  input: \textsc{First Combo Step} &                    internal: \textsc{Next Combo Step} & \\
Small Step &  input \textsc{First Small Step} &     internal: \textsc{Next Small Step} & internal: \textsc{Same} \\
\hline
\end{tabular}
\end{scriptsize}
\caption{Event Lifeline options in ``rounds'' implementation}
\label{tab:event_lifeline}
\end{table}

The implementation of the \texttt{Round} class offers methods to add an event to the current or next round. Depending on the semantic options chosen for the statechart, these methods bound to the ``right'' \texttt{Round}-objects, serve as ``the'' callbacks for raising an internal event and adding input events, during execution.

At the beginning of every big-step, the big-step's input events are added to the right round, always in the category ``remainder''.

``Remainder'' and ``next'' events are added to the lists \texttt{Round.remainder\_events} and \texttt{Round.next\_events}, respectively (shown in Figure \ref{fig:cd_round}). At the end of every round, those lists are cycled:

\begin{python}
  remainder_events = next_events
  next_events = []
\end{python}

At any point in time, the current set of enabled events is the union of all \texttt{remainder\_events} of all \texttt{Round}s. Retrieving this union is recursively implemented in the \texttt{Round.enabled\_events} method, merging the remainder-events with the enabled events of the parent round. It is always the \texttt{SmallStep} object initiating this recursive request, as it needs to know the current set of enabled events in order to generate transition candidates (see next section).

There are 2 exceptions to the categories ``remainder'' and ``next''. The first is the Internal Event Lifeline option \textsc{Queue} makes raised events present in a later big-step. This is not an option mentioned in the BSMLs paper, but it was the standard behavior of the original version of SCCD. It is included merely for completeness. A major drawback of this option is the unpredictability of when a raised event will be responded to: raised events are added to the global event queue, where they may be interleaved with input events. 

The second exception is the Internal Event Lifeline option \textsc{Same}, which is only to be used with concurrency-semantics. Although SCCD currently does not support \emph{true} concurrency, there is a concurrency option, allowing more than one transition to be included in a small-step. When this option is chosen, an event raised by one transition may be responded to by another transition in the same small-step. There is however still a causality relation between the transition raising the event, and the transition responding to it. It is therefore not a ``true concurrency'' and the same semantics can be achieved by replacing the small-steps with combo-steps.

\subsection{Transition candidate generation}

In order to execute a small-step, conceptually, a set of transition candidates is generated, and from this set, a candidate is chosen, depending on Priority semantics. This candidate is fired, concluding the small-step.

The set of transition candidates can be understood as a \emph{series of filters} applied to the set of all transitions. Those filters are:
\begin{enumerate}
  \item Is the transition's source-state part of the statechart's \textbf{current configuration}?
  \item Is the transition's trigger is satisfied with respect to the set of \textbf{currently enabled events}? No distinction is made here between input events or internal events.
  \item Does the transition's arena overlap with any of the \textbf{forbidden arenas}? This set is determined by the maximality-semantics chosen.
  \item Is the transition's \textbf{guard condition} satisfied with respect to the current state of the statechart's memory?
\end{enumerate}
We'll refer to these filters by their number in the enumeration above.

\subsubsection{Efficient candidate generation}
These filters could be applied in any order without changing the result, but not all orders have the same performance: Filter (4) is a potentially expensive operation (it may cause a billion decimals of $\pi$ to be calculated), so it is best to apply this filter at the end, when as many transition candidates as possible have already been eliminated.

The other filters, (1), (2) and (3), have the interesting property, that for the same input (current configuration, currently enabled events, forbidden arenas), they will give the same output: A transition's source state, trigger, or arena does never change! This means that, theoretically, we can pre-calculate their results for all inputs. However, the number of possible inputs would be too large, to compute them all.

A better alternative is to \emph{cache} computed semi-filtered sets of candidates at runtime\footnote{Credit for this feature originally goes to Simon Van Mierlo}. This is possible since all filter inputs are representable as bitmaps (including the set of enabled events, if we assign a unique integer ID to every event), so they (and tuples of them) can serve as keys in a hashtable or search tree. We simply use Python's dictionary type, which uses a hashtable.

The rationale behind caching transition candidates is that a statechart will re-visit the same state configuration / enabled events / forbidden arenas during its lifetime. Some consideration needs to be made when choosing our caching key. We could cache the candidate sets of applying filters (1), (2) and (3), but then our cache be highly specific, making it grow very large, and \emph{cache hits} would be rare. The opposite choice of a non-specific filter, such as only (1), would make the cache smaller and cache hits more frequent, but would yield candidate sets that still have to be filtered further by the unapplied filters, possibly reducing performance.

Filter key to be cached can be swapped out in the SCCD runtime. They are called GeneratorStrategies (see Figure \ref{fig:cd_round}). 2 strategies are included:
\begin{description}
  \item[CurrentConfigStrategy] Computes and caches candidates based on the tuple (current configuration, forbidden arenas), then filters based on enabled events, and finally filters on guard condition evaluation.
  \item[EnabledEventsStrategy] Computes and caches candidates based on the tuple (enabled events, forbidden arenas), then filters based on current configuration, and finally filters on guard condition evaluation.
\end{description}

A performance comparison of these strategies, plus additional ones could be considered future work.

\subsubsection{Cache initialization: Partial pre-computation of candidates}
We have seen that complete pre-computation of transition candidates is intractable. We can however initialize the candidate-cache with pre-computed values that are expected to be requested frequently. This would pass for partial pre-computation of candidates, and mixes nicely with caching.

Currently, pre-computation is only implemented for the EnabledEventsStrategy, computing candidate sets for the following values:
\begin{itemize}
  \item (0, 0): No current events and no forbidden arenas. (Containing all triggerless transitions)
  \item ($\{e\}$, 0): Where $\{e\}$ is the singleton-set of event $e$, $e$ being any event that serves as a trigger for a transition in the statechart. No forbidden arenas.
\end{itemize}
This pre-computation scheme significantly increases the number of cache hits at the beginning of a statechart's execution, since very frequently, the set of currently enabled events consists only of a single event, or of no events.

\subsubsection{Priority implementation}

In the BSML paper, Priority and Order of Small Steps (OOSS) are considered separate semantic aspects, but in our implementation they are considered one and the same option. We described in Section \ref{sec:ooss_priority} how they both produce partial orderings on the transitions in a statechart, and how their combination may produce a total ordering on all sets of potentially simultaneously enabled transitions in the statechart, in which case the model behaves deterministically, and in a way that the modeler has full control over.

In SCCD, priority is implemented by allowing the modeler to select and combine different priority options. these options are:

\begin{description}
  \item[Same source] \textsc{None} (disable priorities), \textsc{Explicit Priority} (takes XML document order for transitions that have the same source state)
  \item[Hierarchical] \textsc{None} (disable priorities), \textsc{Source-Parent}, \textsc{Source-Child}, \textsc{Arena-Parent}, \textsc{Arena-Child} (see Section \ref{sec:ooss_priority})
  \item[Orthogonal] \textsc{None} (disable priorities), \textsc{Explicit Ordering} (takes XML document order of orthogonal regions)
\end{description}

In the general case, the combination of these options yields a partial ordering among the transitions of the statechart, which is explicitly constructed as a directed graph (or more precisely, a list of graph edges). First, this graph is checked for cycles. A cycle represents a conflict in the priority options chosen, and is a static error. Next, a single global ordering of all transitions (as a list) is constructed from the partial ordering, by introducing missing orderings between transitions \emph{non-deterministically}. The eventual behavior is still deterministic, because orderings are only added between transitions that cannot be enabled simultaneously. The algorithm is as follows:

\begin{enumerate}
  \item Initialize our total ordering $T$ to be an empty list.
  \item In the priority graph, find the set of \emph{highest-priority} transitions $H$. This is the set of transitions that have no incoming edges. We know this set will be non-empty, because we have already assured that there are no cycles in the graph.
  \item Check whether in $H$, there exists any pair of transitions $t1$ and $t2$ that can be enabled simultanously. Concretely, this is the case if:
  \begin{itemize}
    \item $t1$ and $t2$ have the same source state
    \item $t1$'s source is an ancestor of $t2$'s source, or vice versa
    \item $t1$'s source is orthogonal to $t2$'s source, i.e. the lowest-common ancestor of the sources is an And-state.
  \end{itemize}
  If this is the case, the model is found to be non-deterministic, and an error is thrown.
  \item Add the transitions of $H$ to $T$ in \emph{any order}. This the part where we non-deterministically introduce an ordering among the transitions in $H$.
  \item Remove all edges involving the transitions of $H$ from the priority graph.
  \item Repeat from step (2) until the graph has no more edges (and $T$ contains all transitions in the statechart)
\end{enumerate}

This total ordering of transitions is statically constructed as a global list (one list for the entire statechart). In order to generate transition candidates, a series of filters is applied as explained in the previous sections, in an order-preserving way, producing a new list of candates, ordered by priority. If this candidate list is non-empty, its first element is always the selected candidate.

% Move this to Chapter "Evaluation"?
Note that the hierarchical arena-options do not cover all situations where one transition's source state is an ancestor of another transition's source state, producing a non-deterministic model. They may also conflict (a cycle in priority graph) with e.g. ``Same source''-priority. For instance, Figure \ref{fig:priority_arena_nondeterminism} shows two transitions that can be enabled simultaneously, have different source states, are not orthogonal, yet would not be assigned a priority with respect to each other, because they have the same arena. It is therefore our opinion that these options have little practical use. They were implemented nevertheless, because it was only a small task to do so.

Also note that every priority option can be disabled by choosing \textsc{None}. SCCD may be the only statechart tool that allows the modeler to do this. It is highly likely however, that the resulting model will be found non-deterministic, and hence rejected by the runtime.

The default semantics of SCCD are \textsc{Explicit Priority}, \textsc{Explicit Ordering} and \textsc{Source-Parent}. These options always yield a deterministic model.

\begin{figure}
\centering
\includegraphics[]{priority_arena_nondeterminism}
\caption{Transitions with different source states, but the same arena}
\label{fig:priority_arena_nondeterminism}
\end{figure}

\subsubsection{Lazy evaluation of transition candidates}

We explained how the list of transition candidates is the result of applying a sequence of filters on a totally ordered list of transitions. Every consecutive filter operation could produce a new (smaller) list, but we only fully construct such a (intermediate) list if the result is being cached. Otherwise, we \emph{lazily evaluate} the list, as we are only interested in its first element (= the next transition to fire), or whether the list is empty (= no more enabled transitions). For lazy evaluation, we use Python's generator expressions.

\subsection{Memory snapshots}

SCCD supports the Enabledness Memory Protocol and Assignment Memory Protocol semantic aspects. (Almost always, one would chose the same option for both Enabledness and Assignment Memory Protocol.) For each aspect, the options \textsc{Big Step}, \textsc{Combo Step} and \textsc{Small Step} exist.  These options determine the ``moment in time'' that action language expressions ``see'' when evaluating variables. This moment in time is always the beginning of some step: E.g. if the \textsc{Combo Step}-option is chosen, all guard conditions evaluated and transitions executed as part of the combo-step will read the statechart's variables as they were at the beginning of the combo-step.

This behavior is trivially implemented by creating a snapshot of a statechart's memory at the beginning of the step chosen in the Memory Protocol option. The snapshots are used for reading, while the actual memory is used for writing.

For any given statechart, the size of the snapshot is always the same. Snapshots are taken in between transitions, when the only stack frames in the statechart's memory are the ``builtin'' and ``instance'' scopes. The ``builtin''-scope is read-only, so does not have to be snapshotted. Therefore the size of every snapshot is equal to the size of the ``instance''-scope, which is statically computed from the statechart's datamodel section (see Section \ref{sec:statechart_scopes}).

The snapshotting is implemented in the \texttt{MemoryPartialSnapshot} class (Figure \ref{fig:cd_memory_snapshot}). It wraps around a normal \texttt{Memory} object (the default implementation of \texttt{MemoryInterface}, expected by action language constructs for execution), which it uses as a delegate for all write-operations, as well as read-operations outside of the snapshotted memory (i.e. builtin variables and temporary variables, local to the action code fragment being executed). At the end of every Memory Protocol-step, the snapshot is refreshed (\texttt{flush}). This is done by registering a callback on the correct \texttt{Round} object.

\begin{figure}
\centering
\includegraphics[]{cd_memory_snapshot}
\caption{Class diagram of MemoryPartialSnapshot class.}

Dark-gray classes defined in action language package.
\label{fig:cd_memory_snapshot}
\end{figure}

\subsubsection{Race condition detection}

If more than one transition writes a value to the same location in memory during the same Memory Protocol-step, a race condition exists. Race conditions are detected by maintaining a bitmap (\texttt{round\_dirty} in \texttt{MemoryPartialSnapshot}) of memory locations written during the current Memory Protocol-step. A memory location written twice causes a \emph{runtime error}. Race conditions cannot be detected statically. At the end of every Memory Protocol-step, when the snapshot is refreshed, the bitmap is reset to 0.

\subsubsection{Locally observable writes}\label{sec:locally_obs_writes}

It would be counter-intuitive if due to the Memory Protocol option chosen (e.g. \textsc{Combo Step}), a transition writing to a statechart's variable $x$ would not be able to consequently read the variable $x$ and retrieve the value it had just written. This is taken care of in \texttt{MemoryPartialSnapshot} by maintaining another bitmap, \texttt{trans\_dirty} of memory locations written during the current transition. For these locations, consequent read operations are performed on the actual memory, not the snapshot. Consequent writes to these locations are also not erroneously detected as race conditions.

\subsection{Various optimizations}

In the previous sections, the following performance optimizations were described:

\begin{itemize}
  \item Use of bitmaps to represent sets of states and sets of events.
  \item Static calculation of a transition's arena, and arena's descendants, allowing for efficient calculation of a transition's \emph{exited states} when firing a transition.
  \item Static calculation of a transition's \emph{entered states}.
  \item Swappable transition candidate generation strategies, with caching.
  \item Lazy evaluation of transition candidates.
\end{itemize}

In this section, we discuss a few optimizations that haven't been covered yet.

% \subsubsection{Sorting lists of enabled events}

% The following pieces of action code in a statechart have access to the event parameters of a transition:
% \begin{itemize}
%   \item The transition's guard condition, an expression.
%   \item The transition's action code, a block of statements.
% \end{itemize}
% Before executing them, the transition's event parameters must be copied to the stack, where they occupy the transition's stack frame. In order to copy event parameters, the set of current events has to be iterated

% If the guard conditions of many different transitions that have event parameters have to be evaluated, this potentially leads to the copying 

% The order in which they are expected on the stack is defined as ordered by of their event ID, followed by the order of the parameter within the event (parameters are ordered within events).

\subsubsection{No semantic-option-dependent conditional branches in execution loop}

The semantics of a statechart model never change during execution, therefore, no sort of execution loop should waste time checking them with each iteration.

Upon initialization of a statechart instance, based on the semantics defined in the statechart model, objects and relationships between them are constructed in a way that they will carry out statechart execution according to the semantics specified, but the objects themselves are unaware of the semantics they represent.

For instance, Figure \ref{fig:rounds_example} shows the construction of different hierarchical \texttt{Round}-structures, as constructed based on different maximality-semantics, but the \texttt{Round}-objects themselves do not have access to the semantic configuration of the model.

\subsubsection{Converting all durations to the same unit}\label{sec:model_delta}

The action language has a builtin \emph{time duration} type. Duration-literals make time durations unambiguous by including a duration-suffix (e.g. \texttt{ms} for milliseconds). In a statechart, the duration type is expected for the delay-value of timed transitions.

Internally, a duration is an integer value with a \emph{unit}. Units reflect the duration-suffixes used, but upon construction, every duration is automatically converted to the largest possible unit, without losing information, called \emph{normalization}. For arithmetical operations between durations, such as addition or subtraction, they have to be converted to their \emph{greatest common divisor}-unit before the operation can be performed, again followed by normalization. Units have no absolute ``size'' (they are defined only relative to each other), offering great flexibility and an unlimited range of units to be defined, but also making these duration conversions expensive operations.

Because units can be arbitrarily mixed in a statechart, potentially requiring many unit conversions during execution, hurting perfomance, all durations are transparently converted to \emph{integer} values representing multiples of the same time unit (or more precisely: time duration), called the \emph{model delta}. The model delta is the smallest representable amount of time during model execution. Even if a model only contains durations in the order of seconds, it is wise to set the model delta much smaller, because the timestamps assigned to input events are also expressed in model delta units, and are rounded down if necessary. By default, a model delta of $100 \mu s$ is used, allowing for maximum durations (and timestamps) of $ 5.8 * 10^{7}$ years if 64-bit unsigned integers are used for timestamps. The model delta can be overriden in the model.

\subsubsection{Python-specific optimizations}

Our implementation contains some optimizations specific to Python, the languge in which it is written:
\begin{description}
  \item[Native integer type vs. SCCD's Bitmap class] Bitmaps are used in many parts of the code. A \texttt{Bitmap} type was defined, inheriting Python's integer type, overriding its string-representation \texttt{\_\_str\_\_}-method to return a bitstring instead of a decimal number. Arithmetic operators between Bitmaps were also overridden to return Bitmaps, not integers. It was found however, that this had a significant negative impact on performance, simple arithmetic operations on bitmaps requiring function calls. The solution was to disable our custom bitmap implementation when the environment variable \texttt{SCCDDEBUG} was not set, falling back to native integers. The Python language is very flexible in this regard, the solution having the following form:
  \begin{python}
    if DEBUG:
      class Bitmap:
        ...
    else:
      Bitmap = int
  \end{python}
  With this modification alone, when executing all models from the test framework, the average time of executing a transition went from 0.2 ms to 0.16 ms, a 20\% speedup!

  \item[Slotted classes] By default, Python allows arbitrary attributes to be added to objects. This is implemented by storing the attributes of objects in (modifiable) dictionaries. This introduces a significant memory overhead and slows down attribute access. Python offers an alternative in the form of \emph{slotted classes}, that declare a sequence of fields to always be part of the object.

  Slotted classes have drawbacks, such as not supporting multiple inheritance. They also require additional syntax, for the declaration of the ``slots''. We \emph{make the common case fast} and use slotted classes in much of the \texttt{statechart.static} package defining syntactic constructs, whose attributes are constantly accessed during execution.
\end{description}


\section{Controller Implementation}\label{sec:controller_impl}

The Controller is the main primitive for executing SCCD models.

\subsection{SCCD models}\label{sec:single_instance_xml}

An SCCD model is at the highest level a \emph{class diagram} (CD) of statechart instances that may be created and destroyed at runtime. In this thesis, the class diagram always consists of a single class, of which a single instance is created. To work with these models, we developed an ad-hoc XML format with an extremely simple structure:

\begin{lstlisting}
<single_instance_cd>
  <model_delta>100 us</model_delta>
  <statechart> ... </statechart>
</single_instance_cd>
\end{lstlisting}

The \texttt{<model\_delta>} node is optional and sets the ``model delta'', the smallest possible time duration that can be simulated in the model. The \texttt{<statechart>} node is the root node of a model in our statecharts XML format, described in Section \ref{sec:statechart_xml}.

This format is parsed into a \texttt{SingleInstanceCD} object (Figure \ref{fig:cd_controller}), implementing the \texttt{AbstractCD} interface for SCCD models. The \texttt{SingleInstanceCD} object can be passed to the Controller's constructor.

\subsection{Controller Interface}\label{sec:controller_interface}

The Controller's interface is a primitive for executing SCCD models. The Controller maintains a single global \emph{priority queue} of timestamped events. Events serve as input for one or more statechart instances. (An event with more than one target instance is a multicast or broadcast event.) Timestamps in the Controller are just integers, they have no physical meaning, only logical.

The Controller maintains an integer value of \emph{simulated time}, which is 0 right after the Controller's creation, and always equal to the timestamp of the most recently popped event from the queue. The simulated time can only increase (or stay the same).

These are the Controller's most important methods, towards its environment:
\begin{description}
  \item[\texttt{add\_input(timestamp: int, ..., event\_name: str, ...)}] Adds an input event to the controller's queue.
  \item[\texttt{run\_until(timestap: int)}] Advances simulated time. In a loop, pops events from the queue and delivers them as input to their target instances. The target instances respond to the input by making a big-step. The call does not return until either (1) the queue is empty, or (2) the next event to be popped would advance simulated time further than the \texttt{timestamp} parameter of the call.

  The following fragment is almost literally taken from the source code:
  \begin{python}
  def run_until(self, now: int):
      for timestamp, entry in self.queue.due(now):
          self.simulated_time = timestamp
          for instance in entry.targets:
              instance.big_step([entry.event])
  \end{python}

  \item[\texttt{next\_wakeup(): int}] Getter. Returns the timestamp of the earliest event in the controller's queue.
\end{description}

All methods are synchronous (taking control of the current thread until they return). Using these 3 methods, the 3 different runtime \emph{platforms} foreseen in the original version of SCCD can be implemented. Section \ref{sec:platforms} contains a description of these platforms and a demonstration of their implementation using the Controller as a primitive.

\subsubsection{Integer timestamps}

Figure \ref{fig:controller_int_timestamps} shows how the Controller only ``talks'' integer timestamps with its environment, as well as with its model. If meaning must be given to the Controller's timestamps (e.g. for real-time simulation), it is up to the environment to query the model for its ``model delta'', which is a time duration of which the controller's timestamps are multiples (see also Section \ref{sec:model_delta}).

These are the motivations for making the Controller work with timestamps only as logical entities:
\begin{itemize}
  \item The Controller's environment may not care about real-time simulation: E.g. when running a model test (a model + timed inputs + expected, timed outputs), we are only interested in the correctness wrt. its output, and run the model as fast as possible.
  \item It allows the Controller's timestamps to be interpreted as a wide range of time units. This could also be done by using our \texttt{Duration} type (which also offers a wide range of units) instead of raw integers, but its performance is worse, because of possible unit-conversions.
  \item Following the single-responsibility principle of OO-design, the Controller's purpose is to queue incoming events, and to handle them in the right order, when simulated time is requested to advance, and nothing more.
\end{itemize}

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{controller_int_timestamps}
\caption{The Controller only ``talks'' integer timestamps}
\label{fig:controller_int_timestamps}
\end{figure}

\subsubsection{Output}

It was mentioned before that statecharts can produce output in several ways:
\begin{enumerate}
  \item As \emph{output events}, produced as the result of a big-step.
  \item By assigning values to \emph{output variables}, that can be read at any time by the environment.
  \item By calling \emph{synchronous functions} from action code, i.e. during the firing of a transition.
\end{enumerate}
Of these, SCCD supports (1) and (3). Output variables are currently not supported.

In BSML, output events are unordered sets, produced at the end of every big-step. The Controller deviates here by delivering every output event immediately, when it is raised, via a callback function, i.e. during transition firing and from the controller's thread. This has the following benefits:
\begin{itemize}
  \item It saves the overhead of adding output events to an output queue.
  \item If desired, the environment can still queue output events.
\end{itemize}
The implementation of the callback function should take care not to block, as it will block the model's execution.

No attempt is made to add output events of a big-step together in a ``bag of events'', but a special event signals the end of a big-step, so these ``bags'' can still be constructed (e.g. the test framework does this)

\subsection{Design}

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{CD_Controller}
\caption{Class diagram of Controller and other classes involved in SCCD model execution}
\label{fig:cd_controller}
\end{figure}

Figure \ref{fig:cd_controller} shows the class diagram of the \texttt{Controller} class and several other classes that play a role in SCCD model execution. The Controller has an \texttt{EventQueue} to which input events are added. The Controller also has an \texttt{ObjectManager}, which maintains a list of all created instances during execution (in our case, only a single instance), and checks adherence of these instances to the class diagram (not yet implemented in our branch). \texttt{Instance}s perform \emph{big-steps} in response to a set of input events.

\texttt{AbstractCD} is an interface that loaded SCCD models implement. The \texttt{SingleInstanceCD} class is an ad-hoc implementation of this interface for models that at all times only consist of a single statechart instance. The Controller receives an \texttt{AbstractCD} model as its constructor parameter, and passes it on to the ObjectManager, which uses it to instantiate the \emph{default class} (in our case, the single statechart the model consists of).

The classes below the dotted line, \texttt{EventLoop} and \texttt{EventLoopImplementation} are not part of the ``core runtime'', and optionally wrap around the controller. They are mentioned in the next section.

\subsection{Runtime platforms}\label{sec:platforms}

\begin{figure}
\centering
\includegraphics[]{platforms}
\caption{Platforms supported in original SCCD.}
Figure taken from \cite{VanMierlo2016}.
\label{fig:platforms}
\end{figure}

An important feature in original SCCD was the ability to select a type of runtime ``platform'' for generated code. The available platforms were (Figure \ref{fig:platforms}):

\begin{description}
  \item[Event Loop] The event loop platform is intended for easy integration with existing 3rd party event loop implementations, as typically found in UI toolkits, like TkInter. This platform can be integrated with any existing event loop implementation that offers functions for
  \begin{enumerate}
    \item scheduling a future callback
    \item clearing a scheduled callback
  \end{enumerate}
  Using these callbacks, the platform will attempt to let the statechart simulation run in sync with wall-clock time, scheduling periods of statechart execution (i.e. responding to a statechart input events) as tasks in the 3rd party event loop. The statechart execution would thus run from the same thread as the 3rd party event loop, interleaving it with e.g. UI events.
  The runtime library comes with implementations for TkInter and JavaScript (browser, NodeJS).

  \item[Threads] The threads platform has its own native event loop implementation, using only the target language's standard library. This event loop implementation will also attempt to run the simulation in real-time. When it is started, the threads platform takes over the current thread, so the user is required to write any input or output logic in separate threads.

  \item[Game Loop] The game loop platform is perhaps the simplest of all, as it only offers a function that advances the simulated time to ``now'' (wall-clock time). The function is blocking and runs as-fast-as-possible. Traditionally, with each iteration of a game loop, the function would be called to update ``the world'', followed by rendering the display.
\end{description}

\subsubsection{Implementation with Controller primitive}
In original SCCD \cite{VanMierlo2016}, these platforms were explicitly implemented side-by-side, but shared a lot of common (duplicated) logic. Now, with the Controller's interface as a primitive, all 3 ``platforms'' can be built, as we demonstrate.

The most trivial is \emph{game loop} integration, where with each iteration, the simulated time advances up to the current wall-clock time:

\begin{python}
start_time = now()
while True:
  controller.add_input( ... ) # e.g. keyboard or mouse events
  controller.run_until(now() - start_time)
  render()
\end{python}

The \emph{threads} platform is slightly more complicated, at least if we want to run the simulation in real time. We use a \texttt{threading.Condition} object instead of the sleep-function, in order to be woken up when there is an input:

\begin{python}
input_queue = queue.Queue() # thread-safe queue
condition = threading.Condition()

def controller_thread():
  while True:
    while event := input_queue.pop():
      controller.add_input(..., event, ...)
    controller.run_until(now()) # this may take a while
    sleep_duration = controller.next_wakeup() - now():
    if sleep_duration > 0:
      with condition:
        condition.wait(sleep_duration)

def add_input(event):
  input_queue.put(event)
  with condition:
    condition.notify() # wake up controller thread

thread = threading.Thread(target=controller_thread)
thread.start()
\end{python}

The \emph{event loop} platform is similar, but instead of sleeping, we schedule a future callback in the event loop we are integrating with. Also, we do not need a thread-safe input queue, as it is safe to directly call \texttt{controller.add\_input} from everywhere. The event loop platform is as follows:

\begin{python}
scheduled_id = None
def run():
  controller.run_until(now()) # this may take a while
  sleep_duration = controller.next_wakeup - now():
  if sleep_duration > 0:
    scheduled_id = schedule(sleep_duration, run)
  else:
    scheduled_id = schedule(0, run)
schedule(0, run) # start controller when event loop starts

def add_input(event):
  controller.add_input(..., event, ...)
  # "wake up":
  cancel(scheduled_id)
  run()
\end{python}

Finally, an example from our \emph{test framework}, where for the execution of a test case, all input events are known from the beginning, and simulation runs as fast as possible. The controller runs in its own thread, so that test output can be verified in parallel:

\begin{python}
for i in test_case.input:
  controller.add_input(..., i.event, ...)

pipe = queue.Queue() # thread-safe queue

def controller_thread():
  controller.run_until(None) # None here means: +infinity (return when event queue empty)
  pipe.put(None) # signal end of run

thread = threading.Thread(target=controller_thread)
thread.start()

while True:
  output = pipe.get()
  ... # verify test output
\end{python}

\subsubsection{Event loop library}

Part of the SCCD project is a library implementing the logic from the above \emph{event loop} example as a class, called \texttt{EventLoop}. Figure \ref{fig:cd_controller} shows how this class wraps around the Controller. In reality, the class does more than shown in the example, because:

\begin{enumerate}
  \item The schedule-callback of the chosen event loop implementation may expect a timeout in a different unit than the (Python) time function in use, which may in turn differ from the time unit expected by the model itself. See also Figure \ref{fig:controller_int_timestamps}.

  \item If the simulation continuously cannot keep up with the wall clock time (e.g. because the computer is too slow), with a naive implementation, invocations of \texttt{run\_until} will take longer with each invocation, increasingly starving other tasks (such as adding input events).
\end{enumerate}

To solve the first problem, event loop implementations are communicated with the \texttt{EventLoop} class in a declarative manner (see \texttt{EventLoopImplementation in Figure \ref{fig:cd_controller}}). Similarly, the time function and its unit are also declared (not pictured). Because time unit conversions have to happen all the time, the conversion ratios are pre-calculated for efficiency.

To solve the second problem, it is checked whether the \texttt{sleep\_duration} variable calculated is a negative number. This indicates wall-clock time running faster than our simulation. If this is the case, the negative number is added to the parameter of \texttt{run\_until} in the next round (reducing the amount of time-to-simulate), purposefully making the model run at a reduced speed. This solution was evaluated to effectively keep the model responsive to input under heavy load. The simulation would also recover (``catch up'') with wall-clock time when load was reduced.

% ^ these problems apply to all real-time types of execution


% TODO:
  %    - model I/O primitives (input: events, output: synchronous callbacks)
  % - overview: controller's "global" event loop


\section{Important changes from original SCCD}\label{sec:changes}

As mentioned in the beginning of this chapter, our fork of SCCD differs significantly from the original. The most drastic change is that SCCD is no longer a code generator. We'll first motivate our decision of abandoning the code generation approach. Next, we explain how the 3 different runtime platforms (each with their own interface) are no longer a fundamental part of the ``core'' runtime, and instead use a single primitive interface usable by all 3 platforms. Finally, we list a number of smaller improvements.


\subsection{No more code generation}
% Intermediate language
In order to support multiple target languages without having to replicate much of the compilation logic, the original SCCD compiler transformed input models to an intermediate procedural language. This language did not have a textual syntax, only abstract. It was developed to be the greatest common divisor of the target languages supported. All supported target languages (Python, JavaScript, C\#) were similar enough (procedural, object-oriented and garbage collected) to make this approach work.

% Can not support all languages
However, it would have been non-trivial, to add a language like C, because of its manual memory management. This limitation contributed to abandoning this compilation strategy.

% Difficult to understand code:
Another, more important problem, was that the part of the compiler that generated the intermediate language code (or better: AST) was very hard to understand. The construction of the AST tree looked nothing like readable code. Some improvement was made by using a stateful ``writer'' object to build the syntax tree, but then still identifiers in the target language were strings in the compiler code, making it impossible to statically check them for errors. Example:

\begin{python}[caption={A fragment of \texttt{generic\_generator.py} of original SCCD},captionpos=b]
  self.writer.beginConstructor()
  self.writer.addFormalParameter("controller")
  self.writer.beginMethodBody()
  self.writer.beginSuperClassConstructorCall("ObjectManagerBase")
  self.writer.addActualParameter("controller")
  self.writer.endSuperClassConstructorCall()
  self.writer.endMethodBody()
  self.writer.endConstructor()
\end{python}

Finally, perhaps the most important reason for abandoning the code generator, was the fact that the execution runtime library had already become the place where most of the complexity was located: it was where most of the semantic variability was implemented, as well as a few runtime optimizations (such as transition candidate caching).

In order to extend SCCD with additional semantic variability, in a clean way, the decision had to be made whether to move all complexity to the code generator, or to the execution runtime. Because of the reasons just mentioned, the latter was chosen.

Our decision to go with the runtime approach still does not rule out the future addition of a code generator to the SCCD project: The insights obtained from implementing statechart semantic variability as a runtime may be used as a foundation for a code generator.

\subsection{No more explicit runtime platforms}

In original SCCD, there were 3 types of controllers, each implementing one of the 3 runtime platforms described in Section \ref{sec:platforms}.

\subsubsection{Limitations}
The original, side-by-side implementation of the 3 platforms had some shortcomings:
\begin{itemize}
  \item Complex design: The available platforms were implemented in the SCCD runtime library, so, in principle, they should have been selectable independently of the compilation step. However, due to contrains in the original design, caused by heavy reliance on class inheritance (with many overrides), the platform had to be selected at compile time, and the generated code depended on a specific platform. This is shown in Figure \ref{fig:cd_old_controller}.

\begin{figure}
\centering
\includegraphics[width=1.0\textwidth]{CD_old_controller_platforms2}
\caption{Class diagram of original SCCD's implementation of the 3 platforms}
\label{fig:cd_old_controller}
\end{figure}

  \item Use of floating point numbers for timestamps: Older Python versions use floats for timestamps, so this choice seemed natural. However, floats have non-uniform precision. At some point, this led to a problem, where a very small update in the simulated time would be rounded down, unexpectedly not incrementing the simulated time, and freezing the simulation. A workaround was developed, detecting this issue, and forcing the result to be rounded up. However, the use of integer timestamps is a much safer choice, for its uniform, predictable precision. Also, since Python 3.7, a more precise time function exists, returning time as an integer.

  \item Wall-clock time-aware: All \texttt{Controller}-classes would query the current wall-clock time, to attempt to let the simulation run in (soft) real time. In some cases however, it is desirable to run a simulation at lower or higher speed. For instance, when (non-interactively) testing a model, the simulation typically needs to run at the highest possible speed, to produce results (test passed/failed) as soon as possible. Simulations for scientific research would also be run at the highest possible speed, if one is only interested in the eventual outcome.
\end{itemize}

\subsubsection{Solution}
As explained in Section \ref{sec:platforms}, there is now a single Controller primitive that can be used to create any of the 3 runtime platforms, and can be used to run the simulation at any speed relative to wall-clock time, including as-fast-as-possible execution. The event loop platform is included as a library that wraps around the controller.

\subsection{Single event queue}

In the original version of SCCD, every statechart instance had its own event queue, as well as a separate event queue in the Controller. The vague intent was to allow parellel execution of instances, but this goal was never accomplished. The existance of multiple event queues made it complex to find the timestamp of the next event to be processed (having to check all queues).

Now, there is only a single event queue, in the Controller, for containing the input events for all statechart instances in the runtime. Hypothetically, parallel execution is still possible by making the event queue thread-safe, and sharing it among ``worker threads''.

\subsection{Duration units}
Treating time durations as numbers is confusing, because the unit may differ depending on the context. All time durations in after-transitions in original SCCD were expressed in seconds, as floating point values. Now, every time duration has a mandatory suffix, and the action language's static type system considers durations their own type, and cannot simply be casted to or from a number. Suffixes are `fs' (femtoseconds), `ps' (picoseconds), `ns' (nanoseconds), `us' (microseconds), `ms' (milliseconds), `s' (seconds), `m' (minutes), `h' (hours) and `D' (days). 

\subsection{More powerful test framework}
Since we are interested in comparing the behavior of different semantic configurations, it is useful to see how the same statechart model behaves if we change the semantics. In original SCCD, the semantic configuration was inseparable of the model, which makes sense, because a model is usually developed only with a specific semantic configuration in mind. So now in SCCD, this is currently still the case, but it is also possible to share statechart models between test cases, and override the semantics, achieving our goal.

Another useful feature is that when defining a semantic configuration for a test case, for each semantic aspect, one can use the wildcard symbol ``$*$'' or a comma-separated list to express that multiple options should cause the test to succeed. If multiple options are chosen for multiple semantic aspects, all configurations making up the cartesian product are tested.

Finally, one can denote tests that are expected to fail with the filename prefix \texttt{fail\_} instead of the usual prefix \texttt{test\_}. This feature is mostly used to check whether syntactically invalid models are detected as such.