checking requirements:Sequence Diagrams, Regular Expressions and State Automata

Practical information

Due date: Wednesday 14 October before 23:55
Team size == 2 (pair design/programming) !
Each team submits only one full solution. Use the index.html template. Use exactly this format to specify names and IDs of the team members. The other team member must submit a single index.html file containing only the coordinates of both team members. This will allow us to put in grades for both team members in BlackBoard. Beware: after the submission deadline there is no way of adding the other team member's index.html file and thus no way of entering a grade !
Your submission must be in the form of a simple HTML file (index.html) with explicit references to all submitted files as well as inline inclusion of images.
The submission medium is BlackBoard.
Contact: Prof. Vangheluwe.

Requirements

Introduction

In this assignment, you will use UML Class Diagrams, UML Sequence Diagrams, Regular Expressions, and State Automata modelling languages to design and verify a communication protocol of a client-server system. In particular, the publish-subscribe software architecture messaging pattern is used. This pattern is also used in the Data Distribution System (DDS) protocol commonly used in industrial automation systems. The client part of the system has a user interface from which the user can input messages and send them. The client also receives messages from a chat room which it has subscribed to. A chat room is a subject (publisher) and a client is an observer (subscriber) in the Observer Pattern. It receives messages from its clients and broadcasts these messages to subscribed clients. A chat room can have 0 or more clients. A client can be connected to 0 or more chat rooms at any time. In your design, there will be no actual interaction with the human user. Rather, the clients simulate human operations (i.e., sending connection requests and sending messages) in a random way.

You are given an implementation chatProtocolSimulation.py along with the object-interaction trace (in a file) obtained from the existing implementation (which you can generate yourself using the implemenation). You need to model, based on the requirements given below, first a collection of Sequence Diagrams, visually representing the requirements. From this, you will derive a set of Regular Expressions, and derive from that, a set of FSAs, which you will subsequently encode, to automatically test whether the system implementation complies with the system (protocol) specification given in the requirements.

The following is a textual description of a simple protocol:

There are 6 clients and 2 chat rooms in the system. The clients must be connected to a chat room before they can randomly send messages.
The system works with discrete time steps: at each time step, exactly one random client will connect or send a message to a random chat room.
Initially, the clients are not connected. They try to connect to a random chat room in each round. The requested chat room immediately receives the request (no network delays).
A chat room can accept at most 4 clients. It accepts a connection request if and only if its capacity is not exceeded.
The requesting client receives acceptance or rejection from the chat room immediately (no network delays).
When connected, a client sends random messages to the chat room in each round. The chat room immediately receives the message. It processes the message and broadcasts it to all the other clients (except the sender) connected to it.
The clients immediately receive the broadcast messages.
For simplicity, disconnection and reconnection are not modelled.

Interaction behaviour to be verified (use cases):

When a client sends a connection request to a chat room, the chat room immediately responds by printing to the output.
On receiving a connection request, the chat room immediately makes a decision whether to accept the client or reject it.
When a chat room accepts a client, the client immediately receives the acceptance and dumps to the output.
When a chat room rejects a client, the client also immediately dumps the rejection.
When a client sends a message, the chat room immediately receives that same message and prints it to the output.
When a chat room receives a message, it broadcasts that same message to all connected clients except the sender.
The sender cannot receive its own message after it sends it.

Tasks you need to finish step by step:

Draw a class diagram (loosely) corresponding to the implementation chatProtocolSimulation.py. You can find inspiration in the Observer Pattern.
Optional(attempt this only after you've finished the rest of the assignment!): you may refactor the source code to more closely reflect the Observer Pattern (and hence, its Class Diagram :).
Design the dynamic interaction behaviour in UML Sequence Diagrams for ONLY use cases 3 and 7. Use the right classes and method calls, by looking at the given implementation, as you've also noted in your Class Diagram.
Write regular expressions (refer to the format of the given output trace) for verifying the above use cases (in this assignment, you only need to verify TWO use cases: use case 3 and use case 7); the following is an example:
```
Regular expression pattern for rule 1:
##[^\n]*\n$CL (\d+)$ RS (\d+)\.\n##[^\n]*\n$CR \2$ RR \1\.\n

The following output will match the above pattern (multiple lines thanks to the \n):
## (Client 2) A connection request is sent to chat room 1.
(CL 2) RS 1.
## (Chat room 1) Received connection request from client 2.
(CR 1) RR 2.
```
Clarification: the above uses a Regular Expression notation commonly used in UNIX Regular Expressions (as used in the stream editor sed for example) which differs from the examples given in class. In addition to the slightly different notation, the expressive power of these Regular Expressions is also higher as it allows for memory of matched expressions making the patterns context dependent.
- [eE] stands for e or E.
- [a-z] stands for one of the characters in the range a to z.
- ^ means "match at the beginning of a line/string".
- $ means "match at the end of the line/string".
- [^x] means not x, so ##[^\n]*\n matches a comment line: ## followed by 0 or more non-newline characters, followed by newline.
- . matches any single character.
- X? matches 0 or 1 repititions of X.
- X* matches 0 or more repititions of X.
- X+ matches 1 or more repititions of X.
- \ is used to escape meta-characters such as (. If you want to match the character (, you need the pattern \(.
- The ( and ) meta-characters are used to memorize a match for later use. They can be used around arbitrarily complex patterns. The first (\d+) in the above regular expression matches any non-empty sequence of digits (assuming d has been defined elsewhere as [0-9]). The matched pattern is memorized and can be referred to later by using \1. Following matched bracketed patterns are referred to by \2, \3, etc. Note that you will need to encode powerful features such as this one by adding appropriate actions (side-effects) to your automaton encoding the regular expression. This can easily be done by storing a matched pattern in a variable and later referring to it again.
You are welcome to use different variant notations (such as the one used in the Python Regular Expression module) as long as you explain your notation.
Design a FSA which encodes the regular expressions for verification;
Implement this FSA for verification in the provided code framework (see scanner.py);
Run this FSA implementation (which in turn implements the Regular Expressions which in turn encode the checking of interaction behaviour use cases which were modelled as Sequence Diagrams) on the given chatProtocolSimulation.trace.txt to verify the specification.
There is an intentional bug in the implementation (chatProtocolSimulation.py) which makes it fail to satisfy the system specification. You need to figure it out by verifying the output trace with your FSA implementations, and fix it (just one line).
Write a report that explains your solution of this assigment. Include your models and discuss them.
It is crucial that all your models are consistent and that you can hence trace back and forth between them: every part of a requirement corresponds to a particular sequence diagram which in turn corresponds to its encoding as a regular expression, which in turn corresponds to a particular FSA.

Starting Point/hints:

The program implementing the system (simulation): chatProtocolSimulation.py
The interaction trace generated from the above program: chatProtocolSimulation.trace.txt
To make your automata smaller, we use abbreviations to shorten the messages that you need to recognize. Here are the mappings:
```
CL := Client
CR := Chat room
RS := A connection request is sent to chat room
RR := Received connection request from client
AC := Accepted client
AB := Accepted by chat room
RC := Rejected client
RB := Rejected by chat room
SY := Says
RM := Received message from client
SM := Sent message to all connected clients except client
```
As you can see, any message in the output file which starts with "##" is some comments to make the output readable. You need to ignore these messages when you do verification.
Hint: to check a particular requirement, you can ignore lines in the trace pertaining to other requirements. You may, if that makes your life easier, pre-process the trace file: you may first filter out all lines not pertaining to the requirement(s) you're checking. You may for example do this with sed. Another option is to use re.sub() from the Python regular expression library. regex101.com allows you to write both match and substitute patterns, try them out, and generate Python code.
You can also leave the trace file as-is and do the filtering of non-relevant lines in your own regexp/FSA.
Hint: Your solution looking for errors needs to run over the entire output trace file. It may of course stop the instant an offending pattern is found, or when an expected pattern is only partially matched, depending on how you chose to do the checking.
Hint: Note that you may create one regexp/FSA/test per requirement, or you may combine them into one.
In principle, your tests should work on any output trace file that chatProtocolSimulation.py generates, including the given one. You may have noticed the use of random.randint() in the source code. As a consequence, every generated trace file is likely to be different. The bug in the given implementation does however occur in all traces (which makes it easier to find than intermittently occuring ones).
Code used to implement the FSA: the scanner is in scanner.py which requires an input stream class CharacterStream found in charStream.py.
Example automata evenBinaryAutomaton.py and numberAutomaton.py recognize even binary numbers (encoded as a string of bits) and floating point numbers respectively. These automata are used in test.py which produces test.output.txt.

Material (useful links)

Use cases: http://www.cs.mcgill.ca/~joerg/SEL/COMP-533_Handouts_files/COMP-533%204%20Use%20Cases.pdf
Class diagrams: http://www.uml-diagrams.org/class-diagrams-overview.html
Sequence diagrams: http://www.uml-diagrams.org/sequence-diagrams.html, UML2 Sequence Diagrams
Regular expressions: http://www.zytrax.com/tech/web/regex.htm
Test out regular expressions online: https://regex101.com/
Example: Regular Expression patterns recognition
Finite state automata: http://cs.stanford.edu/people/eroberts/courses/soco/projects/2004-05/automata-theory/basics.html
You may want to use plantUML as a lightweight tool to produce your UML diagrams.
diagrams.net allows online drawing of many different types of diagrams. Local installation of drawio is also possible.
umple is a more sophisticated visual/textual modelling/programming environment.

Maintained by Hans Vangheluwe .

Last Modified: 2021/09/28 22:28:58.