In this assignment, you will use the UML Class Diagrams, UML Sequence Diagrams,
Regular Expressions, and
State Automata modelling languages to design and verify a communication protocol of a client-server
system. The client part of the system has a user interface from which the user
can input messages and send them. The client also receives messages from
a chat room which it has subscribed to. A chat room is a server.
It receives messages from its clients and broadcasts these messages to subscribed
clients. A chat room can have 0 or more clients. A client can
be connected to 0 or more chat rooms at any time. In your design,
there will be no actual interaction with the human user.
Rather, the clients simulate human operations
(i.e., sending connection requests and sending messages) in a
random way.
You are given
an implementation along with the object-interaction trace (in a file) obtained from the existing implementation.
You need to model first a set of Regular Expressions, and then a set of FSAs, which you will
subsequently encode, to automatically verify whether
the system implementation complies with the system (protocol)
specification given in the requirements below (and modelled visually in Sequence Diagram form).
The following is a textual description of a simple protocol:
There are 6 clients and 2 chat rooms in the system. The clients
must be connected to a chat room before they can randomly send
messages.
The system works in a round-robin fashion.
In each round, a random client will connect or send a message to
a random chat room.
Initially, the clients are not connected. They try to connect to
a random chat room in each round. The requested chat room immediately
receives the request (no network delays).
A chat room can accept at most 3 clients. It accepts a
connection request if and only if its capacity is not exceeded.
The requesting client receives acceptance or rejection from the
chat room immediately (no network delays).
When connected, a client sends random messages to the chat room
in each round. The chat room immediately receives the message.
It processes the message and broadcasts it to all the other clients
(except the sender) connected to it.
The clients immediately receive the broadcast messages.
For simplicity, disconnection and reconnection need not be modelled.
Interaction behaviour to be verified (use cases):
When a client sends a connection request to a chat room, the
chat room immediately responds by printing to the output.
On receiving a connection request, the chat room immediately
makes a decision whether to accept the client or reject it.
When a chat room accepts a client, the client immediately
receives the acceptance and dumps to the output.
When a chat room rejects a client, the client also immediately
dumps the rejection.
When a client sends a message, the chat room immediately
receives it and prints the message to the output.
When a chat room receives a message, it broadcasts it to all
connected clients except the sender.
The sender cannot receive its own message after it sends it.
Six tasks you need to finish step by step:
Design the dynamic interaction behaviour in UML Sequence Diagrams for ONLY use cases 2 and 7 (using e.g., dia, the drawing tool Inkscape or on a piece of paper);
Write regular expressions (refer to the format of the given output trace)
for verifying the above use cases (in this assignment, you only need to verify TWO use cases: use case 2 and use case 7);
the following is an example:
Regular expression pattern for rule 1:
##[^\n]*\n\(CL (\d+)\) RS (\d+)\.\n##[^\n]*\n\(CR \2\) RR \1\.\n
The following output will match the above pattern (multiple lines!):
## (Client 2) A connection request is sent to chat room 1.
(CL 2) RS 1.
## (Chat room 1) Received connection request from client 2.
(CR 1) RR 2.
Clarification: the above uses a Regular Expression notation commonly used in UNIX Regular Expressions
(as used in the stream editor sed for example) which differs from the examples given in class.
In addition to the slightly different notation, the expressive power of these Regular Expressions
is also higher as it allows for memory of matched expressions making the patterns context dependent.
[eE] stands for eorE.
[a-z] stands for one of the characters in the range a to z.
^ means "match at the beginning of a line/string".
$ means "match at the end of the line/string".
[^x] means notx, so
##[^\n]*\n matches a comment line: ## followed
by 0 or more non-newline characters, followed by newline.
. matches any single character.
X? matches 0 or 1 repititions of X.
X* matches 0 or more repititions of X.
X+ matches 1 or more repititions of X.
\ is used to escape meta-characters such as (.
If you want to match the character (, you need the pattern \(.
The ( and ) meta-characters are used to memorize
a match for later use. They can be used around arbitrarily complex patterns.
The first (\d+) in the above regular expression matches any non-empty sequence of
digits (assuming d has been defined elsewhere as [0-9]).
The matched pattern is memorized and can be referred to later by using
\1. Following matched bracketed patterns are referred to by \2, \3, etc.
Note that you will need to encode powerful features such as this one by
adding appropriate actions (side-effects) to your automaton encoding the regular
expression. This can easily be done by storing a matched pattern in a variable
and later referring to it again.
You are welcome to use different variant notations (such as the one used in the
Python Regular Expression module) as long as you explain your notation.
Design a FSA which encodes the regular expressions for verification;
Implement this FSA for verification in the provided code framework (see scanner.py);
Run this FSA implementation (which in turn implements the Regular Expressions
which in turn encode the checking of interaction behaviour use cases which were modelled as Sequence Diagrams)
on the given output trace to verify the specification.
There is an intentional bug in the implementation (chatprotocol.py)
which makes it fail to satisfy the system specification.
You need to figure it out by verifying the output trace with your FSA implementation, and fix it (just one line).
Write a report that explains your solution of this assigment. Include your models (scan them if they are drawings) and discuss them.
The output trace generated from the above program: output
To make life easier, we use abbreviations to shorten the messages that you need to recognize in your RegExp/FSA. Here are the mappings:
CL := Client
CR := Chat room
RS := A connection request is sent to chat room
RR := Received connection request from client
AC := Accepted client
AB := Accepted by chat room
RC := Rejected client
RB := Rejected by chat room
SY := Says
RM := Received message from client
SM := Sent message to all connected clients except client
As you can see, any message in the output file which starts with "##" is some comments to make the output readable.
You need to ignore these messages when you do verification.
Programs used to implement FSA: the scanner is in scanner.py
which requires an input stream class CharacterStream found in charstream.py.