CS304 Assignment 1
Solution
By Jean-Sébastien Bolduc, MSDL.
The assignment
See
assignment website.
NOTE: Python version 1.5.2 and PyUnit version 1.4.1 were used for this assignment.
* Download all files *
1. Class Diagram
2. Testing and implementation
Test Scripts
- TestCD.py : Testing for class CellData
- TestCC.py : Testing for class CellCoordinate
- TestSD.py : Testing for class SpreadsheetData (both list and dictionary implementations)
- TestSDlist.py calls the test on the list implementation (Prototype 1)
- TestSDdict.py calls the test on the dictionary implementation (Prototype 2)
Prototype 0
- CData.py : Class CellData implementation
- CCoord.py : Class CellCoordinate implementation
Prototype 1
- SSheetLIST.py : Class SpreadsheetData implementation, using LISTS
Prototype 2
- SSheetDICT.py : Class SpreadsheetData implementation, using DICTIONARY
3. Performance testing
result-PI-200.txt
- PTesting.py : Performance testing script used for performance analysis
- Results on a PII 200MHz platform.
- Results on a PIII 733MHz platform.
Performance for FULL systems
Performance for SPARSE systems
Analysis
- As SIZE is the width N of a square matrix, the number
of cells in the matrix is N2. All of these
are filled with data in the FULL case, only 10% in the SPARSE
case. All figures clearly show a quadratic relationship
between SIZE (N) and TIME. This is demonstrated by the
perfect fit of a quadratic function (in red) to the data.
As the number of (filled) cells is proportional to N2
and so is TIME, it must take a fixed amount of time to process
(create/fill/read) a single cell (on average).
It might actually make sense to plot TIME as a function of
the number of cells (this would lead to a straight line).
- The results described below and subsequent conclusions
depend on the particular implementation. If the LIST implementation
is "clever" enough, it is possible to obtain results similar
to those of a DICTIONARY implementation.
Note however that a sparse matrix of size 100000x100000
with only the four corner cells filled can be perfectly
handled in a DICTIONARY implementation whereas a LIST implementation
will run out of memory. This stress testing was not included
in this assignment (only time consumption, not memory consumption
was tested). The memory argument is sufficient (in addition
to the performance argument in some implementations) to favour
the DICTIONARY implementation in future developments.
Actually, this implementation is also more elegant.
- For a FULL system, the DICTIONARY implementation outperforms
the LIST one which indicates the (time) cost per cell is higher
for this LIST implementation.
- Obviously, the SPARSE results are better than the FULL system
results. Again, the DICTIONARY implementation outperforms
the LIST one.
- Our conclusion is to use the DICTIONARY implementation in
subsequent prototypes.