memoization.rst 5.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
  1. ..
  2. Copyright 2014 Modelling, Simulation and Design Lab (MSDL) at
  3. McGill University and the University of Antwerp (http://msdl.cs.mcgill.ca/)
  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License.
  13. Memoization
  14. ===========
  15. PyPDEVS supports memoization for the most heavyweight functions that are called during simulation.
  16. What is memoization?
  17. --------------------
  18. Memoization simply means that the return values of a function call will be cached.
  19. As soon as the function is called again with exactly the same parameters, the cached value will be
  20. returned instead of the function being reevaluated again.
  21. The advantage is clearly that it has the potential to speed up computation in situations where
  22. the value is likely to be cached **and** if the function takes a relatively long time.
  23. How does it apply to PyPDEVS?
  24. -----------------------------
  25. The PyPDEVS code is significantly optimized, though a certain part of the code is inoptimizable by
  26. the simulator itself. This code is the *user code*, which defines e.g. the transition functions of
  27. the model. These transition functions have the potential to computationally intensive. For example,
  28. most distributed simulation benchmarks have a transition function which takes in the terms of milliseconds.
  29. The only remaining requirement is then that the value is *likely* to be cached. For this reason, memoization
  30. is only used in distributed simulation. In distributed simulation, a complete node might have to revert
  31. all of its computation due to another (unrelated) model requesting such a revertion. Most of the time,
  32. this model is not influenced by the change directly, therefore the input parameters of the function are
  33. likely to be identical.
  34. It is therefore possible to assume that distributed simulation is likely to profit from this optimization,
  35. certainly in the case of relocations. When a relocation happens, the node is reverted to the current GVT,
  36. even though no real causality violation happened. These transitions can then be recalculated immediately with
  37. the use of memoization.
  38. Why not enable it by default?
  39. -----------------------------
  40. Even though memoization seems a way to quickly increase performance, it also has several downsides. The most
  41. important downside is the high space complexity that it incurs. Time warp simulation is already extremely
  42. space consuming, so also caching the inputs and their response is not going to be of much help to that.
  43. This problem is partially mitigated by having time warp and memoization refer to the same state in memory,
  44. though this still requires additional lists, input dictionaries, ...
  45. Another problem is the datastructure management. As soon as a revertion happens, the list of old states is
  46. reverted and used to check for equality. Without memoization, this list would be discarded, freeing up lots
  47. of space. Therefore, this problem again relates to space complexity.
  48. A final problem is the requirement to check the states for equality. These checks can take arbitrarily long,
  49. depending on how the user defined the equality method. In the worst case, the user might not have defined such
  50. a method, causing every comparison to result in *False*. This is clearly problematic, as the memoization speedup
  51. will then never be visible. Furthermore, memoization is unlikely to have an impact in simulations where nearly
  52. no revertions happen.
  53. For these reasons, memoization is not enabled by default, but only when the user enables it explicitly.
  54. Implementation hints
  55. --------------------
  56. Due to the way memoization is implemented in PyPDEVS, some considerations apply:
  57. 1. As soon as an inequal state is found, memoization code is aborted because the chance of further equality becomes too small.
  58. 2. Memoization code is only triggered after a revertion happened.
  59. 3. Due to memoization, side-effects of the transition function are not performed. This includes printing, random number generation, ... Note that transition functions with side effects are already a bad idea in time warp simulationn.
  60. Requirements
  61. ------------
  62. Two requirements exist to use memoization. The first one is simply to enable it in the configuration, the second one
  63. requires a little more explanation.
  64. By default, Python provides equality methods on two objects, but they always return *False* if the objects are different
  65. (even though their content might be equal).
  66. The user should thus add the *__eq__(self, other)* and *__hash__(self)* function, to provide user-defined equality.
  67. Technically, it is required that the output is **exactly** the same when the current state (and input message, in case
  68. of *external* and *confluent transitions*) are equal according to these methods.
  69. Example
  70. -------
  71. Simply enabling memoization is not that difficult and is simply::
  72. sim = Simulator(MyModel())
  73. sim.setMemoization(True)
  74. sim.simulate()
  75. Defining the equality method on a state can be::
  76. class MyState(object):
  77. def __init__(self, var1, var2):
  78. self.var1 = var1
  79. self.var2 = var2
  80. def __eq__(self, other):
  81. return self.var1 == other.var1 and self.var2 == other.var2
  82. def __hash__(self):
  83. return hash(hash(self.var1) + hash(self.var2))