z = x + ywhere x = 2 and y = 3; computing z over many iterations, say 1000000, i.e.
for iteration = 1 to 1000000 z.signal[iteration] = x + yis unnecessary. First, we replace x and y with their constant values and perform the addition. As discussed later, we only need to compute z once after replacing x with 2 and y with 3. An optimized execution is
z.signal[0] = 2 + 3 # x = 2; y = 3 for iteration = 1 to 1000000 z.signal[iteration] = 5 # or simply print z for the current iterationObviously, computing z 1000000 – 1 times simplifies to printing or copying the value of z at time 0. This generally improves performance.
for iteration = 1 to 1000000 s.signal[iteration] = x + y + z + a + bx, y, z are constants; a, b are variables.
x + y + z + a + b = temp + varwhere
temp = x + y + zand
var = a + bAn optimized execution of the adder is
temp = x + y + z for iteration = 2 to 1000000 s.signal[iteration] = temp + a + bThis results in savings of 2 * 1000000 – 2 additions!
Ax = bwhere A is the coefficient matrix formed from the relationships between the variables given by the vector x; b is the vector for containing the right-hand sides of the equations. Because the dependency graph at time = 0 might be different from the dependency graph at time > 0; code generation occurs only during the first two iterations (time = 0 and time = 1). The generated C code includes function for printing the results to the standard output device, file and for collecting timing statistics.
Application | Python(time in seconds) | C (time in seconds) |
---|---|---|
Physbe(3,000 iterations) | 23.87s | 0.82s |
Circletest (60,000 iterations) | 22.33s | 0.94s |