PageRenderTime 47ms CodeModel.GetById 20ms RepoModel.GetById 0ms app.codeStats 0ms

/rpython/doc/jit/optimizer.rst

https://bitbucket.org/pypy/pypy/
ReStructuredText | 200 lines | 154 code | 46 blank | 0 comment | 0 complexity | 6cf1b7b40699b224148cc7555fad5265 MD5 | raw file
Possible License(s): AGPL-3.0, BSD-3-Clause, Apache-2.0
  1. .. _trace_optimizer:
  2. Trace Optimizer
  3. ===============
  4. Traces of user programs are not directly translated into machine code.
  5. The optimizer module implements several different semantic preserving
  6. transformations that either allow operations to be swept from the trace
  7. or convert them to operations that need less time or space.
  8. The optimizer is in `rpython/jit/metainterp/optimizeopt/`.
  9. When you try to make sense of this module, this page might get you started.
  10. Before some optimizations are explained in more detail, it is essential to
  11. understand how traces look like.
  12. The optimizer comes with a test suit. It contains many trace
  13. examples and you might want to take a look at it
  14. (in `rpython/jit/metainterp/optimizeopt/test/*.py`).
  15. The allowed operations can be found in `rpython/jit/metainterp/resoperation.py`.
  16. Here is an example of a trace::
  17. [p0,i0,i1]
  18. label(p0, i0, i1)
  19. i2 = getarray_item_raw(p0, i0, descr=<Array Signed>)
  20. i3 = int_add(i1,i2)
  21. i4 = int_add(i0,1)
  22. i5 = int_le(i4, 100) # lower-or-equal
  23. guard_true(i5)
  24. jump(p0, i4, i3)
  25. At the beginning it might be clumsy to read but it makes sense when you start
  26. to compare the Python code that constructed the trace::
  27. from array import array
  28. a = array('i',range(101))
  29. sum = 0; i = 0
  30. while i <= 100: # can be seen as label
  31. sum += a[i]
  32. i += 1
  33. # jumps back to the while header
  34. There are better ways to compute the sum from ``[0..100]``, but it gives a better intuition on how
  35. traces are constructed than ``sum(range(101))``.
  36. Note that the trace syntax is the one used in the test suite. It is also very
  37. similar to traces printed at runtime by PYPYLOG_. The first line gives the input variables, the
  38. second line is a ``label`` operation, the last one is the backwards ``jump`` operation.
  39. .. _PYPYLOG: logging.html
  40. These instructions mentioned earlier are special:
  41. * the input defines the input parameter type and name to enter the trace.
  42. * ``label`` is the instruction a ``jump`` can target. Label instructions have
  43. a ``JitCellToken`` associated that uniquely identifies the label. Any jump
  44. has a target token of a label.
  45. The token is saved in a so called `descriptor` of the instruction. It is
  46. not written explicitly because it is not done in the tests either. But
  47. the test suite creates a dummy token for each trace and adds it as descriptor
  48. to ``label`` and ``jump``. Of course the optimizer does the same at runtime,
  49. but using real values.
  50. The sample trace includes a descriptor in ``getarrayitem_raw``. Here it
  51. annotates the type of the array. It is a signed integer array.
  52. High level overview
  53. -------------------
  54. Before the JIT backend transforms any trace into machine code, it tries to
  55. transform the trace into an equivalent trace that executes faster. The method
  56. `optimize_trace` in `rpython/jit/metainterp/optimizeopt/__init__.py` is the
  57. main entry point.
  58. Optimizations are applied in a sequence one after another and the base
  59. sequence is as follows::
  60. intbounds:rewrite:virtualize:string:earlyforce:pure:heap:unroll
  61. Each of the colon-separated name has a class attached, inheriting from
  62. the `Optimization` class. The `Optimizer` class itself also
  63. derives from the `Optimization` class and implements the control logic for
  64. the optimization. Most of the optimizations only require a single forward pass.
  65. The trace is 'propagated' into each optimization using the method
  66. `propagate_forward`. Instruction by instruction, it flows from the
  67. first optimization to the last optimization. The method `emit_operation`
  68. is called for every operation that is passed to the next optimizer.
  69. A frequently encountered pattern
  70. --------------------------------
  71. To find potential optimization targets it is necessary to know the instruction
  72. type. Simple solution is to switch using the operation number (= type)::
  73. for op in operations:
  74. if op.getopnum() == rop.INT_ADD:
  75. # handle this instruction
  76. pass
  77. elif op.getopnum() == rop.INT_FLOOR_DIV:
  78. pass
  79. # and many more
  80. Things get worse if you start to match the arguments
  81. (is argument one constant and two variable or vice versa?). The pattern to tackle
  82. this code bloat is to move it to a separate method using
  83. `make_dispatcher_method`. It associates methods with instruction types::
  84. class OptX(Optimization):
  85. def prefix_INT_ADD(self, op):
  86. pass # emit, transform, ...
  87. dispatch_opt = make_dispatcher_method(OptX, 'prefix_',
  88. default=OptX.emit_operation)
  89. OptX.propagate_forward = dispatch_opt
  90. optX = OptX()
  91. for op in operations:
  92. optX.propagate_forward(op)
  93. ``propagate_forward`` searches for the method that is able to handle the instruction
  94. type. As an example `INT_ADD` will invoke `prefix_INT_ADD`. If there is no function
  95. for the instruction, it is routed to the default implementation (``emit_operation``
  96. in this example).
  97. Rewrite optimization
  98. --------------------
  99. The second optimization is called 'rewrite' and is commonly also known as
  100. strength reduction. A simple example would be that an integer multiplied
  101. by 2 is equivalent to the bits shifted to the left once
  102. (e.g. ``x * 2 == x << 1``). Not only strength reduction is done in this
  103. optimization but also boolean or arithmetic simplifications. Other examples
  104. would be: ``x & 0 == 0``, ``x - 0 == x``
  105. Whenever such an operation is encountered (e.g. ``y = x & 0``), no operation is
  106. emitted. Instead the variable y is made equal to 0
  107. (= ``make_equal_to(op.result, 0)``). The variables found in a trace are
  108. instances of Box classes that can be found in
  109. `rpython/jit/metainterp/history.py`. `OptValue` wraps those variables again
  110. and maps the boxes to the optimization values in the optimizer. When a
  111. value is made equal, the two variable's boxes are made to point to the same
  112. `OptValue` instance.
  113. **NOTE: this OptValue organization is currently being refactored in a branch.**
  114. Pure optimization
  115. -----------------
  116. Is interwoven into the basic optimizer. It saves operations, results,
  117. arguments to be known to have pure semantics.
  118. "Pure" here means the same as the ``jit.elidable`` decorator:
  119. free of "observable" side effects and referentially transparent
  120. (the operation can be replaced with its result without changing the program
  121. semantics). The operations marked as ALWAYS_PURE in `resoperation.py` are a
  122. subset of the NOSIDEEFFECT operations. Operations such as new, new array,
  123. getfield_(raw/gc) are marked as NOSIDEEFFECT but not as ALWAYS_PURE.
  124. Pure operations are optimized in two different ways. If their arguments
  125. are constants, the operation is removed and the result is turned into a
  126. constant. If not, we can still use a memoization technique: if, later,
  127. we see the same operation on the same arguments again, we don't need to
  128. recompute its result, but can simply reuse the previous operation's
  129. result.
  130. Unroll optimization
  131. -------------------
  132. A detailed description can be found the document
  133. `Loop-Aware Optimizations in PyPy's Tracing JIT`__
  134. .. __: http://www2.maths.lth.se/matematiklth/vision/publdb/reports/pdf/ardo-bolz-etal-dls-12.pdf
  135. This optimization does not fall into the traditional scheme of one forward
  136. pass only. In a nutshell it unrolls the trace _once_, connects the two
  137. traces (by inserting parameters into the jump and label of the peeled trace)
  138. and uses information to iron out allocations, propagate constants and
  139. do any other optimization currently present in the 'optimizeopt' module.
  140. It is prepended to all optimizations and thus extends the Optimizer class
  141. and unrolls the loop once before it proceeds.
  142. Vectorization
  143. -------------
  144. - :doc:`Vectorization <vectorization>`
  145. What is missing from this document
  146. ----------------------------------
  147. * Guards are not explained
  148. * Several optimizations are not explained
  149. Further references
  150. ------------------
  151. * `Allocation Removal by Partial Evaluation in a Tracing JIT`__
  152. * `Loop-Aware Optimizations in PyPy's Tracing JIT`__
  153. .. __: http://www.stups.uni-duesseldorf.de/mediawiki/images/b/b0/Pub-BoCuFiLePeRi2011.pdf
  154. .. __: http://www2.maths.lth.se/matematiklth/vision/publdb/reports/pdf/ardo-bolz-etal-dls-12.pdf