/Misc/SpecialBuilds.txt

http://unladen-swallow.googlecode.com/ · Plain Text · 294 lines · 238 code · 56 blank · 0 comment · 0 complexity · 0fba240653e2855f79aa16a7239b6f32 MD5 · raw file

  1. This file describes some special Python build types enabled via
  2. compile-time preprocessor defines.
  3. It is best to define these options in the EXTRA_CFLAGS make variable;
  4. ``make EXTRA_CFLAGS="-DPy_REF_DEBUG"``.
  5. ---------------------------------------------------------------------------
  6. Py_WITH_INSTRUMENTATION introduced in Unladen Swallow 2009Q2
  7. If you pass --with-instrumentation to ./configure, Python will collect a
  8. range of runtime data useful for optimizing Python itself. Data currently
  9. collected:
  10. - Stats on the size of generated LLVM IR.
  11. - Stats on the size of emitted machine code.
  12. - Which functions had their machine code disabled due to fatal guard failures,
  13. but continued to be called.
  14. - How many machine code functions were invalidated by each change to
  15. globals/builtins dicts.
  16. - Stats on how well LLVM was able to optimize CALL_FUNCTION opcodes.
  17. - Stats on which functions were deemed hot, and how hot they were; cold
  18. functions will not be shown.
  19. - How many times LLVM code bailed to the interpreter, why it bailed, and the
  20. opcodes it bailed from.
  21. - Stats on how well LLVM was able to optimize conditional branches.
  22. - Stats on how long LLVM-global gc took and how many globals it collected.
  23. - Stats on how well the JIT was able to specialize/inline binary operators.
  24. - How many feedback maps were created.
  25. - Stats on how well the JIT was able to optimize import statements.
  26. - Stats on how well the JIT was able to optimize certain builtins.
  27. - Data on how long execution blocked for JIT compilation.
  28. This data will be printed to stderr at interpreter-shutdown.
  29. This build currently requires --with-llvm.
  30. ---------------------------------------------------------------------------
  31. Py_REF_DEBUG introduced in 1.4
  32. named REF_DEBUG before 1.4
  33. Turn on aggregate reference counting. This arranges that extern
  34. _Py_RefTotal hold a count of all references, the sum of ob_refcnt across
  35. all objects. In a debug-mode build, this is where the "8288" comes from
  36. in
  37. >>> 23
  38. 23
  39. [8288 refs]
  40. >>>
  41. Note that if this count increases when you're not storing away new objects,
  42. there's probably a leak. Remember, though, that in interactive mode the
  43. special name "_" holds a reference to the last result displayed!
  44. Py_REF_DEBUG also checks after every decref to verify that the refcount
  45. hasn't gone negative, and causes an immediate fatal error if it has.
  46. Special gimmicks:
  47. sys.gettotalrefcount()
  48. Return current total of all refcounts.
  49. Available under Py_REF_DEBUG in Python 2.3.
  50. Before 2.3, Py_TRACE_REFS was required to enable this function.
  51. ---------------------------------------------------------------------------
  52. Py_TRACE_REFS introduced in 1.4
  53. named TRACE_REFS before 1.4
  54. Turn on heavy reference debugging. This is major surgery. Every PyObject
  55. grows two more pointers, to maintain a doubly-linked list of all live
  56. heap-allocated objects. Most builtin type objects are not in this list,
  57. as they're statically allocated. Starting in Python 2.3, if COUNT_ALLOCS
  58. (see below) is also defined, a static type object T does appear in this
  59. list if at least one object of type T has been created.
  60. Note that because the fundamental PyObject layout changes, Python modules
  61. compiled with Py_TRACE_REFS are incompatible with modules compiled without
  62. it.
  63. Py_TRACE_REFS implies Py_REF_DEBUG.
  64. Special gimmicks:
  65. sys.getobjects(max[, type])
  66. Return list of the (no more than) max most-recently allocated objects,
  67. most recently allocated first in the list, least-recently allocated
  68. last in the list. max=0 means no limit on list length.
  69. If an optional type object is passed, the list is also restricted to
  70. objects of that type.
  71. The return list itself, and some temp objects created just to call
  72. sys.getobjects(), are excluded from the return list. Note that the
  73. list returned is just another object, though, so may appear in the
  74. return list the next time you call getobjects(); note that every
  75. object in the list is kept alive too, simply by virtue of being in
  76. the list.
  77. envar PYTHONDUMPREFS
  78. If this envar exists, Py_Finalize() arranges to print a list of
  79. all still-live heap objects. This is printed twice, in different
  80. formats, before and after Py_Finalize has cleaned up everything it
  81. can clean up. The first output block produces the repr() of each
  82. object so is more informative; however, a lot of stuff destined to
  83. die is still alive then. The second output block is much harder
  84. to work with (repr() can't be invoked anymore -- the interpreter
  85. has been torn down too far), but doesn't list any objects that will
  86. die. The tool script combinerefs.py can be run over this to combine
  87. the info from both output blocks. The second output block, and
  88. combinerefs.py, were new in Python 2.3b1.
  89. ---------------------------------------------------------------------------
  90. PYMALLOC_DEBUG introduced in 2.3
  91. When pymalloc is enabled (WITH_PYMALLOC is defined), calls to the PyObject_
  92. memory routines are handled by Python's own small-object allocator, while
  93. calls to the PyMem_ memory routines are directed to the system malloc/
  94. realloc/free. If PYMALLOC_DEBUG is also defined, calls to both PyObject_
  95. and PyMem_ memory routines are directed to a special debugging mode of
  96. Python's small-object allocator.
  97. This mode fills dynamically allocated memory blocks with special,
  98. recognizable bit patterns, and adds debugging info on each end of
  99. dynamically allocated memory blocks. The special bit patterns are:
  100. #define CLEANBYTE 0xCB /* clean (newly allocated) memory */
  101. #define DEADBYTE 0xDB /* dead (newly freed) memory */
  102. #define FORBIDDENBYTE 0xFB /* forbidden -- untouchable bytes */
  103. Strings of these bytes are unlikely to be valid addresses, floats, or 7-bit
  104. ASCII strings.
  105. Let S = sizeof(size_t). 2*S bytes are added at each end of each block of N
  106. bytes requested. The memory layout is like so, where p represents the
  107. address returned by a malloc-like or realloc-like function (p[i:j] means
  108. the slice of bytes from *(p+i) inclusive up to *(p+j) exclusive; note that
  109. the treatment of negative indices differs from a Python slice):
  110. p[-2*S:-S]
  111. Number of bytes originally asked for. This is a size_t, big-endian
  112. (easier to read in a memory dump).
  113. p[-S:0]
  114. Copies of FORBIDDENBYTE. Used to catch under- writes and reads.
  115. p[0:N]
  116. The requested memory, filled with copies of CLEANBYTE, used to catch
  117. reference to uninitialized memory.
  118. When a realloc-like function is called requesting a larger memory
  119. block, the new excess bytes are also filled with CLEANBYTE.
  120. When a free-like function is called, these are overwritten with
  121. DEADBYTE, to catch reference to freed memory. When a realloc-
  122. like function is called requesting a smaller memory block, the excess
  123. old bytes are also filled with DEADBYTE.
  124. p[N:N+S]
  125. Copies of FORBIDDENBYTE. Used to catch over- writes and reads.
  126. p[N+S:N+2*S]
  127. A serial number, incremented by 1 on each call to a malloc-like or
  128. realloc-like function.
  129. Big-endian size_t.
  130. If "bad memory" is detected later, the serial number gives an
  131. excellent way to set a breakpoint on the next run, to capture the
  132. instant at which this block was passed out. The static function
  133. bumpserialno() in obmalloc.c is the only place the serial number
  134. is incremented, and exists so you can set such a breakpoint easily.
  135. A realloc-like or free-like function first checks that the FORBIDDENBYTEs
  136. at each end are intact. If they've been altered, diagnostic output is
  137. written to stderr, and the program is aborted via Py_FatalError(). The
  138. other main failure mode is provoking a memory error when a program
  139. reads up one of the special bit patterns and tries to use it as an address.
  140. If you get in a debugger then and look at the object, you're likely
  141. to see that it's entirely filled with 0xDB (meaning freed memory is
  142. getting used) or 0xCB (meaning uninitialized memory is getting used).
  143. Note that PYMALLOC_DEBUG requires WITH_PYMALLOC.
  144. Special gimmicks:
  145. envar PYTHONMALLOCSTATS
  146. If this envar exists, a report of pymalloc summary statistics is
  147. printed to stderr whenever a new arena is allocated, and also
  148. by Py_Finalize().
  149. Changed in 2.5: The number of extra bytes allocated is 4*sizeof(size_t).
  150. Before it was 16 on all boxes, reflecting that Python couldn't make use of
  151. allocations >= 2**32 bytes even on 64-bit boxes before 2.5.
  152. ---------------------------------------------------------------------------
  153. Py_DEBUG introduced in 1.5
  154. named DEBUG before 1.5
  155. This is what is generally meant by "a debug build" of Python.
  156. Py_DEBUG implies LLTRACE, Py_REF_DEBUG, Py_TRACE_REFS, and
  157. PYMALLOC_DEBUG (if WITH_PYMALLOC is enabled). In addition, C
  158. assert()s are enabled (via the C way: by not defining NDEBUG), and
  159. some routines do additional sanity checks inside "#ifdef Py_DEBUG"
  160. blocks.
  161. ---------------------------------------------------------------------------
  162. COUNT_ALLOCS introduced in 0.9.9
  163. partly broken in 2.2 and 2.2.1
  164. Each type object grows three new members:
  165. /* Number of times an object of this type was allocated. */
  166. int tp_allocs;
  167. /* Number of times an object of this type was deallocated. */
  168. int tp_frees;
  169. /* Highwater mark: the maximum value of tp_allocs - tp_frees so
  170. * far; or, IOW, the largest number of objects of this type alive at
  171. * the same time.
  172. */
  173. int tp_maxalloc;
  174. Allocation and deallocation code keeps these counts up to date.
  175. Py_Finalize() displays a summary of the info returned by sys.getcounts()
  176. (see below), along with assorted other special allocation counts (like
  177. the number of tuple allocations satisfied by a tuple free-list, the number
  178. of 1-character strings allocated, etc).
  179. Before Python 2.2, type objects were immortal, and the COUNT_ALLOCS
  180. implementation relies on that. As of Python 2.2, heap-allocated type/
  181. class objects can go away. COUNT_ALLOCS can blow up in 2.2 and 2.2.1
  182. because of this; this was fixed in 2.2.2. Use of COUNT_ALLOCS makes
  183. all heap-allocated type objects immortal, except for those for which no
  184. object of that type is ever allocated.
  185. Starting with Python 2.3, If Py_TRACE_REFS is also defined, COUNT_ALLOCS
  186. arranges to ensure that the type object for each allocated object
  187. appears in the doubly-linked list of all objects maintained by
  188. Py_TRACE_REFS.
  189. Special gimmicks:
  190. sys.getcounts()
  191. Return a list of 4-tuples, one entry for each type object for which
  192. at least one object of that type was allocated. Each tuple is of
  193. the form:
  194. (tp_name, tp_allocs, tp_frees, tp_maxalloc)
  195. Each distinct type object gets a distinct entry in this list, even
  196. if two or more type objects have the same tp_name (in which case
  197. there's no way to distinguish them by looking at this list). The
  198. list is ordered by time of first object allocation: the type object
  199. for which the first allocation of an object of that type occurred
  200. most recently is at the front of the list.
  201. ---------------------------------------------------------------------------
  202. LLTRACE introduced well before 1.0
  203. Compile in support for Low Level TRACE-ing of the main interpreter loop.
  204. When this preprocessor symbol is defined, before PyEval_EvalFrame
  205. (eval_frame in 2.3 and 2.2, eval_code2 before that) executes a frame's code
  206. it checks the frame's global namespace for a variable "__lltrace__". If
  207. such a variable is found, mounds of information about what the interpreter
  208. is doing are sprayed to stdout, such as every opcode and opcode argument
  209. and values pushed onto and popped off the value stack.
  210. Not useful very often, but very useful when needed.
  211. ---------------------------------------------------------------------------
  212. CALL_PROFILE introduced for Python 2.3
  213. Count the number of function calls executed.
  214. When this symbol is defined, the ceval mainloop and helper functions
  215. count the number of function calls made. It keeps detailed statistics
  216. about what kind of object was called and whether the call hit any of
  217. the special fast paths in the code.
  218. ---------------------------------------------------------------------------
  219. WITH_TSC introduced for Python 2.4
  220. Super-lowlevel profiling of the interpreter. When enabled, the sys
  221. module grows a new function:
  222. settscdump(bool)
  223. If true, tell the Python interpreter to dump VM measurements to
  224. stderr. If false, turn off dump. The measurements are based on the
  225. processor's time-stamp counter.
  226. This build option requires a small amount of platform specific code.
  227. Currently this code is present for linux/x86 or x86_64 and any PowerPC
  228. platform that uses GCC (i.e. OS X and linux/ppc).
  229. On the PowerPC the rate at which the time base register is incremented
  230. is not defined by the architecture specification, so you'll need to
  231. find the manual for your specific processor. For the 750CX, 750CXe
  232. and 750FX (all sold as the G3) we find:
  233. The time base counter is clocked at a frequency that is
  234. one-fourth that of the bus clock.
  235. This build is enabled by the --with-tsc flag to configure. This build currently
  236. requires --with-llvm.
  237. To do something useful with the event timings, run the Misc/tsc_stats.py
  238. script on the output.