PageRenderTime 22ms CodeModel.GetById 15ms app.highlight 5ms RepoModel.GetById 1ms app.codeStats 0ms

/Misc/SpecialBuilds.txt

http://unladen-swallow.googlecode.com/
Plain Text | 294 lines | 238 code | 56 blank | 0 comment | 0 complexity | 0fba240653e2855f79aa16a7239b6f32 MD5 | raw file
  1This file describes some special Python build types enabled via
  2compile-time preprocessor defines.
  3
  4It is best to define these options in the EXTRA_CFLAGS make variable;
  5``make EXTRA_CFLAGS="-DPy_REF_DEBUG"``.
  6
  7
  8---------------------------------------------------------------------------
  9Py_WITH_INSTRUMENTATION                introduced in Unladen Swallow 2009Q2
 10
 11If you pass --with-instrumentation to ./configure, Python will collect a
 12range of runtime data useful for optimizing Python itself. Data currently
 13collected:
 14- Stats on the size of generated LLVM IR.
 15- Stats on the size of emitted machine code.
 16- Which functions had their machine code disabled due to fatal guard failures,
 17  but continued to be called.
 18- How many machine code functions were invalidated by each change to
 19  globals/builtins dicts.
 20- Stats on how well LLVM was able to optimize CALL_FUNCTION opcodes.
 21- Stats on which functions were deemed hot, and how hot they were; cold
 22  functions will not be shown.
 23- How many times LLVM code bailed to the interpreter, why it bailed, and the
 24  opcodes it bailed from.
 25- Stats on how well LLVM was able to optimize conditional branches.
 26- Stats on how long LLVM-global gc took and how many globals it collected.
 27- Stats on how well the JIT was able to specialize/inline binary operators.
 28- How many feedback maps were created.
 29- Stats on how well the JIT was able to optimize import statements.
 30- Stats on how well the JIT was able to optimize certain builtins.
 31- Data on how long execution blocked for JIT compilation.
 32
 33This data will be printed to stderr at interpreter-shutdown.
 34
 35This build currently requires --with-llvm.
 36---------------------------------------------------------------------------
 37Py_REF_DEBUG                                              introduced in 1.4
 38                                                 named REF_DEBUG before 1.4
 39
 40Turn on aggregate reference counting.  This arranges that extern
 41_Py_RefTotal hold a count of all references, the sum of ob_refcnt across
 42all objects.  In a debug-mode build, this is where the "8288" comes from
 43in
 44
 45    >>> 23
 46    23
 47    [8288 refs]
 48    >>>
 49
 50Note that if this count increases when you're not storing away new objects,
 51there's probably a leak.  Remember, though, that in interactive mode the
 52special name "_" holds a reference to the last result displayed!
 53
 54Py_REF_DEBUG also checks after every decref to verify that the refcount
 55hasn't gone negative, and causes an immediate fatal error if it has.
 56
 57Special gimmicks:
 58
 59sys.gettotalrefcount()
 60    Return current total of all refcounts.
 61    Available under Py_REF_DEBUG in Python 2.3.
 62    Before 2.3, Py_TRACE_REFS was required to enable this function.
 63---------------------------------------------------------------------------
 64Py_TRACE_REFS                                             introduced in 1.4
 65                                                named TRACE_REFS before 1.4
 66
 67Turn on heavy reference debugging.  This is major surgery.  Every PyObject
 68grows two more pointers, to maintain a doubly-linked list of all live
 69heap-allocated objects.  Most builtin type objects are not in this list,
 70as they're statically allocated.  Starting in Python 2.3, if COUNT_ALLOCS
 71(see below) is also defined, a static type object T does appear in this
 72list if at least one object of type T has been created.
 73
 74Note that because the fundamental PyObject layout changes, Python modules
 75compiled with Py_TRACE_REFS are incompatible with modules compiled without
 76it.
 77
 78Py_TRACE_REFS implies Py_REF_DEBUG.
 79
 80Special gimmicks:
 81
 82sys.getobjects(max[, type])
 83    Return list of the (no more than) max most-recently allocated objects,
 84    most recently allocated first in the list, least-recently allocated
 85    last in the list.  max=0 means no limit on list length.
 86    If an optional type object is passed, the list is also restricted to
 87    objects of that type.
 88    The return list itself, and some temp objects created just to call
 89    sys.getobjects(), are excluded from the return list.  Note that the
 90    list returned is just another object, though, so may appear in the
 91    return list the next time you call getobjects(); note that every
 92    object in the list is kept alive too, simply by virtue of being in
 93    the list.
 94
 95envar PYTHONDUMPREFS
 96    If this envar exists, Py_Finalize() arranges to print a list of
 97    all still-live heap objects.  This is printed twice, in different
 98    formats, before and after Py_Finalize has cleaned up everything it
 99    can clean up.  The first output block produces the repr() of each
100    object so is more informative; however, a lot of stuff destined to
101    die is still alive then.  The second output block is much harder
102    to work with (repr() can't be invoked anymore -- the interpreter
103    has been torn down too far), but doesn't list any objects that will
104    die.  The tool script combinerefs.py can be run over this to combine
105    the info from both output blocks.  The second output block, and
106    combinerefs.py, were new in Python 2.3b1.
107---------------------------------------------------------------------------
108PYMALLOC_DEBUG                                            introduced in 2.3
109
110When pymalloc is enabled (WITH_PYMALLOC is defined), calls to the PyObject_
111memory routines are handled by Python's own small-object allocator, while
112calls to the PyMem_ memory routines are directed to the system malloc/
113realloc/free.  If PYMALLOC_DEBUG is also defined, calls to both PyObject_
114and PyMem_ memory routines are directed to a special debugging mode of
115Python's small-object allocator.
116
117This mode fills dynamically allocated memory blocks with special,
118recognizable bit patterns, and adds debugging info on each end of
119dynamically allocated memory blocks.  The special bit patterns are:
120
121#define CLEANBYTE     0xCB   /* clean (newly allocated) memory */
122#define DEADBYTE      0xDB   /* dead (newly freed) memory */
123#define FORBIDDENBYTE 0xFB   /* forbidden -- untouchable bytes */
124
125Strings of these bytes are unlikely to be valid addresses, floats, or 7-bit
126ASCII strings.
127
128Let S = sizeof(size_t). 2*S bytes are added at each end of each block of N
129bytes requested.  The memory layout is like so, where p represents the
130address returned by a malloc-like or realloc-like function (p[i:j] means
131the slice of bytes from *(p+i) inclusive up to *(p+j) exclusive; note that
132the treatment of negative indices differs from a Python slice):
133
134p[-2*S:-S]
135    Number of bytes originally asked for.  This is a size_t, big-endian
136    (easier to read in a memory dump).
137p[-S:0]
138    Copies of FORBIDDENBYTE.  Used to catch under- writes and reads.
139p[0:N]
140    The requested memory, filled with copies of CLEANBYTE, used to catch
141    reference to uninitialized memory.
142    When a realloc-like function is called requesting a larger memory
143    block, the new excess bytes are also filled with CLEANBYTE.
144    When a free-like function is called, these are overwritten with
145    DEADBYTE, to catch reference to freed memory.  When a realloc-
146    like function is called requesting a smaller memory block, the excess
147    old bytes are also filled with DEADBYTE.
148p[N:N+S]
149    Copies of FORBIDDENBYTE.  Used to catch over- writes and reads.
150p[N+S:N+2*S]
151    A serial number, incremented by 1 on each call to a malloc-like or
152    realloc-like function.
153    Big-endian size_t.
154    If "bad memory" is detected later, the serial number gives an
155    excellent way to set a breakpoint on the next run, to capture the
156    instant at which this block was passed out.  The static function
157    bumpserialno() in obmalloc.c is the only place the serial number
158    is incremented, and exists so you can set such a breakpoint easily.
159
160A realloc-like or free-like function first checks that the FORBIDDENBYTEs
161at each end are intact.  If they've been altered, diagnostic output is
162written to stderr, and the program is aborted via Py_FatalError().  The
163other main failure mode is provoking a memory error when a program
164reads up one of the special bit patterns and tries to use it as an address.
165If you get in a debugger then and look at the object, you're likely
166to see that it's entirely filled with 0xDB (meaning freed memory is
167getting used) or 0xCB (meaning uninitialized memory is getting used).
168
169Note that PYMALLOC_DEBUG requires WITH_PYMALLOC.
170
171Special gimmicks:
172
173envar PYTHONMALLOCSTATS
174    If this envar exists, a report of pymalloc summary statistics is
175    printed to stderr whenever a new arena is allocated, and also
176    by Py_Finalize().
177
178Changed in 2.5:  The number of extra bytes allocated is 4*sizeof(size_t).
179Before it was 16 on all boxes, reflecting that Python couldn't make use of
180allocations >= 2**32 bytes even on 64-bit boxes before 2.5.
181---------------------------------------------------------------------------
182Py_DEBUG                                                  introduced in 1.5
183                                                     named DEBUG before 1.5
184
185This is what is generally meant by "a debug build" of Python.
186
187Py_DEBUG implies LLTRACE, Py_REF_DEBUG, Py_TRACE_REFS, and
188PYMALLOC_DEBUG (if WITH_PYMALLOC is enabled).  In addition, C
189assert()s are enabled (via the C way: by not defining NDEBUG), and
190some routines do additional sanity checks inside "#ifdef Py_DEBUG"
191blocks.
192---------------------------------------------------------------------------
193COUNT_ALLOCS                                            introduced in 0.9.9
194                                             partly broken in 2.2 and 2.2.1
195
196Each type object grows three new members:
197
198    /* Number of times an object of this type was allocated. */
199    int tp_allocs;
200
201    /* Number of times an object of this type was deallocated. */
202    int tp_frees;
203
204    /* Highwater mark:  the maximum value of tp_allocs - tp_frees so
205     * far; or, IOW, the largest number of objects of this type alive at
206     * the same time.
207     */
208    int tp_maxalloc;
209
210Allocation and deallocation code keeps these counts up to date.
211Py_Finalize() displays a summary of the info returned by sys.getcounts()
212(see below), along with assorted other special allocation counts (like
213the number of tuple allocations satisfied by a tuple free-list, the number
214of 1-character strings allocated, etc).
215
216Before Python 2.2, type objects were immortal, and the COUNT_ALLOCS
217implementation relies on that.  As of Python 2.2, heap-allocated type/
218class objects can go away.  COUNT_ALLOCS can blow up in 2.2 and 2.2.1
219because of this; this was fixed in 2.2.2.  Use of COUNT_ALLOCS makes
220all heap-allocated type objects immortal, except for those for which no
221object of that type is ever allocated.
222
223Starting with Python 2.3, If Py_TRACE_REFS is also defined, COUNT_ALLOCS
224arranges to ensure that the type object for each allocated object
225appears in the doubly-linked list of all objects maintained by
226Py_TRACE_REFS.
227
228Special gimmicks:
229
230sys.getcounts()
231    Return a list of 4-tuples, one entry for each type object for which
232    at least one object of that type was allocated.  Each tuple is of
233    the form:
234
235        (tp_name, tp_allocs, tp_frees, tp_maxalloc)
236
237    Each distinct type object gets a distinct entry in this list, even
238    if two or more type objects have the same tp_name (in which case
239    there's no way to distinguish them by looking at this list).  The
240    list is ordered by time of first object allocation:  the type object
241    for which the first allocation of an object of that type occurred
242    most recently is at the front of the list.
243---------------------------------------------------------------------------
244LLTRACE                                          introduced well before 1.0
245
246Compile in support for Low Level TRACE-ing of the main interpreter loop.
247
248When this preprocessor symbol is defined, before PyEval_EvalFrame
249(eval_frame in 2.3 and 2.2, eval_code2 before that) executes a frame's code
250it checks the frame's global namespace for a variable "__lltrace__".  If
251such a variable is found, mounds of information about what the interpreter
252is doing are sprayed to stdout, such as every opcode and opcode argument
253and values pushed onto and popped off the value stack.
254
255Not useful very often, but very useful when needed.
256
257---------------------------------------------------------------------------
258CALL_PROFILE                                      introduced for Python 2.3
259
260Count the number of function calls executed.
261
262When this symbol is defined, the ceval mainloop and helper functions
263count the number of function calls made.  It keeps detailed statistics
264about what kind of object was called and whether the call hit any of
265the special fast paths in the code.
266
267---------------------------------------------------------------------------
268WITH_TSC                                          introduced for Python 2.4
269
270Super-lowlevel profiling of the interpreter.  When enabled, the sys
271module grows a new function:
272
273settscdump(bool)
274    If true, tell the Python interpreter to dump VM measurements to
275    stderr.  If false, turn off dump.  The measurements are based on the
276    processor's time-stamp counter.
277
278This build option requires a small amount of platform specific code.
279Currently this code is present for linux/x86 or x86_64 and any PowerPC
280platform that uses GCC (i.e. OS X and linux/ppc).
281
282On the PowerPC the rate at which the time base register is incremented
283is not defined by the architecture specification, so you'll need to
284find the manual for your specific processor.  For the 750CX, 750CXe
285and 750FX (all sold as the G3) we find:
286
287    The time base counter is clocked at a frequency that is
288    one-fourth that of the bus clock.
289
290This build is enabled by the --with-tsc flag to configure. This build currently
291requires --with-llvm.
292
293To do something useful with the event timings, run the Misc/tsc_stats.py
294script on the output.