/Tools/pybench/README

http://unladen-swallow.googlecode.com/ · #! · 368 lines · 299 code · 69 blank · 0 comment · 0 complexity · 6c324c69c502bf966c8ac69e2dbec960 MD5 · raw file

  1. ________________________________________________________________________
  2. PYBENCH - A Python Benchmark Suite
  3. ________________________________________________________________________
  4. Extendable suite of of low-level benchmarks for measuring
  5. the performance of the Python implementation
  6. (interpreter, compiler or VM).
  7. pybench is a collection of tests that provides a standardized way to
  8. measure the performance of Python implementations. It takes a very
  9. close look at different aspects of Python programs and let's you
  10. decide which factors are more important to you than others, rather
  11. than wrapping everything up in one number, like the other performance
  12. tests do (e.g. pystone which is included in the Python Standard
  13. Library).
  14. pybench has been used in the past by several Python developers to
  15. track down performance bottlenecks or to demonstrate the impact of
  16. optimizations and new features in Python.
  17. The command line interface for pybench is the file pybench.py. Run
  18. this script with option '--help' to get a listing of the possible
  19. options. Without options, pybench will simply execute the benchmark
  20. and then print out a report to stdout.
  21. Micro-Manual
  22. ------------
  23. Run 'pybench.py -h' to see the help screen. Run 'pybench.py' to run
  24. the benchmark suite using default settings and 'pybench.py -f <file>'
  25. to have it store the results in a file too.
  26. It is usually a good idea to run pybench.py multiple times to see
  27. whether the environment, timers and benchmark run-times are suitable
  28. for doing benchmark tests.
  29. You can use the comparison feature of pybench.py ('pybench.py -c
  30. <file>') to check how well the system behaves in comparison to a
  31. reference run.
  32. If the differences are well below 10% for each test, then you have a
  33. system that is good for doing benchmark testings. Of you get random
  34. differences of more than 10% or significant differences between the
  35. values for minimum and average time, then you likely have some
  36. background processes running which cause the readings to become
  37. inconsistent. Examples include: web-browsers, email clients, RSS
  38. readers, music players, backup programs, etc.
  39. If you are only interested in a few tests of the whole suite, you can
  40. use the filtering option, e.g. 'pybench.py -t string' will only
  41. run/show the tests that have 'string' in their name.
  42. This is the current output of pybench.py --help:
  43. """
  44. ------------------------------------------------------------------------
  45. PYBENCH - a benchmark test suite for Python interpreters/compilers.
  46. ------------------------------------------------------------------------
  47. Synopsis:
  48. pybench.py [option] files...
  49. Options and default settings:
  50. -n arg number of rounds (10)
  51. -f arg save benchmark to file arg ()
  52. -c arg compare benchmark with the one in file arg ()
  53. -s arg show benchmark in file arg, then exit ()
  54. -w arg set warp factor to arg (10)
  55. -t arg run only tests with names matching arg ()
  56. -C arg set the number of calibration runs to arg (20)
  57. -d hide noise in comparisons (0)
  58. -v verbose output (not recommended) (0)
  59. --with-gc enable garbage collection (0)
  60. --with-syscheck use default sys check interval (0)
  61. --timer arg use given timer (time.time)
  62. -h show this help text
  63. --help show this help text
  64. --debug enable debugging
  65. --copyright show copyright
  66. --examples show examples of usage
  67. Version:
  68. 2.0
  69. The normal operation is to run the suite and display the
  70. results. Use -f to save them for later reuse or comparisons.
  71. Available timers:
  72. time.time
  73. time.clock
  74. systimes.processtime
  75. Examples:
  76. python2.1 pybench.py -f p21.pybench
  77. python2.5 pybench.py -f p25.pybench
  78. python pybench.py -s p25.pybench -c p21.pybench
  79. """
  80. License
  81. -------
  82. See LICENSE file.
  83. Sample output
  84. -------------
  85. """
  86. -------------------------------------------------------------------------------
  87. PYBENCH 2.0
  88. -------------------------------------------------------------------------------
  89. * using Python 2.4.2
  90. * disabled garbage collection
  91. * system check interval set to maximum: 2147483647
  92. * using timer: time.time
  93. Calibrating tests. Please wait...
  94. Running 10 round(s) of the suite at warp factor 10:
  95. * Round 1 done in 6.388 seconds.
  96. * Round 2 done in 6.485 seconds.
  97. * Round 3 done in 6.786 seconds.
  98. ...
  99. * Round 10 done in 6.546 seconds.
  100. -------------------------------------------------------------------------------
  101. Benchmark: 2006-06-12 12:09:25
  102. -------------------------------------------------------------------------------
  103. Rounds: 10
  104. Warp: 10
  105. Timer: time.time
  106. Machine Details:
  107. Platform ID: Linux-2.6.8-24.19-default-x86_64-with-SuSE-9.2-x86-64
  108. Processor: x86_64
  109. Python:
  110. Executable: /usr/local/bin/python
  111. Version: 2.4.2
  112. Compiler: GCC 3.3.4 (pre 3.3.5 20040809)
  113. Bits: 64bit
  114. Build: Oct 1 2005 15:24:35 (#1)
  115. Unicode: UCS2
  116. Test minimum average operation overhead
  117. -------------------------------------------------------------------------------
  118. BuiltinFunctionCalls: 126ms 145ms 0.28us 0.274ms
  119. BuiltinMethodLookup: 124ms 130ms 0.12us 0.316ms
  120. CompareFloats: 109ms 110ms 0.09us 0.361ms
  121. CompareFloatsIntegers: 100ms 104ms 0.12us 0.271ms
  122. CompareIntegers: 137ms 138ms 0.08us 0.542ms
  123. CompareInternedStrings: 124ms 127ms 0.08us 1.367ms
  124. CompareLongs: 100ms 104ms 0.10us 0.316ms
  125. CompareStrings: 111ms 115ms 0.12us 0.929ms
  126. CompareUnicode: 108ms 128ms 0.17us 0.693ms
  127. ConcatStrings: 142ms 155ms 0.31us 0.562ms
  128. ConcatUnicode: 119ms 127ms 0.42us 0.384ms
  129. CreateInstances: 123ms 128ms 1.14us 0.367ms
  130. CreateNewInstances: 121ms 126ms 1.49us 0.335ms
  131. CreateStringsWithConcat: 130ms 135ms 0.14us 0.916ms
  132. CreateUnicodeWithConcat: 130ms 135ms 0.34us 0.361ms
  133. DictCreation: 108ms 109ms 0.27us 0.361ms
  134. DictWithFloatKeys: 149ms 153ms 0.17us 0.678ms
  135. DictWithIntegerKeys: 124ms 126ms 0.11us 0.915ms
  136. DictWithStringKeys: 114ms 117ms 0.10us 0.905ms
  137. ForLoops: 110ms 111ms 4.46us 0.063ms
  138. IfThenElse: 118ms 119ms 0.09us 0.685ms
  139. ListSlicing: 116ms 120ms 8.59us 0.103ms
  140. NestedForLoops: 125ms 137ms 0.09us 0.019ms
  141. NormalClassAttribute: 124ms 136ms 0.11us 0.457ms
  142. NormalInstanceAttribute: 110ms 117ms 0.10us 0.454ms
  143. PythonFunctionCalls: 107ms 113ms 0.34us 0.271ms
  144. PythonMethodCalls: 140ms 149ms 0.66us 0.141ms
  145. Recursion: 156ms 166ms 3.32us 0.452ms
  146. SecondImport: 112ms 118ms 1.18us 0.180ms
  147. SecondPackageImport: 118ms 127ms 1.27us 0.180ms
  148. SecondSubmoduleImport: 140ms 151ms 1.51us 0.180ms
  149. SimpleComplexArithmetic: 128ms 139ms 0.16us 0.361ms
  150. SimpleDictManipulation: 134ms 136ms 0.11us 0.452ms
  151. SimpleFloatArithmetic: 110ms 113ms 0.09us 0.571ms
  152. SimpleIntFloatArithmetic: 106ms 111ms 0.08us 0.548ms
  153. SimpleIntegerArithmetic: 106ms 109ms 0.08us 0.544ms
  154. SimpleListManipulation: 103ms 113ms 0.10us 0.587ms
  155. SimpleLongArithmetic: 112ms 118ms 0.18us 0.271ms
  156. SmallLists: 105ms 116ms 0.17us 0.366ms
  157. SmallTuples: 108ms 128ms 0.24us 0.406ms
  158. SpecialClassAttribute: 119ms 136ms 0.11us 0.453ms
  159. SpecialInstanceAttribute: 143ms 155ms 0.13us 0.454ms
  160. StringMappings: 115ms 121ms 0.48us 0.405ms
  161. StringPredicates: 120ms 129ms 0.18us 2.064ms
  162. StringSlicing: 111ms 127ms 0.23us 0.781ms
  163. TryExcept: 125ms 126ms 0.06us 0.681ms
  164. TryRaiseExcept: 133ms 137ms 2.14us 0.361ms
  165. TupleSlicing: 117ms 120ms 0.46us 0.066ms
  166. UnicodeMappings: 156ms 160ms 4.44us 0.429ms
  167. UnicodePredicates: 117ms 121ms 0.22us 2.487ms
  168. UnicodeProperties: 115ms 153ms 0.38us 2.070ms
  169. UnicodeSlicing: 126ms 129ms 0.26us 0.689ms
  170. -------------------------------------------------------------------------------
  171. Totals: 6283ms 6673ms
  172. """
  173. ________________________________________________________________________
  174. Writing New Tests
  175. ________________________________________________________________________
  176. pybench tests are simple modules defining one or more pybench.Test
  177. subclasses.
  178. Writing a test essentially boils down to providing two methods:
  179. .test() which runs .rounds number of .operations test operations each
  180. and .calibrate() which does the same except that it doesn't actually
  181. execute the operations.
  182. Here's an example:
  183. ------------------
  184. from pybench import Test
  185. class IntegerCounting(Test):
  186. # Version number of the test as float (x.yy); this is important
  187. # for comparisons of benchmark runs - tests with unequal version
  188. # number will not get compared.
  189. version = 1.0
  190. # The number of abstract operations done in each round of the
  191. # test. An operation is the basic unit of what you want to
  192. # measure. The benchmark will output the amount of run-time per
  193. # operation. Note that in order to raise the measured timings
  194. # significantly above noise level, it is often required to repeat
  195. # sets of operations more than once per test round. The measured
  196. # overhead per test round should be less than 1 second.
  197. operations = 20
  198. # Number of rounds to execute per test run. This should be
  199. # adjusted to a figure that results in a test run-time of between
  200. # 1-2 seconds (at warp 1).
  201. rounds = 100000
  202. def test(self):
  203. """ Run the test.
  204. The test needs to run self.rounds executing
  205. self.operations number of operations each.
  206. """
  207. # Init the test
  208. a = 1
  209. # Run test rounds
  210. #
  211. # NOTE: Use xrange() for all test loops unless you want to face
  212. # a 20MB process !
  213. #
  214. for i in xrange(self.rounds):
  215. # Repeat the operations per round to raise the run-time
  216. # per operation significantly above the noise level of the
  217. # for-loop overhead.
  218. # Execute 20 operations (a += 1):
  219. a += 1
  220. a += 1
  221. a += 1
  222. a += 1
  223. a += 1
  224. a += 1
  225. a += 1
  226. a += 1
  227. a += 1
  228. a += 1
  229. a += 1
  230. a += 1
  231. a += 1
  232. a += 1
  233. a += 1
  234. a += 1
  235. a += 1
  236. a += 1
  237. a += 1
  238. a += 1
  239. def calibrate(self):
  240. """ Calibrate the test.
  241. This method should execute everything that is needed to
  242. setup and run the test - except for the actual operations
  243. that you intend to measure. pybench uses this method to
  244. measure the test implementation overhead.
  245. """
  246. # Init the test
  247. a = 1
  248. # Run test rounds (without actually doing any operation)
  249. for i in xrange(self.rounds):
  250. # Skip the actual execution of the operations, since we
  251. # only want to measure the test's administration overhead.
  252. pass
  253. Registering a new test module
  254. -----------------------------
  255. To register a test module with pybench, the classes need to be
  256. imported into the pybench.Setup module. pybench will then scan all the
  257. symbols defined in that module for subclasses of pybench.Test and
  258. automatically add them to the benchmark suite.
  259. Breaking Comparability
  260. ----------------------
  261. If a change is made to any individual test that means it is no
  262. longer strictly comparable with previous runs, the '.version' class
  263. variable should be updated. Therefafter, comparisons with previous
  264. versions of the test will list as "n/a" to reflect the change.
  265. Version History
  266. ---------------
  267. 2.0: rewrote parts of pybench which resulted in more repeatable
  268. timings:
  269. - made timer a parameter
  270. - changed the platform default timer to use high-resolution
  271. timers rather than process timers (which have a much lower
  272. resolution)
  273. - added option to select timer
  274. - added process time timer (using systimes.py)
  275. - changed to use min() as timing estimator (average
  276. is still taken as well to provide an idea of the difference)
  277. - garbage collection is turned off per default
  278. - sys check interval is set to the highest possible value
  279. - calibration is now a separate step and done using
  280. a different strategy that allows measuring the test
  281. overhead more accurately
  282. - modified the tests to each give a run-time of between
  283. 100-200ms using warp 10
  284. - changed default warp factor to 10 (from 20)
  285. - compared results with timeit.py and confirmed measurements
  286. - bumped all test versions to 2.0
  287. - updated platform.py to the latest version
  288. - changed the output format a bit to make it look
  289. nicer
  290. - refactored the APIs somewhat
  291. 1.3+: Steve Holden added the NewInstances test and the filtering
  292. option during the NeedForSpeed sprint; this also triggered a long
  293. discussion on how to improve benchmark timing and finally
  294. resulted in the release of 2.0
  295. 1.3: initial checkin into the Python SVN repository
  296. Have fun,
  297. --
  298. Marc-Andre Lemburg
  299. mal@lemburg.com