/doc/tutorials/manual/parallel_processing.rst

https://code.google.com/p/ruffus/ · ReStructuredText · 52 lines · 33 code · 19 blank · 0 comment · 0 complexity · 0b7ac7dd2ecf2cf5e3dc88e858402c8f MD5 · raw file

  1. .. include:: ../../global.inc
  2. .. include:: chapter_numbers.inc
  3. .. _manual.multiprocessing:
  4. #######################################################################################
  5. |manual.multiprocessing.chapter_num|: `Running Tasks and Jobs in parallel`
  6. #######################################################################################
  7. .. hlist::
  8. * :ref:`Manual overview <manual>`
  9. =====================
  10. Multi Processing
  11. =====================
  12. *Ruffus* uses python `multiprocessing <http://docs.python.org/library/multiprocessing.html>`_ to run
  13. each job in a separate process.
  14. This means that jobs do *not* necessarily complete in the order of the defined parameters.
  15. Task hierachies are, of course, inviolate: upstream tasks run before downstream, dependent tasks.
  16. Tasks that are independent (i.e. do not precede each other) may be run in parallel as well.
  17. The number of concurrent jobs can be set in :ref:`pipeline_run<pipeline_functions.pipeline_run>`:
  18. ::
  19. pipeline_run([parallel_task], multiprocess = 5)
  20. If ``multiprocess`` is set to 1, then jobs will be run on a single process.
  21. =====================
  22. Data sharing
  23. =====================
  24. Running jobs in separate processes allows *Ruffus* to make full use of the multiple
  25. processors in modern computers. However, some of the
  26. `multiprocessing guidelines <http://docs.python.org/library/multiprocessing.html#multiprocessing-programming>`_
  27. should be borne in mind when writing *Ruffus* pipelines. In particular:
  28. * Try not to pass large amounts of data between jobs, or at least be aware that this has to be marshalled
  29. across process boundaries.
  30. * Only data which can be `pickled <http://docs.python.org/library/pickle.html>`_ can be passed as
  31. parameters to *Ruffus* task functions. Happily, that applies to almost any Python data type.
  32. The use of the rare, unpicklable object will cause python to complain (fail) loudly when *Ruffus* pipelines
  33. are run.