/doc/tutorials/manual/manual_introduction.rst
https://code.google.com/p/ruffus/ · ReStructuredText · 254 lines · 172 code · 82 blank · 0 comment · 0 complexity · 58bdd9f654ced46741d93f0abc12821c MD5 · raw file
- .. include:: ../../global.inc
- .. _manual.introduction:
- .. index::
- pair: manual; introduction
-
- ####################################################################
- **Ruffus** Manual
- ####################################################################
- | The chapters of this manual go through each of the features of **Ruffus** in turn.
- | Some of these (especially those labelled **esoteric** or **deprecated**) may not
- be of interest to all users of **Ruffus**.
- If you are looking for a quick introduction to **Ruffus**, you may want to look at the
- :ref:`Simple Tutorial <Simple_Tutorial>` first, some of which content is shared with,
- or elaborated on, by this manual.
- ***************************************
- **Ruffus** Manual: Table of Contents:
- ***************************************
- .. toctree::
- :maxdepth: 1
- follows.rst
- tasks_as_recipes.rst
- files.rst
- tasks_and_globs_in_inputs.rst
- tracing_pipeline_parameters.rst
- parallel_processing.rst
- split.rst
- transform.rst
- merge.rst
- posttask.rst
- jobs_limit.rst
- dependencies.rst
- onthefly.rst
- collate.rst
- advanced_transform.rst
- parallel.rst
- check_if_uptodate.rst
- exceptions.rst
- logging.rst
- files_re.rst
- ***************************************
- Introduction
- ***************************************
- The **Ruffus** module is a lightweight way to run computational pipelines.
-
- Computational pipelines often become quite simple
- if we breakdown the process into simple stages.
-
- .. note::
-
- Ruffus refers to each stage of your pipeline as a :term:`task`.
- | Let us start with the usual "Hello World".
- | We have the following two python functions which
- we would like to turn into an automatic pipeline:
-
-
- .. image:: ../../images/simple_tutorial_hello_world.png
- .. ::
-
- ::
-
- def first_task():
- print "Hello "
-
- def second_task():
- print "world"
-
- The simplest **Ruffus** pipeline would look like this:
-
- .. image:: ../../images/simple_tutorial_intro_follows.png
-
- .. ::
-
- ::
-
- from ruffus import *
-
- def first_task():
- print "Hello "
-
- @follows(first_task)
- def second_task():
- print "world"
-
- pipeline_run([second_task])
-
- The functions which do the actual work of each stage of the pipeline remain unchanged.
- The role of **Ruffus** is to make sure these functions are called in the right order,
- with the right parameters, running in parallel using multiprocessing if desired.
-
- There are three simple parts to building a **ruffus** pipeline
- #. importing ruffus
- #. "Decorating" functions which are part of the pipeline
- #. Running the pipeline!
-
- .. _manual.introduction.import:
- .. index::
- single: importing ruffus
- ****************************
- Importing ruffus
- ****************************
- The most convenient way to use ruffus is to import the various names directly:
-
- ::
-
- from ruffus import *
- This will allow **ruffus** terms to be used directly in your code. This is also
- the style we have adopted for this manual.
-
- .. csv-table::
- :header: "Category", "Terms"
- :stub-columns: 1
- "*Pipeline functions*", "
- ::
-
- pipeline_printout
- pipeline_printout_graph
- pipeline_run
- register_cleanup"
- "*Decorators*", "
- ::
-
- @follows
- @files
- @split
- @transform
- @merge
- @collate
- @posttask
- @jobs_limit
- @parallel
- @check_if_uptodate
- @files_re"
- "*Loggers*", "
- ::
- stderr_logger
- black_hole_logger"
- "*Parameter disambiguating Indicators*", "
- ::
-
- suffix
- regex
- inputs
- touch_file
- combine
- mkdir
- output_from"
-
- If any of these clash with names in your code, you can use qualified names instead:
- ::
-
- import ruffus
-
- ruffus.pipeline_printout("...")
-
- .. index::
- pair: decorators; Manual
- .. _manual.introduction.decorators:
- ****************************
- "Decorating" functions
- ****************************
- You need to tag or :term:`decorator` existing code to tell **Ruffus** that they are part
- of the pipeline.
-
- .. note::
-
- :term:`decorator`\ s are ways to tag or mark out functions.
- They start with an ``@`` prefix and take a number of parameters in parenthesis.
- .. image:: ../../images/simple_tutorial_decorator_syntax.png
-
- The **ruffus** decorator :ref:`@follows <decorators.follows>` makes sure that
- ``second_task`` follows ``first_task``.
-
- | Multiple :term:`decorator`\ s can be used for each :term:`task` function to add functionality
- to *Ruffus* pipeline functions.
- | However, the decorated python functions can still be
- called normally, outside of *Ruffus*.
- | *Ruffus* :term:`decorator`\ s can be added to (stacked on top of) any function in any order.
- * :ref:`More on @follows in |manual.follows.chapter_num| <manual.follows>`
- * :ref:`@follows syntax in detail <decorators.follows>`
- .. index::
- pair: Running the pipeline; Manual
- pair: pipeline_run; Manual
- .. _manual.introduction.running_pipeline:
- ****************************
- Running the pipeline
- ****************************
- We run the pipeline by specifying the **last** stage (:term:`task` function) of your pipeline.
- Ruffus will know what other functions this depends on, following the appropriate chain of
- dependencies automatically, making sure that the entire pipeline is up-to-date.
- In our example above, because ``second_task`` depends on ``first_task``, both functions are executed in order.
- ::
-
- >>> pipeline_run([second_task], verbose = 1)
-
- **Ruffus** by default prints out the ``verbose`` progress through your pipeline,
- interleaved with our ``Hello`` and ``World``.
-
- .. image:: ../../images/simple_tutorial_hello_world_output.png
- .. ::
-
- ::
-
- >>> pipeline_run([second_task], verbose = 1)
- Start Task = first_task
- Hello
- Job completed
- Completed Task = first_task
- Start Task = second_task
- world
- Job completed
- Completed Task = second_task