PageRenderTime 18ms CodeModel.GetById 6ms app.highlight 8ms RepoModel.GetById 1ms app.codeStats 0ms

/doc/tutorials/manual/manual_introduction.rst

https://code.google.com/p/ruffus/
ReStructuredText | 254 lines | 172 code | 82 blank | 0 comment | 0 complexity | 58bdd9f654ced46741d93f0abc12821c MD5 | raw file
  1.. include:: ../../global.inc
  2.. _manual.introduction:
  3
  4.. index:: 
  5    pair: manual; introduction
  6    
  7
  8####################################################################
  9**Ruffus** Manual
 10####################################################################
 11
 12| The chapters of this manual go through each of the features of **Ruffus** in turn.
 13| Some of these (especially those labelled **esoteric** or **deprecated**) may not
 14  be of interest to all users of **Ruffus**.
 15
 16If you are looking for a quick introduction to **Ruffus**, you may want to look at the
 17:ref:`Simple Tutorial <Simple_Tutorial>` first, some of which content is shared with,
 18or elaborated on, by this manual.
 19
 20
 21***************************************
 22**Ruffus** Manual: Table of Contents:
 23***************************************
 24
 25.. toctree::
 26    :maxdepth: 1
 27
 28    follows.rst                            
 29    tasks_as_recipes.rst                   
 30    files.rst                              
 31    tasks_and_globs_in_inputs.rst
 32    tracing_pipeline_parameters.rst
 33    parallel_processing.rst                
 34    split.rst                              
 35    transform.rst                          
 36    merge.rst                              
 37    posttask.rst
 38    jobs_limit.rst
 39    dependencies.rst
 40    onthefly.rst
 41    collate.rst
 42    advanced_transform.rst
 43    parallel.rst
 44    check_if_uptodate.rst
 45    exceptions.rst
 46    logging.rst
 47    files_re.rst
 48
 49
 50
 51***************************************
 52Introduction
 53***************************************
 54
 55    The **Ruffus** module is a lightweight way to run computational pipelines.
 56    
 57    Computational pipelines often become quite simple
 58    if we breakdown the process into simple stages.
 59    
 60    .. note::
 61        
 62        Ruffus refers to each stage of your pipeline as a :term:`task`.
 63
 64    | Let us start with the usual "Hello World". 
 65    | We have the following two python functions which
 66      we would like to turn into an automatic pipeline:
 67      
 68    
 69        .. image:: ../../images/simple_tutorial_hello_world.png
 70
 71    .. ::
 72    
 73        ::
 74        
 75            def first_task():
 76                print "Hello "
 77        
 78            def second_task():
 79                print "world"
 80
 81    
 82    The simplest **Ruffus** pipeline would look like this:
 83    
 84        .. image:: ../../images/simple_tutorial_intro_follows.png
 85    
 86    .. ::
 87    
 88        ::
 89        
 90            from ruffus import *
 91            
 92            def first_task():
 93                print "Hello "
 94        
 95            @follows(first_task)
 96            def second_task():
 97                print "world"
 98    
 99            pipeline_run([second_task])
100
101    
102    The functions which do the actual work of each stage of the pipeline remain unchanged.
103    The role of **Ruffus** is to make sure these functions are called in the right order, 
104    with the right parameters, running in parallel using multiprocessing if desired.
105        
106    There are three simple parts to building a **ruffus** pipeline
107
108        #. importing ruffus
109        #. "Decorating" functions which are part of the pipeline
110        #. Running the pipeline!
111    
112.. _manual.introduction.import:
113
114.. index:: 
115    single: importing ruffus
116
117
118****************************
119Importing ruffus
120****************************
121
122    The most convenient way to use ruffus is to import the various names directly:
123    
124        ::
125        
126            from ruffus import *
127
128    This will allow **ruffus** terms to be used directly in your code. This is also
129    the style we have adopted for this manual.
130    
131    .. csv-table:: 
132       :header: "Category", "Terms"
133       :stub-columns: 1
134
135       "*Pipeline functions*", "
136       ::
137       
138         pipeline_printout
139         pipeline_printout_graph
140         pipeline_run
141         register_cleanup"
142       "*Decorators*", "
143       ::
144       
145        @follows
146        @files
147        @split
148        @transform
149        @merge
150        @collate
151        @posttask
152        @jobs_limit
153        @parallel
154        @check_if_uptodate
155        @files_re"
156       "*Loggers*", "
157       ::
158
159         stderr_logger
160         black_hole_logger"
161       "*Parameter disambiguating Indicators*", "
162       ::
163       
164         suffix
165         regex
166         inputs
167         touch_file
168         combine
169         mkdir
170         output_from"
171            
172If any of these clash with names in your code, you can use qualified names instead:
173        ::
174        
175            import ruffus
176            
177            ruffus.pipeline_printout("...")
178            
179
180.. index:: 
181    pair: decorators; Manual
182
183.. _manual.introduction.decorators:
184
185****************************
186"Decorating" functions
187****************************
188
189    You need to tag or :term:`decorator` existing code to tell **Ruffus** that they are part
190    of the pipeline.
191    
192    .. note::
193        
194        :term:`decorator`\ s are ways to tag or mark out functions. 
195
196        They start with an ``@`` prefix and take a number of parameters in parenthesis.
197
198        .. image:: ../../images/simple_tutorial_decorator_syntax.png
199                
200    The **ruffus** decorator :ref:`@follows <decorators.follows>` makes sure that
201    ``second_task`` follows ``first_task``.
202    
203
204    | Multiple :term:`decorator`\ s can be used for each :term:`task` function to add functionality
205      to *Ruffus* pipeline functions. 
206    | However, the decorated python functions can still be
207      called normally, outside of *Ruffus*.
208    | *Ruffus* :term:`decorator`\ s can be added to (stacked on top of) any function in any order.
209
210    * :ref:`More on @follows in |manual.follows.chapter_num| <manual.follows>` 
211    * :ref:`@follows syntax in detail <decorators.follows>`
212
213.. index:: 
214    pair: Running the pipeline; Manual
215    pair: pipeline_run; Manual
216
217.. _manual.introduction.running_pipeline:
218
219****************************
220Running the pipeline
221****************************
222
223    We run the pipeline by specifying the **last** stage (:term:`task` function) of your pipeline.
224    Ruffus will know what other functions this depends on, following the appropriate chain of
225    dependencies automatically, making sure that the entire pipeline is up-to-date.
226
227    In our example above, because ``second_task`` depends on ``first_task``, both functions are executed in order.
228
229        ::
230            
231            >>> pipeline_run([second_task], verbose = 1)
232        
233    **Ruffus** by default prints out the ``verbose`` progress through your pipeline, 
234    interleaved with our ``Hello`` and ``World``.
235    
236        .. image:: ../../images/simple_tutorial_hello_world_output.png
237
238    .. ::
239    
240        ::
241            
242            >>> pipeline_run([second_task], verbose = 1)
243            Start Task = first_task
244            Hello
245                Job completed
246            Completed Task = first_task
247            Start Task = second_task
248            world
249                Job completed
250            Completed Task = second_task
251    
252    
253
254