PageRenderTime 28ms CodeModel.GetById 16ms app.highlight 7ms RepoModel.GetById 0ms app.codeStats 0ms

/SQLAlchemy-0.7.8/doc/_sources/orm/loading.txt

#
Plain Text | 502 lines | 385 code | 117 blank | 0 comment | 0 complexity | eaa49ef33576f2072cb9c8fe8fee0a7b MD5 | raw file
  1.. _loading_toplevel:
  2
  3.. currentmodule:: sqlalchemy.orm
  4
  5Relationship Loading Techniques
  6===============================
  7
  8A big part of SQLAlchemy is providing a wide range of control over how related objects get loaded when querying.   This behavior
  9can be configured at mapper construction time using the ``lazy`` parameter to the :func:`.relationship` function,
 10as well as by using options with the :class:`.Query` object.
 11
 12Using Loader Strategies: Lazy Loading, Eager Loading
 13----------------------------------------------------
 14
 15By default, all inter-object relationships are **lazy loading**. The scalar or
 16collection attribute associated with a :func:`~sqlalchemy.orm.relationship`
 17contains a trigger which fires the first time the attribute is accessed.  This
 18trigger, in all but one case, issues a SQL call at the point of access
 19in order to load the related object or objects:
 20
 21.. sourcecode:: python+sql
 22
 23    {sql}>>> jack.addresses
 24    SELECT addresses.id AS addresses_id, addresses.email_address AS addresses_email_address, 
 25    addresses.user_id AS addresses_user_id
 26    FROM addresses
 27    WHERE ? = addresses.user_id
 28    [5]
 29    {stop}[<Address(u'jack@google.com')>, <Address(u'j25@yahoo.com')>]
 30
 31The one case where SQL is not emitted is for a simple many-to-one relationship, when 
 32the related object can be identified by its primary key alone and that object is already
 33present in the current :class:`.Session`.
 34
 35This default behavior of "load upon attribute access" is known as "lazy" or
 36"select" loading - the name "select" because a "SELECT" statement is typically emitted
 37when the attribute is first accessed.
 38
 39In the :ref:`ormtutorial_toplevel`, we introduced the concept of **Eager
 40Loading**. We used an ``option`` in conjunction with the
 41:class:`~sqlalchemy.orm.query.Query` object in order to indicate that a
 42relationship should be loaded at the same time as the parent, within a single
 43SQL query.   This option, known as :func:`.joinedload`, connects a JOIN (by default
 44a LEFT OUTER join) to the statement and populates the scalar/collection from the
 45same result set as that of the parent:
 46
 47.. sourcecode:: python+sql
 48
 49    {sql}>>> jack = session.query(User).\
 50    ... options(joinedload('addresses')).\
 51    ... filter_by(name='jack').all() #doctest: +NORMALIZE_WHITESPACE
 52    SELECT addresses_1.id AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
 53    addresses_1.user_id AS addresses_1_user_id, users.id AS users_id, users.name AS users_name,
 54    users.fullname AS users_fullname, users.password AS users_password
 55    FROM users LEFT OUTER JOIN addresses AS addresses_1 ON users.id = addresses_1.user_id
 56    WHERE users.name = ?
 57    ['jack']
 58
 59
 60In addition to "joined eager loading", a second option for eager loading
 61exists, called "subquery eager loading". This kind of eager loading emits an
 62additional SQL statement for each collection requested, aggregated across all
 63parent objects:
 64
 65.. sourcecode:: python+sql
 66
 67    {sql}>>> jack = session.query(User).\
 68    ... options(subqueryload('addresses')).\
 69    ... filter_by(name='jack').all() 
 70    SELECT users.id AS users_id, users.name AS users_name, users.fullname AS users_fullname, 
 71    users.password AS users_password 
 72    FROM users 
 73    WHERE users.name = ?
 74    ('jack',)
 75    SELECT addresses.id AS addresses_id, addresses.email_address AS addresses_email_address, 
 76    addresses.user_id AS addresses_user_id, anon_1.users_id AS anon_1_users_id 
 77    FROM (SELECT users.id AS users_id 
 78    FROM users 
 79    WHERE users.name = ?) AS anon_1 JOIN addresses ON anon_1.users_id = addresses.user_id 
 80    ORDER BY anon_1.users_id, addresses.id
 81    ('jack',)
 82
 83The default **loader strategy** for any :func:`~sqlalchemy.orm.relationship`
 84is configured by the ``lazy`` keyword argument, which defaults to ``select`` - this indicates
 85a "select" statement .
 86Below we set it as ``joined`` so that the ``children`` relationship is eager
 87loading, using a join:
 88
 89.. sourcecode:: python+sql
 90
 91    # load the 'children' collection using LEFT OUTER JOIN
 92    mapper(Parent, parent_table, properties={
 93        'children': relationship(Child, lazy='joined')
 94    })
 95
 96We can also set it to eagerly load using a second query for all collections,
 97using ``subquery``:
 98
 99.. sourcecode:: python+sql
100
101    # load the 'children' attribute using a join to a subquery
102    mapper(Parent, parent_table, properties={
103        'children': relationship(Child, lazy='subquery')
104    })
105
106When querying, all three choices of loader strategy are available on a
107per-query basis, using the :func:`~sqlalchemy.orm.joinedload`,
108:func:`~sqlalchemy.orm.subqueryload` and :func:`~sqlalchemy.orm.lazyload`
109query options:
110
111.. sourcecode:: python+sql
112
113    # set children to load lazily
114    session.query(Parent).options(lazyload('children')).all()
115
116    # set children to load eagerly with a join
117    session.query(Parent).options(joinedload('children')).all()
118
119    # set children to load eagerly with a second statement
120    session.query(Parent).options(subqueryload('children')).all()
121
122To reference a relationship that is deeper than one level, separate the names by periods:
123
124.. sourcecode:: python+sql
125
126    session.query(Parent).options(joinedload('foo.bar.bat')).all()
127
128When using dot-separated names with :func:`~sqlalchemy.orm.joinedload` or
129:func:`~sqlalchemy.orm.subqueryload`, the option applies **only** to the actual
130attribute named, and **not** its ancestors. For example, suppose a mapping
131from ``A`` to ``B`` to ``C``, where the relationships, named ``atob`` and
132``btoc``, are both lazy-loading. A statement like the following:
133
134.. sourcecode:: python+sql
135
136    session.query(A).options(joinedload('atob.btoc')).all()
137
138will load only ``A`` objects to start. When the ``atob`` attribute on each
139``A`` is accessed, the returned ``B`` objects will *eagerly* load their ``C``
140objects.
141
142Therefore, to modify the eager load to load both ``atob`` as well as ``btoc``,
143place joinedloads for both:
144
145.. sourcecode:: python+sql
146
147    session.query(A).options(joinedload('atob'), joinedload('atob.btoc')).all()
148
149or more succinctly just use :func:`~sqlalchemy.orm.joinedload_all` or
150:func:`~sqlalchemy.orm.subqueryload_all`:
151
152.. sourcecode:: python+sql
153
154    session.query(A).options(joinedload_all('atob.btoc')).all()
155
156There are two other loader strategies available, **dynamic loading** and **no
157loading**; these are described in :ref:`largecollections`.
158
159Default Loading Strategies
160--------------------------
161
162.. versionadded:: 0.7.5
163    Default loader strategies as a new feature.
164
165Each of :func:`.joinedload`, :func:`.subqueryload`, :func:`.lazyload`, 
166and :func:`.noload` can be used to set the default style of
167:func:`.relationship` loading 
168for a particular query, affecting all :func:`.relationship` -mapped
169attributes not otherwise
170specified in the :class:`.Query`.   This feature is available by passing
171the string ``'*'`` as the argument to any of these options::
172
173    session.query(MyClass).options(lazyload('*'))
174
175Above, the ``lazyload('*')`` option will supercede the ``lazy`` setting
176of all :func:`.relationship` constructs in use for that query,
177except for those which use the ``'dynamic'`` style of loading.   
178If some relationships specify
179``lazy='joined'`` or ``lazy='subquery'``, for example,
180using ``default_strategy(lazy='select')`` will unilaterally
181cause all those relationships to use ``'select'`` loading.
182
183The option does not supercede loader options stated in the
184query, such as :func:`.eagerload`, 
185:func:`.subqueryload`, etc.  The query below will still use joined loading
186for the ``widget`` relationship::
187
188    session.query(MyClass).options(
189                                lazyload('*'), 
190                                joinedload(MyClass.widget)
191                            )
192
193If multiple ``'*'`` options are passed, the last one overrides
194those previously passed.
195
196.. _zen_of_eager_loading:
197
198The Zen of Eager Loading
199-------------------------
200
201The philosophy behind loader strategies is that any set of loading schemes can be
202applied to a particular query, and *the results don't change* - only the number 
203of SQL statements required to fully load related objects and collections changes. A particular
204query might start out using all lazy loads.   After using it in context, it might be revealed
205that particular attributes or collections are always accessed, and that it would be more
206efficient to change the loader strategy for these.   The strategy can be changed with no other
207modifications to the query, the results will remain identical, but fewer SQL statements would be emitted.
208In theory (and pretty much in practice), nothing you can do to the :class:`.Query` would make it load
209a different set of primary or related objects based on a change in loader strategy.
210
211How :func:`joinedload` in particular achieves this result of not impacting
212entity rows returned in any way is that it creates an anonymous alias of the joins it adds to your
213query, so that they can't be referenced by other parts of the query.   For example,
214the query below uses :func:`.joinedload` to create a LEFT OUTER JOIN from ``users``
215to ``addresses``, however the ``ORDER BY`` added against ``Address.email_address``
216is not valid - the ``Address`` entity is not named in the query:
217
218.. sourcecode:: python+sql
219
220    >>> jack = session.query(User).\
221    ... options(joinedload(User.addresses)).\
222    ... filter(User.name=='jack').\
223    ... order_by(Address.email_address).all() 
224    {opensql}SELECT addresses_1.id AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
225    addresses_1.user_id AS addresses_1_user_id, users.id AS users_id, users.name AS users_name,
226    users.fullname AS users_fullname, users.password AS users_password
227    FROM users LEFT OUTER JOIN addresses AS addresses_1 ON users.id = addresses_1.user_id
228    WHERE users.name = ? ORDER BY addresses.email_address   <-- this part is wrong !
229    ['jack']
230
231Above, ``ORDER BY addresses.email_address`` is not valid since ``addresses`` is not in the 
232FROM list.   The correct way to load the ``User`` records and order by email
233address is to use :meth:`.Query.join`:
234
235.. sourcecode:: python+sql
236
237    >>> jack = session.query(User).\
238    ... join(User.addresses).\
239    ... filter(User.name=='jack').\
240    ... order_by(Address.email_address).all() 
241    {opensql}
242    SELECT users.id AS users_id, users.name AS users_name,
243    users.fullname AS users_fullname, users.password AS users_password
244    FROM users JOIN addresses ON users.id = addresses.user_id
245    WHERE users.name = ? ORDER BY addresses.email_address
246    ['jack']
247
248The statement above is of course not the same as the previous one, in that the columns from ``addresses``
249are not included in the result at all.   We can add :func:`.joinedload` back in, so that
250there are two joins - one is that which we are ordering on, the other is used anonymously to 
251load the contents of the ``User.addresses`` collection:
252
253.. sourcecode:: python+sql
254
255    >>> jack = session.query(User).\
256    ... join(User.addresses).\
257    ... options(joinedload(User.addresses)).\
258    ... filter(User.name=='jack').\
259    ... order_by(Address.email_address).all() 
260    {opensql}SELECT addresses_1.id AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
261    addresses_1.user_id AS addresses_1_user_id, users.id AS users_id, users.name AS users_name,
262    users.fullname AS users_fullname, users.password AS users_password
263    FROM users JOIN addresses ON users.id = addresses.user_id
264    LEFT OUTER JOIN addresses AS addresses_1 ON users.id = addresses_1.user_id
265    WHERE users.name = ? ORDER BY addresses.email_address
266    ['jack']
267
268What we see above is that our usage of :meth:`.Query.join` is to supply JOIN clauses we'd like
269to use in subsequent query criterion, whereas our usage of :func:`.joinedload` only concerns
270itself with the loading of the ``User.addresses`` collection, for each ``User`` in the result.
271In this case, the two joins most probably appear redundant - which they are.  If we
272wanted to use just one JOIN for collection loading as well as ordering, we use the 
273:func:`.contains_eager` option, described in :ref:`contains_eager` below.   But 
274to see why :func:`joinedload` does what it does, consider if we were **filtering** on a
275particular ``Address``:
276
277.. sourcecode:: python+sql
278
279    >>> jack = session.query(User).\
280    ... join(User.addresses).\
281    ... options(joinedload(User.addresses)).\
282    ... filter(User.name=='jack').\
283    ... filter(Address.email_address=='someaddress@foo.com').\
284    ... all() 
285    {opensql}SELECT addresses_1.id AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
286    addresses_1.user_id AS addresses_1_user_id, users.id AS users_id, users.name AS users_name,
287    users.fullname AS users_fullname, users.password AS users_password
288    FROM users JOIN addresses ON users.id = addresses.user_id
289    LEFT OUTER JOIN addresses AS addresses_1 ON users.id = addresses_1.user_id
290    WHERE users.name = ? AND addresses.email_address = ?
291    ['jack', 'someaddress@foo.com']
292
293Above, we can see that the two JOINs have very different roles.  One will match exactly
294one row, that of the join of ``User`` and ``Address`` where ``Address.email_address=='someaddress@foo.com'``.
295The other LEFT OUTER JOIN will match *all* ``Address`` rows related to ``User``,
296and is only used to populate the ``User.addresses`` collection, for those ``User`` objects
297that are returned.
298
299By changing the usage of :func:`.joinedload` to another style of loading, we can change
300how the collection is loaded completely independently of SQL used to retrieve
301the actual ``User`` rows we want.  Below we change :func:`.joinedload` into
302:func:`.subqueryload`:
303
304.. sourcecode:: python+sql
305
306    >>> jack = session.query(User).\
307    ... join(User.addresses).\
308    ... options(subqueryload(User.addresses)).\
309    ... filter(User.name=='jack').\
310    ... filter(Address.email_address=='someaddress@foo.com').\
311    ... all() 
312    {opensql}SELECT users.id AS users_id, users.name AS users_name,
313    users.fullname AS users_fullname, users.password AS users_password
314    FROM users JOIN addresses ON users.id = addresses.user_id
315    WHERE users.name = ? AND addresses.email_address = ?
316    ['jack', 'someaddress@foo.com']
317
318    # ... subqueryload() emits a SELECT in order 
319    # to load all address records ...
320
321When using joined eager loading, if the
322query contains a modifier that impacts the rows returned
323externally to the joins, such as when using DISTINCT, LIMIT, OFFSET
324or equivalent, the completed statement is first
325wrapped inside a subquery, and the joins used specifically for joined eager
326loading are applied to the subquery.   SQLAlchemy's 
327joined eager loading goes the extra mile, and then ten miles further, to 
328absolutely ensure that it does not affect the end result of the query, only
329the way collections and related objects are loaded, no matter what the format of the query is.
330
331What Kind of Loading to Use ?
332-----------------------------
333
334Which type of loading to use typically comes down to optimizing the tradeoff
335between number of SQL executions, complexity of SQL emitted, and amount of
336data fetched. Lets take two examples, a :func:`~sqlalchemy.orm.relationship`
337which references a collection, and a :func:`~sqlalchemy.orm.relationship` that
338references a scalar many-to-one reference.
339
340* One to Many Collection
341
342 * When using the default lazy loading, if you load 100 objects, and then access a collection on each of
343   them, a total of 101 SQL statements will be emitted, although each statement will typically be a
344   simple SELECT without any joins.
345
346 * When using joined loading, the load of 100 objects and their collections will emit only one SQL
347   statement.  However, the 
348   total number of rows fetched will be equal to the sum of the size of all the collections, plus one 
349   extra row for each parent object that has an empty collection.  Each row will also contain the full
350   set of columns represented by the parents, repeated for each collection item - SQLAlchemy does not
351   re-fetch these columns other than those of the primary key, however most DBAPIs (with some 
352   exceptions) will transmit the full data of each parent over the wire to the client connection in 
353   any case.  Therefore joined eager loading only makes sense when the size of the collections are 
354   relatively small.  The LEFT OUTER JOIN can also be performance intensive compared to an INNER join.
355
356 * When using subquery loading, the load of 100 objects will emit two SQL statements.  The second
357   statement will fetch a total number of rows equal to the sum of the size of all collections.  An
358   INNER JOIN is used, and a minimum of parent columns are requested, only the primary keys.  So a 
359   subquery load makes sense when the collections are larger.
360
361 * When multiple levels of depth are used with joined or subquery loading, loading collections-within-
362   collections will multiply the total number of rows fetched in a cartesian fashion.  Both forms
363   of eager loading always join from the original parent class.
364
365* Many to One Reference
366
367 * When using the default lazy loading, a load of 100 objects will like in the case of the collection
368   emit as many as 101 SQL statements.  However - there is a significant exception to this, in that
369   if the many-to-one reference is a simple foreign key reference to the target's primary key, each
370   reference will be checked first in the current identity map using :meth:`.Query.get`.  So here, 
371   if the collection of objects references a relatively small set of target objects, or the full set
372   of possible target objects have already been loaded into the session and are strongly referenced,
373   using the default of `lazy='select'` is by far the most efficient way to go.
374
375 * When using joined loading, the load of 100 objects will emit only one SQL statement.   The join
376   will be a LEFT OUTER JOIN, and the total number of rows will be equal to 100 in all cases.
377   If you know that each parent definitely has a child (i.e. the foreign
378   key reference is NOT NULL), the joined load can be configured with ``innerjoin=True``, which is
379   usually specified within the :func:`~sqlalchemy.orm.relationship`.   For a load of objects where
380   there are many possible target references which may have not been loaded already, joined loading
381   with an INNER JOIN is extremely efficient.
382
383 * Subquery loading will issue a second load for all the child objects, so for a load of 100 objects
384   there would be two SQL statements emitted.  There's probably not much advantage here over
385   joined loading, however, except perhaps that subquery loading can use an INNER JOIN in all cases
386   whereas joined loading requires that the foreign key is NOT NULL.
387
388.. _joinedload_and_join:
389
390.. _contains_eager:
391
392Routing Explicit Joins/Statements into Eagerly Loaded Collections
393------------------------------------------------------------------
394
395The behavior of :func:`~sqlalchemy.orm.joinedload()` is such that joins are
396created automatically, using anonymous aliases as targets, the results of which 
397are routed into collections and
398scalar references on loaded objects. It is often the case that a query already
399includes the necessary joins which represent a particular collection or scalar
400reference, and the joins added by the joinedload feature are redundant - yet
401you'd still like the collections/references to be populated.
402
403For this SQLAlchemy supplies the :func:`~sqlalchemy.orm.contains_eager()`
404option. This option is used in the same manner as the
405:func:`~sqlalchemy.orm.joinedload()` option except it is assumed that the
406:class:`~sqlalchemy.orm.query.Query` will specify the appropriate joins
407explicitly. Below it's used with a ``from_statement`` load::
408
409    # mapping is the users->addresses mapping
410    mapper(User, users_table, properties={
411        'addresses': relationship(Address, addresses_table)
412    })
413
414    # define a query on USERS with an outer join to ADDRESSES
415    statement = users_table.outerjoin(addresses_table).select().apply_labels()
416
417    # construct a Query object which expects the "addresses" results
418    query = session.query(User).options(contains_eager('addresses'))
419
420    # get results normally
421    r = query.from_statement(statement)
422
423It works just as well with an inline :meth:`.Query.join` or
424:meth:`.Query.outerjoin`::
425
426    session.query(User).outerjoin(User.addresses).options(contains_eager(User.addresses)).all()
427
428If the "eager" portion of the statement is "aliased", the ``alias`` keyword
429argument to :func:`~sqlalchemy.orm.contains_eager` may be used to indicate it.
430This is a string alias name or reference to an actual
431:class:`~sqlalchemy.sql.expression.Alias` (or other selectable) object:
432
433.. sourcecode:: python+sql
434
435    # use an alias of the Address entity
436    adalias = aliased(Address)
437
438    # construct a Query object which expects the "addresses" results
439    query = session.query(User).\
440        outerjoin(adalias, User.addresses).\
441        options(contains_eager(User.addresses, alias=adalias))
442
443    # get results normally
444    {sql}r = query.all()
445    SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, adalias.address_id AS adalias_address_id,
446    adalias.user_id AS adalias_user_id, adalias.email_address AS adalias_email_address, (...other columns...)
447    FROM users LEFT OUTER JOIN email_addresses AS email_addresses_1 ON users.user_id = email_addresses_1.user_id
448
449The ``alias`` argument is used only as a source of columns to match up to the
450result set. You can use it to match up the result to arbitrary label
451names in a string SQL statement, by passing a :func:`.select` which links those
452labels to the mapped :class:`.Table`::
453
454    # label the columns of the addresses table
455    eager_columns = select([
456                        addresses.c.address_id.label('a1'),
457                        addresses.c.email_address.label('a2'),
458                        addresses.c.user_id.label('a3')])
459
460    # select from a raw SQL statement which uses those label names for the
461    # addresses table.  contains_eager() matches them up.
462    query = session.query(User).\
463        from_statement("select users.*, addresses.address_id as a1, "
464                "addresses.email_address as a2, addresses.user_id as a3 "
465                "from users left outer join addresses on users.user_id=addresses.user_id").\
466        options(contains_eager(User.addresses, alias=eager_columns))
467
468The path given as the argument to :func:`.contains_eager` needs
469to be a full path from the starting entity. For example if we were loading
470``Users->orders->Order->items->Item``, the string version would look like::
471
472    query(User).options(contains_eager('orders', 'items'))
473
474Or using the class-bound descriptor::
475
476    query(User).options(contains_eager(User.orders, Order.items))
477
478
479Relation Loader API
480--------------------
481
482.. autofunction:: contains_alias
483
484.. autofunction:: contains_eager
485
486.. autofunction:: eagerload
487
488.. autofunction:: eagerload_all
489
490.. autofunction:: immediateload
491
492.. autofunction:: joinedload
493
494.. autofunction:: joinedload_all
495
496.. autofunction:: lazyload
497
498.. autofunction:: noload
499
500.. autofunction:: subqueryload
501
502.. autofunction:: subqueryload_all