PageRenderTime 28ms CodeModel.GetById 16ms app.highlight 7ms RepoModel.GetById 0ms app.codeStats 0ms


Plain Text | 502 lines | 385 code | 117 blank | 0 comment | 0 complexity | eaa49ef33576f2072cb9c8fe8fee0a7b MD5 | raw file
  1.. _loading_toplevel:
  3.. currentmodule:: sqlalchemy.orm
  5Relationship Loading Techniques
  8A big part of SQLAlchemy is providing a wide range of control over how related objects get loaded when querying.   This behavior
  9can be configured at mapper construction time using the ``lazy`` parameter to the :func:`.relationship` function,
 10as well as by using options with the :class:`.Query` object.
 12Using Loader Strategies: Lazy Loading, Eager Loading
 15By default, all inter-object relationships are **lazy loading**. The scalar or
 16collection attribute associated with a :func:`~sqlalchemy.orm.relationship`
 17contains a trigger which fires the first time the attribute is accessed.  This
 18trigger, in all but one case, issues a SQL call at the point of access
 19in order to load the related object or objects:
 21.. sourcecode:: python+sql
 23    {sql}>>> jack.addresses
 24    SELECT AS addresses_id, addresses.email_address AS addresses_email_address, 
 25    addresses.user_id AS addresses_user_id
 26    FROM addresses
 27    WHERE ? = addresses.user_id
 28    [5]
 29    {stop}[<Address(u'')>, <Address(u'')>]
 31The one case where SQL is not emitted is for a simple many-to-one relationship, when 
 32the related object can be identified by its primary key alone and that object is already
 33present in the current :class:`.Session`.
 35This default behavior of "load upon attribute access" is known as "lazy" or
 36"select" loading - the name "select" because a "SELECT" statement is typically emitted
 37when the attribute is first accessed.
 39In the :ref:`ormtutorial_toplevel`, we introduced the concept of **Eager
 40Loading**. We used an ``option`` in conjunction with the
 41:class:`~sqlalchemy.orm.query.Query` object in order to indicate that a
 42relationship should be loaded at the same time as the parent, within a single
 43SQL query.   This option, known as :func:`.joinedload`, connects a JOIN (by default
 44a LEFT OUTER join) to the statement and populates the scalar/collection from the
 45same result set as that of the parent:
 47.. sourcecode:: python+sql
 49    {sql}>>> jack = session.query(User).\
 50    ... options(joinedload('addresses')).\
 51    ... filter_by(name='jack').all() #doctest: +NORMALIZE_WHITESPACE
 52    SELECT AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
 53    addresses_1.user_id AS addresses_1_user_id, AS users_id, AS users_name,
 54    users.fullname AS users_fullname, users.password AS users_password
 55    FROM users LEFT OUTER JOIN addresses AS addresses_1 ON = addresses_1.user_id
 56    WHERE = ?
 57    ['jack']
 60In addition to "joined eager loading", a second option for eager loading
 61exists, called "subquery eager loading". This kind of eager loading emits an
 62additional SQL statement for each collection requested, aggregated across all
 63parent objects:
 65.. sourcecode:: python+sql
 67    {sql}>>> jack = session.query(User).\
 68    ... options(subqueryload('addresses')).\
 69    ... filter_by(name='jack').all() 
 70    SELECT AS users_id, AS users_name, users.fullname AS users_fullname, 
 71    users.password AS users_password 
 72    FROM users 
 73    WHERE = ?
 74    ('jack',)
 75    SELECT AS addresses_id, addresses.email_address AS addresses_email_address, 
 76    addresses.user_id AS addresses_user_id, anon_1.users_id AS anon_1_users_id 
 77    FROM (SELECT AS users_id 
 78    FROM users 
 79    WHERE = ?) AS anon_1 JOIN addresses ON anon_1.users_id = addresses.user_id 
 80    ORDER BY anon_1.users_id,
 81    ('jack',)
 83The default **loader strategy** for any :func:`~sqlalchemy.orm.relationship`
 84is configured by the ``lazy`` keyword argument, which defaults to ``select`` - this indicates
 85a "select" statement .
 86Below we set it as ``joined`` so that the ``children`` relationship is eager
 87loading, using a join:
 89.. sourcecode:: python+sql
 91    # load the 'children' collection using LEFT OUTER JOIN
 92    mapper(Parent, parent_table, properties={
 93        'children': relationship(Child, lazy='joined')
 94    })
 96We can also set it to eagerly load using a second query for all collections,
 97using ``subquery``:
 99.. sourcecode:: python+sql
101    # load the 'children' attribute using a join to a subquery
102    mapper(Parent, parent_table, properties={
103        'children': relationship(Child, lazy='subquery')
104    })
106When querying, all three choices of loader strategy are available on a
107per-query basis, using the :func:`~sqlalchemy.orm.joinedload`,
108:func:`~sqlalchemy.orm.subqueryload` and :func:`~sqlalchemy.orm.lazyload`
109query options:
111.. sourcecode:: python+sql
113    # set children to load lazily
114    session.query(Parent).options(lazyload('children')).all()
116    # set children to load eagerly with a join
117    session.query(Parent).options(joinedload('children')).all()
119    # set children to load eagerly with a second statement
120    session.query(Parent).options(subqueryload('children')).all()
122To reference a relationship that is deeper than one level, separate the names by periods:
124.. sourcecode:: python+sql
126    session.query(Parent).options(joinedload('')).all()
128When using dot-separated names with :func:`~sqlalchemy.orm.joinedload` or
129:func:`~sqlalchemy.orm.subqueryload`, the option applies **only** to the actual
130attribute named, and **not** its ancestors. For example, suppose a mapping
131from ``A`` to ``B`` to ``C``, where the relationships, named ``atob`` and
132``btoc``, are both lazy-loading. A statement like the following:
134.. sourcecode:: python+sql
136    session.query(A).options(joinedload('atob.btoc')).all()
138will load only ``A`` objects to start. When the ``atob`` attribute on each
139``A`` is accessed, the returned ``B`` objects will *eagerly* load their ``C``
142Therefore, to modify the eager load to load both ``atob`` as well as ``btoc``,
143place joinedloads for both:
145.. sourcecode:: python+sql
147    session.query(A).options(joinedload('atob'), joinedload('atob.btoc')).all()
149or more succinctly just use :func:`~sqlalchemy.orm.joinedload_all` or
152.. sourcecode:: python+sql
154    session.query(A).options(joinedload_all('atob.btoc')).all()
156There are two other loader strategies available, **dynamic loading** and **no
157loading**; these are described in :ref:`largecollections`.
159Default Loading Strategies
162.. versionadded:: 0.7.5
163    Default loader strategies as a new feature.
165Each of :func:`.joinedload`, :func:`.subqueryload`, :func:`.lazyload`, 
166and :func:`.noload` can be used to set the default style of
167:func:`.relationship` loading 
168for a particular query, affecting all :func:`.relationship` -mapped
169attributes not otherwise
170specified in the :class:`.Query`.   This feature is available by passing
171the string ``'*'`` as the argument to any of these options::
173    session.query(MyClass).options(lazyload('*'))
175Above, the ``lazyload('*')`` option will supercede the ``lazy`` setting
176of all :func:`.relationship` constructs in use for that query,
177except for those which use the ``'dynamic'`` style of loading.   
178If some relationships specify
179``lazy='joined'`` or ``lazy='subquery'``, for example,
180using ``default_strategy(lazy='select')`` will unilaterally
181cause all those relationships to use ``'select'`` loading.
183The option does not supercede loader options stated in the
184query, such as :func:`.eagerload`, 
185:func:`.subqueryload`, etc.  The query below will still use joined loading
186for the ``widget`` relationship::
188    session.query(MyClass).options(
189                                lazyload('*'), 
190                                joinedload(MyClass.widget)
191                            )
193If multiple ``'*'`` options are passed, the last one overrides
194those previously passed.
196.. _zen_of_eager_loading:
198The Zen of Eager Loading
201The philosophy behind loader strategies is that any set of loading schemes can be
202applied to a particular query, and *the results don't change* - only the number 
203of SQL statements required to fully load related objects and collections changes. A particular
204query might start out using all lazy loads.   After using it in context, it might be revealed
205that particular attributes or collections are always accessed, and that it would be more
206efficient to change the loader strategy for these.   The strategy can be changed with no other
207modifications to the query, the results will remain identical, but fewer SQL statements would be emitted.
208In theory (and pretty much in practice), nothing you can do to the :class:`.Query` would make it load
209a different set of primary or related objects based on a change in loader strategy.
211How :func:`joinedload` in particular achieves this result of not impacting
212entity rows returned in any way is that it creates an anonymous alias of the joins it adds to your
213query, so that they can't be referenced by other parts of the query.   For example,
214the query below uses :func:`.joinedload` to create a LEFT OUTER JOIN from ``users``
215to ``addresses``, however the ``ORDER BY`` added against ``Address.email_address``
216is not valid - the ``Address`` entity is not named in the query:
218.. sourcecode:: python+sql
220    >>> jack = session.query(User).\
221    ... options(joinedload(User.addresses)).\
222    ... filter('jack').\
223    ... order_by(Address.email_address).all() 
224    {opensql}SELECT AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
225    addresses_1.user_id AS addresses_1_user_id, AS users_id, AS users_name,
226    users.fullname AS users_fullname, users.password AS users_password
227    FROM users LEFT OUTER JOIN addresses AS addresses_1 ON = addresses_1.user_id
228    WHERE = ? ORDER BY addresses.email_address   <-- this part is wrong !
229    ['jack']
231Above, ``ORDER BY addresses.email_address`` is not valid since ``addresses`` is not in the 
232FROM list.   The correct way to load the ``User`` records and order by email
233address is to use :meth:`.Query.join`:
235.. sourcecode:: python+sql
237    >>> jack = session.query(User).\
238    ... join(User.addresses).\
239    ... filter('jack').\
240    ... order_by(Address.email_address).all() 
241    {opensql}
242    SELECT AS users_id, AS users_name,
243    users.fullname AS users_fullname, users.password AS users_password
244    FROM users JOIN addresses ON = addresses.user_id
245    WHERE = ? ORDER BY addresses.email_address
246    ['jack']
248The statement above is of course not the same as the previous one, in that the columns from ``addresses``
249are not included in the result at all.   We can add :func:`.joinedload` back in, so that
250there are two joins - one is that which we are ordering on, the other is used anonymously to 
251load the contents of the ``User.addresses`` collection:
253.. sourcecode:: python+sql
255    >>> jack = session.query(User).\
256    ... join(User.addresses).\
257    ... options(joinedload(User.addresses)).\
258    ... filter('jack').\
259    ... order_by(Address.email_address).all() 
260    {opensql}SELECT AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
261    addresses_1.user_id AS addresses_1_user_id, AS users_id, AS users_name,
262    users.fullname AS users_fullname, users.password AS users_password
263    FROM users JOIN addresses ON = addresses.user_id
264    LEFT OUTER JOIN addresses AS addresses_1 ON = addresses_1.user_id
265    WHERE = ? ORDER BY addresses.email_address
266    ['jack']
268What we see above is that our usage of :meth:`.Query.join` is to supply JOIN clauses we'd like
269to use in subsequent query criterion, whereas our usage of :func:`.joinedload` only concerns
270itself with the loading of the ``User.addresses`` collection, for each ``User`` in the result.
271In this case, the two joins most probably appear redundant - which they are.  If we
272wanted to use just one JOIN for collection loading as well as ordering, we use the 
273:func:`.contains_eager` option, described in :ref:`contains_eager` below.   But 
274to see why :func:`joinedload` does what it does, consider if we were **filtering** on a
275particular ``Address``:
277.. sourcecode:: python+sql
279    >>> jack = session.query(User).\
280    ... join(User.addresses).\
281    ... options(joinedload(User.addresses)).\
282    ... filter('jack').\
283    ... filter(Address.email_address=='').\
284    ... all() 
285    {opensql}SELECT AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
286    addresses_1.user_id AS addresses_1_user_id, AS users_id, AS users_name,
287    users.fullname AS users_fullname, users.password AS users_password
288    FROM users JOIN addresses ON = addresses.user_id
289    LEFT OUTER JOIN addresses AS addresses_1 ON = addresses_1.user_id
290    WHERE = ? AND addresses.email_address = ?
291    ['jack', '']
293Above, we can see that the two JOINs have very different roles.  One will match exactly
294one row, that of the join of ``User`` and ``Address`` where ``Address.email_address==''``.
295The other LEFT OUTER JOIN will match *all* ``Address`` rows related to ``User``,
296and is only used to populate the ``User.addresses`` collection, for those ``User`` objects
297that are returned.
299By changing the usage of :func:`.joinedload` to another style of loading, we can change
300how the collection is loaded completely independently of SQL used to retrieve
301the actual ``User`` rows we want.  Below we change :func:`.joinedload` into
304.. sourcecode:: python+sql
306    >>> jack = session.query(User).\
307    ... join(User.addresses).\
308    ... options(subqueryload(User.addresses)).\
309    ... filter('jack').\
310    ... filter(Address.email_address=='').\
311    ... all() 
312    {opensql}SELECT AS users_id, AS users_name,
313    users.fullname AS users_fullname, users.password AS users_password
314    FROM users JOIN addresses ON = addresses.user_id
315    WHERE = ? AND addresses.email_address = ?
316    ['jack', '']
318    # ... subqueryload() emits a SELECT in order 
319    # to load all address records ...
321When using joined eager loading, if the
322query contains a modifier that impacts the rows returned
323externally to the joins, such as when using DISTINCT, LIMIT, OFFSET
324or equivalent, the completed statement is first
325wrapped inside a subquery, and the joins used specifically for joined eager
326loading are applied to the subquery.   SQLAlchemy's 
327joined eager loading goes the extra mile, and then ten miles further, to 
328absolutely ensure that it does not affect the end result of the query, only
329the way collections and related objects are loaded, no matter what the format of the query is.
331What Kind of Loading to Use ?
334Which type of loading to use typically comes down to optimizing the tradeoff
335between number of SQL executions, complexity of SQL emitted, and amount of
336data fetched. Lets take two examples, a :func:`~sqlalchemy.orm.relationship`
337which references a collection, and a :func:`~sqlalchemy.orm.relationship` that
338references a scalar many-to-one reference.
340* One to Many Collection
342 * When using the default lazy loading, if you load 100 objects, and then access a collection on each of
343   them, a total of 101 SQL statements will be emitted, although each statement will typically be a
344   simple SELECT without any joins.
346 * When using joined loading, the load of 100 objects and their collections will emit only one SQL
347   statement.  However, the 
348   total number of rows fetched will be equal to the sum of the size of all the collections, plus one 
349   extra row for each parent object that has an empty collection.  Each row will also contain the full
350   set of columns represented by the parents, repeated for each collection item - SQLAlchemy does not
351   re-fetch these columns other than those of the primary key, however most DBAPIs (with some 
352   exceptions) will transmit the full data of each parent over the wire to the client connection in 
353   any case.  Therefore joined eager loading only makes sense when the size of the collections are 
354   relatively small.  The LEFT OUTER JOIN can also be performance intensive compared to an INNER join.
356 * When using subquery loading, the load of 100 objects will emit two SQL statements.  The second
357   statement will fetch a total number of rows equal to the sum of the size of all collections.  An
358   INNER JOIN is used, and a minimum of parent columns are requested, only the primary keys.  So a 
359   subquery load makes sense when the collections are larger.
361 * When multiple levels of depth are used with joined or subquery loading, loading collections-within-
362   collections will multiply the total number of rows fetched in a cartesian fashion.  Both forms
363   of eager loading always join from the original parent class.
365* Many to One Reference
367 * When using the default lazy loading, a load of 100 objects will like in the case of the collection
368   emit as many as 101 SQL statements.  However - there is a significant exception to this, in that
369   if the many-to-one reference is a simple foreign key reference to the target's primary key, each
370   reference will be checked first in the current identity map using :meth:`.Query.get`.  So here, 
371   if the collection of objects references a relatively small set of target objects, or the full set
372   of possible target objects have already been loaded into the session and are strongly referenced,
373   using the default of `lazy='select'` is by far the most efficient way to go.
375 * When using joined loading, the load of 100 objects will emit only one SQL statement.   The join
376   will be a LEFT OUTER JOIN, and the total number of rows will be equal to 100 in all cases.
377   If you know that each parent definitely has a child (i.e. the foreign
378   key reference is NOT NULL), the joined load can be configured with ``innerjoin=True``, which is
379   usually specified within the :func:`~sqlalchemy.orm.relationship`.   For a load of objects where
380   there are many possible target references which may have not been loaded already, joined loading
381   with an INNER JOIN is extremely efficient.
383 * Subquery loading will issue a second load for all the child objects, so for a load of 100 objects
384   there would be two SQL statements emitted.  There's probably not much advantage here over
385   joined loading, however, except perhaps that subquery loading can use an INNER JOIN in all cases
386   whereas joined loading requires that the foreign key is NOT NULL.
388.. _joinedload_and_join:
390.. _contains_eager:
392Routing Explicit Joins/Statements into Eagerly Loaded Collections
395The behavior of :func:`~sqlalchemy.orm.joinedload()` is such that joins are
396created automatically, using anonymous aliases as targets, the results of which 
397are routed into collections and
398scalar references on loaded objects. It is often the case that a query already
399includes the necessary joins which represent a particular collection or scalar
400reference, and the joins added by the joinedload feature are redundant - yet
401you'd still like the collections/references to be populated.
403For this SQLAlchemy supplies the :func:`~sqlalchemy.orm.contains_eager()`
404option. This option is used in the same manner as the
405:func:`~sqlalchemy.orm.joinedload()` option except it is assumed that the
406:class:`~sqlalchemy.orm.query.Query` will specify the appropriate joins
407explicitly. Below it's used with a ``from_statement`` load::
409    # mapping is the users->addresses mapping
410    mapper(User, users_table, properties={
411        'addresses': relationship(Address, addresses_table)
412    })
414    # define a query on USERS with an outer join to ADDRESSES
415    statement = users_table.outerjoin(addresses_table).select().apply_labels()
417    # construct a Query object which expects the "addresses" results
418    query = session.query(User).options(contains_eager('addresses'))
420    # get results normally
421    r = query.from_statement(statement)
423It works just as well with an inline :meth:`.Query.join` or
426    session.query(User).outerjoin(User.addresses).options(contains_eager(User.addresses)).all()
428If the "eager" portion of the statement is "aliased", the ``alias`` keyword
429argument to :func:`~sqlalchemy.orm.contains_eager` may be used to indicate it.
430This is a string alias name or reference to an actual
431:class:`~sqlalchemy.sql.expression.Alias` (or other selectable) object:
433.. sourcecode:: python+sql
435    # use an alias of the Address entity
436    adalias = aliased(Address)
438    # construct a Query object which expects the "addresses" results
439    query = session.query(User).\
440        outerjoin(adalias, User.addresses).\
441        options(contains_eager(User.addresses, alias=adalias))
443    # get results normally
444    {sql}r = query.all()
445    SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, adalias.address_id AS adalias_address_id,
446    adalias.user_id AS adalias_user_id, adalias.email_address AS adalias_email_address, (...other columns...)
447    FROM users LEFT OUTER JOIN email_addresses AS email_addresses_1 ON users.user_id = email_addresses_1.user_id
449The ``alias`` argument is used only as a source of columns to match up to the
450result set. You can use it to match up the result to arbitrary label
451names in a string SQL statement, by passing a :func:`.select` which links those
452labels to the mapped :class:`.Table`::
454    # label the columns of the addresses table
455    eager_columns = select([
456                        addresses.c.address_id.label('a1'),
457                        addresses.c.email_address.label('a2'),
458                        addresses.c.user_id.label('a3')])
460    # select from a raw SQL statement which uses those label names for the
461    # addresses table.  contains_eager() matches them up.
462    query = session.query(User).\
463        from_statement("select users.*, addresses.address_id as a1, "
464                "addresses.email_address as a2, addresses.user_id as a3 "
465                "from users left outer join addresses on users.user_id=addresses.user_id").\
466        options(contains_eager(User.addresses, alias=eager_columns))
468The path given as the argument to :func:`.contains_eager` needs
469to be a full path from the starting entity. For example if we were loading
470``Users->orders->Order->items->Item``, the string version would look like::
472    query(User).options(contains_eager('orders', 'items'))
474Or using the class-bound descriptor::
476    query(User).options(contains_eager(User.orders, Order.items))
479Relation Loader API
482.. autofunction:: contains_alias
484.. autofunction:: contains_eager
486.. autofunction:: eagerload
488.. autofunction:: eagerload_all
490.. autofunction:: immediateload
492.. autofunction:: joinedload
494.. autofunction:: joinedload_all
496.. autofunction:: lazyload
498.. autofunction:: noload
500.. autofunction:: subqueryload
502.. autofunction:: subqueryload_all