PageRenderTime 70ms CodeModel.GetById 21ms RepoModel.GetById 0ms app.codeStats 0ms


Plain Text | 502 lines | 385 code | 117 blank | 0 comment | 0 complexity | eaa49ef33576f2072cb9c8fe8fee0a7b MD5 | raw file
  1. .. _loading_toplevel:
  2. .. currentmodule:: sqlalchemy.orm
  3. Relationship Loading Techniques
  4. ===============================
  5. A big part of SQLAlchemy is providing a wide range of control over how related objects get loaded when querying. This behavior
  6. can be configured at mapper construction time using the ``lazy`` parameter to the :func:`.relationship` function,
  7. as well as by using options with the :class:`.Query` object.
  8. Using Loader Strategies: Lazy Loading, Eager Loading
  9. ----------------------------------------------------
  10. By default, all inter-object relationships are **lazy loading**. The scalar or
  11. collection attribute associated with a :func:`~sqlalchemy.orm.relationship`
  12. contains a trigger which fires the first time the attribute is accessed. This
  13. trigger, in all but one case, issues a SQL call at the point of access
  14. in order to load the related object or objects:
  15. .. sourcecode:: python+sql
  16. {sql}>>> jack.addresses
  17. SELECT AS addresses_id, addresses.email_address AS addresses_email_address,
  18. addresses.user_id AS addresses_user_id
  19. FROM addresses
  20. WHERE ? = addresses.user_id
  21. [5]
  22. {stop}[<Address(u'')>, <Address(u'')>]
  23. The one case where SQL is not emitted is for a simple many-to-one relationship, when
  24. the related object can be identified by its primary key alone and that object is already
  25. present in the current :class:`.Session`.
  26. This default behavior of "load upon attribute access" is known as "lazy" or
  27. "select" loading - the name "select" because a "SELECT" statement is typically emitted
  28. when the attribute is first accessed.
  29. In the :ref:`ormtutorial_toplevel`, we introduced the concept of **Eager
  30. Loading**. We used an ``option`` in conjunction with the
  31. :class:`~sqlalchemy.orm.query.Query` object in order to indicate that a
  32. relationship should be loaded at the same time as the parent, within a single
  33. SQL query. This option, known as :func:`.joinedload`, connects a JOIN (by default
  34. a LEFT OUTER join) to the statement and populates the scalar/collection from the
  35. same result set as that of the parent:
  36. .. sourcecode:: python+sql
  37. {sql}>>> jack = session.query(User).\
  38. ... options(joinedload('addresses')).\
  39. ... filter_by(name='jack').all() #doctest: +NORMALIZE_WHITESPACE
  40. SELECT AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
  41. addresses_1.user_id AS addresses_1_user_id, AS users_id, AS users_name,
  42. users.fullname AS users_fullname, users.password AS users_password
  43. FROM users LEFT OUTER JOIN addresses AS addresses_1 ON = addresses_1.user_id
  44. WHERE = ?
  45. ['jack']
  46. In addition to "joined eager loading", a second option for eager loading
  47. exists, called "subquery eager loading". This kind of eager loading emits an
  48. additional SQL statement for each collection requested, aggregated across all
  49. parent objects:
  50. .. sourcecode:: python+sql
  51. {sql}>>> jack = session.query(User).\
  52. ... options(subqueryload('addresses')).\
  53. ... filter_by(name='jack').all()
  54. SELECT AS users_id, AS users_name, users.fullname AS users_fullname,
  55. users.password AS users_password
  56. FROM users
  57. WHERE = ?
  58. ('jack',)
  59. SELECT AS addresses_id, addresses.email_address AS addresses_email_address,
  60. addresses.user_id AS addresses_user_id, anon_1.users_id AS anon_1_users_id
  61. FROM (SELECT AS users_id
  62. FROM users
  63. WHERE = ?) AS anon_1 JOIN addresses ON anon_1.users_id = addresses.user_id
  64. ORDER BY anon_1.users_id,
  65. ('jack',)
  66. The default **loader strategy** for any :func:`~sqlalchemy.orm.relationship`
  67. is configured by the ``lazy`` keyword argument, which defaults to ``select`` - this indicates
  68. a "select" statement .
  69. Below we set it as ``joined`` so that the ``children`` relationship is eager
  70. loading, using a join:
  71. .. sourcecode:: python+sql
  72. # load the 'children' collection using LEFT OUTER JOIN
  73. mapper(Parent, parent_table, properties={
  74. 'children': relationship(Child, lazy='joined')
  75. })
  76. We can also set it to eagerly load using a second query for all collections,
  77. using ``subquery``:
  78. .. sourcecode:: python+sql
  79. # load the 'children' attribute using a join to a subquery
  80. mapper(Parent, parent_table, properties={
  81. 'children': relationship(Child, lazy='subquery')
  82. })
  83. When querying, all three choices of loader strategy are available on a
  84. per-query basis, using the :func:`~sqlalchemy.orm.joinedload`,
  85. :func:`~sqlalchemy.orm.subqueryload` and :func:`~sqlalchemy.orm.lazyload`
  86. query options:
  87. .. sourcecode:: python+sql
  88. # set children to load lazily
  89. session.query(Parent).options(lazyload('children')).all()
  90. # set children to load eagerly with a join
  91. session.query(Parent).options(joinedload('children')).all()
  92. # set children to load eagerly with a second statement
  93. session.query(Parent).options(subqueryload('children')).all()
  94. To reference a relationship that is deeper than one level, separate the names by periods:
  95. .. sourcecode:: python+sql
  96. session.query(Parent).options(joinedload('')).all()
  97. When using dot-separated names with :func:`~sqlalchemy.orm.joinedload` or
  98. :func:`~sqlalchemy.orm.subqueryload`, the option applies **only** to the actual
  99. attribute named, and **not** its ancestors. For example, suppose a mapping
  100. from ``A`` to ``B`` to ``C``, where the relationships, named ``atob`` and
  101. ``btoc``, are both lazy-loading. A statement like the following:
  102. .. sourcecode:: python+sql
  103. session.query(A).options(joinedload('atob.btoc')).all()
  104. will load only ``A`` objects to start. When the ``atob`` attribute on each
  105. ``A`` is accessed, the returned ``B`` objects will *eagerly* load their ``C``
  106. objects.
  107. Therefore, to modify the eager load to load both ``atob`` as well as ``btoc``,
  108. place joinedloads for both:
  109. .. sourcecode:: python+sql
  110. session.query(A).options(joinedload('atob'), joinedload('atob.btoc')).all()
  111. or more succinctly just use :func:`~sqlalchemy.orm.joinedload_all` or
  112. :func:`~sqlalchemy.orm.subqueryload_all`:
  113. .. sourcecode:: python+sql
  114. session.query(A).options(joinedload_all('atob.btoc')).all()
  115. There are two other loader strategies available, **dynamic loading** and **no
  116. loading**; these are described in :ref:`largecollections`.
  117. Default Loading Strategies
  118. --------------------------
  119. .. versionadded:: 0.7.5
  120. Default loader strategies as a new feature.
  121. Each of :func:`.joinedload`, :func:`.subqueryload`, :func:`.lazyload`,
  122. and :func:`.noload` can be used to set the default style of
  123. :func:`.relationship` loading
  124. for a particular query, affecting all :func:`.relationship` -mapped
  125. attributes not otherwise
  126. specified in the :class:`.Query`. This feature is available by passing
  127. the string ``'*'`` as the argument to any of these options::
  128. session.query(MyClass).options(lazyload('*'))
  129. Above, the ``lazyload('*')`` option will supercede the ``lazy`` setting
  130. of all :func:`.relationship` constructs in use for that query,
  131. except for those which use the ``'dynamic'`` style of loading.
  132. If some relationships specify
  133. ``lazy='joined'`` or ``lazy='subquery'``, for example,
  134. using ``default_strategy(lazy='select')`` will unilaterally
  135. cause all those relationships to use ``'select'`` loading.
  136. The option does not supercede loader options stated in the
  137. query, such as :func:`.eagerload`,
  138. :func:`.subqueryload`, etc. The query below will still use joined loading
  139. for the ``widget`` relationship::
  140. session.query(MyClass).options(
  141. lazyload('*'),
  142. joinedload(MyClass.widget)
  143. )
  144. If multiple ``'*'`` options are passed, the last one overrides
  145. those previously passed.
  146. .. _zen_of_eager_loading:
  147. The Zen of Eager Loading
  148. -------------------------
  149. The philosophy behind loader strategies is that any set of loading schemes can be
  150. applied to a particular query, and *the results don't change* - only the number
  151. of SQL statements required to fully load related objects and collections changes. A particular
  152. query might start out using all lazy loads. After using it in context, it might be revealed
  153. that particular attributes or collections are always accessed, and that it would be more
  154. efficient to change the loader strategy for these. The strategy can be changed with no other
  155. modifications to the query, the results will remain identical, but fewer SQL statements would be emitted.
  156. In theory (and pretty much in practice), nothing you can do to the :class:`.Query` would make it load
  157. a different set of primary or related objects based on a change in loader strategy.
  158. How :func:`joinedload` in particular achieves this result of not impacting
  159. entity rows returned in any way is that it creates an anonymous alias of the joins it adds to your
  160. query, so that they can't be referenced by other parts of the query. For example,
  161. the query below uses :func:`.joinedload` to create a LEFT OUTER JOIN from ``users``
  162. to ``addresses``, however the ``ORDER BY`` added against ``Address.email_address``
  163. is not valid - the ``Address`` entity is not named in the query:
  164. .. sourcecode:: python+sql
  165. >>> jack = session.query(User).\
  166. ... options(joinedload(User.addresses)).\
  167. ... filter('jack').\
  168. ... order_by(Address.email_address).all()
  169. {opensql}SELECT AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
  170. addresses_1.user_id AS addresses_1_user_id, AS users_id, AS users_name,
  171. users.fullname AS users_fullname, users.password AS users_password
  172. FROM users LEFT OUTER JOIN addresses AS addresses_1 ON = addresses_1.user_id
  173. WHERE = ? ORDER BY addresses.email_address <-- this part is wrong !
  174. ['jack']
  175. Above, ``ORDER BY addresses.email_address`` is not valid since ``addresses`` is not in the
  176. FROM list. The correct way to load the ``User`` records and order by email
  177. address is to use :meth:`.Query.join`:
  178. .. sourcecode:: python+sql
  179. >>> jack = session.query(User).\
  180. ... join(User.addresses).\
  181. ... filter('jack').\
  182. ... order_by(Address.email_address).all()
  183. {opensql}
  184. SELECT AS users_id, AS users_name,
  185. users.fullname AS users_fullname, users.password AS users_password
  186. FROM users JOIN addresses ON = addresses.user_id
  187. WHERE = ? ORDER BY addresses.email_address
  188. ['jack']
  189. The statement above is of course not the same as the previous one, in that the columns from ``addresses``
  190. are not included in the result at all. We can add :func:`.joinedload` back in, so that
  191. there are two joins - one is that which we are ordering on, the other is used anonymously to
  192. load the contents of the ``User.addresses`` collection:
  193. .. sourcecode:: python+sql
  194. >>> jack = session.query(User).\
  195. ... join(User.addresses).\
  196. ... options(joinedload(User.addresses)).\
  197. ... filter('jack').\
  198. ... order_by(Address.email_address).all()
  199. {opensql}SELECT AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
  200. addresses_1.user_id AS addresses_1_user_id, AS users_id, AS users_name,
  201. users.fullname AS users_fullname, users.password AS users_password
  202. FROM users JOIN addresses ON = addresses.user_id
  203. LEFT OUTER JOIN addresses AS addresses_1 ON = addresses_1.user_id
  204. WHERE = ? ORDER BY addresses.email_address
  205. ['jack']
  206. What we see above is that our usage of :meth:`.Query.join` is to supply JOIN clauses we'd like
  207. to use in subsequent query criterion, whereas our usage of :func:`.joinedload` only concerns
  208. itself with the loading of the ``User.addresses`` collection, for each ``User`` in the result.
  209. In this case, the two joins most probably appear redundant - which they are. If we
  210. wanted to use just one JOIN for collection loading as well as ordering, we use the
  211. :func:`.contains_eager` option, described in :ref:`contains_eager` below. But
  212. to see why :func:`joinedload` does what it does, consider if we were **filtering** on a
  213. particular ``Address``:
  214. .. sourcecode:: python+sql
  215. >>> jack = session.query(User).\
  216. ... join(User.addresses).\
  217. ... options(joinedload(User.addresses)).\
  218. ... filter('jack').\
  219. ... filter(Address.email_address=='').\
  220. ... all()
  221. {opensql}SELECT AS addresses_1_id, addresses_1.email_address AS addresses_1_email_address,
  222. addresses_1.user_id AS addresses_1_user_id, AS users_id, AS users_name,
  223. users.fullname AS users_fullname, users.password AS users_password
  224. FROM users JOIN addresses ON = addresses.user_id
  225. LEFT OUTER JOIN addresses AS addresses_1 ON = addresses_1.user_id
  226. WHERE = ? AND addresses.email_address = ?
  227. ['jack', '']
  228. Above, we can see that the two JOINs have very different roles. One will match exactly
  229. one row, that of the join of ``User`` and ``Address`` where ``Address.email_address==''``.
  230. The other LEFT OUTER JOIN will match *all* ``Address`` rows related to ``User``,
  231. and is only used to populate the ``User.addresses`` collection, for those ``User`` objects
  232. that are returned.
  233. By changing the usage of :func:`.joinedload` to another style of loading, we can change
  234. how the collection is loaded completely independently of SQL used to retrieve
  235. the actual ``User`` rows we want. Below we change :func:`.joinedload` into
  236. :func:`.subqueryload`:
  237. .. sourcecode:: python+sql
  238. >>> jack = session.query(User).\
  239. ... join(User.addresses).\
  240. ... options(subqueryload(User.addresses)).\
  241. ... filter('jack').\
  242. ... filter(Address.email_address=='').\
  243. ... all()
  244. {opensql}SELECT AS users_id, AS users_name,
  245. users.fullname AS users_fullname, users.password AS users_password
  246. FROM users JOIN addresses ON = addresses.user_id
  247. WHERE = ? AND addresses.email_address = ?
  248. ['jack', '']
  249. # ... subqueryload() emits a SELECT in order
  250. # to load all address records ...
  251. When using joined eager loading, if the
  252. query contains a modifier that impacts the rows returned
  253. externally to the joins, such as when using DISTINCT, LIMIT, OFFSET
  254. or equivalent, the completed statement is first
  255. wrapped inside a subquery, and the joins used specifically for joined eager
  256. loading are applied to the subquery. SQLAlchemy's
  257. joined eager loading goes the extra mile, and then ten miles further, to
  258. absolutely ensure that it does not affect the end result of the query, only
  259. the way collections and related objects are loaded, no matter what the format of the query is.
  260. What Kind of Loading to Use ?
  261. -----------------------------
  262. Which type of loading to use typically comes down to optimizing the tradeoff
  263. between number of SQL executions, complexity of SQL emitted, and amount of
  264. data fetched. Lets take two examples, a :func:`~sqlalchemy.orm.relationship`
  265. which references a collection, and a :func:`~sqlalchemy.orm.relationship` that
  266. references a scalar many-to-one reference.
  267. * One to Many Collection
  268. * When using the default lazy loading, if you load 100 objects, and then access a collection on each of
  269. them, a total of 101 SQL statements will be emitted, although each statement will typically be a
  270. simple SELECT without any joins.
  271. * When using joined loading, the load of 100 objects and their collections will emit only one SQL
  272. statement. However, the
  273. total number of rows fetched will be equal to the sum of the size of all the collections, plus one
  274. extra row for each parent object that has an empty collection. Each row will also contain the full
  275. set of columns represented by the parents, repeated for each collection item - SQLAlchemy does not
  276. re-fetch these columns other than those of the primary key, however most DBAPIs (with some
  277. exceptions) will transmit the full data of each parent over the wire to the client connection in
  278. any case. Therefore joined eager loading only makes sense when the size of the collections are
  279. relatively small. The LEFT OUTER JOIN can also be performance intensive compared to an INNER join.
  280. * When using subquery loading, the load of 100 objects will emit two SQL statements. The second
  281. statement will fetch a total number of rows equal to the sum of the size of all collections. An
  282. INNER JOIN is used, and a minimum of parent columns are requested, only the primary keys. So a
  283. subquery load makes sense when the collections are larger.
  284. * When multiple levels of depth are used with joined or subquery loading, loading collections-within-
  285. collections will multiply the total number of rows fetched in a cartesian fashion. Both forms
  286. of eager loading always join from the original parent class.
  287. * Many to One Reference
  288. * When using the default lazy loading, a load of 100 objects will like in the case of the collection
  289. emit as many as 101 SQL statements. However - there is a significant exception to this, in that
  290. if the many-to-one reference is a simple foreign key reference to the target's primary key, each
  291. reference will be checked first in the current identity map using :meth:`.Query.get`. So here,
  292. if the collection of objects references a relatively small set of target objects, or the full set
  293. of possible target objects have already been loaded into the session and are strongly referenced,
  294. using the default of `lazy='select'` is by far the most efficient way to go.
  295. * When using joined loading, the load of 100 objects will emit only one SQL statement. The join
  296. will be a LEFT OUTER JOIN, and the total number of rows will be equal to 100 in all cases.
  297. If you know that each parent definitely has a child (i.e. the foreign
  298. key reference is NOT NULL), the joined load can be configured with ``innerjoin=True``, which is
  299. usually specified within the :func:`~sqlalchemy.orm.relationship`. For a load of objects where
  300. there are many possible target references which may have not been loaded already, joined loading
  301. with an INNER JOIN is extremely efficient.
  302. * Subquery loading will issue a second load for all the child objects, so for a load of 100 objects
  303. there would be two SQL statements emitted. There's probably not much advantage here over
  304. joined loading, however, except perhaps that subquery loading can use an INNER JOIN in all cases
  305. whereas joined loading requires that the foreign key is NOT NULL.
  306. .. _joinedload_and_join:
  307. .. _contains_eager:
  308. Routing Explicit Joins/Statements into Eagerly Loaded Collections
  309. ------------------------------------------------------------------
  310. The behavior of :func:`~sqlalchemy.orm.joinedload()` is such that joins are
  311. created automatically, using anonymous aliases as targets, the results of which
  312. are routed into collections and
  313. scalar references on loaded objects. It is often the case that a query already
  314. includes the necessary joins which represent a particular collection or scalar
  315. reference, and the joins added by the joinedload feature are redundant - yet
  316. you'd still like the collections/references to be populated.
  317. For this SQLAlchemy supplies the :func:`~sqlalchemy.orm.contains_eager()`
  318. option. This option is used in the same manner as the
  319. :func:`~sqlalchemy.orm.joinedload()` option except it is assumed that the
  320. :class:`~sqlalchemy.orm.query.Query` will specify the appropriate joins
  321. explicitly. Below it's used with a ``from_statement`` load::
  322. # mapping is the users->addresses mapping
  323. mapper(User, users_table, properties={
  324. 'addresses': relationship(Address, addresses_table)
  325. })
  326. # define a query on USERS with an outer join to ADDRESSES
  327. statement = users_table.outerjoin(addresses_table).select().apply_labels()
  328. # construct a Query object which expects the "addresses" results
  329. query = session.query(User).options(contains_eager('addresses'))
  330. # get results normally
  331. r = query.from_statement(statement)
  332. It works just as well with an inline :meth:`.Query.join` or
  333. :meth:`.Query.outerjoin`::
  334. session.query(User).outerjoin(User.addresses).options(contains_eager(User.addresses)).all()
  335. If the "eager" portion of the statement is "aliased", the ``alias`` keyword
  336. argument to :func:`~sqlalchemy.orm.contains_eager` may be used to indicate it.
  337. This is a string alias name or reference to an actual
  338. :class:`~sqlalchemy.sql.expression.Alias` (or other selectable) object:
  339. .. sourcecode:: python+sql
  340. # use an alias of the Address entity
  341. adalias = aliased(Address)
  342. # construct a Query object which expects the "addresses" results
  343. query = session.query(User).\
  344. outerjoin(adalias, User.addresses).\
  345. options(contains_eager(User.addresses, alias=adalias))
  346. # get results normally
  347. {sql}r = query.all()
  348. SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, adalias.address_id AS adalias_address_id,
  349. adalias.user_id AS adalias_user_id, adalias.email_address AS adalias_email_address, (...other columns...)
  350. FROM users LEFT OUTER JOIN email_addresses AS email_addresses_1 ON users.user_id = email_addresses_1.user_id
  351. The ``alias`` argument is used only as a source of columns to match up to the
  352. result set. You can use it to match up the result to arbitrary label
  353. names in a string SQL statement, by passing a :func:`.select` which links those
  354. labels to the mapped :class:`.Table`::
  355. # label the columns of the addresses table
  356. eager_columns = select([
  357. addresses.c.address_id.label('a1'),
  358. addresses.c.email_address.label('a2'),
  359. addresses.c.user_id.label('a3')])
  360. # select from a raw SQL statement which uses those label names for the
  361. # addresses table. contains_eager() matches them up.
  362. query = session.query(User).\
  363. from_statement("select users.*, addresses.address_id as a1, "
  364. "addresses.email_address as a2, addresses.user_id as a3 "
  365. "from users left outer join addresses on users.user_id=addresses.user_id").\
  366. options(contains_eager(User.addresses, alias=eager_columns))
  367. The path given as the argument to :func:`.contains_eager` needs
  368. to be a full path from the starting entity. For example if we were loading
  369. ``Users->orders->Order->items->Item``, the string version would look like::
  370. query(User).options(contains_eager('orders', 'items'))
  371. Or using the class-bound descriptor::
  372. query(User).options(contains_eager(User.orders, Order.items))
  373. Relation Loader API
  374. --------------------
  375. .. autofunction:: contains_alias
  376. .. autofunction:: contains_eager
  377. .. autofunction:: eagerload
  378. .. autofunction:: eagerload_all
  379. .. autofunction:: immediateload
  380. .. autofunction:: joinedload
  381. .. autofunction:: joinedload_all
  382. .. autofunction:: lazyload
  383. .. autofunction:: noload
  384. .. autofunction:: subqueryload
  385. .. autofunction:: subqueryload_all