/doc/src/xml-processing/xquery-introduction.qdoc

https://bitbucket.org/ultra_iter/qt-vtl · Unknown · 1006 lines · 787 code · 219 blank · 0 comment · 0 complexity · 7093e31853da04ab29e550b31a6ba13c MD5 · raw file

  1. /****************************************************************************
  2. **
  3. ** Copyright (C) 2012 Nokia Corporation and/or its subsidiary(-ies).
  4. ** All rights reserved.
  5. ** Contact: Nokia Corporation (qt-info@nokia.com)
  6. **
  7. ** This file is part of the documentation of the Qt Toolkit.
  8. **
  9. ** $QT_BEGIN_LICENSE:FDL$
  10. ** GNU Free Documentation License
  11. ** Alternatively, this file may be used under the terms of the GNU Free
  12. ** Documentation License version 1.3 as published by the Free Software
  13. ** Foundation and appearing in the file included in the packaging of
  14. ** this file.
  15. **
  16. ** Other Usage
  17. ** Alternatively, this file may be used in accordance with the terms
  18. ** and conditions contained in a signed written agreement between you
  19. ** and Nokia.
  20. **
  21. **
  22. **
  23. **
  24. ** $QT_END_LICENSE$
  25. **
  26. ****************************************************************************/
  27. /*!
  28. \page xquery-introduction.html
  29. \title A Short Path to XQuery
  30. \pagekeywords XPath XQuery
  31. \startpage XQuery
  32. \target XQuery-introduction
  33. XQuery is a language for querying XML data or non-XML data that can be
  34. modeled as XML. XQuery is specified by the \l{http://www.w3.org}{W3C}.
  35. \tableofcontents
  36. \section1 Introduction
  37. Where Java and C++ are \e{statement-based} languages, the XQuery
  38. language is \e{expression-based}. The simplest XQuery expression is an
  39. XML element constructor:
  40. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 20
  41. This \c{<recipe/>} element is an XQuery expression that forms a
  42. complete XQuery. In fact, this XQuery doesn't actually query
  43. anything. It just creates an empty \c{<recipe/>} element in the
  44. output. But \l{Constructing Elements} {constructing new elements in an
  45. XQuery} is often necessary.
  46. An XQuery expression can also be enclosed in curly braces and embedded
  47. in another XQuery expression. This XQuery has a document expression
  48. embedded in a node expression:
  49. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 21
  50. It creates a new \c{<html>} element in the output and sets its \c{id}
  51. attribute to be the \c{id} attribute from an \c{<html>} element in the
  52. \c{other.html} file.
  53. \section1 Using Path Expressions To Match And Select Items
  54. In C++ and Java, we write nested \c{for} loops and recursive functions
  55. to traverse XML trees in search of elements of interest. In XQuery, we
  56. write these iterative and recursive algorithms with \e{path
  57. expressions}.
  58. A path expression looks somewhat like a typical \e{file pathname} for
  59. locating a file in a hierarchical file system. It is a sequence of one
  60. or more \e{steps} separated by slash '/' or double slash '//'.
  61. Although path expressions are used for traversing XML trees, not file
  62. systems, in QtXmlPatterms we can model a file system to look like an
  63. XML tree, so in QtXmlPatterns we can use XQuery to traverse a file
  64. system. See the \l {File System Example} {file system example}.
  65. Think of a path expression as an algorithm for traversing an XML tree
  66. to find and collect items of interest. This algorithm is evaluated by
  67. evaluating each step moving from left to right through the sequence. A
  68. step is evaluated with a set of input items (nodes and atomic values),
  69. sometimes called the \e focus. The step is evaluated for each item in
  70. the focus. These evaluations produce a new set of items, called the \e
  71. result, which then becomes the focus that is passed to the next step.
  72. Evaluation of the final step produces the final result, which is the
  73. result of the XQuery. The items in the result set are presented in
  74. \l{http://www.w3.org/TR/xquery/#id-document-order} {document order}
  75. and without duplicates.
  76. With QtXmlPatterns, a standard way to present the initial focus to a
  77. query is to call QXmlQuery::setFocus(). Another common way is to let
  78. the XQuery itself create the initial focus by using the first step of
  79. the path expression to call the XQuery \c{doc()} function. The
  80. \c{doc()} function loads an XML document and returns the \e {document
  81. node}. Note that the document node is \e{not} the same as the
  82. \e{document element}. The \e{document node} is a node constructed in
  83. memory, when the document is loaded. It represents the entire XML
  84. document, not the document element. The \e{document element} is the
  85. single, top-level XML element in the file. The \c{doc()} function
  86. returns the document node, which becomes the singleton node in the
  87. initial focus set. The document node will have one child node, and
  88. that child node will represent the document element. Consider the
  89. following XQuery:
  90. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 18
  91. The \c{doc()} function loads the \c{cookbook.xml} file and returns the
  92. document node. The document node then becomes the focus for the next
  93. step \c{//recipe}. Here the double slash means select all \c{<recipe>}
  94. elements found below the document node, regardless of where they
  95. appear in the document tree. The query selects all \c{<recipe>}
  96. elements in the cookbook. See \l{Running The Cookbook Examples} for
  97. instructions on how to run this query (and most of the ones that
  98. follow) from the command line.
  99. Conceptually, evaluation of the steps of a path expression is similar
  100. to iterating through the same number of nested \e{for} loops. Consider
  101. the following XQuery, which builds on the previous one:
  102. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 19
  103. This XQuery is a single path expression composed of three steps. The
  104. first step creates the initial focus by calling the \c{doc()}
  105. function. We can paraphrase what the query engine does at each step:
  106. \list 1
  107. \o for each node in the initial focus (the document node)...
  108. \o for each descendant node that is a \c{<recipe>} element...
  109. \o collect the child nodes that are \c{<title>} elements.
  110. \endlist
  111. Again the double slash means select all the \c{<recipe>} elements in the
  112. document. The single slash before the \c{<title>} element means select
  113. only those \c{<title>} elements that are \e{child} elements of a
  114. \c{<recipe>} element (i.e. not grandchildren, etc). The XQuery evaluates
  115. to a final result set containing the \c{<title>} element of each
  116. \c{<recipe>} element in the cookbook.
  117. \section2 Axis Steps
  118. The most common kind of path step is called an \e{axis step}, which
  119. tells the query engine which way to navigate from the context node,
  120. and which test to perform when it encounters nodes along the way. An
  121. axis step has two parts, an \e{axis specifier}, and a \e{node test}.
  122. Conceptually, evaluation of an axis step proceeds as follows: For each
  123. node in the focus set, the query engine navigates out from the node
  124. along the specified axis and applies the node test to each node it
  125. encounters. The nodes selected by the node test are collected in the
  126. result set, which becomes the focus set for the next step.
  127. In the example XQuery above, the second and third steps are both axis
  128. steps. Both apply the \c{element(name)} node test to nodes encountered
  129. while traversing along some axis. But in this example, the two axis
  130. steps are written in a \l{Shorthand Form} {shorthand form}, where the
  131. axis specifier and the node test are not written explicitly but are
  132. implied. XQueries are normally written in this shorthand form, but
  133. they can also be written in the longhand form. If we rewrite the
  134. XQuery in the longhand form, it looks like this:
  135. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 22
  136. The two axis steps have been expanded. The first step (\c{//recipe})
  137. has been rewritten as \c{/descendant-or-self::element(recipe)}, where
  138. \c{descendant-or-self::} is the axis specifier and \c{element(recipe)}
  139. is the node test. The second step (\c{title}) has been rewritten as
  140. \c{/child::element(title)}, where \c{child::} is the axis specifier
  141. and \c{element(title)} is the node test. The output of the expanded
  142. XQuery will be exactly the same as the output of the shorthand form.
  143. To create an axis step, concatenate an axis specifier and a node
  144. test. The following sections list the axis specifiers and node tests
  145. that are available.
  146. \section2 Axis Specifiers
  147. An axis specifier defines the direction you want the query engine to
  148. take, when it navigates away from the context node. QtXmlPatterns
  149. supports the following axes.
  150. \table
  151. \header
  152. \o Axis Specifier
  153. \o refers to the axis containing...
  154. \row
  155. \o \c{self::}
  156. \o the context node itself
  157. \row
  158. \o \c{attribute::}
  159. \o all attribute nodes of the context node
  160. \row
  161. \o \c{child::}
  162. \o all child nodes of the context node (not attributes)
  163. \row
  164. \o \c{descendant::}
  165. \o all descendants of the context node (children, grandchildren, etc)
  166. \row
  167. \o \c{descendant-or-self::}
  168. \o all nodes in \c{descendant} + \c{self}
  169. \row
  170. \o \c{parent::}
  171. \o the parent node of the context node, or empty if there is no parent
  172. \row
  173. \o \c{ancestor::}
  174. \o all ancestors of the context node (parent, grandparent, etc)
  175. \row
  176. \o \c{ancestor-or-self::}
  177. \o all nodes in \c{ancestor} + \c{self}
  178. \row
  179. \o \c{following::}
  180. \o all nodes in the tree containing the context node, \e not
  181. including \c{descendant}, \e and that follow the context node
  182. in the document
  183. \row
  184. \o \c{preceding::}
  185. \o all nodes in the tree contianing the context node, \e not
  186. including \c{ancestor}, \e and that precede the context node in
  187. the document
  188. \row
  189. \o \c{following-sibling::}
  190. \o all children of the context node's \c{parent} that follow the
  191. context node in the document
  192. \row
  193. \o \c{preceding-sibling::}
  194. \o all children of the context node's \c{parent} that precede the
  195. context node in the document
  196. \endtable
  197. \section2 Node Tests
  198. A node test is a conditional expression that must be true for a node
  199. if the node is to be selected by the axis step. The conditional
  200. expression can test just the \e kind of node, or it can test the \e
  201. kind of node and the \e name of the node. The XQuery specification for
  202. \l{http://www.w3.org/TR/xquery/#node-tests} {node tests} also defines
  203. a third condition, the node's \e {Schema Type}, but schema type tests
  204. are not supported in QtXmlPatterns.
  205. QtXmlPatterns supports the following node tests. The tests that have a
  206. \c{name} parameter test the node's name in addition to its \e{kind}
  207. and are often called the \l{Name Tests}.
  208. \table
  209. \header
  210. \o Node Test
  211. \o matches all...
  212. \row
  213. \o \c{node()}
  214. \o nodes of any kind
  215. \row
  216. \o \c{text()}
  217. \o text nodes
  218. \row
  219. \o \c{comment()}
  220. \o comment nodes
  221. \row
  222. \o \c{element()}
  223. \o element nodes (same as star: *)
  224. \row
  225. \o \c{element(name)}
  226. \o element nodes named \c{name}
  227. \row
  228. \o \c{attribute()}
  229. \o attribute nodes
  230. \row
  231. \o \c{attribute(name)}
  232. \o attribute nodes named \c{name}
  233. \row
  234. \o \c{processing-instruction()}
  235. \o processing-instructions
  236. \row
  237. \o \c{processing-instruction(name)}
  238. \o processing-instructions named \c{name}
  239. \row
  240. \o \c{document-node()}
  241. \o document nodes (there is only one)
  242. \row
  243. \o \c{document-node(element(name))}
  244. \o document node with document element \c{name}
  245. \endtable
  246. \target Shorthand Form
  247. \section2 Shorthand Form
  248. Writing axis steps using the longhand form with axis specifiers and
  249. node tests is semantically clear but syntactically verbose. The
  250. shorthand form is easy to learn and, once you learn it, just as easy
  251. to read. In the shorthand form, the axis specifier and node test are
  252. implied by the syntax. XQueries are normally written in the shorthand
  253. form. Here is a table of some frequently used shorthand forms:
  254. \table
  255. \header
  256. \o Shorthand syntax
  257. \o Short for...
  258. \o matches all...
  259. \row
  260. \o \c{name}
  261. \o \c{child::element(name)}
  262. \o child nodes that are \c{name} elements
  263. \row
  264. \o \c{*}
  265. \o \c{child::element()}
  266. \o child nodes that are elements (\c{node()} matches
  267. \e all child nodes)
  268. \row
  269. \o \c{..}
  270. \o \c{parent::node()}
  271. \o parent nodes (there is only one)
  272. \row
  273. \o \c{@*}
  274. \o \c{attribute::attribute()}
  275. \o attribute nodes
  276. \row
  277. \o \c{@name}
  278. \o \c{attribute::attribute(name)}
  279. \o \c{name} attributes
  280. \row
  281. \o \c{//}
  282. \o \c{descendant-or-self::node()}
  283. \o descendent nodes (when used instead of '/')
  284. \endtable
  285. The \l{http://www.w3.org/TR/xquery/}{XQuery language specification}
  286. has a more detailed section on the shorthand form, which it calls the
  287. \l{http://www.w3.org/TR/xquery/#abbrev} {abbreviated syntax}. More
  288. examples of path expressions written in the shorthand form are found
  289. there. There is also a section listing examples of path expressions
  290. written in the \l{http://www.w3.org/TR/xquery/#unabbrev} {longhand
  291. form}.
  292. \target Name Tests
  293. \section2 Name Tests
  294. The name tests are the \l{Node Tests} that have the \c{name}
  295. parameter. A name test must match the node \e name in addition to the
  296. node \e kind. We have already seen name tests used:
  297. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 19
  298. In this path expression, both \c{recipe} and \c{title} are name tests
  299. written in the shorthand form. XQuery resolves these names
  300. (\l{http://www.w3.org/TR/xquery/#id-basics}{QNames}) to their expanded
  301. form using whatever
  302. \l{http://www.w3.org/TR/xquery/#dt-namespace-declaration} {namespace
  303. declarations} it knows about. Resolving a name to its expanded form
  304. means replacing its namespace prefix, if one is present (there aren't
  305. any present in the example), with a namespace URI. The expanded name
  306. then consists of the namespace URI and the local name.
  307. But the names in the example above don't have namespace prefixes,
  308. because we didn't include a namespace declaration in our
  309. \c{cookbook.xml} file. However, we will often use XQuery to query XML
  310. documents that use namespaces. Forgetting to declare the correct
  311. namespace(s) in an XQuery is a common cause of XQuery failures. Let's
  312. add a \e{default} namespace to \c{cookbook.xml} now. Change the
  313. \e{document element} in \c{cookbook.xml} from:
  314. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 23
  315. to...
  316. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 24
  317. This is called a \e{default namespace} declaration because it doesn't
  318. include a namespace prefix. By including this default namespace
  319. declaration in the document element, we mean that all unprefixed
  320. \e{element} names in the document, including the document element
  321. itself (\c{cookbook}), are automatically in the default namespace
  322. \c{http://cookbook/namespace}. Note that unprefixed \e{attribute}
  323. names are not affected by the default namespace declaration. They are
  324. always considered to be in \e{no namespace}. Note also that the URL
  325. we choose as our namespace URI need not refer to an actual location,
  326. and doesn't refer to one in this case. But click on
  327. \l{http://www.w3.org/XML/1998/namespace}, for example, which is the
  328. namespace URI for elements and attributes prefixed with \c{xml:}.
  329. Now when we try to run the previous XQuery example, no output is
  330. produced! The path expression no longer matches anything in the
  331. cookbook file because our XQuery doesn't yet know about the namespace
  332. declaration we added to the cookbook document. There are two ways we
  333. can declare the namespace in the XQuery. We can give it a \e{namespace
  334. prefix} (e.g. \c{c} for cookbook) and prefix each name test with the
  335. namespace prefix:
  336. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 3
  337. Or we can declare the namespace to be the \e{default element
  338. namespace}, and then we can still run the original XQuery:
  339. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 4
  340. Both methods will work and produce the same output, all the
  341. \c{<title>} elements:
  342. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 5
  343. But note how the output is slightly different from the output we saw
  344. before we added the default namespace declaration to the cookbook file.
  345. QtXmlPatterns automatically includes the correct namespace attribute
  346. in each \c{<title>} element in the output. When QtXmlPatterns loads a
  347. document and expands a QName, it creates an instance of QXmlName,
  348. which retains the namespace prefix along with the namespace URI and
  349. the local name. See QXmlName for further details.
  350. One thing to keep in mind from this namespace discussion, whether you
  351. run XQueries in a Qt program using QtXmlPatterns, or you run them from
  352. the command line using xmlpatterns, is that if you don't get the
  353. output you expect, it might be because the data you are querying uses
  354. namespaces, but you didn't declare those namespaces in your XQuery.
  355. \section3 Wildcards in Name Tests
  356. The wildcard \c{'*'} can be used in a name test. To find all the
  357. attributes in the cookbook but select only the ones in the \c{xml}
  358. namespace, use the \c{xml:} namespace prefix but replace the
  359. \e{local name} (the attribute name) with the wildcard:
  360. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 7
  361. Oops! If you save this XQuery in \c{file.xq} and run it through
  362. \c{xmlpatterns}, it doesn't work. You get an error message instead,
  363. something like this: \e{Error SENR0001 in file:///...file.xq, at line
  364. 1, column 1: Attribute xml:id can't be serialized because it appears
  365. at the top level.} The XQuery actually ran correctly. It selected a
  366. bunch of \c{xml:id} attributes and put them in the result set. But
  367. then \c{xmlpatterns} sent the result set to a \l{QXmlSerializer}
  368. {serializer}, which tried to output it as well-formed XML. Since the
  369. result set contains only attributes and attributes alone are not
  370. well-formed XML, the \l{QXmlSerializer} {serializer} reports a
  371. \l{http://www.w3.org/TR/2005/WD-xslt-xquery-serialization-20050915/#id-errors}
  372. {serialization error}.
  373. Fear not. XQuery can do more than just find and select elements and
  374. attributes. It can \l{Constructing Elements} {construct new ones on
  375. the fly} as well, which is what we need to do here if we want
  376. \c{xmlpatterns} to let us see the attributes we selected. The example
  377. above and the ones below are revisited in the \l{Constructing
  378. Elements} section. You can jump ahead to see the modified examples
  379. now, and then come back, or you can press on from here.
  380. To find all the \c{name} attributes in the cookbook and select them
  381. all regardless of their namespace, replace the namespace prefix with
  382. the wildcard and write \c{name} (the attribute name) as the local
  383. name:
  384. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 8
  385. To find and select all the attributes of the \e{document element} in
  386. the cookbook, replace the entire name test with the wildcard:
  387. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 9
  388. \section1 Using Predicates In Path Expressions
  389. Predicates can be used to further filter the nodes selected by a path
  390. expression. A predicate is an expression in square brackets ('[' and
  391. ']') that either returns a boolean value or a number. A predicate can
  392. appear at the end of any path step in a path expression. The predicate
  393. is applied to each node in the focus set. If a node passes the
  394. filter, the node is included in the result set. The query below
  395. selects the recipe element that has the \c{<title>} element
  396. \c{"Hard-Boiled Eggs"}.
  397. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 10
  398. The dot expression ('.') can be used in predicates and path
  399. expressions to refer to the current context node. The following query
  400. uses the dot expression to refer to the current \c{<method>} element.
  401. The query selects the empty \c{<method>} elements from the cookbook.
  402. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 11
  403. Note that passing the dot expression to the
  404. \l{http://www.w3.org/TR/xpath-functions/#func-string-length}
  405. {string-length()} function is optional. When
  406. \l{http://www.w3.org/TR/xpath-functions/#func-string-length}
  407. {string-length()} is called with no parameter, the context node is
  408. assumed:
  409. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 12
  410. Actually, selecting an empty \c{<method>} element might not be very
  411. useful by itself. It doesn't tell you which recipe has the empty
  412. method:
  413. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 31
  414. \target Empty Method Not Robust
  415. What you probably want to see instead are the \c{<recipe>} elements that
  416. have empty \c{<method>} elements:
  417. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 32
  418. The predicate uses the
  419. \l{http://www.w3.org/TR/xpath-functions/#func-string-length}
  420. {string-length()} function to test the length of each \c{<method>}
  421. element in each \c{<recipe>} element found by the node test. If a
  422. \c{<method>} contains no text, the predicate evaluates to \c{true} and
  423. the \c{<recipe>} element is selected. If the method contains some
  424. text, the predicate evaluates to \c{false}, and the \c{<recipe>}
  425. element is discarded. The output is the entire recipe that has no
  426. instructions for preparation:
  427. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 33
  428. The astute reader will have noticed that this use of
  429. \c{string-length()} to find an empty element is unreliable. It works
  430. in this case, because the method element is written as \c{<method/>},
  431. guaranteeing that its string length will be 0. It will still work if
  432. the method element is written as \c{<method></method>}, but it will
  433. fail if there is any whitespace between the opening and ending
  434. \c{<method>} tags. A more robust way to find the recipes with empty
  435. methods is presented in the section on \l{Boolean Predicates}.
  436. There are many more functions and operators defined for XQuery and
  437. XPath. They are all \l{http://www.w3.org/TR/xpath-functions}
  438. {documented in the specification}.
  439. \section2 Positional Predicates
  440. Predicates are often used to filter items based on their position in
  441. a sequence. For path expressions processing items loaded from XML
  442. documents, the normal sequence is
  443. \l{http://www.w3.org/TR/xquery/#id-document-order} {document order}.
  444. This query returns the second \c{<recipe>} element in the
  445. \c{cookbook.xml} file:
  446. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 13
  447. The other frequently used positional function is
  448. \l{http://www.w3.org/TR/xpath-functions/#func-last} {last()}, which
  449. returns the numeric position of the last item in the focus set. Stated
  450. another way, \l{http://www.w3.org/TR/xpath-functions/#func-last}
  451. {last()} returns the size of the focus set. This query returns the
  452. last recipe in the cookbook:
  453. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 16
  454. And this query returns the next to last \c{<recipe>}:
  455. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 17
  456. \section2 Boolean Predicates
  457. The other kind of predicate evaluates to \e true or \e false. A
  458. boolean predicate takes the value of its expression and determines its
  459. \e{effective boolean value} according to the following rules:
  460. \list
  461. \o An expression that evaluates to a single node is \c{true}.
  462. \o An expression that evaluates to a string is \c{false} if the
  463. string is empty and \c{true} if the string is not empty.
  464. \o An expression that evaluates to a boolean value (i.e. type
  465. \c{xs:boolean}) is that value.
  466. \o If the expression evaluates to anything else, it's an error
  467. (e.g. type \c{xs:date}).
  468. \endlist
  469. We have already seen some boolean predicates in use. Earlier, we saw
  470. a \e{not so robust} way to find the \l{Empty Method Not Robust}
  471. {recipes that have no instructions}. \c{[string-length(method) = 0]}
  472. is a boolean predicate that would fail in the example if the empty
  473. method element was written with both opening and closing tags and
  474. there was whitespace between the tags. Here is a more robust way that
  475. uses a different boolean predicate.
  476. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 34
  477. This one uses the
  478. \l{http://www.w3.org/TR/xpath-functions/#func-empty} {empty()} and
  479. function to test whether the method contains any steps. If the method
  480. contains no steps, then \c{empty(step)} will return \c{true}, and
  481. hence the predicate will evaluate to \c{true}.
  482. But even that version isn't foolproof. Suppose the method does contain
  483. steps, but all the steps themselves are empty. That's still a case of
  484. a recipe with no instructions that won't be detected. There is a
  485. better way:
  486. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 35
  487. This version uses the
  488. \l{http://www.w3.org/TR/xpath-functions/#func-not} {not} and
  489. \l{http://www.w3.org/TR/xpath-functions/#func-normalize-space}
  490. {normalize-space()} functions. \c{normalize-space(method))} returns
  491. the contents of the method element as a string, but with all the
  492. whitespace normalized, i.e., the string value of each \c{<step>}
  493. element will have its whitespace normalized, and then all the
  494. normalized step values will be concatenated. If that string is empty,
  495. then \c{not()} returns \c{true} and the predicate is \c{true}.
  496. We can also use the
  497. \l{http://www.w3.org/TR/xpath-functions/#func-position} {position()}
  498. function in a comparison to inspect positions with conditional logic. The
  499. \l{http://www.w3.org/TR/xpath-functions/#func-position} {position()}
  500. function returns the position index of the current context item in the
  501. sequence of items:
  502. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 14
  503. Note that the first position in the sequence is position 1, not 0. We
  504. can also select \e{all} the recipes after the first one:
  505. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 15
  506. \target Constructing Elements
  507. \section1 Constructing Elements
  508. In the section about \l{Wildcards in Name Tests} {using wildcards in
  509. name tests}, we saw three simple example XQueries, each of which
  510. selected a different list of XML attributes from the cookbook. We
  511. couldn't use \c{xmlpatterns} to run these queries, however, because
  512. \c{xmlpatterns} sends the XQuery results to a \l{QXmlSerializer}
  513. {serializer}, which expects to serialize the results as well-formed
  514. XML. Since a list of XML attributes by itself is not well-formed XML,
  515. the serializer reported an error for each XQuery.
  516. Since an attribute must appear in an element, for each attribute in
  517. the result set, we must create an XML element. We can do that using a
  518. \l{http://www.w3.org/TR/xquery/#id-for-let} {\e{for} clause} with a
  519. \l{http://www.w3.org/TR/xquery/#id-variables} {bound variable}, and a
  520. \l{http://www.w3.org/TR/xquery/#id-orderby-return} {\e{return}
  521. clause} with an element constructor:
  522. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 25
  523. The \e{for} clause produces a sequence of attribute nodes from the result
  524. of the path expression. Each attribute node in the sequence is bound
  525. to the variable \c{$i}. The \e{return} clause then constructs a \c{<p>}
  526. element around the attribute node. Here is the output:
  527. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 28
  528. The output contains one \c{<p>} element for each \c{xml:id} attribute
  529. in the cookbook. Note that XQuery puts each attribute in the right
  530. place in its \c{<p>} element, despite the fact that in the \e{return}
  531. clause, the \c{$i} variable is positioned as if it is meant to become
  532. \c{<p>} element content.
  533. The other two examples from the \l{Wildcards in Name Tests} {wildcard}
  534. section can be rewritten the same way. Here is the XQuery that selects
  535. all the \c{name} attributes, regardless of namespace:
  536. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 26
  537. And here is its output:
  538. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 29
  539. And here is the XQuery that selects all the attributes from the
  540. \e{document element}:
  541. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 27
  542. And here is its output:
  543. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 30
  544. \section2 Element Constructors are Expressions
  545. Because node constructors are expressions, they can be used in
  546. XQueries wherever expressions are allowed.
  547. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 40
  548. If \c{cookbook.xml} is loaded without error, a \c{<oppskrift>} element
  549. (Norwegian word for recipe) is constructed for each \c{<recipe>}
  550. element in the cookbook, and the child nodes of the \c{<recipe>} are
  551. copied into the \c{<oppskrift>} element. But if the cookbook document
  552. doesn't exist or does not contain well-formed XML, a single
  553. \c{<oppskrift>} element is constructed containing an error message.
  554. \section1 Constructing Atomic Values
  555. XQuery also has atomic values. An atomic value is a value in the value
  556. space of one of the built-in datatypes in the \l
  557. {http://www.w3.org/TR/xmlschema-2} {XML Schema language}. These
  558. \e{atomic types} have built-in operators for doing arithmetic,
  559. comparisons, and for converting values to other atomic types. See the
  560. \l {http://www.w3.org/TR/xmlschema-2/#built-in-datatypes} {Built-in
  561. Datatype Hierarchy} for the entire tree of built-in, primitive and
  562. derived atomic types. \note Click on a data type in the tree for its
  563. detailed specification.
  564. To construct an atomic value as element content, enclose an expression
  565. in curly braces and embed it in the element constructor:
  566. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 36
  567. Sending this XQuery through xmlpatterns produces:
  568. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 37
  569. To compute the value of an attribute, enclose the expression in
  570. curly braces and embed it in the attribute value:
  571. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 38
  572. Sending this XQuery through xmlpatterns produces:
  573. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 39
  574. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 40
  575. If \c{cookbook.xml} is loaded without error, a \c{<oppskrift>} element
  576. (Norweigian word for recipe) is constructed for each \c{<recipe>}
  577. element in the cookbook, and the child nodes of the \c{<recipe>} are
  578. copied into the \c{<oppskrift>} element. But if the cookbook document
  579. doesn't exist or does not contain well-formed XML, a single
  580. \c{<oppskrift>} element is constructed containing an error message.
  581. \section1 Running The Cookbook Examples
  582. Most of the XQuery examples in this document refer to the
  583. \c{cookbook.xml} example file from the \l{Recipes Example}.
  584. Copy the \c{cookbook.xml} to your current directory, save one of the
  585. cookbook XQuery examples in a \c{.xq} file (e.g., \c{file.xq}), and
  586. run the XQuery using Qt's command line utility:
  587. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 6
  588. \section1 Further Reading
  589. There is much more to the XQuery language than we have presented in
  590. this short introduction. We will be adding more here in later
  591. releases. In the meantime, playing with the \c{xmlpatterns} utility
  592. and making modifications to the XQuery examples provided here will be
  593. quite informative. An XQuery textbook will be a good investment.
  594. You can also ask questions on XQuery mail lists:
  595. \list
  596. \o
  597. \l{http://qt.nokia.com/lists/qt-interest/}{qt-interest}
  598. \o
  599. \l{http://www.x-query.com/mailman/listinfo/talk}{talk at x-query.com}.
  600. \endlist
  601. \l{http://www.functx.com/functx/}{FunctX} has a collection of XQuery
  602. functions that can be both useful and educational.
  603. This introduction contains many links to the specifications, which, of course,
  604. are the ultimate source of information about XQuery. They can be a bit
  605. difficult, though, so consider investing in a textbook:
  606. \list
  607. \o \l{http://www.w3.org/TR/xquery/}{XQuery 1.0: An XML Query
  608. Language} - the main source for syntax and semantics.
  609. \o \l{http://www.w3.org/TR/xpath-functions/}{XQuery 1.0 and XPath
  610. 2.0 Functions and Operators} - the builtin functions and operators.
  611. \endlist
  612. \section1 FAQ
  613. The answers to these frequently asked questions explain the causes of
  614. several common mistakes that most beginners make. Reading through the
  615. answers ahead of time might save you a lot of head scratching.
  616. \section2 Why didn't my path expression match anything?
  617. The most common cause of this bug is failure to declare one or more
  618. namespaces in your XQuery. Consider the following query for selecting
  619. all the examples in an XHTML document:
  620. \quotefile snippets/patternist/simpleHTML.xq
  621. It won't match anything because \c{index.html} is an XHTML file, and
  622. all XHTML files declare the default namespace
  623. \c{"http://www.w3.org/1999/xhtml"} in their top (\c{<html>}) element.
  624. But the query doesn't declare this namespace, so the path expression
  625. expands \c{html} to \c{{}html} and tries to match that expanded name.
  626. But the actual expanded name is
  627. \c{{http://www.w3.org/1999/xhtml}html}. One possible fix is to declare the
  628. correct default namespace in the XQuery:
  629. \quotefile snippets/patternist/simpleXHTML.xq
  630. Another common cause of this bug is to confuse the \e{document node}
  631. with the top element node. They are different. This query won't match
  632. anything:
  633. \quotefile snippets/patternist/docPlainHTML.xq
  634. The \c{doc()} function returns the \e{document node}, not the top
  635. element node (\c{<html>}). Don't forget to match the top element node
  636. in the path expression:
  637. \quotefile snippets/patternist/docPlainHTML2.xq
  638. \section2 What if my input namespace is different from my output namespace?
  639. Just remember to declare both namespaces in your XQuery and use them
  640. properly. Consider the following query, which is meant to generate
  641. XHTML output from XML input:
  642. \quotefile snippets/patternist/embedDataInXHTML.xq
  643. We want the \c{<html>}, \c{<body>}, and \c{<p>} nodes we create in the
  644. output to be in the standard XHTML namespace, so we declare the
  645. default namespace to be \c{http://www.w3.org/1999/xhtml}. That's
  646. correct for the output, but that same default namespace will also be
  647. applied to the node names in the path expression we're trying to match
  648. in the input (\c{/tests/test[@status = "failure"]}), which is wrong,
  649. because the namespace used in \c{testResult.xml} is perhaps in the
  650. empty namespace. So we must declare that namespace too, with a
  651. namespace prefix, and then use the prefix with the node names in
  652. the path expression. This one will probably work better:
  653. \quotefile snippets/patternist/embedDataInXHTML2.xq
  654. \section2 Why doesn't my return clause work?
  655. Recall that XQuery is an \e{expression-based} language, not
  656. \e{statement-based}. Because an XQuery is a lot of expressions,
  657. understanding XQuery expression precedence is very important.
  658. Consider the following query:
  659. \quotefile snippets/patternist/forClause2.xq
  660. It looks ok, but it isn't. It is supposed to be a FLWOR expression
  661. comprising a \e{for} clause and a \e{return} clause, but it isn't just
  662. that. It \e{has} a FLWOR expression, certainly (with the \e{for} and
  663. \e{return} clauses), but it \e{also} has an arithmetic expression
  664. (\e{+ $d}) dangling at the end because we didn't enclose the return
  665. expression in parentheses.
  666. Using parentheses to establish precedence is more important in XQuery
  667. than in other languages, because XQuery is \e{expression-based}. In
  668. In this case, without parantheses enclosing \c{$i + $d}, the return
  669. clause only returns \c{$i}. The \c{+$d} will have the result of the
  670. FLWOR expression as its left operand. And, since the scope of variable
  671. \c{$d} ends at the end of the \e{return} clause, a variable out of
  672. scope error will be reported. Correct these problems by using
  673. parentheses.
  674. \quotefile snippets/patternist/forClause.xq
  675. \section2 Why didn't my expression get evaluated?
  676. You probably misplaced some curly braces. When you want an expression
  677. evaluated inside an element constructor, enclose the expression in
  678. curly braces. Without the curly braces, the expression will be
  679. interpreted as text. Here is a \c{sum()} expression used in an \c{<e>}
  680. element. The table shows cases where the curly braces are missing,
  681. misplaced, and placed correctly:
  682. \table
  683. \header
  684. \o element constructor with expression...
  685. \o evaluates to...
  686. \row
  687. \o <e>sum((1, 2, 3))</e>
  688. \o <e>sum((1, 2, 3))</e>
  689. \row
  690. \o <e>sum({(1, 2, 3)})</e>
  691. \o <e>sum(1 2 3)</e>
  692. \row
  693. \o <e>{sum((1, 2, 3))}</e>
  694. \o <e>6</e>
  695. \endtable
  696. \section2 My predicate is correct, so why doesn't it select the right stuff?
  697. Either you put your predicate in the wrong place in your path
  698. expression, or you forgot to add some parentheses. Consider this
  699. input file \c{doc.txt}:
  700. \quotefile snippets/patternist/doc.txt
  701. Suppose you want the first \c{<span>} element of every \c{<p>}
  702. element. Apply a position filter (\c{[1]}) to the \c{/span} path step:
  703. \quotefile snippets/patternist/filterOnStep.xq
  704. Applying the \c{[1]} filter to the \c{/span} step returns the first
  705. \c{<span>} element of each \c{<p>} element:
  706. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 41
  707. \note: You can write the same query this way:
  708. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 44
  709. Or you can reduce it right down to this:
  710. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 45
  711. On the other hand, suppose you really want only one \c{<span>}
  712. element, the first one in the document (i.e., you only want the first
  713. \c{<span>} element in the first \c{<p>} element). Then you have to do
  714. more filtering. There are two ways you can do it. You can apply the
  715. \c{[1]} filter in the same place as above but enclose the path
  716. expression in parentheses:
  717. \quotefile snippets/patternist/filterOnPath.xq
  718. Or you can apply a second position filter (\c{[1]} again) to the
  719. \c{/p} path step:
  720. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 43
  721. Either way the query will return only the first \c{<span>} element in
  722. the document:
  723. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 42
  724. \section2 Why doesn't my FLWOR behave as expected?
  725. The quick answer is you probably expected your XQuery FLWOR to behave
  726. just like a C++ \e{for} loop. But they aren't the same. Consider a
  727. simple example:
  728. \quotefile snippets/patternist/letOrderBy.xq
  729. This query evaluates to \e{4 -4 -2 2 -8 8}. The \e{for} clause does
  730. set up a \e{for} loop style iteration, which does evaluate the rest of
  731. the FLWOR multiple times, one time for each value returned by the
  732. \e{in} expression. That much is similar to the C++ \e{for} loop.
  733. But consider the \e{return} clause. In C++ if you hit a \e{return}
  734. statement, you break out of the \e{for} loop and return from the
  735. function with one value. Not so in XQuery. The \e{return} clause is
  736. the last clause of the FLWOR, and it means: \e{Append the return value
  737. to the result list and then begin the next iteration of the FLWOR}.
  738. When the \e{for} clause's \e{in} expression no longer returns a value,
  739. the entire result list is returned.
  740. Next, consider the \e{order by} clause. It doesn't do any sorting on
  741. each iteration of the FLWOR. It just evaluates its expression on each
  742. iteration (\c{$a} in this case) to get an ordering value to map to the
  743. result item from each iteration. These ordering values are kept in a
  744. parallel list. The result list is sorted at the end using the parallel
  745. list of ordering values.
  746. The last difference to note here is that the \e{let} clause does
  747. \e{not} set up an iteration through a sequence of values like the
  748. \e{for} clause does. The \e{let} clause isn't a sort of nested loop.
  749. It isn't a loop at all. It is just a variable binding. On each
  750. iteration, it binds the \e{entire} sequence of values on the right to
  751. the variable on the left. In the example above, it binds (4 -4) to
  752. \c{$b} on the first iteration, (-2 2) on the second iteration, and (-8
  753. 8) on the third iteration. So the following query doesn't iterate
  754. through anything, and doesn't do any ordering:
  755. \quotefile snippets/patternist/invalidLetOrderBy.xq
  756. It binds the entire sequence (2, 3, 1) to \c{$i} one time only; the
  757. \e{order by} clause only has one thing to order and hence does
  758. nothing, and the query evaluates to 2 3 1, the sequence assigned to
  759. \c{$i}.
  760. \note We didn't include a \e{where} clause in the example. The
  761. \e{where} clause is for filtering results.
  762. \section2 Why are my elements created in the wrong order?
  763. The short answer is your elements are \e{not} created in the wrong
  764. order, because when appearing as operands to a path expression,
  765. there is no correct order. Consider the following query,
  766. which again uses the input file \c{doc.txt}:
  767. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 46
  768. The query finds all the \c{<p>} elements in the file. For each \c{<p>}
  769. element, it builds a \c{<p>} element in the output containing the
  770. concatenated contents of all the \c{<p>} element's child \c{<span>}
  771. elements. Running the query through \c{xmlpatterns} might produce the
  772. following output, which is not sorted in the expected order.
  773. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 47
  774. You can use a \e{for} loop to ensure that the order of
  775. the result set corresponds to the order of the input sequence:
  776. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 48
  777. This version produces the same result set but in the expected order:
  778. \snippet snippets/code/doc_src_qtxmlpatterns.qdoc 49
  779. \section2 Why can't I use \c{true} and \c{false} in my XQuery?
  780. You can, but not by just using the names \c{true} and \c{false}
  781. directly, because they are \l{Name Tests} {name tests} although they look
  782. like boolean constants. The simple way to create the boolean values is
  783. to use the builtin functions \c{true()} and \c{false()} wherever
  784. you want to use \c{true} and \c{false}. The other way is to invoke the
  785. boolean constructor:
  786. \quotefile snippets/patternist/xsBooleanTrue.xq
  787. */