PageRenderTime 54ms CodeModel.GetById 19ms RepoModel.GetById 1ms app.codeStats 0ms

/trunk/contribs/saxon6_5_5/doc/extensibility.html

https://bitbucket.org/haris_peco/debrief
HTML | 422 lines | 334 code | 88 blank | 0 comment | 0 complexity | 657d64544ffb2517b81f982e12a3491c MD5 | raw file
  1. <html>
  2. <head>
  3. <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">
  4. <title>Extensibility</title>
  5. <body leftmargin="150" bgcolor="#ddeeff"><font face="Arial, Helvetica, sans-serif">
  6. <div align=right><a href="index.html">SAXON home page</a></div>
  7. <p><font FACE="Arial, Helvetica, sans-serif" color="#FF0080" size="7">Extensibility</font></p>
  8. <hr>
  9. <p>This page describes how to extend the capability of SAXON XSLT Stylesheets</p>
  10. <div><font face="Arial, Helvetica, sans-serif">
  11. <table width="100%">
  12. <tr>
  13. <td width="100%" bgcolor="#0000FF"><font color="#FFFFFF"><big><b>Contents</b></big></font></td>
  14. </tr>
  15. <tr>
  16. <td VALIGN="top" bgcolor="#00FFFF">
  17. <a href="#Writing-extension-functions">Writing extension functions</a><br>
  18. <a href="#Writing-extension-elements">Writing extension elements</a><br>
  19. <a href="#Writing-Java-node-handlers">Writing Java node handlers</a><br>
  20. <a href="#Writing-input-filters">Writing input filters</a><br>
  21. <a href="#Writing-output-filters">Writing output filters</a><br>
  22. <a href="#Implementing-a-collating-sequence">Implementing a collating sequence</a><br>
  23. <a href="#Implementing-a-numbering-sequence">Implementing a numbering sequence</a><br>
  24. <a href="#Adding-an-output-encoding">Adding an output encoding</a>
  25. </td>
  26. </tr>
  27. </table>
  28. </font></div>
  29. <a name="Writing-extension-functions"><h2>Writing extension functions</h2></a>
  30. <p>An extension function is invoked using a name such as <b>prefix:localname()</b>.
  31. The prefix must
  32. be the prefix associated with a namespace declaration that is in scope. </p>
  33. <p>Extension functions may be implemented in Java or in XSLT. For information on writing
  34. functions in XSLT, see <a href="extensions.html#saxon:function">the description of the saxon:function
  35. element</a>. The following information applies to extension functions implemented in Java.
  36. <p>Saxon supports the &lt;xsl:script&gt; element defined in the XSLT 1.1 working draft.
  37. It also supports &lt;saxon:script&gt; as a synonym (use this if you want other XSLT processors
  38. to ignore the element). This element defines a mapping between a namespace URI used in calls
  39. of extension functions, and a Java class that contains implementations of these functions.
  40. See <a href="xsl-elements.html#xsl:script">xsl:script</a> for details.</p>
  41. <p>You can also use a short-cut technique of binding external Java classes, by making the
  42. class name part of the namespace URI. In this case, you don't need an &lt;xsl:script&gt;
  43. element.</p>
  44. <p>With the short-cut technique, the URI for the
  45. namespace identifies the class where the external function will be found.
  46. The namespace URI must either be "java:" followed by the fully-qualified class name
  47. (for example xmlns:date="java:java.util.Date"),
  48. or a string containing a "/", in which the fully-qualified class name appears after the final "/".
  49. (for example xmlns:date="http://www.jclark.com/xt/java/java.util.Date"). The part of
  50. the URI before the final "/" is immaterial. The class must be on the classpath. For compatibility
  51. with previous releases, the format xmlns:date="java.util.Date" is also supported.</p>
  52. <p>The SAXON namespace URI "http://icl.com/saxon" is recognised as a special case, and causes the
  53. function to be loaded from the class com.icl.saxon.functions.Extensions. This class name can be
  54. specified explicitly if you prefer.</p>
  55. <p>There are three cases to consider: static methods, constructors, and instance-level methods.</p>
  56. <p>Static methods can be called directly.
  57. The localname of the function must match the name of a public static method in this class. The names
  58. match if they contain the same characters, excluding hyphens and forcing any character that follows
  59. a hyphen to upper-case. For example the XPath function call "to-string()" matches the Java method
  60. "toString()"; but the function call can also be written as "toString()" if you prefer.
  61. If there are several methods in the class that match the localname, the system attempts
  62. to find the one that is the best fit to the types of the supplied arguments: for example if the
  63. call is f(1,2) then a method with two int arguments will be preferred to one with two String
  64. arguments. The detailed rules are quite complex, and are described in the XSLT 1.1 working draft.
  65. If there are several methods with the same name and the correct number of arguments, but none is
  66. preferable to the others under these rules, an error is reported.</p>
  67. <p>For example:</p>
  68. <pre><code>
  69. &lt;xsl:value-of select="math:sqrt($arg)"
  70. xmlns:math="java:java.lang.Math"/&gt;
  71. </code></pre>
  72. <p>This will invoke the static method java.lang.Math.sqrt(), applying it to the value of the variable
  73. $arg, and copying the value of the square root of $arg to the result tree.</p>
  74. <p>Constructors are called by using the function named new(). If there are several constructors, then again
  75. the system tries to find the one that is the best fit, according to the types of the supplied arguments. The result
  76. of calling new() is an XPath value of type Java Object; the only things that can be done with a Java Object
  77. are to assign it to a variable, to pass it to an extension function, and to convert it to a string, number,
  78. or boolean, using the rules given below.</p>
  79. <p>Instance-level methods are called by supplying an extra first argument of type Java Object which is the
  80. object on which the method is to be invoked. A Java Object is usually created by calling an extension
  81. function (e.g. a constructor) that returns an object; it may also be passed to the style sheet as the
  82. value of a global parameter. Matching of method names is done as for static methods.
  83. If there are several methods in the class that match the localname, the system again tries to
  84. find the one that is the best fit, according to the types of the supplied arguments.</p>
  85. For example, the following stylesheet prints the date and time. This example is copied from the
  86. documentation of the xt product, and it works unchanged with SAXON, because SAXON
  87. does not care what the namespace URI for extension functions is, so long as it ends with
  88. the class name. (Extension functions are likely to be compatible between SAXON and xt
  89. provided they only use the data types string, number, and boolean).</p>
  90. <code><pre>
  91. &lt;xsl:stylesheet
  92. version="1.0"
  93. xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  94. xmlns:date="http://www.jclark.com/xt/java/java.util.Date">
  95. &lt;xsl:template match="/">
  96. &lt;html>
  97. &lt;xsl:if test="function-available('date:to-string') and function-available('date:new')">
  98. &lt;p>&lt;xsl:value-of select="date:to-string(date:new())"/>&lt;/p>
  99. &lt;/xsl:if>
  100. &lt;/html>
  101. &lt;/xsl:template>
  102. &lt;/xsl:stylesheet>
  103. </pre></code>
  104. <p>A wrapped Java object may be converted to another data type as follows.
  105. <ul>
  106. <li>It is converted to a string by using its toString() method; if the object is null, the result is
  107. the empty string "".</li>
  108. <li>It is converted to a number by converting it first to a string, and then applying the
  109. XPath number() conversion. If it is null, the result is NaN.</li>
  110. <li>It is converted to a boolean as follows: if it is null, the result is false, otherwise
  111. it is converted to a string and the result is true if and only if the string is non-empty.</li>
  112. </ul>
  113. <p>The method may have an extra first argument of class com.icl.saxon.Context or org.w3c.xsl.XSLTContext.
  114. This argument is not
  115. supplied by the calling XSL code, but by SAXON itself. The Context object provides methods to access many
  116. internal SAXON resources, the most useful being getContextNode() which returns the context node in the
  117. source document. The Context object is not available with constructors.</p>
  118. <p>If any exceptions are thrown by the method, or if a matching method cannot be found,
  119. processing of the stylesheet will be abandoned.</p>
  120. <p>The result type of the method is converted to an XPath value as follows.</p>
  121. <ul>
  122. <li>If the method returns void, the XPath value is an empty node-set</li>
  123. <li>If the method returns null, the XPath value is a wrapped null Object.</li>
  124. <li>If the method is a constructor, the XPath value is of type "wrapped Java object". The only way of
  125. using this is by passing it to another external function, or by converting it to one of the standard
  126. XPath data types as described above.</li>
  127. <li>If the returned value is a Java boolean or Boolean, the XPath result is a boolean.</li>
  128. <li>If the returned value is a Java int, short, long, double, or float, or one of their object wrapper
  129. equivalents, it is converted to a double using Java casting, and the XPath result is a number.</li>
  130. <li>If the returned value is a Java String, the XPath result is a string.</li>
  131. <li>If the returned value is of class com.icl.saxon.om.NodeInfo (a node in a Saxon tree), it is
  132. returned as a node-set containing a single node.</li>
  133. <li>If the returned value is a DOM NodeList, the list of nodes is returned as a Saxon node-set. However,
  134. all the nodes must be instances of class com.icl.saxon.om.NodeInfo, that is, they must use Saxon's tree
  135. implementation, not some third-party DOM. But any implementation of NodeList can be used. The nodes
  136. can come from the original source tree, or from a newly-constructed tree, so long as it is constructed
  137. using Saxon.</li>
  138. <li>If the returned value is a DOM Node that is not an instance of class com.icl.saxon.om.NodeInfo, it is
  139. rejected: the result must use Saxon's DOM implementation, not some third-party DOM.</li>
  140. <li>If the result is any other Java object (including null), it is returned as a "wrapped Java object".</li>
  141. </ul>
  142. <p>Note that Saxon's tree structure conforms to the DOM Core Level 2 interface. However, it is read-only:
  143. any attempt to modify the tree causes an exception. Saxon's trees can only be built using the Saxon
  144. subclasses of the com.icl.saxon.tree.Builder class, and they cannot be modified <i>in situ</i>.</p>
  145. <p>The system function <b>function-available(String name)</b> returns true if there appears
  146. to be a method available with the right name. It does not test whether this method has the appropriate
  147. number of arguments or whether the arguments are of appropriate types. If the function name is "new" it
  148. returns true so long as the class is not an abstract class or interface, and so long as it has at least
  149. one constructor.</p>
  150. <p>There are a number of extension functions supplied with the SAXON product: for details, see
  151. <a href="extensions.html">extensions.html</a>. The source code of these methods, which
  152. in most cases is extremely simple, can be used as an example for writing
  153. other user extension functions. It is found in class com.icl.saxon.functions.Extensions</p>
  154. <a name="Writing-extension-elements"><h2>Writing extension elements</h2></a>
  155. <p>SAXON implements the element extensibility feature defined in section 14.1 of the standard.
  156. This feature allows you to define your own element types for use in the stylesheet. </p>
  157. <p>If a namespace prefix is to be used to denote extension elements, it must be declared in the
  158. <b>extension-element-prefixes</b> attribute on the xsl:stylesheet element, or the
  159. <b>xsl:extension-element-prefixes</b> attribute on any enclosing literal result element or
  160. extension element.
  161. <p>Note that SAXON itself provides a number of stylesheet elements beyond those defined in the
  162. XSLT specification, including saxon:assign, saxon:entity-ref, saxon:while,
  163. saxon:group, saxon:item. To enable these, use the standard XSL extension mechanism: define
  164. <b>extension-element-prefixes="saxon"</b> on the xsl:stylesheet element, or
  165. <b>xsl:extension-element-prefixes="saxon"</b> on any enclosing literal result element.</p>
  166. <p>To invoke a user-defined set of extension elements, include the prefix in this attribute as
  167. described, and associate it with a namespace URI that ends in "/" followed by the fully qualified
  168. class name of a Java class that implements the <b>com.icl.saxon.style.ExtensionElementFactory</b> interface.
  169. This interface defines a single method, <b>getExtensionClass()</b>, which takes the local name of the element
  170. (i.e., the name without its namespace prefix) as a parameter, and returns the Java class used to
  171. implement this extension element (for example, "return SQLConnect.class"). The class returned must
  172. be a subclass of com.icl.saxon.style.StyleElement.</p>
  173. <p>The best way to see how to implement an extension element is by looking at the example, for SQL
  174. extension elements, provided in package <b>com.icl.saxon.sql</b>, and at the sample stylesheet <b>books.sqlxsl</b>
  175. which uses these extension elements. There are three main methods a StyleElement
  176. class must provide:</p>
  177. <table>
  178. <tr><td valign=top width="30%">prepareAttributes()</td>
  179. <td>This is called while the stylesheet tree is still being built, so it should not attempt
  180. to navigate the tree. Its task is to validate the attributes of the stylesheet element and
  181. perform any preprocessing necessary. For example, if the attribute is an attribute value template,
  182. this includes creating an Expression that can subsequently be evaluated to get the AVT's
  183. value.</td></tr>
  184. <tr><td valign=top>validate()</td>
  185. <td>This is called once the tree has been built, and its task is to check that the stylesheet
  186. element appears in the right context within the tree, e.g. that it is within a template</td></tr>
  187. <tr><td valign=top>process()</td>
  188. <td>This is called to process a particular node in the source document, which can be accessed
  189. by reference to the Context supplied as a parameter.</td></tr>
  190. <tr><td valign=top>isInstruction()</td>
  191. <td>This should return true, to ensure that the element is allowed to appear
  192. within a template body.</td></tr>
  193. <tr><td valign=top>mayContainTemplateBody(()</td>
  194. <td>This should return true, to ensure that the element can contain instructions.
  195. Even if it can't contain anything else, extension elements should allow an xsl:fallback
  196. instruction to provide portability between processors</td></tr>
  197. </table>
  198. <p>The StyleElement class has access to many services supplied either via its superclasses or via
  199. the Context object. For details, see the API documentation of the individual classes.</p>
  200. <p>Any element whose prefix matches a namespace listed in the extension-element-prefixes
  201. attribute of an enclosing element is treated as an extension element. If no class can be
  202. instantiated for the element (for example, because no ExtensionElementFactory can be loaded,
  203. or because the ExtensionElementFactory doesn't recognise the local name), then fallback
  204. action is taken as follows. If the element has one or more xsl:fallback children, they are
  205. processed. Otherwise, an error is reported. When xsl:fallback is used in any other context, it
  206. and its children are ignored.</p>
  207. <p>It is also possible to test whether an extension element is implemented by using the system
  208. function element-available(). This returns true if the namespace of the element identifies
  209. it as an extension element (or indeed as a standard XSL instruction) and if a class can be instantiated
  210. to represent it. If the namespace is not that of an extension element, or if no class can be
  211. instantiated, it returns false.</p>
  212. <a name="Writing-Java-node-handlers"><h2>Writing Java node handlers</h2></a>
  213. <p>A Java node handler can be used to process any node, in place of an XSL template. The handler is
  214. nominated by using a saxon:handler element with a handler attribute that names the node handler class. The
  215. handler itself is an implementation of com.icl.saxon.NodeHandler or one of its subclasses (the most usual being
  216. com.icl.saxon.ElementHandler). The saxon:handler element must be a top-level element, and must be
  217. empty. It takes the same attributes as xsl:template (match, mode, name, and priority) and is
  218. considered along with xsl:template elements to decide which template to execute when xsl:call-template
  219. or xsl:apply-templates is used.</p>
  220. <p>Java node handlers have full access to the source document and the current processing context (for example, the
  221. values of parameters). The may also trigger processing of other nodes in the document by calling
  222. applyTemplates(): this works just like xsl:apply-templates, and the selected nodes may be processed either
  223. by XSL templates or by further Java node handlers.</p>
  224. <p>A Java node handler may also be registered with a name, and may thus be invoked using xsl:call-template. There
  225. is no direct mechanism for a Java node handler to call a named XSLT template, but the effect can be achieved
  226. by using a mode that identifies the called template uniquely.</p>
  227. <a name="Writing-input-filters"><h2>Writing input filters</h2></a>
  228. <p>SAXON takes its input from a SAX2 Parser reading from an InputSource. A very useful technique is to
  229. interpose a <i>filter</i> between the parser and SAXON. The filter will typically be an
  230. instance of the SAX2 <b>XMLFilter</b> class.
  231. </p>
  232. <p>See the TrAX examples for hints on using a Saxon Transformer as part of a chain of
  233. SAX Filters.</p>
  234. <p>Note that SAXON relies on the application to supply a well-balanced sequence
  235. of SAX events; it doesn't need to be well-formed (the root node can have any number
  236. of element or text children), but if it isn't well-balanced,
  237. the consequences are unpredictable.</p>
  238. <a name="Writing-output-filters"><h2>Writing output filters</h2></a>
  239. <p>The output of a SAXON stylesheet can be directed to a user-defined output filter. This filter can be
  240. defined either as a standard SAX1 <b>DocumentHandler</b>, a SAX2 <b>ContentHandler</b>, or
  241. as a subclass of the SAXON class
  242. <b>com.icl.saxon.output.Emitter</b>. The advantage of using an Emitter is that more information is available
  243. from the stylesheet, for example the attributes of the xsl:output element.</p>
  244. <p>When a ContentHandler is used, Saxon will by default always supply a stream of events corresponding
  245. to a well-formed document. (The XSLT
  246. specification also allows the output to be an external general parsed entity.) If the result tree is not
  247. well-formed, Saxon will notify the content handler of the fact by sending a processing instruction
  248. with the name "saxon:warning" and the text "Output suppressed because it is not well-formed".
  249. If the content handler is happy to accept output that is not well-formed, it can respond to this
  250. processing instruction by throwing a SAXException whose message text is "continue"; in this case
  251. subsequent events will be notified whether or not they are well-formed.</p>
  252. <p>As specified in the JAXP 1.1 interface, requests to disable or re-enable output escaping
  253. are also notified to the content handler by means of special processing instructions. The
  254. names of these processing instructions are defined by the constants PI_DISABLE_OUTPUT_ESCAPING
  255. and PI_ENABLE_OUTPUT_ESCAPING defined in class javax.xml.transform.Result.</p>
  256. <p>If an Emitter is used, however, it will be informed of all events.</p>
  257. <p>The Emitter or ContentHandler to be used is specified in the <b>method</b> attribute of the
  258. xsl:output or xsl:document
  259. element, as a fully-qualified class name; for example
  260. <b>method="prefix:com.acme.xml.SaxonOutputFilter"</b>. The namespace prefix is ignored, but
  261. must be present to meet XSLT conformance rules.
  262. <p>See the documentation of class com.icl.saxon.output.Emitter for details of the methods available, or
  263. implementations such as HTMLEmitter and XMLEmitter and TEXTEmitter for the standard output formats
  264. supported by SAXON.</p>
  265. <p>It can sometimes be useful to set up a chain of emitters working as a pipeline. To write a filter
  266. that participates in such a pipeline, the class <b>ProxyEmitter</b> is supplied. Use the class <b>Indenter</b>,
  267. which handles XML and HTML indentation, as an example of how to write a ProxyEmitter.</p>
  268. <p>Rather than writing an output filter in Java, SAXON also allows you to process the output through
  269. another XSL stylesheet. To do this, simply name the next stylesheet in the saxon:next-in-chain attribute
  270. of xsl:output or xsl:document. </p>
  271. <p>Any number of
  272. user-defined attributes may be defined on both xsl:output and xsl:document. These
  273. attributes must have names in a non-null namespace, which must not be either the XSLT
  274. or the Saxon namespace. These attributes are interpreted as attribute value templates.
  275. The value of the attribute is inserted into the Properties object made available to
  276. the Emitter handling the output; they will be ignored by the standard output methods,
  277. but can supply arbitrary information to a user-defined output method. The name of the
  278. property will be the expanded name of the attribute in JAXP format, for example
  279. "{http://my-namespace/uri}local-name", and the value will be the value as given,
  280. after evaluation as an attribute value template.</p>
  281. <a name="Implementing a collating sequence"><h2>Implementing a collating sequence</h2></a>
  282. <p>It is possible to define a collating sequence for use by xsl:sort. This is controlled through the
  283. data-type and lang attributes of the xsl:sort element.</p>
  284. <p>To define language-dependent collating where the sort data-type has its default data type "text",
  285. you should supply a collator named com.icl.saxon.sort.Compare_<i>lang</i> where <i>lang</i> is the value
  286. of the xsl:sort lang attribute. For example, for German collating set lang="de" and supply a collator
  287. named com.icl.saxon.sort.Compare_de. Note that any hyphens in the language name are ignored
  288. in forming the class name, but case is significant.
  289. For example if you specify lang="en-GB", the TextComparer must be named
  290. "com.icl.saxon.sort.Compare_enGB".</p>
  291. <p>To define application-dependent collating, set the data-type attribute of xsl:sort to "xyz:class-name"
  292. where xyz is any namespace prefix, and class-name is the fully-qualified Java class name of your collator.
  293. For example if you want to collate
  294. the names of the months January, February, March, etc, in the conventional sequence you could do this by
  295. writing <xsl:sort data-type="type:month"/> and providing a collator called "month".</p>
  296. <p>In either case the collator must be a subclass of the abstract class com.icl.saxon.sort.TextComparer.
  297. The main method you have to implement is compare() which takes two values and returns a number that is
  298. negative, zero, or positive, depending on whether the first value is less than, equal to, or greater
  299. than the second.</p>
  300. <p>The collator is also notified of the values of the <b>order</b> and <b>case-order</b> attributes, and
  301. can modify its strategy accordingly, either by remembering the current settings, or by returning a different
  302. collator to be used in place of the original.</p>
  303. <a name="Implementing-a-numbering-sequence"><h2>Implementing a numbering sequence</h2></a>
  304. <p>It is possible to define a numbering sequence for use by xsl:number. This is controlled through the lang
  305. attribute of the xsl:number element. The feature is primarily intended to provide language-dependent numbering,
  306. but in fact it can be used to provide arbitrary numbering sequences: for example if you want to number items
  307. as "one", "two", "three" etc, you could implement a numbering class to do this and invoke it say with
  308. lang="alpha".</p>
  309. <p>To implement a numberer for language X, you need to define a class com.icl.saxon.number.Numberer_X,
  310. for example <b>com.icl.saxon.sort.Numberer_alpha</b>. This must implement the interface Numberer. A (not very
  311. useful) Numberer is supplied for lang="de" as a specimen, and you can use this as a prototype to write your
  312. own. A numbering sequence is also supplied for lang="en", and this is used by default if no other can be loaded.</p>
  313. <p>Note that any hyphens in the language name are ignored in forming the class name, but case is significant.
  314. For example if you specify lang="en-GB", the Numberer must be named "com.icl.saxon.number.Numberer_enGB".</p>
  315. <a name="Adding-an-output-encoding"><h2>Adding an output encoding</h2></a>
  316. <p>If you want to use an output encoding that is not directly supported by Saxon
  317. (for a list of encodings that are supported, see <a href="conformance.html">conformance.html</a>)
  318. you can do this by writing a Java class that implements the interface
  319. <b>com.icl.saxon.charcode.PluggableCharacterSet</b>.
  320. You need to supply two methods: inCharSet() which tests whether a particular Unicode character
  321. is present in the character set, and getEncodingName() which returns the name given to the
  322. encoding by your Java VM. The encoding must be supported by the Java VM. To use this encoding,
  323. specify the fully-qualified class name as the value of the encoding attribute in xsl:output.</p>
  324. <p>Alternatively, it is possible to specify the CharacterSet class to be used for a named output
  325. encoding by setting the system property, e.g. -D"encoding.EUC-JP"="EUC_JP"; the value
  326. of the property should be the name of a class that implements the
  327. PluggableCharacterSet interface. This indicates the class to be used when the xsl:output
  328. element specifies encoding="EUC-JP".</p>
  329. <hr>
  330. <p align="center">Michael H. Kay<br>
  331. <a href="http://www.saxonica.com/">Saxonica Limited</a><br>
  332. 22 June 2005</p>
  333. </body>
  334. </html>