/trunk/contribs/saxon6_5_5/doc/extensibility.html
HTML | 422 lines | 334 code | 88 blank | 0 comment | 0 complexity | 657d64544ffb2517b81f982e12a3491c MD5 | raw file
- <html>
-
- <head>
- <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">
- <title>Extensibility</title>
-
- <body leftmargin="150" bgcolor="#ddeeff"><font face="Arial, Helvetica, sans-serif">
-
- <div align=right><a href="index.html">SAXON home page</a></div>
-
- <p><font FACE="Arial, Helvetica, sans-serif" color="#FF0080" size="7">Extensibility</font></p>
-
-
- <hr>
-
- <p>This page describes how to extend the capability of SAXON XSLT Stylesheets</p>
-
- <div><font face="Arial, Helvetica, sans-serif">
- <table width="100%">
- <tr>
- <td width="100%" bgcolor="#0000FF"><font color="#FFFFFF"><big><b>Contents</b></big></font></td>
- </tr>
-
- <tr>
- <td VALIGN="top" bgcolor="#00FFFF">
- <a href="#Writing-extension-functions">Writing extension functions</a><br>
- <a href="#Writing-extension-elements">Writing extension elements</a><br>
- <a href="#Writing-Java-node-handlers">Writing Java node handlers</a><br>
- <a href="#Writing-input-filters">Writing input filters</a><br>
- <a href="#Writing-output-filters">Writing output filters</a><br>
- <a href="#Implementing-a-collating-sequence">Implementing a collating sequence</a><br>
- <a href="#Implementing-a-numbering-sequence">Implementing a numbering sequence</a><br>
- <a href="#Adding-an-output-encoding">Adding an output encoding</a>
- </td>
-
- </tr>
- </table>
- </font></div>
-
- <a name="Writing-extension-functions"><h2>Writing extension functions</h2></a>
-
- <p>An extension function is invoked using a name such as <b>prefix:localname()</b>.
- The prefix must
- be the prefix associated with a namespace declaration that is in scope. </p>
-
- <p>Extension functions may be implemented in Java or in XSLT. For information on writing
- functions in XSLT, see <a href="extensions.html#saxon:function">the description of the saxon:function
- element</a>. The following information applies to extension functions implemented in Java.
-
- <p>Saxon supports the <xsl:script> element defined in the XSLT 1.1 working draft.
- It also supports <saxon:script> as a synonym (use this if you want other XSLT processors
- to ignore the element). This element defines a mapping between a namespace URI used in calls
- of extension functions, and a Java class that contains implementations of these functions.
- See <a href="xsl-elements.html#xsl:script">xsl:script</a> for details.</p>
-
- <p>You can also use a short-cut technique of binding external Java classes, by making the
- class name part of the namespace URI. In this case, you don't need an <xsl:script>
- element.</p>
-
- <p>With the short-cut technique, the URI for the
- namespace identifies the class where the external function will be found.
- The namespace URI must either be "java:" followed by the fully-qualified class name
- (for example xmlns:date="java:java.util.Date"),
- or a string containing a "/", in which the fully-qualified class name appears after the final "/".
- (for example xmlns:date="http://www.jclark.com/xt/java/java.util.Date"). The part of
- the URI before the final "/" is immaterial. The class must be on the classpath. For compatibility
- with previous releases, the format xmlns:date="java.util.Date" is also supported.</p>
-
- <p>The SAXON namespace URI "http://icl.com/saxon" is recognised as a special case, and causes the
- function to be loaded from the class com.icl.saxon.functions.Extensions. This class name can be
- specified explicitly if you prefer.</p>
-
- <p>There are three cases to consider: static methods, constructors, and instance-level methods.</p>
-
- <p>Static methods can be called directly.
- The localname of the function must match the name of a public static method in this class. The names
- match if they contain the same characters, excluding hyphens and forcing any character that follows
- a hyphen to upper-case. For example the XPath function call "to-string()" matches the Java method
- "toString()"; but the function call can also be written as "toString()" if you prefer.
- If there are several methods in the class that match the localname, the system attempts
- to find the one that is the best fit to the types of the supplied arguments: for example if the
- call is f(1,2) then a method with two int arguments will be preferred to one with two String
- arguments. The detailed rules are quite complex, and are described in the XSLT 1.1 working draft.
- If there are several methods with the same name and the correct number of arguments, but none is
- preferable to the others under these rules, an error is reported.</p>
-
- <p>For example:</p>
-
- <pre><code>
- <xsl:value-of select="math:sqrt($arg)"
- xmlns:math="java:java.lang.Math"/>
- </code></pre>
-
- <p>This will invoke the static method java.lang.Math.sqrt(), applying it to the value of the variable
- $arg, and copying the value of the square root of $arg to the result tree.</p>
-
- <p>Constructors are called by using the function named new(). If there are several constructors, then again
- the system tries to find the one that is the best fit, according to the types of the supplied arguments. The result
- of calling new() is an XPath value of type Java Object; the only things that can be done with a Java Object
- are to assign it to a variable, to pass it to an extension function, and to convert it to a string, number,
- or boolean, using the rules given below.</p>
-
- <p>Instance-level methods are called by supplying an extra first argument of type Java Object which is the
- object on which the method is to be invoked. A Java Object is usually created by calling an extension
- function (e.g. a constructor) that returns an object; it may also be passed to the style sheet as the
- value of a global parameter. Matching of method names is done as for static methods.
- If there are several methods in the class that match the localname, the system again tries to
- find the one that is the best fit, according to the types of the supplied arguments.</p>
-
- For example, the following stylesheet prints the date and time. This example is copied from the
- documentation of the xt product, and it works unchanged with SAXON, because SAXON
- does not care what the namespace URI for extension functions is, so long as it ends with
- the class name. (Extension functions are likely to be compatible between SAXON and xt
- provided they only use the data types string, number, and boolean).</p>
-
- <code><pre>
-
- <xsl:stylesheet
- version="1.0"
- xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
- xmlns:date="http://www.jclark.com/xt/java/java.util.Date">
-
- <xsl:template match="/">
- <html>
- <xsl:if test="function-available('date:to-string') and function-available('date:new')">
- <p><xsl:value-of select="date:to-string(date:new())"/></p>
- </xsl:if>
- </html>
- </xsl:template>
-
- </xsl:stylesheet>
-
- </pre></code>
-
-
-
- <p>A wrapped Java object may be converted to another data type as follows.
-
- <ul>
- <li>It is converted to a string by using its toString() method; if the object is null, the result is
- the empty string "".</li>
- <li>It is converted to a number by converting it first to a string, and then applying the
- XPath number() conversion. If it is null, the result is NaN.</li>
- <li>It is converted to a boolean as follows: if it is null, the result is false, otherwise
- it is converted to a string and the result is true if and only if the string is non-empty.</li>
- </ul>
-
- <p>The method may have an extra first argument of class com.icl.saxon.Context or org.w3c.xsl.XSLTContext.
- This argument is not
- supplied by the calling XSL code, but by SAXON itself. The Context object provides methods to access many
- internal SAXON resources, the most useful being getContextNode() which returns the context node in the
- source document. The Context object is not available with constructors.</p>
-
- <p>If any exceptions are thrown by the method, or if a matching method cannot be found,
- processing of the stylesheet will be abandoned.</p>
-
- <p>The result type of the method is converted to an XPath value as follows.</p>
-
- <ul>
- <li>If the method returns void, the XPath value is an empty node-set</li>
- <li>If the method returns null, the XPath value is a wrapped null Object.</li>
- <li>If the method is a constructor, the XPath value is of type "wrapped Java object". The only way of
- using this is by passing it to another external function, or by converting it to one of the standard
- XPath data types as described above.</li>
- <li>If the returned value is a Java boolean or Boolean, the XPath result is a boolean.</li>
- <li>If the returned value is a Java int, short, long, double, or float, or one of their object wrapper
- equivalents, it is converted to a double using Java casting, and the XPath result is a number.</li>
- <li>If the returned value is a Java String, the XPath result is a string.</li>
- <li>If the returned value is of class com.icl.saxon.om.NodeInfo (a node in a Saxon tree), it is
- returned as a node-set containing a single node.</li>
- <li>If the returned value is a DOM NodeList, the list of nodes is returned as a Saxon node-set. However,
- all the nodes must be instances of class com.icl.saxon.om.NodeInfo, that is, they must use Saxon's tree
- implementation, not some third-party DOM. But any implementation of NodeList can be used. The nodes
- can come from the original source tree, or from a newly-constructed tree, so long as it is constructed
- using Saxon.</li>
- <li>If the returned value is a DOM Node that is not an instance of class com.icl.saxon.om.NodeInfo, it is
- rejected: the result must use Saxon's DOM implementation, not some third-party DOM.</li>
- <li>If the result is any other Java object (including null), it is returned as a "wrapped Java object".</li>
- </ul>
-
- <p>Note that Saxon's tree structure conforms to the DOM Core Level 2 interface. However, it is read-only:
- any attempt to modify the tree causes an exception. Saxon's trees can only be built using the Saxon
- subclasses of the com.icl.saxon.tree.Builder class, and they cannot be modified <i>in situ</i>.</p>
-
- <p>The system function <b>function-available(String name)</b> returns true if there appears
- to be a method available with the right name. It does not test whether this method has the appropriate
- number of arguments or whether the arguments are of appropriate types. If the function name is "new" it
- returns true so long as the class is not an abstract class or interface, and so long as it has at least
- one constructor.</p>
-
- <p>There are a number of extension functions supplied with the SAXON product: for details, see
- <a href="extensions.html">extensions.html</a>. The source code of these methods, which
- in most cases is extremely simple, can be used as an example for writing
- other user extension functions. It is found in class com.icl.saxon.functions.Extensions</p>
-
-
- <a name="Writing-extension-elements"><h2>Writing extension elements</h2></a>
-
- <p>SAXON implements the element extensibility feature defined in section 14.1 of the standard.
- This feature allows you to define your own element types for use in the stylesheet. </p>
-
- <p>If a namespace prefix is to be used to denote extension elements, it must be declared in the
- <b>extension-element-prefixes</b> attribute on the xsl:stylesheet element, or the
- <b>xsl:extension-element-prefixes</b> attribute on any enclosing literal result element or
- extension element.
-
- <p>Note that SAXON itself provides a number of stylesheet elements beyond those defined in the
- XSLT specification, including saxon:assign, saxon:entity-ref, saxon:while,
- saxon:group, saxon:item. To enable these, use the standard XSL extension mechanism: define
- <b>extension-element-prefixes="saxon"</b> on the xsl:stylesheet element, or
- <b>xsl:extension-element-prefixes="saxon"</b> on any enclosing literal result element.</p>
-
- <p>To invoke a user-defined set of extension elements, include the prefix in this attribute as
- described, and associate it with a namespace URI that ends in "/" followed by the fully qualified
- class name of a Java class that implements the <b>com.icl.saxon.style.ExtensionElementFactory</b> interface.
- This interface defines a single method, <b>getExtensionClass()</b>, which takes the local name of the element
- (i.e., the name without its namespace prefix) as a parameter, and returns the Java class used to
- implement this extension element (for example, "return SQLConnect.class"). The class returned must
- be a subclass of com.icl.saxon.style.StyleElement.</p>
-
- <p>The best way to see how to implement an extension element is by looking at the example, for SQL
- extension elements, provided in package <b>com.icl.saxon.sql</b>, and at the sample stylesheet <b>books.sqlxsl</b>
- which uses these extension elements. There are three main methods a StyleElement
- class must provide:</p>
-
- <table>
- <tr><td valign=top width="30%">prepareAttributes()</td>
- <td>This is called while the stylesheet tree is still being built, so it should not attempt
- to navigate the tree. Its task is to validate the attributes of the stylesheet element and
- perform any preprocessing necessary. For example, if the attribute is an attribute value template,
- this includes creating an Expression that can subsequently be evaluated to get the AVT's
- value.</td></tr>
- <tr><td valign=top>validate()</td>
- <td>This is called once the tree has been built, and its task is to check that the stylesheet
- element appears in the right context within the tree, e.g. that it is within a template</td></tr>
- <tr><td valign=top>process()</td>
- <td>This is called to process a particular node in the source document, which can be accessed
- by reference to the Context supplied as a parameter.</td></tr>
- <tr><td valign=top>isInstruction()</td>
- <td>This should return true, to ensure that the element is allowed to appear
- within a template body.</td></tr>
- <tr><td valign=top>mayContainTemplateBody(()</td>
- <td>This should return true, to ensure that the element can contain instructions.
- Even if it can't contain anything else, extension elements should allow an xsl:fallback
- instruction to provide portability between processors</td></tr>
- </table>
-
- <p>The StyleElement class has access to many services supplied either via its superclasses or via
- the Context object. For details, see the API documentation of the individual classes.</p>
-
- <p>Any element whose prefix matches a namespace listed in the extension-element-prefixes
- attribute of an enclosing element is treated as an extension element. If no class can be
- instantiated for the element (for example, because no ExtensionElementFactory can be loaded,
- or because the ExtensionElementFactory doesn't recognise the local name), then fallback
- action is taken as follows. If the element has one or more xsl:fallback children, they are
- processed. Otherwise, an error is reported. When xsl:fallback is used in any other context, it
- and its children are ignored.</p>
-
- <p>It is also possible to test whether an extension element is implemented by using the system
- function element-available(). This returns true if the namespace of the element identifies
- it as an extension element (or indeed as a standard XSL instruction) and if a class can be instantiated
- to represent it. If the namespace is not that of an extension element, or if no class can be
- instantiated, it returns false.</p>
-
- <a name="Writing-Java-node-handlers"><h2>Writing Java node handlers</h2></a>
-
- <p>A Java node handler can be used to process any node, in place of an XSL template. The handler is
- nominated by using a saxon:handler element with a handler attribute that names the node handler class. The
- handler itself is an implementation of com.icl.saxon.NodeHandler or one of its subclasses (the most usual being
- com.icl.saxon.ElementHandler). The saxon:handler element must be a top-level element, and must be
- empty. It takes the same attributes as xsl:template (match, mode, name, and priority) and is
- considered along with xsl:template elements to decide which template to execute when xsl:call-template
- or xsl:apply-templates is used.</p>
-
- <p>Java node handlers have full access to the source document and the current processing context (for example, the
- values of parameters). The may also trigger processing of other nodes in the document by calling
- applyTemplates(): this works just like xsl:apply-templates, and the selected nodes may be processed either
- by XSL templates or by further Java node handlers.</p>
-
- <p>A Java node handler may also be registered with a name, and may thus be invoked using xsl:call-template. There
- is no direct mechanism for a Java node handler to call a named XSLT template, but the effect can be achieved
- by using a mode that identifies the called template uniquely.</p>
-
- <a name="Writing-input-filters"><h2>Writing input filters</h2></a>
-
- <p>SAXON takes its input from a SAX2 Parser reading from an InputSource. A very useful technique is to
- interpose a <i>filter</i> between the parser and SAXON. The filter will typically be an
- instance of the SAX2 <b>XMLFilter</b> class.
- </p>
-
- <p>See the TrAX examples for hints on using a Saxon Transformer as part of a chain of
- SAX Filters.</p>
-
-
- <p>Note that SAXON relies on the application to supply a well-balanced sequence
- of SAX events; it doesn't need to be well-formed (the root node can have any number
- of element or text children), but if it isn't well-balanced,
- the consequences are unpredictable.</p>
-
- <a name="Writing-output-filters"><h2>Writing output filters</h2></a>
-
- <p>The output of a SAXON stylesheet can be directed to a user-defined output filter. This filter can be
- defined either as a standard SAX1 <b>DocumentHandler</b>, a SAX2 <b>ContentHandler</b>, or
- as a subclass of the SAXON class
- <b>com.icl.saxon.output.Emitter</b>. The advantage of using an Emitter is that more information is available
- from the stylesheet, for example the attributes of the xsl:output element.</p>
-
- <p>When a ContentHandler is used, Saxon will by default always supply a stream of events corresponding
- to a well-formed document. (The XSLT
- specification also allows the output to be an external general parsed entity.) If the result tree is not
- well-formed, Saxon will notify the content handler of the fact by sending a processing instruction
- with the name "saxon:warning" and the text "Output suppressed because it is not well-formed".
- If the content handler is happy to accept output that is not well-formed, it can respond to this
- processing instruction by throwing a SAXException whose message text is "continue"; in this case
- subsequent events will be notified whether or not they are well-formed.</p>
-
- <p>As specified in the JAXP 1.1 interface, requests to disable or re-enable output escaping
- are also notified to the content handler by means of special processing instructions. The
- names of these processing instructions are defined by the constants PI_DISABLE_OUTPUT_ESCAPING
- and PI_ENABLE_OUTPUT_ESCAPING defined in class javax.xml.transform.Result.</p>
-
- <p>If an Emitter is used, however, it will be informed of all events.</p>
-
- <p>The Emitter or ContentHandler to be used is specified in the <b>method</b> attribute of the
- xsl:output or xsl:document
- element, as a fully-qualified class name; for example
- <b>method="prefix:com.acme.xml.SaxonOutputFilter"</b>. The namespace prefix is ignored, but
- must be present to meet XSLT conformance rules.
-
- <p>See the documentation of class com.icl.saxon.output.Emitter for details of the methods available, or
- implementations such as HTMLEmitter and XMLEmitter and TEXTEmitter for the standard output formats
- supported by SAXON.</p>
-
- <p>It can sometimes be useful to set up a chain of emitters working as a pipeline. To write a filter
- that participates in such a pipeline, the class <b>ProxyEmitter</b> is supplied. Use the class <b>Indenter</b>,
- which handles XML and HTML indentation, as an example of how to write a ProxyEmitter.</p>
-
- <p>Rather than writing an output filter in Java, SAXON also allows you to process the output through
- another XSL stylesheet. To do this, simply name the next stylesheet in the saxon:next-in-chain attribute
- of xsl:output or xsl:document. </p>
-
- <p>Any number of
- user-defined attributes may be defined on both xsl:output and xsl:document. These
- attributes must have names in a non-null namespace, which must not be either the XSLT
- or the Saxon namespace. These attributes are interpreted as attribute value templates.
- The value of the attribute is inserted into the Properties object made available to
- the Emitter handling the output; they will be ignored by the standard output methods,
- but can supply arbitrary information to a user-defined output method. The name of the
- property will be the expanded name of the attribute in JAXP format, for example
- "{http://my-namespace/uri}local-name", and the value will be the value as given,
- after evaluation as an attribute value template.</p>
-
-
- <a name="Implementing a collating sequence"><h2>Implementing a collating sequence</h2></a>
-
- <p>It is possible to define a collating sequence for use by xsl:sort. This is controlled through the
- data-type and lang attributes of the xsl:sort element.</p>
-
- <p>To define language-dependent collating where the sort data-type has its default data type "text",
- you should supply a collator named com.icl.saxon.sort.Compare_<i>lang</i> where <i>lang</i> is the value
- of the xsl:sort lang attribute. For example, for German collating set lang="de" and supply a collator
- named com.icl.saxon.sort.Compare_de. Note that any hyphens in the language name are ignored
- in forming the class name, but case is significant.
- For example if you specify lang="en-GB", the TextComparer must be named
- "com.icl.saxon.sort.Compare_enGB".</p>
-
- <p>To define application-dependent collating, set the data-type attribute of xsl:sort to "xyz:class-name"
- where xyz is any namespace prefix, and class-name is the fully-qualified Java class name of your collator.
- For example if you want to collate
- the names of the months January, February, March, etc, in the conventional sequence you could do this by
- writing <xsl:sort data-type="type:month"/> and providing a collator called "month".</p>
-
- <p>In either case the collator must be a subclass of the abstract class com.icl.saxon.sort.TextComparer.
- The main method you have to implement is compare() which takes two values and returns a number that is
- negative, zero, or positive, depending on whether the first value is less than, equal to, or greater
- than the second.</p>
-
- <p>The collator is also notified of the values of the <b>order</b> and <b>case-order</b> attributes, and
- can modify its strategy accordingly, either by remembering the current settings, or by returning a different
- collator to be used in place of the original.</p>
-
-
- <a name="Implementing-a-numbering-sequence"><h2>Implementing a numbering sequence</h2></a>
-
- <p>It is possible to define a numbering sequence for use by xsl:number. This is controlled through the lang
- attribute of the xsl:number element. The feature is primarily intended to provide language-dependent numbering,
- but in fact it can be used to provide arbitrary numbering sequences: for example if you want to number items
- as "one", "two", "three" etc, you could implement a numbering class to do this and invoke it say with
- lang="alpha".</p>
-
- <p>To implement a numberer for language X, you need to define a class com.icl.saxon.number.Numberer_X,
- for example <b>com.icl.saxon.sort.Numberer_alpha</b>. This must implement the interface Numberer. A (not very
- useful) Numberer is supplied for lang="de" as a specimen, and you can use this as a prototype to write your
- own. A numbering sequence is also supplied for lang="en", and this is used by default if no other can be loaded.</p>
-
- <p>Note that any hyphens in the language name are ignored in forming the class name, but case is significant.
- For example if you specify lang="en-GB", the Numberer must be named "com.icl.saxon.number.Numberer_enGB".</p>
-
- <a name="Adding-an-output-encoding"><h2>Adding an output encoding</h2></a>
-
- <p>If you want to use an output encoding that is not directly supported by Saxon
- (for a list of encodings that are supported, see <a href="conformance.html">conformance.html</a>)
- you can do this by writing a Java class that implements the interface
- <b>com.icl.saxon.charcode.PluggableCharacterSet</b>.
- You need to supply two methods: inCharSet() which tests whether a particular Unicode character
- is present in the character set, and getEncodingName() which returns the name given to the
- encoding by your Java VM. The encoding must be supported by the Java VM. To use this encoding,
- specify the fully-qualified class name as the value of the encoding attribute in xsl:output.</p>
-
- <p>Alternatively, it is possible to specify the CharacterSet class to be used for a named output
- encoding by setting the system property, e.g. -D"encoding.EUC-JP"="EUC_JP"; the value
- of the property should be the name of a class that implements the
- PluggableCharacterSet interface. This indicates the class to be used when the xsl:output
- element specifies encoding="EUC-JP".</p>
-
-
- <hr>
- <p align="center">Michael H. Kay<br>
- <a href="http://www.saxonica.com/">Saxonica Limited</a><br>
- 22 June 2005</p>
- </body>
- </html>