PageRenderTime 43ms CodeModel.GetById 12ms RepoModel.GetById 0ms app.codeStats 0ms

/bundles/plugins-trunk/XML/docs/trang-manual.html

#
HTML | 522 lines | 365 code | 147 blank | 10 comment | 0 complexity | 7eea7fef0baf014d6bff8134b65e14e7 MD5 | raw file
Possible License(s): BSD-3-Clause, AGPL-1.0, Apache-2.0, LGPL-2.0, LGPL-3.0, GPL-2.0, CC-BY-SA-3.0, LGPL-2.1, GPL-3.0, MPL-2.0-no-copyleft-exception, IPL-1.0
  1. <html xmlns="http://www.w3.org/1999/xhtml">
  2. <head>
  3. <title>Trang Manual</title>
  4. </head>
  5. <body>
  6. <h1>Trang Manual</h1>
  7. <p>Copyright &#169; 2002, 2003, 2008 Thai Open Source Software Center Ltd</p>
  8. <p>See the file <a href="copying.txt">copying.txt</a> for copying
  9. permission.</p>
  10. <h3>Version 20090818</h3>
  11. <h2>Contents</h2>
  12. <ul>
  13. <li><a href="#introduction">Introduction</a></li>
  14. <li><a href="#running">Running Trang</a></li>
  15. <li><a href="#arguments">Command-line arguments</a></li>
  16. <li><a href="#input-modules">Input modules</a>
  17. <ul>
  18. <li><a href="#rng-input">RELAX NG (XML syntax)</a></li>
  19. <li><a href="#rnc-input">RELAX NG (compact syntax)</a></li>
  20. <li><a href="#dtd-input">XML DTD</a></li>
  21. <li><a href="#xml-input">XML</a></li>
  22. </ul>
  23. </li>
  24. <li><a href="#output-modules">Output modules</a>
  25. <ul>
  26. <li><a href="#rng-output">RELAX NG (XML syntax)</a></li>
  27. <li><a href="#rnc-output">RELAX NG (compact syntax)</a></li>
  28. <li><a href="#dtd-output">XML DTD</a></li>
  29. <li><a href="#xsd-output">W3C XML Schema</a></li>
  30. </ul>
  31. </li>
  32. <!--
  33. <li><a href="#examples">Examples</a></li>
  34. -->
  35. </ul>
  36. <h2><a name="introduction">Introduction</a></h2>
  37. <p>Trang takes as input a schema written in any of the following formats:</p>
  38. <ul>
  39. <li>RELAX NG (XML syntax)</li>
  40. <li>RELAX NG (compact syntax)</li>
  41. <li>XML 1.0 DTD</li>
  42. </ul>
  43. <p>and produces as output a schema written in any of the following formats:</p>
  44. <ul>
  45. <li>RELAX NG (XML syntax)</li>
  46. <li>RELAX NG (compact syntax)</li>
  47. <li>XML 1.0 DTD</li>
  48. <li>W3C XML Schema</li>
  49. </ul>
  50. <p>Trang can also infer a schema from one or more example XML
  51. documents.</p>
  52. <p>Trang uses an internal representation based on RELAX NG. For each
  53. supported input format, there is an input module that converts a
  54. schema in that input format into this internal representation. For
  55. each supported output format, there is an output module that converts
  56. the internal representation into a schema in that output format.
  57. Thus, any supported input format can be translated to any supported
  58. output format.</p>
  59. <h2><a name="running">Running Trang</a></h2>
  60. <p>The file <code>trang.jar</code> contains Trang packaged for use
  61. with a Java runtime. It requires a Java runtime compatible with the
  62. Java 2 Platform, Standard Edition (J2SE) version 5 (or any later
  63. version), such as the Java Runtime Environment (JRE), which can be
  64. downloaded <a href="http://java.sun.com/j2se/downloads.html">here</a>.</p>
  65. <p>Once you have installed a suitable Java runtime, you can run Trang
  66. by using the command:</p>
  67. <pre>java -jar trang.jar <var>args</var></pre>
  68. <p>where <code><var>args</var></code> are additional command-line
  69. arguments described <a href="#arguments">below</a>.</p>
  70. <h2><a name="arguments">Command-line arguments</a></h2>
  71. <p>Trang requires two command-line arguments: the first is the URI or
  72. filename of the schema to be translated; the second is the output
  73. filename.</p>
  74. <p>Trang infers the input and output modules to be used from the
  75. extension of input and output filenames as follows:</p>
  76. <dl>
  77. <dt><code>.rng</code></dt>
  78. <dd>RELAX NG (XML syntax)</dd>
  79. <dt><code>.rnc</code></dt>
  80. <dd>RELAX NG (compact syntax)</dd>
  81. <dt><code>.dtd</code></dt>
  82. <dd>XML 1.0 DTD</dd>
  83. <dt><code>.xsd</code></dt>
  84. <dd>W3C XML Schema</dd>
  85. <dt><code>.xml</code></dt>
  86. <dd>XML documents (used as examples from which to infer a schema)</dd>
  87. </dl>
  88. <p>This inference can be overridden using the <code>-I</code> and
  89. <code>-O</code> options.</p>
  90. <p>When the input is XML documents used as examples to infer a schema,
  91. more than one input file may be specified as arguments. All the input
  92. files are specified before the output file.</p>
  93. <p>The arguments specifying the input and output files can be preceded
  94. by arguments specifying options. Trang accepts the following
  95. options:</p>
  96. <dl>
  97. <dt><code>-I <a href="#rng-input">rng</a></code></dt>
  98. <dt><code>-I <a href="#rnc-input">rnc</a></code></dt>
  99. <dt><code>-I <a href="#dtd-input">dtd</a></code></dt>
  100. <dt><code>-I <a href="#xml-input">xml</a></code></dt>
  101. <dd>Specifies the input module.</dd>
  102. <dt><code>-O <a href="#rng-output">rng</a></code></dt>
  103. <dt><code>-O <a href="#rnc-output">rnc</a></code></dt>
  104. <dt><code>-O <a href="#rng-output">dtd</a></code></dt>
  105. <dt><code>-O <a href="#xsd-output">xsd</a></code></dt>
  106. <dd>Specifies the output module.</dd>
  107. <dt><code>-i <var>param</var></code></dt>
  108. <dt><code>-o <var>param</var></code></dt>
  109. <dd>Specifies an additional parameter for an input (<code>-i</code>)
  110. or output (<code>-o</code>) module. The <code>-i</code> and
  111. <code>-o</code> options may be used multiple times in order to specify
  112. multiple parameters. There are two kinds of parameter: boolean
  113. parameters and string-valued parameters. A string-valued parameter is
  114. specified using the form
  115. <code><var>name</var>=<var>value</var></code>. A boolean parameter is
  116. specified using the form <code><var>name</var></code> or
  117. <code>no-<var>name</var></code>. The applicable parameters depend on
  118. the particular input and output module and are described in the
  119. documentation for the <a href="#input-modules">input</a> or <a
  120. href="#output-modules">output</a> modules.</dd>
  121. </dl>
  122. <h2><a name="input-modules">Input modules</a></h2>
  123. <h3><a name="rng-input">RELAX NG (XML syntax) input module</a></h3>
  124. <p>This input module accepts RELAX NG schemas in XML syntax as defined
  125. by the RELAX NG 1.0 <a
  126. href="http://www.oasis-open.org/committees/relax-ng/spec.html"
  127. >Committee Specification</a>.</p>
  128. <p>It accept the following parameters:</p>
  129. <dl>
  130. <dt><code>-i encoding=<var>name</var></code></dt>
  131. <dd>Use an encoding of <var>name</var> rather than the encoding
  132. specified in the encoding declaration of the XML document.</dd>
  133. </dl>
  134. <!-- XXX mention incomplete schemas -->
  135. <h3><a name="rnc-input">RELAX NG Compact Syntax input module</a></h3>
  136. <p>This input module accepts RELAX NG schemas using the compact syntax
  137. as defined in the RELAX NG Compact Syntax <a
  138. href="http://www.oasis-open.org/committees/relax-ng/compact-20021121.html"
  139. >Committee Specification</a>.</p>
  140. <p>It accepts the following parameters:</p>
  141. <dl>
  142. <dt><code>-i encoding=<var>name</var></code></dt>
  143. <dd>Use an encoding of <var>name</var>. By default, Trang will
  144. autodetect an encoding of UTF-8 or UTF-16.</dd>
  145. </dl>
  146. <h3><a name="dtd-input">DTD input module</a></h3>
  147. <p>This input module accepts DTDs as defined by the XML 1.0 <a
  148. href="http://www.w3.org/TR/REC-xml">Recommendation</a>.</p>
  149. <!-- Say something about namespaces -->
  150. <p>It accepts the following parameters:</p>
  151. <dl>
  152. <dt><code>-i xmlns=<var>uri</var></code></dt>
  153. <dd>Specifies the default namespace, that is the namespace used for
  154. unqualified element names.</dd>
  155. <dt><code>-i xmlns:<var>prefix</var>=<var>uri</var></code></dt>
  156. <dd>Specifies the namespace for the element and attribute names using
  157. <code><var>prefix</var></code>.</dd>
  158. <dt><code>-i colon-replacement=<var>chars</var></code></dt>
  159. <dd><a name="colon-replacement">Replaces colons in element names by
  160. <code><var>chars</var></code> when constructing the names of
  161. definitions used to represent the element declarations and attribute
  162. list declarations in the DTD. Trang generates a definition for each
  163. element declaration and attlist declaration in the DTD. The name of
  164. the definition is based on the name of the element. In RELAX NG, the
  165. names of definitions cannot contain colons. However, in the DTD, the
  166. element name may contain a colon. By default, Trang will first try to
  167. use the element names without prefixes. If this causes a conflict, it
  168. will instead replace the colon by a legal name character (it try first
  169. to use a period).</a></dd>
  170. <dt><code>-i element-define=<var>name-pattern</var></code></dt>
  171. <dd>Specifies how to construct the name of the definition representing
  172. an element declaration from the name of the element. The
  173. <code><var>name-pattern</var></code> must contain exactly one percent
  174. character. This percent character is replaced by the name of element
  175. (after <a href="#colon-replacement">colon replacement</a>) and the
  176. result is used as the name of the definition.</dd>
  177. <dt><a name="inline-attlist"><code>-i inline-attlist</code></a></dt>
  178. <dd>Specifies not to generate definitions for attribute list
  179. declarations and instead move attributes declared in attribute list
  180. declarations into the definitions generated for element declarations.
  181. This is the default behavior when the output module is
  182. <code>xsd</code>. Otherwise, the default behaviour is as described in
  183. the <a href="#no-inline-attlist"><code>-i no-inline-attlist</code></a>
  184. parameter.</dd>
  185. <dt><a name="no-inline-attlist"><code>-i no-inline-attlist</code></a></dt>
  186. <dd>Generates a distinct definition (with
  187. <code>combine="interleave"</code>) for each attribute list declaration
  188. in the DTD; the definition for each element declaration references the
  189. definition for the corresponding attribute list declaration. This is
  190. the default behavior, except when the output module is
  191. <code>xsd</code>, for which the default behavior is as described in
  192. the <a href="#inline-attlist"><code>-i inline-attlist</code></a>
  193. parameter.</dd>
  194. <dt><code>-i attlist-define=<var>name-pattern</var></code></dt>
  195. <dd>This specifies how to construct the name of the definition
  196. representing an attribute list declaration from the name of the
  197. element. The <code><var>name-pattern</var></code> must contain exactly
  198. one percent character. This percent character is replaced by the name
  199. of element (after <a href="#colon-replacement">colon replacement</a>)
  200. and the result is used as the name of the definition.</dd>
  201. <dt><code>-i any-name=<var>name</var></code></dt>
  202. <dd>Specifies the name of the definition generated for the content of
  203. elements declared in the DTD as having a content model of ANY.</dd>
  204. <dt><code>-i strict-any</code></dt>
  205. <dd>Preserves the exact semantics of ANY content models by using an
  206. explicit choice of references to all declared elements. By default,
  207. Trang uses a wildcard that allows any element.</dd>
  208. <dt><code>-i annotation-prefix=<var>prefix</var></code></dt>
  209. <dd>Default values are represented using an annotation attribute
  210. <code><var>prefix</var>:defaultValue</code> where
  211. <code><var>prefix</var></code> is bound to
  212. <code>http://relaxng.org/ns/compatibility/annotations/1.0</code> as
  213. defined by the RELAX NG DTD Compatibility <a
  214. href="http://www.oasis-open.org/committees/relax-ng/compatibility.html"
  215. >Committee Specification</a>. By default, Trang will use
  216. <code>a</code> for <code><var>prefix</var></code> unless that
  217. conflicts with a prefix used in the DTD.</dd>
  218. <dt><code>-i generate-start</code></dt>
  219. <dt><code>-i no-generate-start</code></dt>
  220. <dd>Specifies whether Trang should generate a <code>start</code>
  221. element. DTDs do not indicate what elements are allowed as document
  222. elements. Trang assumes that all elements that are defined but never
  223. referenced are allowed as document elements.</dd>
  224. </dl>
  225. <!-- Say something about limitations wrt marked sections -->
  226. <h3><a name="xml-input">XML input module</a></h3>
  227. <p>This input module accepts one or more XML documents and infers a
  228. schema. All the XML documents will be valid with respect to the
  229. inferred schema.</p>
  230. <p>It accept the following parameters:</p>
  231. <dl>
  232. <dt><code>-i encoding=<var>name</var></code></dt>
  233. <dd>Use an encoding of <var>name</var> rather than the encoding
  234. specified in the encoding declaration of the XML document.</dd>
  235. </dl>
  236. <h2><a name="output-modules">Output modules</a></h2>
  237. <p>All output modules accept the following parameters:</p>
  238. <dl>
  239. <dt><code>-o encoding=<var>name</var></code></dt>
  240. <dd>Use an encoding of <code><var>name</var></code> for the output
  241. files.</dd>
  242. <!-- describe default -->
  243. <dt><code>-o indent=<var>n</var></code></dt>
  244. <dd>Indent by <code><var>n</var></code> spaces for each indentation
  245. level.</dd>
  246. </dl>
  247. <h3><a name="rng-output">RELAX NG (XML syntax) output module</a></h3>
  248. <p>This output module outputs RELAX NG schemas in XML syntax as
  249. defined by the RELAX NG 1.0 <a
  250. href="http://www.oasis-open.org/committees/relax-ng/spec.html">Committee
  251. Specification</a>.</p>
  252. <h3><a name="rnc-output">RELAX NG Compact Syntax output module</a></h3>
  253. <p>This output module outputs RELAX NG schemas in compact syntax as
  254. defined by the RELAX NG Compact Syntax <a
  255. href="http://www.oasis-open.org/committees/relax-ng/compact-20021121.html"
  256. >Committee Specification</a>.</p>
  257. <h3><a name="dtd-output">DTD output module</a></h3>
  258. <p>This output module outputs DTDs as defined by the XML 1.0 <a
  259. href="http://www.w3.org/TR/REC-xml">Recommendation</a>.</p>
  260. <p>It has many limitations. There are many RELAX NG features that it
  261. cannot handle, including:</p>
  262. <ul>
  263. <li>Wildcards</li>
  264. <li>Multiple <code>element</code> patterns with the same name</li>
  265. <li><code>externalRef</code></li>
  266. <li>overriding definitions (in an <code>include</code>)</li>
  267. <li>combining definitions with <code>combine="choice"</code></li>
  268. </ul>
  269. <p>However, it can handle many RELAX NG features, including some
  270. that go beyond the capabilities of DTDs. When some part of a RELAX NG
  271. schema cannot be represented exactly in DTD, Trang will try to
  272. <i>approximate</i> it. The approximation will always be more general,
  273. that is, the DTD will allow everything that is allowed by the RELAX NG
  274. schema, but there may be some things that are allowed by the DTD that
  275. are not allowed by the RELAX NG schema. For example, if the RELAX NG
  276. schema specifies that the content of an element is a string conforming
  277. to some datatype, then Trang will make the content of the element be
  278. <code>(#PCDATA)</code>; or if the RELAX NG schema specifies a choice
  279. between two attributes <var>x</var> and <var>y</var>, then the DTD
  280. will allow both <var>x</var> and <var>y</var> optionally. Whenever
  281. Trang approximates, it will give a warning message.</p>
  282. <p>If you want to be able to generate a DTD but need to use some
  283. feature of RELAX NG that Trang is unable to convert into a DTD, then
  284. you might try one of the following approaches:</p>
  285. <ul>
  286. <li>Create a RELAX NG schema including the features you need, and then
  287. use XSLT (or some other XML transformation language) to transform the
  288. schema into something that Trang can handle, perhaps making use of
  289. annotations in the schema to guide the transformation.</li>
  290. <li>Create a RELAX NG schema <var>S</var><sub>1</sub> which uses only
  291. features that Trang can handle but which, consequently, does not
  292. capture all the desired constraints; then create a second RELAX NG
  293. schema <var>S</var><sub>2</sub> that <code>include</code>s
  294. <var>S</var><sub>1</sub>, and overrides definitions in
  295. <var>S</var><sub>1</sub> replacing them with definitions that make
  296. unrestricted use of the features of RELAX NG.</li>
  297. </ul>
  298. <h3><a name="xsd-output">W3C XML Schema output module</a></h3>
  299. <p>This output module outputs an W3C XML Schema as defined by the XML
  300. Schema <a href="http://www.w3.org/TR/xmlschema-1/"
  301. >Recommendation</a>.</p>
  302. <p>It supports the following parameters:</p>
  303. <dl>
  304. <dt><code>-o disable-abstract-elements</code></dt>
  305. <dd>Disables the use of abstract elements and subsitution groups in
  306. the generated XML Schema. This can also be controlled using an <a
  307. href="#enable-abstract-elements">annotation attribute</a>.</dd>
  308. <dt><code>-o any-process-contents=strict</code>|<code>lax</code>|<code>skip</code></dt>
  309. <dd>Specifies the value for the <code>processContents</code> attribute
  310. of <code>any</code> elements. The default is <code>skip</code>
  311. (corresponding to RELAX NG semantics) unless the input format is
  312. <code>dtd</code>, in which case the default is <code>strict</code>
  313. (corresponding to DTD semantics).</dd>
  314. <dt><code>-o any-attribute-process-contents=strict</code>|<code>lax</code>|<code>skip</code></dt>
  315. <dd>Specifies the value for the <code>processContents</code> attribute
  316. of <code>anyAttribute</code> elements. The default is
  317. <code>skip</code> (corresponding to RELAX NG semantics).</dd>
  318. </dl>
  319. <p>It has the following limitations:</p>
  320. <ul>
  321. <li>it may generate schemas that violate W3C XML Schema's restrictions
  322. on ambiguous content models;</li>
  323. <li>it may generate schemas that violate W3C XML Schema's restrictions
  324. on consistent element types;</li>
  325. <li>when the RELAX NG schema cannot be represented by W3C XML Schema,
  326. a generalization is generated; it should give a warning in this case,
  327. but does not always do so.</li>
  328. </ul>
  329. <p>Annotations can be added to the RELAX NG schema to guide the
  330. translation. These annotations have the namespace URI
  331. <code>http://www.thaiopensource.com/ns/relaxng/xsd</code>. This document
  332. will use the convention that the prefix <code>tx</code> refers to this
  333. namespace URI; in other words, it will assume a namespace declaration
  334. of</p>
  335. <pre>xmlns:tx="http://www.thaiopensource.com/ns/relaxng/xsd"</pre>
  336. <p><a name="enable-abstract-elements"/>Currently, only one annotation
  337. is supported, an attribute <code>tx:enableAbstractElements</code>.
  338. The value of this must be <code>true</code> or <code>false</code>. It
  339. applies to RELAX NG <code>define</code> elements. Trang has the
  340. ability to translate a <code>define</code> that contains a choice of
  341. element patterns into an abstract element declaration, which will be
  342. used as the head of a substitution group whose members are the
  343. elements in the choice. Whether it does this is determined by the
  344. value of the <code>tx:enableAbstractElements</code> annotation
  345. attribute. If the value is <code>true</code>, it will attempt to use
  346. an abstract element element. If the value is <code>false</code>, it
  347. will not, which means the <code>define</code> will typically be
  348. translated into a group definition.</p>
  349. <p>The <code>tx:enableAbstractElements</code> attribute is inherited
  350. in a similar way to the <code>ns</code> attribute: it can be specified
  351. on a <code>grammar</code>, <code>div</code> or <code>include</code>
  352. element to enable or disable the use of abstract elements for all
  353. descendant <code>define</code> elements. In the absence of any
  354. inherited <code>tx:enableAbstractElements</code> attribute, the use of
  355. abstract elements is enabled unless the <code>-o
  356. disable-abstract-elements</code> option was specified.</p>
  357. <p>It can happen that the same element name occurs in a choice in more
  358. than one <code>define</code> element; at most one of these
  359. <code>define</code> elements can be translated to an abstract element.
  360. In this case, Trang will not translate any of them to an abstract
  361. element, unless the use of abstract elements has been disabled by
  362. <code>tx:enableAbstractElements</code> for all except one of the
  363. <code>define</code> elements.</p>
  364. <p>In fact, the use of abstract elements is not restricted to the case
  365. where the <code>define</code> consists of a <code>choice</code> that
  366. contains only <code>element</code> patterns; the <code>choice</code>
  367. may also contain <code>ref</code> patterns referring to definitions
  368. that are to be translated into element declarations, whether abstract
  369. or not. The <code>tx:enableAbstractElements</code> attribute applies
  370. equally to these definitions.</p>
  371. <!--
  372. <h2><a name="examples">Examples</a></h2>
  373. -->
  374. </body>
  375. </html>