/jEdit/tags/jedit-4-0-pre3/doc/users-guide/writing-modes.xml
# · XML · 515 lines · 507 code · 6 blank · 2 comment · 0 complexity · 1386484f546c7d11e0ad5d68fdd0325a MD5 · raw file
- <!-- jEdit buffer-local properties: -->
- <!-- :indentSize=1:noTabs=true:folding=indent:collapseFolds=1: -->
- <chapter id="writing-modes"><title>Writing Edit Modes</title>
- <para>
- Edit modes are defined using XML, the <firstterm>extensible markup
- language</firstterm>; mode files have the extension
- <filename>.xml</filename>. XML is a very simple language, and as a result
- edit modes are easy to create and modify. This section will
- start with a short XML primer, followed by detailed information about
- each supported tag and highlighting rule.
- </para>
- <sidebar><title>Changes to modes take effect immediately</title>
- <para>
- Editing a mode file, or a mode catalog file within jEdit will cause the
- edit modes to be reloaded automatically as soon as the file is saved.
- </para>
- <para>
- <guimenu>Utilities</guimenu>><guimenuitem>Reload Edit Modes</guimenuitem>
- can be used to reload edit modes after changes to mode files are made
- outside jEdit.
- </para>
- </sidebar>
- <sect1 id="xml-primer"><title>An XML Primer</title>
- <para>
- A very simple edit mode looks like so:
- </para>
- <programlisting><![CDATA[<?xml version="1.0"?>
- <!DOCTYPE MODE SYSTEM "xmode.dtd">
- <MODE>
- <PROPS>
- <PROPERTY NAME="commentStart" VALUE="/*" />
- <PROPERTY NAME="commentEnd" VALUE="*/" />
- </PROPS>
- <RULES>
- <SPAN TYPE="COMMENT1">
- <BEGIN>/*</BEGIN>
- <END>*/</END>
- </SPAN>
- </RULES>
- </MODE>]]></programlisting>
- <para>
- Note that each opening tag must have a corresponding closing tag.
- If there is nothing between the opening and closing tags, for example
- <literal><TAG></TAG></literal>, the shorthand notation
- <literal><TAG /></literal> may be used. An example of this shorthand
- can be seen
- in the <literal><PROPERTY></literal> tags above.
- </para>
- <para>
- XML is case sensitive. <literal>Span</literal> or <literal>span</literal>
- is not the same as <literal>SPAN</literal>.
- </para>
- <para>
- To insert a special character such as < or > literally in XML
- (for example, inside an attribute value), you must write it as
- an <firstterm>entity</firstterm>. An
- entity consists of the character's symbolic name enclosed with
- <quote>&</quote> and <quote>;</quote>. A full list of entities is out of
- the scope of this section, but the most important are:
- </para>
- <itemizedlist>
- <listitem><para><literal>&lt;</literal> - The less-than (<)
- character</para></listitem>
- <listitem><para><literal>&gt;</literal> - The greater-than (>)
- character</para></listitem>
- <listitem><para><literal>&amp;</literal> - The ampersand (&)
- character</para></listitem>
- </itemizedlist>
- <para>
- For example, the following will cause a syntax error:
- </para>
- <programlisting><SEQ TYPE="OPERATOR">&</SEQ></programlisting>
- <para>
- Instead, you must write:
- </para>
- <programlisting><SEQ TYPE="OPERATOR">&amp;</SEQ></programlisting>
- <para>
- Now that the basics of XML have been covered, the rest of this
- section will cover each construct in detail.
- </para>
- </sect1>
- <sect1 id="mode-preamble"><title>The Preamble and MODE tag</title>
- <para>
- Each mode definition must begin with the following:
- </para>
- <programlisting><?xml version="1.0"?>
- <!DOCTYPE MODE SYSTEM "xmode.dtd"></programlisting>
- <para>
- Each mode definition must also contain exactly one <literal>MODE</literal>
- tag. All other tags (<literal>PROPS</literal>, <literal>RULES</literal>)
- must be placed inside the <literal>MODE</literal> tag.
- </para>
- </sect1>
- <sect1 id="mode-tag-props"><title>The PROPS Tag</title>
- <para>
- The <literal>PROPS</literal> tag and the <literal>PROPERTY</literal> tags
- inside it are used to define mode-specific
- properties. Each <literal>PROPERTY</literal> tag must have a
- <literal>NAME</literal> attribute set to the property's name, and a
- <literal>VALUE</literal> attribute with the property's value.
- </para>
- <para>
- All buffer-local properties listed in <xref linkend="buffer-local" />
- may be given values in edit modes. In addition, the following mode
- properties have no buffer-local equivalent:
- </para>
- <itemizedlist>
- <listitem><para><literal>indentCloseBrackets</literal> -
- A list of characters (usually brackets) that subtract indent from
- the <emphasis>current</emphasis> line. For example, in Java mode this
- property is set to <quote>}</quote>.</para></listitem>
- <listitem><para><literal>indentOpenBrackets</literal> -
- A list of characters (usually brackets) that add indent to
- the <emphasis>next</emphasis> line. For example, in Java mode this
- property is set to <quote>{</quote>.</para></listitem>
- <listitem><para><literal>indentPrevLine</literal> -
- When indenting a line, jEdit checks if the previous line matches
- the regular expression stored in this property. If it does, a level
- of indent is added. For example, in Java mode this regular expression
- matches language constructs such as
- <quote>if</quote>, <quote>else</quote>, <quote>while</quote>, etc.</para>
- </listitem>
- <listitem><para><literal>doubleBracketIndent</literal> -
- If a line matches the <literal>indentPrevLine</literal> regular
- expression and the next line contains an opening bracket,
- a level of indent will not be added to the next line, unless
- this property is set to <quote>true</quote>. For example, with this
- property set to <quote>false</quote>, Java code will be indented like so:
- </para>
- <programlisting>while(objects.hasMoreElements())
- {
- ((Drawable)objects.nextElement()).draw();
- }</programlisting>
- <para>
- On the other hand, settings this property to <quote>true</quote> will
- give the following result:
- </para>
- <programlisting>while(objects.hasMoreElements())
- {
- ((Drawable)objects.nextElement()).draw();
- }</programlisting></listitem>
- </itemizedlist>
- <para>
- Here is the complete <literal><PROPS></literal> tag for Java mode:
- </para>
- <programlisting><PROPS>
- <PROPERTY NAME="indentOpenBrackets" VALUE="{" />
- <PROPERTY NAME="indentCloseBrackets" VALUE="}" />
- <PROPERTY NAME="indentPrevLine" VALUE="\s*(((if|while)
- \s*\(|else|case|default)[^;]*|for\s*\(.*)" />
- <PROPERTY NAME="doubleBracketIndent" VALUE="false" />
- <PROPERTY NAME="commentStart" VALUE="/*" />
- <PROPERTY NAME="commentEnd" VALUE="*/" />
- <PROPERTY NAME="blockComment" VALUE="//" />
- <PROPERTY NAME="wordBreakChars" VALUE=",+-=<>/?^&*" />
- </PROPS></programlisting>
- </sect1>
- <sect1 id="mode-tag-rules"><title>The RULES Tag</title>
- <para>
- <literal>RULES</literal> tags must be placed inside the
- <literal>MODE</literal> tag. Each <literal>RULES</literal> tag defines a
- <firstterm>ruleset</firstterm>. A ruleset consists of a number of
- <firstterm>parser rules</firstterm>, with each parser
- rule specifying how to highlight a specific syntax token. There must
- be at least one ruleset in each edit mode. There can also be more
- than one, with different rulesets being used to highlight different
- parts of a buffer (for example, in HTML mode, one rule set
- highlights HTML tags, and another highlights inline JavaScript).
- For information about using more
- than one ruleset, see <xref linkend="mode-rule-span" />.
- </para>
- <para>
- The <literal>RULES</literal> tag supports the following attributes, all of
- which are optional:
- </para>
- <itemizedlist>
- <listitem><para><literal>SET</literal> - the name of this ruleset.
- All rulesets other than the first must have a name.
- </para></listitem>
- <listitem><para><literal>HIGHLIGHT_DIGITS</literal> - if set to
- <literal>TRUE</literal>, digits (0-9, as well as hexadecimal literals
- prefixed with <quote>0x</quote>) will be highlighted with the
- <classname>DIGIT</classname> token type. Default is <literal>FALSE</literal>.
- </para></listitem>
- <listitem><para><literal>IGNORE_CASE</literal> - if set to
- <literal>FALSE</literal>, matches will be case sensitive. Otherwise, case
- will not matter. Default is <literal>TRUE</literal>.
- </para></listitem>
- <listitem><para><literal>DEFAULT</literal> - the token type for
- text which doesn't match
- any specific rule. Default is <literal>NULL</literal>. See
- <xref linkend="mode-syntax-tokens" /> for a list of token types.
- </para></listitem>
- </itemizedlist>
- <para>
- Here is an example <literal>RULES</literal> tag:
- </para>
- <programlisting><RULES IGNORE_CASE="FALSE" HIGHLIGHT_DIGITS="TRUE">
- <replaceable>... parser rules go here ...</replaceable>
- </RULES></programlisting>
- <sidebar><title>Rule Ordering Requirements</title>
- <para>
- You might encounter this very common pitfall when writing your own modes.
- </para>
- <para>
- Since jEdit checks buffer text against parser rules in the order they appear
- in the ruleset, more specific rules must be placed before generalized ones,
- otherwise the generalized rules will catch everything.
- </para>
- <para>
- This is best demonstrated with an example. The following is incorrect rule
- ordering:
- </para>
- <programlisting><![CDATA[<SPAN TYPE="MARKUP">
- <BEGIN>[</BEGIN>
- <END>]</END>
- </SPAN>
- <SPAN TYPE="KEYWORD1">
- <BEGIN>[!</BEGIN>
- <END>]</END>
- </SPAN>]]></programlisting>
- <para>
- If you write the above in a rule set, any occurrence of <quote>[</quote>
- (even things like <quote>[!DEFINE</quote>, etc)
- will be highlighted using the first rule, because it will be the
- first to match. This is most likely not the intended behavior.
- </para>
- <para>
- The problem can be solved by placing the more specific rule before the
- general one:
- </para>
- <programlisting><![CDATA[<SPAN TYPE="KEYWORD1">
- <BEGIN>[!</BEGIN>
- <END>]</END>
- </SPAN>
- <SPAN TYPE="MARKUP">
- <BEGIN>[</BEGIN>
- <END>]</END>
- </SPAN>]]></programlisting>
- <para>
- Now, if the buffer contains the text <quote>[!SPECIAL]</quote>, the
- rules will be checked in order, and the first rule will be the first
- to match. However, if you write <quote>[FOO]</quote>, it will be highlighted
- using the second rule, which is exactly what you would expect.
- </para>
- </sidebar>
- <sect2 id="mode-rule-terminate"><title>The TERMINATE Rule</title>
- <para>
- The <literal>TERMINATE</literal> rule specifies that parsing should stop
- after the specified number of characters have been read from a line. The
- number of characters to terminate after should be specified with the
- <literal>AT_CHAR</literal> attribute. Here is an example:
- </para>
- <programlisting><TERMINATE AT_CHAR="1" /></programlisting>
- <para>
- This rule is used in Patch mode, for example, because only the first
- character of each line affects highlighting.
- </para>
- </sect2>
- <sect2 id="mode-rule-whitespace"><title>The WHITESPACE Rule</title>
- <para>
- The <literal>WHITESPACE</literal> rule specifies characters which are to
- be treated as keyword delimiters.
- Most rulesets will have <literal>WHITESPACE</literal> tags for spaces and
- tabs. Here is an example:
- </para>
- <programlisting><WHITESPACE> </WHITESPACE>
- <WHITESPACE> </WHITESPACE></programlisting>
- </sect2>
- <sect2 id="mode-rule-span"><title>The SPAN Rule</title>
- <para>
- The <literal>SPAN</literal> rule highlights text between a start
- and end string. The start and end strings are specified inside
- child elements of the <literal>SPAN</literal> tag.
- The following attributes are supported:
- </para>
- <itemizedlist>
- <listitem><para><literal>TYPE</literal> - The token type to highlight the
- span with. See <xref linkend="mode-syntax-tokens" /> for a list of token
- types</para></listitem>
- <listitem><para><literal>AT_LINE_START</literal> - If set to
- <literal>TRUE</literal>, the span will only be highlighted if the start
- sequence occurs at the beginning of a line</para></listitem>
- <listitem><para><literal>EXCLUDE_MATCH</literal> - If set to
- <literal>TRUE</literal>, the start and end sequences will not be highlighted,
- only the text between them will</para></listitem>
- <listitem><para><literal>NO_LINE_BREAK</literal> - If set to
- <literal>TRUE</literal>, the span will be highlighted with the
- <classname>INVALID</classname> token type if it spans more than one
- line</para></listitem>
- <listitem><para><literal>NO_WORD_BREAK</literal> - If set to
- <literal>TRUE</literal>, the span will be highlighted with the
- <classname>INVALID</classname> token type if it includes
- whitespace</para></listitem>
- <listitem><para><literal>DELEGATE</literal> - text inside the span will be
- highlighted with the specified ruleset. To delegate to a ruleset defined
- in the current mode, just specify its name. To delegate to a ruleset
- defined in another mode, specify a name of the form
- <literal><replaceable>mode</replaceable>::<replaceable>ruleset</replaceable></literal>.
- Note that the first (unnamed) ruleset in a mode is called
- <quote>MAIN</quote>.</para></listitem>
- </itemizedlist>
- <note>
- <para>
- Do not delegate to rulesets that define a <literal>TERMINATE</literal> rule
- (examples of such rulesets include <literal>text::MAIN</literal> and
- <literal>patch::MAIN</literal>). It won't work.
- </para>
- </note>
- <para>
- Here is a <literal>SPAN</literal> that highlights Java string literals,
- which cannot include line breaks:
- </para>
- <programlisting><SPAN TYPE="LITERAL1" NO_LINE_BREAK="TRUE">
- <BEGIN>"</BEGIN>
- <END>"</END>
- </SPAN></programlisting>
- <para>
- Here is a <literal>SPAN</literal> that highlights Java documentation
- comments by delegating to the <quote>JAVADOC</quote> ruleset defined
- elsewhere in the current mode:
- </para>
- <programlisting><SPAN TYPE="COMMENT2" DELEGATE="JAVADOC">
- <BEGIN>/**</BEGIN>
- <END>*/</END>
- </SPAN></programlisting>
- <para>
- Here is a <literal>SPAN</literal> that highlights HTML cascading stylesheets
- inside <literal><STYLE></literal> tags by delegating to the main
- ruleset in the CSS edit mode:
- </para>
- <programlisting><SPAN TYPE="MARKUP" DELEGATE="css::MAIN">
- <BEGIN>&lt;style&gt;</BEGIN>
- <END>&lt;/style&gt;</END>
- </SPAN></programlisting>
- <tip>
- <para>
- The <literal><END></literal> tag is optional. If it is not specified,
- any occurrence of the start string will cause the remainder of the buffer
- to be highlighted with this rule.
- </para>
- <para>
- This can be very useful when combined with delegation.
- </para>
- </tip>
- </sect2>
- <sect2 id="mode-rule-eol-span"><title>The EOL_SPAN Rule</title>
- <para>
- An <literal>EOL_SPAN</literal> is similar to a <literal>SPAN</literal>
- except that highlighting stops at the end of the line, not after the end
- sequence is found. The text to match is specified between the opening and
- closing <literal>EOL_SPAN</literal> tags.
- The following attributes are supported:
- </para>
- <itemizedlist>
- <listitem><para><literal>TYPE</literal> - The token type to highlight the span
- with. See <xref linkend="mode-syntax-tokens" /> for a list of token
- types</para></listitem>
- <listitem><para><literal>AT_LINE_START</literal> - If set to
- <literal>TRUE</literal>, the span will only be highlighted if the start
- sequence occurs at the beginning of a line</para></listitem>
- <listitem><para><literal>EXCLUDE_MATCH</literal> - If set to
- <literal>TRUE</literal>, the start sequence will not be highlighted,
- only the text after it will</para></listitem>
- </itemizedlist>
- <para>
- Here is an <literal>EOL_SPAN</literal> that highlights C++ comments:
- </para>
- <programlisting><EOL_SPAN TYPE="COMMENT1">//</EOL_SPAN></programlisting>
- </sect2>
- <sect2 id="mode-rule-mark-prev"><title>The MARK_PREVIOUS Rule</title>
- <para>
- The <literal>MARK_PREVIOUS</literal> rule highlights from the end of the
- previous syntax token to the matched text. The text to match
- is specified between opening and closing <literal>MARK_PREVIOUS</literal>
- tags. The following attributes are supported:
- </para>
- <itemizedlist>
- <listitem><para><literal>TYPE</literal> - The token type to highlight the
- text with. See <xref linkend="mode-syntax-tokens" /> for a list of token
- types</para></listitem>
- <listitem><para><literal>AT_LINE_START</literal> - If set to
- <literal>TRUE</literal>,
- the text will only be highlighted if it occurs at the beginning of
- the line</para></listitem>
- <listitem><para><literal>EXCLUDE_MATCH</literal> - If set to
- <literal>TRUE</literal>, the match will not be highlighted,
- only the text before it will</para></listitem>
- </itemizedlist>
- <para>
- Here is a rule that highlights labels in Java mode (for example,
- <quote>XXX:</quote>):
- </para>
- <programlisting><MARK_PREVIOUS AT_LINE_START="TRUE"
- EXCLUDE_MATCH="TRUE">:</MARK_PREVIOUS></programlisting>
- </sect2>
- <sect2 id="mode-rule-mark-following"><title>The MARK_FOLLOWING Rule</title>
- <para>
- The <literal>MARK_FOLLOWING</literal> rule highlights from the start of the
- match to the next syntax token. The text to match
- is specified between opening and closing <literal>MARK_FOLLOWING</literal>
- tags. The following attributes are supported:
- </para>
- <itemizedlist>
- <listitem><para><literal>TYPE</literal> - The token type to highlight the
- text with. See <xref linkend="mode-syntax-tokens" /> for a list of token
- types</para></listitem>
- <listitem><para><literal>AT_LINE_START</literal> - If set to
- <literal>TRUE</literal>, the text will only be highlighted if the start
- sequence occurs at the beginning of a line</para></listitem>
- <listitem><para><literal>EXCLUDE_MATCH</literal> - If set to
- <literal>TRUE</literal>, the match will not be highlighted,
- only the text after it will</para></listitem>
- </itemizedlist>
- <para>
- Here is a rule that highlights variables in Unix shell scripts
- (<quote>$CLASSPATH</quote>, <quote>$IFS</quote>, etc):
- </para>
- <programlisting><MARK_FOLLOWING TYPE="KEYWORD2">$</MARK_FOLLOWING></programlisting>
- </sect2>
- <sect2 id="mode-rule-seq"><title>The SEQ Rule</title>
- <para>
- The <literal>SEQ</literal> rule highlights fixed sequences of text. The text
- to highlight is specified between opening and closing <literal>SEQ</literal>
- tags. The following attributes are supported:
- </para>
- <itemizedlist>
- <listitem><para><literal>TYPE</literal> - the token type to highlight the
- sequence with. See <xref linkend="mode-syntax-tokens" /> for a list of token
- types</para></listitem>
- <listitem><para><literal>AT_LINE_START</literal> - If set to
- <literal>TRUE</literal>, the sequence will only be highlighted if it occurs
- at the beginning of a line</para></listitem>
- </itemizedlist>
- <para>
- The following rules highlight a few Java operators:
- </para>
- <programlisting><SEQ TYPE="OPERATOR">+</SEQ>
- <SEQ TYPE="OPERATOR">-</SEQ>
- <SEQ TYPE="OPERATOR">*</SEQ>
- <SEQ TYPE="OPERATOR">/</SEQ></programlisting>
- </sect2>
- <sect2 id="mode-rule-keywords"><title>The KEYWORDS Rule</title>
- <para>
- There can only be one <literal>KEYWORDS</literal> tag per ruleset.
- The <literal>KEYWORDS</literal> rule defines keywords to highlight.
- Keywords are similar to <literal>SEQ</literal>s, except that
- <literal>SEQ</literal>s match anywhere in the text, whereas keywords only
- match whole words.
- </para>
- <para>
- The <literal>KEYWORDS</literal> tag supports only one attribute,
- <literal>IGNORE_CASE</literal>. If set to <literal>FALSE</literal>, keywords
- will be case sensitive. Otherwise, case will not matter. Default is
- <literal>TRUE</literal>.
- </para>
- <para>
- Each child element of the <literal>KEYWORDS</literal> tag should be named
- after the desired token type, with the keyword text between the start and
- end tags. For example, the following rule highlights the most common Java
- keywords:
- </para>
- <programlisting><KEYWORDS IGNORE_CASE="FALSE">
- <KEYWORD1>if</KEYWORD1>
- <KEYWORD1>else</KEYWORD1>
- <KEYWORD3>int</KEYWORD3>
- <KEYWORD3>void</KEYWORD3>
- </KEYWORDS></programlisting>
- </sect2>
- <sect2 id="mode-syntax-tokens"><title>Token Types</title>
- <para>
- Parser rules can highlight tokens using any of the following token
- types:
- </para>
- <itemizedlist>
- <listitem><para><literal>NULL</literal> - no special
- highlighting is performed on tokens of type <literal>NULL</literal>
- </para></listitem>
- <listitem><para><literal>COMMENT1</literal>
- </para></listitem>
- <listitem><para><literal>COMMENT2</literal>
- </para></listitem>
- <listitem><para><literal>FUNCTION</literal>
- </para></listitem>
- <listitem><para><literal>INVALID</literal> - tokens of this type are
- automatically added if a <literal>NO_WORD_BREAK</literal> or
- <literal>NO_LINE_BREAK</literal> <literal>SPAN</literal> spans more than
- one word or line, respectively.
- </para></listitem>
- <listitem><para><literal>KEYWORD1</literal>
- </para></listitem>
- <listitem><para><literal>KEYWORD2</literal>
- </para></listitem>
- <listitem><para><literal>KEYWORD3</literal>
- </para></listitem>
- <listitem><para><literal>LABEL</literal>
- </para></listitem>
- <listitem><para><literal>LITERAL1</literal>
- </para></listitem>
- <listitem><para><literal>LITERAL2</literal>
- </para></listitem>
- <listitem><para><literal>MARKUP</literal>
- </para></listitem>
- <listitem><para><literal>OPERATOR</literal>
- </para></listitem>
- </itemizedlist>
- </sect2>
- </sect1>
- </chapter>