/projects/javacc-5.0/www/doc/javaccgrm.html
HTML | 1284 lines | 1193 code | 50 blank | 41 comment | 0 complexity | 5db65ab80568b83eddba672d3954c2bf MD5 | raw file
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
- <html xmlns="http://www.w3.org/1999/xhtml">
- <!--
- Copyright (c) 2006, Sun Microsystems, Inc.
- All rights reserved.
- Redistribution and use in source and binary forms, with or without
- modification, are permitted provided that the following conditions are met:
- * Redistributions of source code must retain the above copyright notice,
- this list of conditions and the following disclaimer.
- * Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in the
- documentation and/or other materials provided with the distribution.
- * Neither the name of the Sun Microsystems, Inc. nor the names of its
- contributors may be used to endorse or promote products derived from
- this software without specific prior written permission.
- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
- AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
- LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
- CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
- SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
- INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
- CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
- ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
- THE POSSIBILITY OF SUCH DAMAGE.
- -->
- <head>
- <title>JavaCC Grammar Files</title>
- <!-- Changed by: Michael Van De Vanter, 14-Jan-2003 -->
- </head>
- <body bgcolor="#FFFFFF" >
- <h1>JavaCC [tm]: Grammar Files</h1>
- <p>
- This page contains the complete syntax of Java Compiler Compiler [tm]
- grammar files with detailed explanations of each construct.
- </p>
- <p>
- Tokens in the grammar files follow the same conventions as for the Java programming language.
- Hence identifiers, strings, characters, etc. used in the grammars are
- the same as Java identifiers, Java strings, Java characters, etc.
- </p>
- <p>
- <em>White space</em> in the grammar files also follows the same conventions as
- for the Java programming language. This includes the syntax for comments. Most comments present in
- the grammar files are generated into the generated parser/lexical analyzer.
- </p>
- <p>
- Grammar files are preprocessed for Unicode escapes just as Java files
- are (i.e., occurrences of strings such as <code>\uxxxx</code> - where <code>xxxx</code> is a hex value -
- are converted to the corresponding Unicode character before lexical analysis).
- </p>
- <p>
- <em>Exceptions to the above rules:</em>
- The Java operators "<code><<</code>", "<code>>></code>", "<code>>>></code>", "<code><<=</code>",
- "<code>>>=</code>", and "<code>>>>=</code>" are left out of JavaCC's input token list
- in order to allow convenient nested use of token specifications.
- Finally, the following are the additional reserved words in the Java Compiler
- Compiler [tm] grammar files.
- </p>
- <table cellpadding="3">
- <tr>
- <td align="left"><strong>EOF</strong></td>
- <td align="left"><strong><a href="#IGNORE_CASE">IGNORE_CASE</a></strong></td>
- <td align="left"><strong><a href="#JAVACODE">JAVACODE</a></strong></td>
- <td align="left"><strong><a href="#LOOKAHEAD">LOOKAHEAD</a></strong></td>
- </tr>
- <tr>
- <td align="left"><strong><a href="#MORE">MORE</a></strong></td>
- <td align="left"><strong><a href="#PARSER_BEGIN">PARSER_BEGIN</a></strong></td>
- <td align="left"><strong><a href="#PARSER_END">PARSER_END</a></strong></td>
- <td align="left"><strong><a href="#SKIP">SKIP</a></strong></td>
- </tr>
- <tr>
- <td align="left"><strong><a href="#SPECIAL_TOKEN">SPECIAL_TOKEN</a></strong></td>
- <td align="left"><strong><a href="#TOKEN">TOKEN</a></strong></td>
- <td align="left"><strong><a href="#TOKEN_MGR_DECLS">TOKEN_MGR_DECLS</a></strong></td>
- </tr>
- </table>
- <p>
- Any Java entities used in the grammar rules that follow appear italicized
- with the prefix <em>java_</em> (<em>e.g.</em>, <em>java_compilation_unit</em>).
- </p>
- <hr />
- <a name="PARSER_BEGIN"></a><a name="PARSER_END"></a>
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod1">javacc_input</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline"><a href="#prod2">javacc_options</a></td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline">"PARSER_BEGIN" "(" <IDENTIFIER> ")"</td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline"><em>java_compilation_unit</em></td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline">"PARSER_END" "(" <IDENTIFIER> ")"</td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline">( <a href="#prod5">production</a> )*</td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline"><EOF></td>
- </tr>
- </table>
- <p>
- The grammar file starts with a list of options (which is optional).
- This is then followed by a Java compilation unit enclosed between
- "PARSER_BEGIN(name)" and "PARSER_END(name)". After this is a list
- of grammar productions. <a href="#prod2">Options</a> and
- <a href="#prod5">productions</a> are described later.
- </p>
- <p>
- The <em>name</em> that follows "PARSER_BEGIN" and "PARSER_END" must
- be the same and this identifies the name of the generated parser.
- For example, if <em>name</em> is "MyParser", then the following files
- are generated:
- </p>
- <p>
- <strong>MyParser.java:</strong>
- The generate parser.
- <br />
- <strong>MyParserTokenManager.java:</strong>
- The generated token manager (or scanner/lexical analyzer).
- <br />
- <strong>MyParserConstants.java:</strong>
- A bunch of useful constants.
- </p>
- <p>
- Other files such as "Token.java", "ParseException.java", etc. are also
- generated. However, these files contain boilerplate code and are
- the same for any grammar and may be reused across grammars (provided the
- grammars use compatible options).
- </p>
- <p>
- Between the PARSER_BEGIN and PARSER_END constructs is a regular
- Java compilation unit (a compilation unit in Java lingo is the entire
- contents of a Java file). This may be any arbitrary
- Java compilation unit so long as it contains a class declaration
- whose name is the same as the name of the generated parser ("MyParser"
- in the above example). Hence, in general, this part of the grammar
- file looks like:
- </p>
- <pre>
- PARSER_BEGIN(parser_name)
- . . .
- class parser_name . . . {
- . . .
- }
- . . .
- PARSER_END(parser_name)
- </pre>
- <p>
- JavaCC does not perform detailed checks on the compilation unit, so
- it is possible for a grammar file to pass through JavaCC and generate
- Java files that produce errors when they are compiled.
- </p>
- <p>
- If the compilation unit includes a package declaration, this is
- included in all the generated files. If the compilation unit includes
- imports declarations, this is included in the generated parser and
- token manager files.
- </p>
- <p>
- The generated parser file contains everything in the compilation unit
- and, in addition, contains the generated parser code that is included at
- the end of the parser class. For the above example, the generated
- parser will look like:
- </p>
- <pre>
- . . .
- class parser_name . . . {
- . . .
- // generated parser is inserted here.
- }
- . . .
- </pre>
- <p>
- The generated parser includes a public method declaration corresponding
- to each non-terminal (see <a href="#prod9">javacode_production</a> and
- <a href="#prod11">bnf_production</a>) in the grammar file. Parsing with
- respect to a non-terminal is achieved by calling the method corresponding
- to that non-terminal. Unlike yacc, there is no single start symbol in
- JavaCC - one can parse with respect to any non-terminal in the grammar.
- </p>
- <p>
- The generated token manager provides one public method:
- </p>
- <pre>
- Token getNextToken() throws ParseError;
- </pre>
- <p>
- For more details on how this method may be used, please read
- <a href="apiroutines.html">the description of the Java Compiler Compiler
- API</a>.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod2">javacc_options</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">[ "<a name="options">options</a>" "{" ( <a href="#prod6">option_binding</a> )* "}" ]</td>
- </tr>
- </table>
- <p>
- The options if present, starts with the reserved word "options" followed
- by a list of one or more option bindings within braces. Each option
- binding specifies the setting of one option. The same option may not be
- set multiple times.
- </p>
- <p>
- Options may be specified either here in the grammar file, or from
- <a href="commandline.html">the command line</a>. If the option is set
- from <a href="commandline.html">the command line</a>, that takes precedence.
- </p>
- <p>
- Option names are not case-sensitive.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod6">option_binding</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">"LOOKAHEAD" "=" <em>java_integer_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"CHOICE_AMBIGUITY_CHECK" "=" <em>java_integer_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"OTHER_AMBIGUITY_CHECK" "=" <em>java_integer_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"STATIC" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"SUPPORT_CLASS_VISIBILITY_PUBLIC" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"DEBUG_PARSER" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"DEBUG_LOOKAHEAD" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"DEBUG_TOKEN_MANAGER" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"ERROR_REPORTING" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"JAVA_UNICODE_ESCAPE" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"UNICODE_INPUT" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"IGNORE_CASE" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"USER_TOKEN_MANAGER" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"USER_CHAR_STREAM" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"BUILD_PARSER" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"BUILD_TOKEN_MANAGER" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"TOKEN_EXTENDS" "=" <em>java_string_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"TOKEN_FACTORY" "=" <em>java_string_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"TOKEN_MANAGER_USES_PARSER" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"SANITY_CHECK" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"FORCE_LA_CHECK" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"COMMON_TOKEN_ACTION" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"CACHE_TOKENS" "=" <em>java_boolean_literal</em> ";"</td>
- </tr>
- <tr>
- <td></td><td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"OUTPUT_DIRECTORY" "=" <em>java_string_literal</em> ";"</td>
- </tr>
- </table>
- <ul>
- <li>
- <strong><a name="LOOKAHEAD">LOOKAHEAD</a>:</strong>
- The number of tokens to look ahead before making a
- decision at a choice point during parsing. The default value is 1.
- The smaller this number, the faster the parser. This number may be
- overridden for specific productions within the grammar as described
- later. See the description of
- <a href="lookahead.html">the lookahead algorithm</a> for complete
- details on how lookahead works.
- </li>
- <li>
- <strong>CHOICE_AMBIGUITY_CHECK:</strong>
- This is an integer option whose default value is 2.
- This is the number of tokens considered in checking choices of the
- form "A | B | ..." for ambiguity. For example, if there is a common
- two token prefix for both A and B, but no common three token prefix,
- (assume this option is set to 3) then JavaCC can tell you to use a
- lookahead of 3 for disambiguation purposes. And if A and B have a
- common three token prefix, then JavaCC only tell you that you need to
- have a lookahead of 3 <em>or more</em>. Increasing this can give you more
- comprehensive ambiguity information at the cost of more processing
- time. For large grammars such as the Java grammar, increasing this number
- any further causes the checking to take too much time.
- </li>
- <li>
- <strong>OTHER_AMBIGUITY_CHECK:</strong>
- This is an integer option whose default value is 1.
- This is the number of tokens considered in checking all other kinds of
- choices (i.e., of the forms "(A)*", "(A)+", and "(A)?") for ambiguity.
- This takes more time to do than the choice checking, and hence the
- default value is set to 1 rather than 2.
- </li>
- <li>
- <strong>STATIC:</strong>
- This is a boolean option whose default value is true. If
- true, all methods and class variables are specified as static in the
- generated parser and token manager. This allows only one parser object to be present,
- but it improves the performance of the parser. To perform multiple
- parses during one run of your Java program, you will have to call the
- <a href="apiroutines.html">ReInit()</a>
- method to reinitialize your parser if it is static.
- If the parser is non-static, you may use the "new" operator to
- construct as many parsers as you wish. These can all be used
- simultaneously from different threads.
- </li>
- <li>
- <strong>DEBUG_PARSER:</strong>
- This is a boolean option whose default value is false. This
- option is used to obtain debugging information from the generated
- parser. Setting this option to true causes the parser to generate
- a trace of its actions. Tracing may be disabled by
- calling the method <a href="apiroutines.html">disable_tracing()</a>
- in the generated parser class. Tracing may be subsequently enabled
- by calling the method <a href="apiroutines.html">enable_tracing()</a>
- in the generated parser class.
- </li>
- <li>
- <strong>DEBUG_LOOKAHEAD:</strong>
- This is a boolean option whose default value is false. Setting this
- option to true causes the parser to generate all the tracing information
- it does when the option DEBUG_PARSER is true, and in addition, also
- causes it to generated a trace of actions performed during
- <a href="lookahead.html">lookahead operation</a>.
- </li>
- <li>
- <strong>DEBUG_TOKEN_MANAGER:</strong>
- This is a boolean option whose default value is false. This
- option is used to obtain debugging information from the generated
- token manager. Setting this option to true causes the token manager to generate
- a trace of its actions. This trace is rather large and should only
- be used when you have a lexical error that has been reported to you
- and you cannot understand why. Typically, in this situation, you
- can determine the problem by looking at the last few lines of this trace.
- </li>
- <li>
- <strong>ERROR_REPORTING:</strong>
- This is a boolean option whose default value is
- true. Setting it to false causes errors due to parse errors to be
- reported in somewhat less detail. The only reason to set this
- option to false is to improve performance.
- </li>
- <li>
- <strong>JAVA_UNICODE_ESCAPE:</strong>
- This is a boolean option whose default value is
- false. When set to true, the generated parser uses
- an input stream object that processes Java Unicode escapes
- (\u...) before sending characters to the token manager. By
- default, Java Unicode escapes are not processed.
- <br />
- This option is ignored if either of options USER_TOKEN_MANAGER,
- USER_CHAR_STREAM is set to true.
- </li>
- <li>
- <strong>UNICODE_INPUT:</strong>
- This is a boolean option whose default value is
- false. When set to true, the generated parser uses
- uses an input stream object that reads Unicode files. By default,
- ASCII files are assumed.
- <br />
- This option is ignored if either of
- options USER_TOKEN_MANAGER, USER_CHAR_STREAM is set to true.
- </li>
- <li>
- <strong><a name="IGNORE_CASE">IGNORE_CASE:</a></strong>
- This is a boolean option whose default value is false.
- Setting this option to true causes the generated token manager to ignore
- case in the token specifications and the input files. This is useful
- for writing grammars for languages such as HTML. It is also possible
- to localize the effect of IGNORE_CASE by using
- <a href="#prod10">an alternate mechanism described later</a>.
- </li>
- <li>
- <strong>USER_TOKEN_MANAGER:</strong>
- This is a boolean option whose default value is
- false. The default action is to generate a token manager
- that works on the specified grammar tokens. If this
- option is set to true, then the parser is generated to accept tokens
- from any token manager of type "TokenManager" - this interface
- is generated into the generated parser directory.
- </li>
- <li>
- <strong>SUPPORT_CLASS_VISIBILITY_PUBLIC:</strong>
- This is a boolean option whose default value is
- true. The default action is to generate support classes (such as
- Token.java, ParseException.java etc) with <em>Public</em> visibility. If
- set to false, the classes will be generated with package-private
- visibility.
- </li>
- <li>
- <strong>USER_CHAR_STREAM:</strong>
- This is a boolean option whose default value is
- false. The default action is to generate a character stream reader
- as specified by the options JAVA_UNICODE_ESCAPE and UNICODE_INPUT.
- The generated token manager receives characters
- from this stream reader. If this option is set to true, then the
- token manager is generated to read characters from any character
- stream reader of type "CharStream.java". This file is generated
- into the generated parser directory.
- <br />
- This option is ignored if USER_TOKEN_MANAGER is set to true.
- </li>
- <li>
- <strong>BUILD_PARSER:</strong>
- This is a boolean option whose default value is true.
- The default action is to generate the parser file ("MyParser.java"
- in the above example). When set to false, the parser file is
- not generated. Typically, this option is set to false when
- you wish to generate only the token manager and use it without
- the associated parser.
- </li>
- <li>
- <strong>BUILD_TOKEN_MANAGER:</strong>
- This is a boolean option whose default value is true.
- The default action is to generate the token manager file
- ("MyParserTokenManager.java" in the above example). When set to
- false the token manager file is not generated. The only reason
- to set this option to false is to save some time during parser
- generation when you fix problems in the parser part of the grammar
- file and leave the lexical specifications untouched.
- </li>
- <li>
- <strong>TOKEN_MANAGER_USES_PARSER:</strong>
- This is a boolean option whose default value is false.
- When set to true, the generated token manager will include a field
- called <CODE>parser</CODE> that references the instantiating parser
- instance (of type <CODE>MyParser</CODE> in the above example).
- The main reason for having a parser in a token manager is using
- some of its logic in lexical actions.
- This option has no effect if the STATIC option is set to true.
- </li>
- <li>
- <strong>TOKEN_EXTENDS:</strong>
- This is a string option whose default value is
- "", meaning that the generated Token class will extend
- java.lang.Object. This option may be set to the name of a
- class that will be used as the base class for the generated
- <code>Token</code> class.
- </li>
- <li>
- <strong>TOKEN_FACTORY:</strong>
- This is a string option whose default value is
- "", meaning that Tokens will be created by calling
- <code>Token.newToken()</code>. If set the option names a
- Token factory class containing a
- <code>public static Token newToken(int ofKind, String image)</code>
- method.
- </li>
- <li>
- <strong>SANITY_CHECK:</strong>
- This is a boolean option whose default value is true.
- JavaCC performs many syntactic and semantic checks on the grammar
- file during parser generation. Some checks such as detection of
- left recursion, detection of ambiguity, and bad usage of empty
- expansions may be suppressed for faster parser generation by
- setting this option to false. Note that the presence of these
- errors (even if they are not detected and reported by setting this
- option to false) can cause unexpected behavior from the generated
- parser.
- </li>
- <li>
- <strong>FORCE_LA_CHECK:</strong>
- This is a boolean option whose default value is false.
- This option setting controls lookahead ambiguity checking performed
- by JavaCC. By default (when this option is false), lookahead
- ambiguity checking is performed for all choice points where the
- default lookahead of 1 is used. Lookahead ambiguity checking is
- not performed at choice points where there is an
- <a href="lookahead.html">explicit lookahead specification</a>,
- or if the option LOOKAHEAD is set to something other than 1.
- Setting this option to true performs lookahead ambiguity checking
- at <em>all</em> choice points regardless of the lookahead specifications
- in the grammar file.
- </li>
- <li>
- <strong>COMMON_TOKEN_ACTION:</strong>
- This is a boolean option whose default value is false.
- When set to true, every call to the token manager's method
- "getNextToken" (<a href="apiroutines.html">see the description of the
- Java Compiler Compiler API</a>) will cause a call to a used defined
- method "CommonTokenAction" after the token has been scanned in by the
- token manager. The user must define this method within the
- <a href="#prod12">TOKEN_MGR_DECLS</a> section.
- The signature of this method is:
- <pre>
- void CommonTokenAction(Token t)
- </pre>
- </li>
- <li>
- <strong>CACHE_TOKENS:</strong>
- This is a boolean option whose default value is false.
- Setting this option to true causes the generated parser to lookahead for
- extra tokens ahead of time. This facilitates some performance improvements.
- However, in this case (when the option is true), interactive
- applications may not work since the parser needs to work synchronously
- with the availability of tokens from the input stream. In such cases,
- it's best to leave this option at its default value.
- </li>
- <li>
- <strong>OUTPUT_DIRECTORY:</strong>
- This is a string valued option whose default value is the current
- directory. This controls where output files are generated.
- </li>
- </ul>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod5">production</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline"><a href="#prod9">javacode_production</a></td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline"><a href="#prod10">regular_expr_production</a></td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline"><a href="#prod11">bnf_production</a></td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline"><a href="#prod12">token_manager_decls</a></td>
- </tr>
- </table>
- <p>
- There are four kinds of productions in JavaCC.
- <a href="#prod9">javacode_production</a> and <a href="#prod11">bnf_production</a>
- are used to define the grammar from which the parser is generated.
- <a href="#prod10">regular_expr_production</a> is used to define the grammar
- tokens - the token manager is generated from this information (as well as from
- inline token specifications in the parser grammar).
- <a href="#prod12">token_manager_decls</a> is used to introduce declarations
- that get inserted into the generated token manager.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod9">javacode_production</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">"<a name="JAVACODE">JAVACODE</a>"</td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline"><em>java_access_modifier</em> <em>java_return_type</em> <em>java_identifier</em> "(" <em>java_parameter_list</em> ")"</td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline"><em>java_block</em></td>
- </tr>
- </table>
- <p>
- The JAVACODE production is a way to write Java code for some
- productions instead of the usual EBNF expansion. This is useful when
- there is the need to recognize something that is not context-free
- or for whatever reason is very difficult to write a grammar for.
- An example of the use of JAVACODE is shown below. In this example,
- the non-terminal "skip_to_matching_brace" consumes tokens in the input
- stream all the way up to a matching closing brace (the opening brace
- is assumed to have been just scanned):
- </p>
- <pre>
- JAVACODE
- void skip_to_matching_brace() {
- <a href="apiroutines.html">Token</a> tok;
- int nesting = 1;
- while (true) {
- tok = <a href="apiroutines.html">getToken</a>(1);
- if (tok.kind == LBRACE) nesting++;
- if (tok.kind == RBRACE) {
- nesting--;
- if (nesting == 0) break;
- }
- tok = <a href="apiroutines.html">getNextToken</a>();
- }
- }
- </pre>
- <p>
- Care must be taken when using JAVACODE productions. While you can
- say pretty much what you want with these productions, JavaCC simply
- considers it a black box (that somehow performs its parsing task).
- This becomes a problem when JAVACODE productions appear at
- <a href="lookahead.html">choice points</a>. For example, if the
- above JAVACODE production was referred to from the following production:
- </p>
- <pre>
- void NT() :
- {}
- {
- skip_to_matching_brace()
- |
- some_other_production()
- }
- </pre>
- <p>
- Then JavaCC would not know how to choose between the two choices.
- On the other hand, if the JAVACODE production is used at a non-choice
- point as in the following example, there is no problem:
- </p>
- <pre>
- void NT() :
- {}
- {
- "{" skip_to_matching_brace()
- |
- "(" parameter_list() ")"
- }
- </pre>
- <p>
- JAVACODE productions at choice points may also be preceded by syntactic or
- semantic LOOKAHEAD, as in this example:
- </p>
- <pre>
- void NT() :
- {}
- {
- LOOKAHEAD( {errorOccurred} ) skip_to_matching_brace()
- |
- "(" parameter_list() ")"
- }
- </pre>
- <!-- JavaCC *should* print a warning message, but currently doesn't (see issue 166).
- <p>
- When this issue is fixed ww should re-instate this paragraph.
- When JAVACODE productions are used at choice points, JavaCC will
- print a warning message stating this fact. You will then have to
- insert some explicit LOOKAHEAD specifications to help JavaCC. See
- <a href="lookahead.html">the minitutorial on LOOKAHEAD</a> for a
- detailed guide on such issues.
- </p>
- -->
- <p>
- The default access modifier for JAVACODE productions is package private.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod11">bnf_production</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline"><em>java_access_modifier</em> <em>java_return_type</em> <em>java_identifier</em> "(" <em>java_parameter_list</em> ")" ":"</td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline"><em>java_block</em></td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline">"{" <a href="#prod16">expansion_choices</a> "}"</td>
- </tr>
- </table>
- <p>
- The BNF production is the standard production used
- in specifying JavaCC grammars. Each BNF production has a left hand
- side which is a non-terminal specification. The BNF production then
- defines this non-terminal in terms of BNF expansions on the right hand
- side. The non-terminal is written exactly like a declared Java method.
- Since each non-terminal is translated into a method
- in the generated parser, this style of writing the non-terminal makes
- this association obvious. The name of the non-terminal is the name of
- the method, and the parameters and return value declared are the means
- to pass values up and down the parse tree. As will be seen later,
- non-terminals on the right hand sides of productions are written as
- method calls, so the passing of values up and down the tree are done
- using exactly the same paradigm as method call and return.
- The default access modifier for BNF productions is public.
- </p>
- <p>
- There are two parts on the right hand side of an BNF production. The
- first part is a set of arbitrary Java declarations and code (the Java
- block). This code is generated at the beginning
- of the method generated for the Java non-terminal. Hence, every time
- this non-terminal is used in the parsing process, these declarations and
- code are executed. The declarations in this part are visible to all Java
- code in actions in the BNF expansions. JavaCC does not do any processing
- of these declarations and code, except to skip to the matching ending
- brace, collecting all text encountered on the way. Hence, a Java compiler
- can detect errors in this code that has been processed by JavaCC.
- </p>
- <p>
- The second part of the right hand side are the BNF expansions. This
- is described <a href="#prod16">later</a>.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod10">regular_expr_production</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">[ <a href="#newprod1">lexical_state_list</a> ]</td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline"><a href="#prod17">regexpr_kind</a> [ "[" "IGNORE_CASE" "]" ] ":"</td>
- </tr>
- <tr>
- <td></td><td></td>
- <td align="left" valign="baseline">"{" <a href="#prod18">regexpr_spec</a> ( "|" <a href="#prod18">regexpr_spec</a> )* "}"</td>
- </tr>
- </table>
- <p>
- A regular expression production is used to define lexical entities
- that get processed by the generated token manager. A detailed description
- of how the token manager works is provided in
- <a href="tokenmanager.html">this minitutorial (click here)</a>. This
- page describes the syntactic aspects of specifying lexical entities,
- while <a href="tokenmanager.html">the minitutorial</a> describes how
- these syntactic constructs tie in with how the token manager actually
- works.
- </p>
- <p>
- A regular expression production starts with a specification of the
- lexical states for which it applies (the
- <a href="#newprod1">lexical state list</a>).
- There is a standard lexical state called "DEFAULT". If the
- <a href="#newprod1">lexical state list</a> is omitted, the regular
- expression production applies to the lexical state "DEFAULT".
- </p>
- <p>
- Following this is a description of what kind of regular expression
- production this is (<a href="#prod17">see below for what this means</a>).
- </p>
- <p>
- After this is an optional "[IGNORE_CASE]". If this is present, the
- regular expression production is case insensitive - it has the same
- effect as the
- <a href="#prod6">IGNORE_CASE</a>
- option, except that in this case it applies locally to this regular
- expression production.
- </p>
- <p>
- This is then followed by a list of regular expression specifications
- that describe in more detail the lexical entities of this regular
- expression production.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod12">token_manager_decls</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">"<a name="TOKEN_MGR_DECLS">TOKEN_MGR_DECLS</a>" ":" <em>java_block</em></td>
- </tr>
- </table>
- <p>
- The token manager declarations starts with the reserved word
- "TOKEN_MGR_DECLS" followed by a ":" and then a set of Java declarations
- and statements (the Java block). These declarations and statements are
- written into the generated token manager and are accessible from within
- <a href="#prod18">lexical actions</a>. See
- <a href="tokenmanager.html">the minitutorial on the token manager</a>
- for more details.
- </p>
- <p>
- There can only be one token manager declaration in a JavaCC grammar file.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="newprod1">lexical_state_list</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">"<" "*" ">"</td>
- </tr>
- <tr>
- <td></td><td>|</td>
- <td align="left" valign="baseline">"<" <em>java_identifier</em> ( "," <em>java_identifier</em> )* ">"</td>
- </tr>
- </table>
- <p>
- The lexical state list describes the set of lexical states for which
- the corresponding <a href="#prod10">regular expression production</a>
- applies. If this is written as "<*>", the regular expression production
- applies to all lexical states. Otherwise, it applies to all the lexical
- states in the identifier list within the angular brackets.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod17">regexpr_kind</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">"TOKEN"</td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"SPECIAL_TOKEN"</td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"SKIP"</td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"MORE"</td>
- </tr>
- </table>
- <p>
- This specifies the kind of
- <a href="#prod10">regular expression production</a>.
- There are four kinds:
- </p>
- <ul>
- <li>
- <strong><a name="TOKEN">TOKEN</a></strong>:
- The regular expressions in this regular expression production describe
- <em>tokens</em> in the grammar. The token manager creates a
- <a href="apiroutines.html">Token</a> object for each match of such
- a regular expression and returns it to the parser.
- </li>
- <li>
- <strong><a name="SPECIAL_TOKEN">SPECIAL_TOKEN</a></strong>:
- The regular expressions in this regular expression production describe
- <em>special tokens</em>. Special tokens are like tokens, except that
- they do not have significance during parsing - that is the BNF productions
- ignore them. Special tokens are, however, still passed on to the parser
- so that parser actions can access them. Special tokens are passed
- to the parser by linking them to neighboring real tokens using the
- field "specialToken" in the <a href="apiroutines.html">Token</a>
- class. Special tokens are useful in the processing of lexical entities
- such as comments which have no significance to parsing, but still
- are an important part of the input file. See
- <a href="tokenmanager.html">the minitutorial on the token manager</a>
- for more details of special token handling.
- </li>
- <li>
- <strong><a name="SKIP">SKIP</a></strong>:
- Matches to regular expressions in this regular expression production
- are simply skipped (ignored) by the token manager.
- </li>
- <li>
- <strong><a name="MORE">MORE</a></strong>:
- Sometimes it is useful to gradually build up a token to be passed on
- to the parser. Matches to this kind of regular expression are stored
- in a buffer until the next TOKEN or SPECIAL_TOKEN match. Then all
- the matches in the buffer and the final TOKEN/SPECIAL_TOKEN match
- are concatenated together to form one TOKEN/SPECIAL_TOKEN that is
- passed on to the parser. If a match to a SKIP regular expression
- follows a sequence of MORE matches, the contents of the buffer is
- discarded.
- </li>
- </ul>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod18">regexpr_spec</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline"><a href="#prod19">regular_expression</a> [ <em>java_block</em> ] [ ":" <em>java_identifier</em> ]</td>
- </tr>
- </table>
- <p>
- The regular expression specification begins the actual description
- of the lexical entities that are part of this
- <a href="#prod10">regular expression production</a>.
- Each regular expression production may contain any number of
- regular expression specifications.
- </p>
- <p>
- Each regular expression specification contains a regular expression
- followed by a Java block (the lexical action) which is optional.
- This is then followed by an identifier of a lexical state (which
- is also optional). Whenever this regular expression is matched,
- the lexical action (if any) gets executed, followed by any
- <a href="#prod6">common token actions</a>. Then the action depending
- on the
- <a href="#prod17">regular expression production kind</a>
- is taken. Finally, if a lexical state is specified, the token
- manager moves to that lexical state for further processing (the
- token manager starts initially in the state "DEFAULT").
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod16">expansion_choices</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline"><a href="#prod20">expansion</a> ( "|" <a href="#prod20">expansion</a> )*</td>
- </tr>
- </table>
- <p>
- Expansion choices are written as a list of one or more expansions
- separated by "|"s. The set of legal parses allowed by an expansion
- choice is a legal parse of any one of the contained expansions.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod20">expansion</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">( <a href="#prod22">expansion_unit</a> )*</td>
- </tr>
- </table>
- <p>
- An expansion is written as a sequence of expansion units.
- A concatenation of legal
- parses of the expansion units is a legal parse of the expansion.
- </p>
- <p>
- For example, the expansion "{" decls() "}" consists of three expansion
- units - "{", decls(), and "}". A match for the expansion is a concatenation
- of the matches for the individual expansion units - in this case, that would
- be any string that begins with a "{", ends with a "}", and contains a match
- for decls() in between.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod22">expansion_unit</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline"><a href="#prod21">local_lookahead</a></td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline"><em>java_block</em></td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"(" <a href="#prod16">expansion_choices</a> ")" [ "+" | "*" | "?" ]</td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"[" <a href="#prod16">expansion_choices</a> "]"</td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">[ <em>java_assignment_lhs</em> "=" ] <a href="#prod19">regular_expression</a></td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">[ <em>java_assignment_lhs</em> "=" ] <em>java_identifier</em> "(" <em>java_expression_list</em> ")"</td>
- </tr>
- </table>
- <p>
- An expansion unit can be a <a href="#prod21">local LOOKAHEAD specification</a>.
- This instructs the
- generated parser on how to make choices at choice points. For details
- on how LOOKAHEAD specifications work and how to write LOOKAHEAD specifications,
- <a href="lookahead.html">click here to visit the minitutorial on LOOKAHEAD</a>.
- </p>
- <p>
- An expansion unit can be a set of Java declarations and code enclosed
- within braces (the Java block). These are also called <em>parser
- actions</em>. This is generated into the method parsing the
- non-terminal at the appropriate location. This block is executed
- whenever the parsing process crosses this point successfully.
- When JavaCC processes the Java block, it does not perform any detailed
- syntax or semantic checking. Hence it is possible that the Java compiler
- will find errors in your actions that have been processed by JavaCC.
- <em>Actions are not executed during
- <a href="lookahead.html">lookahead evaluation</a>.</em>
- </p>
- <p>
- An expansion unit can be a parenthesized set of one or more
- <a href="#prod16">expansion choices</a>. In which case, a legal parse of the expansion
- unit is any legal parse of the nested expansion choices.
- The parenthesized set of expansion choices can be suffixed (optionally) by:
- </p>
- <ul>
- <li>
- <strong>"+":</strong>
- Then any legal parse of the expansion unit is one or more
- repetitions of a legal parse of the parenthesized set of
- expansion choices.
- </li>
- <li>
- <strong>"*":</strong>
- Then any legal parse of the expansion unit is zero or more
- repetitions of a legal parse of the parenthesized set of
- expansion choices.
- </li>
- <li>
- <strong>"?":</strong>
- Then a legal parse of the expansion unit is either the
- empty token sequence or any legal parse of the nested expansion choices.
- An alternate syntax for this construct is to enclose the
- expansion choices within brackets "[...]".
- </li>
- </ul>
- <p>
- An expansion unit can be a <a href="#prod19">regular expression</a>. Then a legal parse
- of the expansion unit is any token that matches this regular
- expression. When a regular expression is matched, it creates an
- object of type <a href="apiroutines.html">Token</a>. This object
- can be accessed by assigning it to a variable by prefixing the
- regular expression with "variable =". In general, you may have any
- valid Java assignment left-hand side to the left of the "=".
- <em>This assignment is not performed during
- <a href="lookahead.html">lookahead evaluation</a>.</em>
- </p>
- <p>
- An expansion unit can be a non-terminal (the last choice in the syntax
- above). In which case, it takes
- the form of a method call with the non-terminal name used as the
- name of the method. A successful parse of the non-terminal causes
- the parameters placed in the method call to be operated on and a
- value returned (in case the non-terminal was not declared to be
- of type "void"). The return value can be assigned (optionally) to
- a variable by prefixing the regular expression with "variable =".
- In general, you may have any
- valid Java assignment left-hand side to the left of the "=".
- <em>This assignment is not performed during
- <a href="lookahead.html">lookahead evaluation</a>.</em>
- Non-terminals may not be used in an expansion in a manner that introduces
- left-recursion. JavaCC checks this for you.
- </p>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod21">local_lookahead</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline">"LOOKAHEAD" "(" [ <em>java_integer_literal</em> ] [ "," ] [ <a href="#prod16">expansion_choices</a> ] [ "," ] [ "{" <em>java_expression</em> "}" ] ")"</td>
- </tr>
- </table>
- <p>
- A local lookahead specification is used to influence the way the generated
- parser makes choices at the various
- <a href="lookahead.html">choice points</a>
- in the grammar. A local lookahead specification starts with the reserved
- word "LOOKAHEAD" followed by a set of lookahead constraints within parentheses.
- There are three different kinds of lookahead constraints - a lookahead limit
- (the integer literal), a syntactic lookahead (the expansion choices), and
- a semantic lookahead (the expression within braces). At least one lookahead
- constraint must be present. If more than one lookahead constraint is present,
- they must be separated by commas.
- </p>
- <p>
- For a detailed description of how lookahead works, please
- <a href="lookahead.html">click here to visit the minitutorial on LOOKAHEAD</a>.
- A brief description of each kind of lookahead constraint is given below:
- </p>
- <ul>
- <li>
- <strong>Lookahead Limit:</strong>
- This is the maximum number of tokens of lookahead that may be used for choice
- determination purposes. This overrides the default value which is specified
- by the <a href="#prod2">LOOKAHEAD option</a>. This lookahead limit applies
- only to the <a href="lookahead.html">choice point</a>
- at the location of the local lookahead specification.
- If the local lookahead specification is not at a choice point, the lookahead
- limit (if any) is ignored.
- </li>
- <li>
- <strong>Syntactic Lookahead:</strong>
- This is an expansion (or expansion choices) that is used for the purpose of
- determining whether or not the particular choice that this local lookahead
- specification applies to is to be taken. If this was not provided, the parser
- uses the expansion to be selected during lookahead determination.
- If the local lookahead specification is not at a
- <a href="lookahead.html">choice point</a>, the syntactic
- lookahead (if any) is ignored.
- </li>
- <li>
- <strong>Semantic Lookahead:</strong>
- This is a boolean expression that is evaluated whenever the parser crosses this
- point during parsing. If the expression evaluates to true, the parsing
- continues normally. If the expression evaluates to false and the local
- lookahead specification is at a <a href="lookahead.html">choice point</a>,
- the current choice is not taken and the next choice is considered.
- If the expression evaluates to false and the local lookahead specification
- is <em>not</em> at a choice point, then parsing aborts with a parse error.
- Unlike the other two lookahead constraints that are ignored at non-choice
- points, semantic lookahead is always evaluated. In fact, semantic lookahead
- is even evaluated if it is encountered during the evaluation of some other
- syntactic lookahead check (for more details
- <a href="lookahead.html">click here to visit the minitutorial on LOOKAHEAD</a>).
- </li>
- </ul>
- <p>
- <strong>Default values for lookahead constraints:</strong>
- If a local lookahead specification has been provided, but not all lookahead
- constraints have been included, then the missing ones are assigned default
- values as follows:
- </p>
- <ul>
- <li>
- If the lookahead limit is not provided and if the syntactic lookahead is
- provided, then the lookahead limit defaults to the largest integer value
- (2147483647). This essentially implements "infinite lookahead" - namely,
- look ahead as many tokens as necessary to match the syntactic lookahead that
- has been provided.
- </li>
- <li>
- If neither the lookahead limit nor the syntactic lookahead has been
- provided (which means the semantic lookahead is provided), the lookahead
- limit defaults to 0. This means that syntactic lookahead is not performed
- (it passes trivially), and only semantic lookahead is performed.
- </li>
- <li>
- If the syntactic lookahead is not provided, it defaults to the choice
- to which the local lookahead specification applies. If the local lookahead
- specification is not at a choice point, then the syntactic lookahead is
- ignored - hence a default value is not relevant.
- </li>
- <li>
- If the semantic lookahead is not provided, it defaults to the boolean
- expression "true". That is, it trivially passes.
- </li>
- </ul>
- <hr />
- <table>
- <tr>
- <td align="right" valign="baseline"><a name="prod19">regular_expression</a></td>
- <td align="center" valign="baseline">::=</td>
- <td align="left" valign="baseline"><em>java_string_literal</em></td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"<" [ [ "#" ] <em>java_identifier</em> ":" ] <a href="#prod29">complex_regular_expression_choices</a> ">"</td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"<" <em>java_identifier</em> ">"</td>
- </tr>
- <tr>
- <td align="right" valign="baseline"></td>
- <td align="center" valign="baseline">|</td>
- <td align="left" valign="baseline">"<" "EOF" ">"</td>
- </tr>
- </table>
- <p>
- There are two places in a grammar files where regular expressions may be
- written:
- </p>
- <ul>
- <li>
- Within a <a href="#prod18">regular expression specification</a>
- (part of a <a href="#prod10">regular expression production</a>),
- </li>
- <li>
- As an <a href="#prod22">expansion unit</a> with an <a href="#prod20">expansion</a>.
- When a regular expression is used in this manner, it is as if the regular expression
- were defined in the following manner at this location and then referred to by its
- label from the expansion unit:
- <pre>
- <DEFAULT> TOKEN :
- {
- regular expression
- }
- </pre>
- That is, this usage of regular expression can be rewritten using the other
- kind of usage.
- </li>
- </ul>
- <p>
- The complete details of regular expression matching by the token manager is
- available in
- <a href="tokenmanager.html">the minitutorial on the token manager</a>. The
- description of the syntactic constructs follows.
- </p>
- <p>
- The first kind of regular expression is a string literal. The input being
- parsed matches this regular expression if the token manager is in a
- <a href="#prod10">lexical state</a> for which this regular expression applies
- and the next set of characters in the input stream is the same (possibly with
- case ignored) as this string literal.
- </p>
- <p>
- A regular expression may