PageRenderTime 60ms CodeModel.GetById 22ms RepoModel.GetById 0ms app.codeStats 1ms

/testability-explorer/src/main/antlr/java15.g

http://testability-explorer.googlecode.com/
Unknown | 1832 lines | 1611 code | 221 blank | 0 comment | 0 complexity | 1ddc0340d006f643eddf96fdac920781 MD5 | raw file
Possible License(s): Apache-2.0
  1. /* Java 1.5 Recognizer
  2. *
  3. * Run 'java Main [-showtree] directory-full-of-java-files'
  4. *
  5. * [The -showtree option pops up a Swing frame that shows
  6. * the AST constructed from the parser.]
  7. *
  8. * Run 'java Main <directory full of java files>'
  9. *
  10. * Contributing authors:
  11. * John Mitchell johnm@non.net
  12. * Terence Parr parrt@magelang.com
  13. * John Lilley jlilley@empathy.com
  14. * Scott Stanchfield thetick@magelang.com
  15. * Markus Mohnen mohnen@informatik.rwth-aachen.de
  16. * Peter Williams pete.williams@sun.com
  17. * Allan Jacobs Allan.Jacobs@eng.sun.com
  18. * Steve Messick messick@redhills.com
  19. * John Pybus john@pybus.org
  20. *
  21. * Version 1.00 December 9, 1997 -- initial release
  22. * Version 1.01 December 10, 1997
  23. * fixed bug in octal def (0..7 not 0..8)
  24. * Version 1.10 August 1998 (parrt)
  25. * added tree construction
  26. * fixed definition of WS,comments for mac,pc,unix newlines
  27. * added unary plus
  28. * Version 1.11 (Nov 20, 1998)
  29. * Added "shutup" option to turn off last ambig warning.
  30. * Fixed inner class def to allow named class defs as statements
  31. * synchronized requires compound not simple statement
  32. * add [] after builtInType DOT class in primaryExpression
  33. * "const" is reserved but not valid..removed from modifiers
  34. * Version 1.12 (Feb 2, 1999)
  35. * Changed LITERAL_xxx to xxx in tree grammar.
  36. * Updated java.g to use tokens {...} now for 2.6.0 (new feature).
  37. *
  38. * Version 1.13 (Apr 23, 1999)
  39. * Didn't have (stat)? for else clause in tree parser.
  40. * Didn't gen ASTs for interface extends. Updated tree parser too.
  41. * Updated to 2.6.0.
  42. * Version 1.14 (Jun 20, 1999)
  43. * Allowed final/abstract on local classes.
  44. * Removed local interfaces from methods
  45. * Put instanceof precedence where it belongs...in relationalExpr
  46. * It also had expr not type as arg; fixed it.
  47. * Missing ! on SEMI in classBlock
  48. * fixed: (expr) + "string" was parsed incorrectly (+ as unary plus).
  49. * fixed: didn't like Object[].class in parser or tree parser
  50. * Version 1.15 (Jun 26, 1999)
  51. * Screwed up rule with instanceof in it. :( Fixed.
  52. * Tree parser didn't like (expr).something; fixed.
  53. * Allowed multiple inheritance in tree grammar. oops.
  54. * Version 1.16 (August 22, 1999)
  55. * Extending an interface built a wacky tree: had extra EXTENDS.
  56. * Tree grammar didn't allow multiple superinterfaces.
  57. * Tree grammar didn't allow empty var initializer: {}
  58. * Version 1.17 (October 12, 1999)
  59. * ESC lexer rule allowed 399 max not 377 max.
  60. * java.tree.g didn't handle the expression of synchronized
  61. * statements.
  62. * Version 1.18 (August 12, 2001)
  63. * Terence updated to Java 2 Version 1.3 by
  64. * observing/combining work of Allan Jacobs and Steve
  65. * Messick. Handles 1.3 src. Summary:
  66. * * primary didn't include boolean.class kind of thing
  67. * * constructor calls parsed explicitly now:
  68. * see explicitConstructorInvocation
  69. * * add strictfp modifier
  70. * * missing objBlock after new expression in tree grammar
  71. * * merged local class definition alternatives, moved after declaration
  72. * * fixed problem with ClassName.super.field
  73. * * reordered some alternatives to make things more efficient
  74. * * long and double constants were not differentiated from int/float
  75. * * whitespace rule was inefficient: matched only one char
  76. * * add an examples directory with some nasty 1.3 cases
  77. * * made Main.java use buffered IO and a Reader for Unicode support
  78. * * supports UNICODE?
  79. * Using Unicode charVocabulay makes code file big, but only
  80. * in the bitsets at the end. I need to make ANTLR generate
  81. * unicode bitsets more efficiently.
  82. * Version 1.19 (April 25, 2002)
  83. * Terence added in nice fixes by John Pybus concerning floating
  84. * constants and problems with super() calls. John did a nice
  85. * reorg of the primary/postfix expression stuff to read better
  86. * and makes f.g.super() parse properly (it was METHOD_CALL not
  87. * a SUPER_CTOR_CALL). Also:
  88. *
  89. * * "finally" clause was a root...made it a child of "try"
  90. * * Added stuff for asserts too for Java 1.4, but *commented out*
  91. * as it is not backward compatible.
  92. *
  93. * Version 1.20 (October 27, 2002)
  94. *
  95. * Terence ended up reorging John Pybus' stuff to
  96. * remove some nondeterminisms and some syntactic predicates.
  97. * Note that the grammar is stricter now; e.g., this(...) must
  98. * be the first statement.
  99. *
  100. * Trinary ?: operator wasn't working as array name:
  101. * (isBig ? bigDigits : digits)[i];
  102. *
  103. * Checked parser/tree parser on source for
  104. * Resin-2.0.5, jive-2.1.1, jdk 1.3.1, Lucene, antlr 2.7.2a4,
  105. * and the 110k-line jGuru server source.
  106. *
  107. * Version 1.21 (October 17, 2003)
  108. * Fixed lots of problems including:
  109. * Ray Waldin: add typeDefinition to interfaceBlock in java.tree.g
  110. * He found a problem/fix with floating point that start with 0
  111. * Ray also fixed problem that (int.class) was not recognized.
  112. * Thorsten van Ellen noticed that \n are allowed incorrectly in strings.
  113. * TJP fixed CHAR_LITERAL analogously.
  114. *
  115. * Version 1.21.2 (March, 2003)
  116. * Changes by Matt Quail to support generics (as per JDK1.5/JSR14)
  117. * Notes:
  118. * * We only allow the "extends" keyword and not the "implements"
  119. * keyword, since thats what JSR14 seems to imply.
  120. * * Thanks to Monty Zukowski for his help on the antlr-interest
  121. * mail list.
  122. * * Thanks to Alan Eliasen for testing the grammar over his
  123. * Fink source base
  124. *
  125. * Version 1.22 (July, 2004)
  126. * Changes by Michael Studman to support Java 1.5 language extensions
  127. * Notes:
  128. * * Added support for annotations types
  129. * * Finished off Matt Quail's generics enhancements to support bound type arguments
  130. * * Added support for new for statement syntax
  131. * * Added support for static import syntax
  132. * * Added support for enum types
  133. * * Tested against JDK 1.5 source base and source base of jdigraph project
  134. * * Thanks to Matt Quail for doing the hard part by doing most of the generics work
  135. *
  136. * Version 1.22.1 (July 28, 2004)
  137. * Bug/omission fixes for Java 1.5 language support
  138. * * Fixed tree structure bug with classOrInterface - thanks to Pieter Vangorpto for
  139. * spotting this
  140. * * Fixed bug where incorrect handling of SR and BSR tokens would cause type
  141. * parameters to be recognised as type arguments.
  142. * * Enabled type parameters on constructors, annotations on enum constants
  143. * and package definitions
  144. * * Fixed problems when parsing if ((char.class.equals(c))) {} - solution by Matt Quail at Cenqua
  145. *
  146. * Version 1.22.2 (July 28, 2004)
  147. * Slight refactoring of Java 1.5 language support
  148. * * Refactored for/"foreach" productions so that original literal "for" literal
  149. * is still used but the for sub-clauses vary by token type
  150. * * Fixed bug where type parameter was not included in generic constructor's branch of AST
  151. *
  152. * Version 1.22.3 (August 26, 2004)
  153. * Bug fixes as identified by Michael Stahl; clean up of tabs/spaces
  154. * and other refactorings
  155. * * Fixed typeParameters omission in identPrimary and newStatement
  156. * * Replaced GT reconcilliation code with simple semantic predicate
  157. * * Adapted enum/assert keyword checking support from Michael Stahl's java15 grammar
  158. * * Refactored typeDefinition production and field productions to reduce duplication
  159. *
  160. * Version 1.22.4 (October 21, 2004)
  161. * Small bux fixes
  162. * * Added typeArguments to explicitConstructorInvocation, e.g. new <String>MyParameterised()
  163. * * Added typeArguments to postfixExpression productions for anonymous inner class super
  164. * constructor invocation, e.g. new Outer().<String>super()
  165. * * Fixed bug in array declarations identified by Geoff Roy
  166. *
  167. * Version 1.22.5 (January 03, 2005)
  168. * Small change to tree structure
  169. * * Flattened classOrInterfaceType tree so IDENT no longer has children. TYPE_ARGUMENTS are now
  170. * always siblings of IDENT rather than children. Fully.qualified.names trees now
  171. * look a little less clean when TYPE_ARGUMENTS are present though.
  172. *
  173. * This grammar is in the PUBLIC DOMAIN
  174. */
  175. header {
  176. package com.google.test.metric.javasrc;
  177. }
  178. class JavaRecognizer extends Parser;
  179. options {
  180. k = 2; // two token lookahead
  181. exportVocab=Java; // Call its vocabulary "Java"
  182. codeGenMakeSwitchThreshold = 2; // Some optimizations
  183. codeGenBitsetTestThreshold = 3;
  184. defaultErrorHandler = false; // Don't generate parser error handlers
  185. buildAST = true;
  186. }
  187. tokens {
  188. BLOCK; MODIFIERS; OBJBLOCK; SLIST; CTOR_DEF; METHOD_DEF; VARIABLE_DEF;
  189. INSTANCE_INIT; STATIC_INIT; TYPE; CLASS_DEF; INTERFACE_DEF;
  190. PACKAGE_DEF; ARRAY_DECLARATOR; EXTENDS_CLAUSE; IMPLEMENTS_CLAUSE;
  191. PARAMETERS; PARAMETER_DEF; LABELED_STAT; TYPECAST; INDEX_OP;
  192. POST_INC; POST_DEC; METHOD_CALL; EXPR; ARRAY_INIT;
  193. IMPORT; UNARY_MINUS; UNARY_PLUS; CASE_GROUP; ELIST; FOR_INIT; FOR_CONDITION;
  194. FOR_ITERATOR; EMPTY_STAT; FINAL="final"; ABSTRACT="abstract";
  195. STRICTFP="strictfp"; SUPER_CTOR_CALL; CTOR_CALL; VARIABLE_PARAMETER_DEF;
  196. STATIC_IMPORT; ENUM_DEF; ENUM_CONSTANT_DEF; FOR_EACH_CLAUSE; ANNOTATION_DEF; ANNOTATIONS;
  197. ANNOTATION; ANNOTATION_MEMBER_VALUE_PAIR; ANNOTATION_FIELD_DEF; ANNOTATION_ARRAY_INIT;
  198. TYPE_ARGUMENTS; TYPE_ARGUMENT; TYPE_PARAMETERS; TYPE_PARAMETER; WILDCARD_TYPE;
  199. TYPE_UPPER_BOUNDS; TYPE_LOWER_BOUNDS;
  200. }
  201. {
  202. /**
  203. * Counts the number of LT seen in the typeArguments production.
  204. * It is used in semantic predicates to ensure we have seen
  205. * enough closing '>' characters; which actually may have been
  206. * either GT, SR or BSR tokens.
  207. */
  208. private int ltCounter = 0;
  209. }
  210. // Compilation Unit: In Java, this is a single file. This is the start
  211. // rule for this parser
  212. compilationUnit
  213. : // A compilation unit starts with an optional package definition
  214. ( (annotations "package")=> packageDefinition
  215. | /* nothing */
  216. )
  217. // Next we have a series of zero or more import statements
  218. ( importDefinition )*
  219. // Wrapping things up with any number of class or interface
  220. // definitions
  221. ( typeDefinition )*
  222. EOF!
  223. ;
  224. // Package statement: optional annotations followed by "package" then the package identifier.
  225. packageDefinition
  226. options {defaultErrorHandler = true;} // let ANTLR handle errors
  227. : annotations p:"package"^ {#p.setType(PACKAGE_DEF);} identifier SEMI!
  228. ;
  229. // Import statement: import followed by a package or class name
  230. importDefinition
  231. options {defaultErrorHandler = true;}
  232. { boolean isStatic = false; }
  233. : i:"import"^ {#i.setType(IMPORT);} ( "static"! {#i.setType(STATIC_IMPORT);} )? identifierStar SEMI!
  234. ;
  235. // A type definition is either a class, interface, enum or annotation with possible additional semis.
  236. typeDefinition
  237. options {defaultErrorHandler = true;}
  238. : m:modifiers!
  239. typeDefinitionInternal[#m]
  240. | SEMI!
  241. ;
  242. // Protected type definitions production for reuse in other productions
  243. protected typeDefinitionInternal[AST mods]
  244. : classDefinition[#mods] // inner class
  245. | interfaceDefinition[#mods] // inner interface
  246. | enumDefinition[#mods] // inner enum
  247. | annotationDefinition[#mods] // inner annotation
  248. ;
  249. // A declaration is the creation of a reference or primitive-type variable
  250. // Create a separate Type/Var tree for each var in the var list.
  251. declaration!
  252. : m:modifiers t:typeSpec[false] v:variableDefinitions[#m,#t]
  253. {#declaration = #v;}
  254. ;
  255. // A type specification is a type name with possible brackets afterwards
  256. // (which would make it an array type).
  257. typeSpec[boolean addImagNode]
  258. : classTypeSpec[addImagNode]
  259. | builtInTypeSpec[addImagNode]
  260. ;
  261. // A class type specification is a class type with either:
  262. // - possible brackets afterwards
  263. // (which would make it an array type).
  264. // - generic type arguments after
  265. classTypeSpec[boolean addImagNode]
  266. : classOrInterfaceType[false]
  267. (options{greedy=true;}: // match as many as possible
  268. lb:LBRACK^ {#lb.setType(ARRAY_DECLARATOR);} RBRACK!
  269. )*
  270. {
  271. if ( addImagNode ) {
  272. #classTypeSpec = #(#[TYPE,"TYPE"], #classTypeSpec);
  273. }
  274. }
  275. ;
  276. // A non-built in type name, with possible type parameters
  277. classOrInterfaceType[boolean addImagNode]
  278. : IDENT (typeArguments)?
  279. (options{greedy=true;}: // match as many as possible
  280. DOT^
  281. IDENT (typeArguments)?
  282. )*
  283. {
  284. if ( addImagNode ) {
  285. #classOrInterfaceType = #(#[TYPE,"TYPE"], #classOrInterfaceType);
  286. }
  287. }
  288. ;
  289. // A specialised form of typeSpec where built in types must be arrays
  290. typeArgumentSpec
  291. : classTypeSpec[true]
  292. | builtInTypeArraySpec[true]
  293. ;
  294. // A generic type argument is a class type, a possibly bounded wildcard type or a built-in type array
  295. typeArgument
  296. : ( typeArgumentSpec
  297. | wildcardType
  298. )
  299. {#typeArgument = #(#[TYPE_ARGUMENT,"TYPE_ARGUMENT"], #typeArgument);}
  300. ;
  301. // Wildcard type indicating all types (with possible constraint)
  302. wildcardType
  303. : q:QUESTION^ {#q.setType(WILDCARD_TYPE);}
  304. (("extends" | "super")=> typeArgumentBounds)?
  305. ;
  306. // Type arguments to a class or interface type
  307. typeArguments
  308. {int currentLtLevel = 0;}
  309. :
  310. {currentLtLevel = ltCounter;}
  311. LT! {ltCounter++;}
  312. typeArgument
  313. (options{greedy=true;}: // match as many as possible
  314. {inputState.guessing !=0 || ltCounter == currentLtLevel + 1}?
  315. COMMA! typeArgument
  316. )*
  317. ( // turn warning off since Antlr generates the right code,
  318. // plus we have our semantic predicate below
  319. options{generateAmbigWarnings=false;}:
  320. typeArgumentsOrParametersEnd
  321. )?
  322. // make sure we have gobbled up enough '>' characters
  323. // if we are at the "top level" of nested typeArgument productions
  324. {(currentLtLevel != 0) || ltCounter == currentLtLevel}?
  325. {#typeArguments = #(#[TYPE_ARGUMENTS, "TYPE_ARGUMENTS"], #typeArguments);}
  326. ;
  327. // this gobbles up *some* amount of '>' characters, and counts how many
  328. // it gobbled.
  329. protected typeArgumentsOrParametersEnd
  330. : GT! {ltCounter-=1;}
  331. | SR! {ltCounter-=2;}
  332. | BSR! {ltCounter-=3;}
  333. ;
  334. // Restriction on wildcard types based on super class or derrived class
  335. typeArgumentBounds
  336. {boolean isUpperBounds = false;}
  337. :
  338. ( "extends"! {isUpperBounds=true;} | "super"! ) classOrInterfaceType[false]
  339. {
  340. if (isUpperBounds)
  341. {
  342. #typeArgumentBounds = #(#[TYPE_UPPER_BOUNDS,"TYPE_UPPER_BOUNDS"], #typeArgumentBounds);
  343. }
  344. else
  345. {
  346. #typeArgumentBounds = #(#[TYPE_LOWER_BOUNDS,"TYPE_LOWER_BOUNDS"], #typeArgumentBounds);
  347. }
  348. }
  349. ;
  350. // A builtin type array specification is a builtin type with brackets afterwards
  351. builtInTypeArraySpec[boolean addImagNode]
  352. : builtInType
  353. (options{greedy=true;}: // match as many as possible
  354. lb:LBRACK^ {#lb.setType(ARRAY_DECLARATOR);} RBRACK!
  355. )+
  356. {
  357. if ( addImagNode ) {
  358. #builtInTypeArraySpec = #(#[TYPE,"TYPE"], #builtInTypeArraySpec);
  359. }
  360. }
  361. ;
  362. // A builtin type specification is a builtin type with possible brackets
  363. // afterwards (which would make it an array type).
  364. builtInTypeSpec[boolean addImagNode]
  365. : builtInType
  366. (options{greedy=true;}: // match as many as possible
  367. lb:LBRACK^ {#lb.setType(ARRAY_DECLARATOR);} RBRACK!
  368. )*
  369. {
  370. if ( addImagNode ) {
  371. #builtInTypeSpec = #(#[TYPE,"TYPE"], #builtInTypeSpec);
  372. }
  373. }
  374. ;
  375. // A type name. which is either a (possibly qualified and parameterized)
  376. // class name or a primitive (builtin) type
  377. type
  378. : classOrInterfaceType[false]
  379. | builtInType
  380. ;
  381. // The primitive types.
  382. builtInType
  383. : "void"
  384. | "boolean"
  385. | "byte"
  386. | "char"
  387. | "short"
  388. | "int"
  389. | "float"
  390. | "long"
  391. | "double"
  392. ;
  393. // A (possibly-qualified) java identifier. We start with the first IDENT
  394. // and expand its name by adding dots and following IDENTS
  395. identifier
  396. : IDENT ( DOT^ IDENT )*
  397. ;
  398. identifierStar
  399. : IDENT
  400. ( DOT^ IDENT )*
  401. ( DOT^ STAR )?
  402. ;
  403. // A list of zero or more modifiers. We could have used (modifier)* in
  404. // place of a call to modifiers, but I thought it was a good idea to keep
  405. // this rule separate so they can easily be collected in a Vector if
  406. // someone so desires
  407. modifiers
  408. :
  409. (
  410. //hush warnings since the semantic check for "@interface" solves the non-determinism
  411. options{generateAmbigWarnings=false;}:
  412. modifier
  413. |
  414. //Semantic check that we aren't matching @interface as this is not an annotation
  415. //A nicer way to do this would be nice
  416. {LA(1)==AT && !LT(2).getText().equals("interface")}? annotation
  417. )*
  418. {#modifiers = #([MODIFIERS, "MODIFIERS"], #modifiers);}
  419. ;
  420. // modifiers for Java classes, interfaces, class/instance vars and methods
  421. modifier
  422. : "private"
  423. | "public"
  424. | "protected"
  425. | "static"
  426. | "transient"
  427. | "final"
  428. | "abstract"
  429. | "native"
  430. | "threadsafe"
  431. | "synchronized"
  432. | "volatile"
  433. | "strictfp"
  434. ;
  435. annotation!
  436. : AT! i:identifier ( LPAREN! ( args:annotationArguments )? RPAREN! )?
  437. {#annotation = #(#[ANNOTATION,"ANNOTATION"], i, args);}
  438. ;
  439. annotations
  440. : (annotation)*
  441. {#annotations = #([ANNOTATIONS, "ANNOTATIONS"], #annotations);}
  442. ;
  443. annotationArguments
  444. : annotationMemberValueInitializer | anntotationMemberValuePairs
  445. ;
  446. anntotationMemberValuePairs
  447. : annotationMemberValuePair ( COMMA! annotationMemberValuePair )*
  448. ;
  449. annotationMemberValuePair!
  450. : i:IDENT ASSIGN! v:annotationMemberValueInitializer
  451. {#annotationMemberValuePair = #(#[ANNOTATION_MEMBER_VALUE_PAIR,"ANNOTATION_MEMBER_VALUE_PAIR"], i, v);}
  452. ;
  453. annotationMemberValueInitializer
  454. :
  455. conditionalExpression | annotation | annotationMemberArrayInitializer
  456. ;
  457. // This is an initializer used to set up an annotation member array.
  458. annotationMemberArrayInitializer
  459. : lc:LCURLY^ {#lc.setType(ANNOTATION_ARRAY_INIT);}
  460. ( annotationMemberArrayValueInitializer
  461. (
  462. // CONFLICT: does a COMMA after an initializer start a new
  463. // initializer or start the option ',' at end?
  464. // ANTLR generates proper code by matching
  465. // the comma as soon as possible.
  466. options {
  467. warnWhenFollowAmbig = false;
  468. }
  469. :
  470. COMMA! annotationMemberArrayValueInitializer
  471. )*
  472. (COMMA!)?
  473. )?
  474. RCURLY!
  475. ;
  476. // The two things that can initialize an annotation array element are a conditional expression
  477. // and an annotation (nested annotation array initialisers are not valid)
  478. annotationMemberArrayValueInitializer
  479. : conditionalExpression
  480. | annotation
  481. ;
  482. superClassClause!
  483. : ( "extends" c:classOrInterfaceType[false] )?
  484. {#superClassClause = #(#[EXTENDS_CLAUSE,"EXTENDS_CLAUSE"],c);}
  485. ;
  486. // Definition of a Java class
  487. classDefinition![AST modifiers]
  488. : "class" IDENT
  489. // it _might_ have type paramaters
  490. (tp:typeParameters)?
  491. // it _might_ have a superclass...
  492. sc:superClassClause
  493. // it might implement some interfaces...
  494. ic:implementsClause
  495. // now parse the body of the class
  496. cb:classBlock
  497. {#classDefinition = #(#[CLASS_DEF,"CLASS_DEF"],
  498. modifiers,IDENT,tp,sc,ic,cb);}
  499. ;
  500. // Definition of a Java Interface
  501. interfaceDefinition![AST modifiers]
  502. : "interface" IDENT
  503. // it _might_ have type paramaters
  504. (tp:typeParameters)?
  505. // it might extend some other interfaces
  506. ie:interfaceExtends
  507. // now parse the body of the interface (looks like a class...)
  508. ib:interfaceBlock
  509. {#interfaceDefinition = #(#[INTERFACE_DEF,"INTERFACE_DEF"],
  510. modifiers,IDENT,tp,ie,ib);}
  511. ;
  512. enumDefinition![AST modifiers]
  513. : "enum" IDENT
  514. // it might implement some interfaces...
  515. ic:implementsClause
  516. // now parse the body of the enum
  517. eb:enumBlock
  518. {#enumDefinition = #(#[ENUM_DEF,"ENUM_DEF"],
  519. modifiers,IDENT,ic,eb);}
  520. ;
  521. annotationDefinition![AST modifiers]
  522. : AT "interface" IDENT
  523. // now parse the body of the annotation
  524. ab:annotationBlock
  525. {#annotationDefinition = #(#[ANNOTATION_DEF,"ANNOTATION_DEF"],
  526. modifiers,IDENT,ab);}
  527. ;
  528. typeParameters
  529. {int currentLtLevel = 0;}
  530. :
  531. {currentLtLevel = ltCounter;}
  532. LT! {ltCounter++;}
  533. typeParameter (COMMA! typeParameter)*
  534. (typeArgumentsOrParametersEnd)?
  535. // make sure we have gobbled up enough '>' characters
  536. // if we are at the "top level" of nested typeArgument productions
  537. {(currentLtLevel != 0) || ltCounter == currentLtLevel}?
  538. {#typeParameters = #(#[TYPE_PARAMETERS, "TYPE_PARAMETERS"], #typeParameters);}
  539. ;
  540. typeParameter
  541. :
  542. // I'm pretty sure Antlr generates the right thing here:
  543. (id:IDENT) ( options{generateAmbigWarnings=false;}: typeParameterBounds )?
  544. {#typeParameter = #(#[TYPE_PARAMETER,"TYPE_PARAMETER"], #typeParameter);}
  545. ;
  546. typeParameterBounds
  547. :
  548. "extends"! classOrInterfaceType[false]
  549. (BAND! classOrInterfaceType[false])*
  550. {#typeParameterBounds = #(#[TYPE_UPPER_BOUNDS,"TYPE_UPPER_BOUNDS"], #typeParameterBounds);}
  551. ;
  552. // This is the body of a class. You can have classFields and extra semicolons.
  553. classBlock
  554. : LCURLY!
  555. ( classField | SEMI! )*
  556. RCURLY!
  557. {#classBlock = #([OBJBLOCK, "OBJBLOCK"], #classBlock);}
  558. ;
  559. // This is the body of an interface. You can have interfaceField and extra semicolons.
  560. interfaceBlock
  561. : LCURLY!
  562. ( interfaceField | SEMI! )*
  563. RCURLY!
  564. {#interfaceBlock = #([OBJBLOCK, "OBJBLOCK"], #interfaceBlock);}
  565. ;
  566. // This is the body of an annotation. You can have annotation fields and extra semicolons,
  567. // That's about it (until you see what an annoation field is...)
  568. annotationBlock
  569. : LCURLY!
  570. ( annotationField | SEMI! )*
  571. RCURLY!
  572. {#annotationBlock = #([OBJBLOCK, "OBJBLOCK"], #annotationBlock);}
  573. ;
  574. // This is the body of an enum. You can have zero or more enum constants
  575. // followed by any number of fields like a regular class
  576. enumBlock
  577. : LCURLY!
  578. ( enumConstant ( options{greedy=true;}: COMMA! enumConstant )* ( COMMA! )? )?
  579. ( SEMI! ( classField | SEMI! )* )?
  580. RCURLY!
  581. {#enumBlock = #([OBJBLOCK, "OBJBLOCK"], #enumBlock);}
  582. ;
  583. // An annotation field
  584. annotationField!
  585. : mods:modifiers
  586. ( td:typeDefinitionInternal[#mods]
  587. {#annotationField = #td;}
  588. | t:typeSpec[false] // annotation field
  589. ( i:IDENT // the name of the field
  590. LPAREN! RPAREN!
  591. rt:declaratorBrackets[#t]
  592. ( "default" amvi:annotationMemberValueInitializer )?
  593. SEMI
  594. {#annotationField =
  595. #(#[ANNOTATION_FIELD_DEF,"ANNOTATION_FIELD_DEF"],
  596. mods,
  597. #(#[TYPE,"TYPE"],rt),
  598. i,amvi
  599. );}
  600. | v:variableDefinitions[#mods,#t] SEMI // variable
  601. {#annotationField = #v;}
  602. )
  603. )
  604. ;
  605. //An enum constant may have optional parameters and may have a
  606. //a class body
  607. enumConstant!
  608. : an:annotations
  609. i:IDENT
  610. ( LPAREN!
  611. a:argList
  612. RPAREN!
  613. )?
  614. ( b:enumConstantBlock )?
  615. {#enumConstant = #([ENUM_CONSTANT_DEF, "ENUM_CONSTANT_DEF"], an, i, a, b);}
  616. ;
  617. //The class-like body of an enum constant
  618. enumConstantBlock
  619. : LCURLY!
  620. ( enumConstantField | SEMI! )*
  621. RCURLY!
  622. {#enumConstantBlock = #([OBJBLOCK, "OBJBLOCK"], #enumConstantBlock);}
  623. ;
  624. //An enum constant field is just like a class field but without
  625. //the posibility of a constructor definition or a static initializer
  626. enumConstantField!
  627. : mods:modifiers
  628. ( td:typeDefinitionInternal[#mods]
  629. {#enumConstantField = #td;}
  630. | // A generic method has the typeParameters before the return type.
  631. // This is not allowed for variable definitions, but this production
  632. // allows it, a semantic check could be used if you wanted.
  633. (tp:typeParameters)? t:typeSpec[false] // method or variable declaration(s)
  634. ( IDENT // the name of the method
  635. // parse the formal parameter declarations.
  636. LPAREN! param:parameterDeclarationList RPAREN!
  637. rt:declaratorBrackets[#t]
  638. // get the list of exceptions that this method is
  639. // declared to throw
  640. (tc:throwsClause)?
  641. ( s2:compoundStatement | SEMI )
  642. {#enumConstantField = #(#[METHOD_DEF,"METHOD_DEF"],
  643. mods,
  644. tp,
  645. #(#[TYPE,"TYPE"],rt),
  646. IDENT,
  647. param,
  648. tc,
  649. s2);}
  650. | v:variableDefinitions[#mods,#t] SEMI
  651. {#enumConstantField = #v;}
  652. )
  653. )
  654. // "{ ... }" instance initializer
  655. | s4:compoundStatement
  656. {#enumConstantField = #(#[INSTANCE_INIT,"INSTANCE_INIT"], s4);}
  657. ;
  658. // An interface can extend several other interfaces...
  659. interfaceExtends
  660. : (
  661. e:"extends"!
  662. classOrInterfaceType[false] ( COMMA! classOrInterfaceType[false] )*
  663. )?
  664. {#interfaceExtends = #(#[EXTENDS_CLAUSE,"EXTENDS_CLAUSE"],
  665. #interfaceExtends);}
  666. ;
  667. // A class can implement several interfaces...
  668. implementsClause
  669. : (
  670. i:"implements"! classOrInterfaceType[false] ( COMMA! classOrInterfaceType[false] )*
  671. )?
  672. {#implementsClause = #(#[IMPLEMENTS_CLAUSE,"IMPLEMENTS_CLAUSE"],
  673. #implementsClause);}
  674. ;
  675. // Now the various things that can be defined inside a class
  676. classField!
  677. : // method, constructor, or variable declaration
  678. mods:modifiers
  679. ( td:typeDefinitionInternal[#mods]
  680. {#classField = #td;}
  681. | (tp:typeParameters)?
  682. (
  683. h:ctorHead s:constructorBody // constructor
  684. {#classField = #(#[CTOR_DEF,"CTOR_DEF"], mods, tp, h, s);}
  685. | // A generic method/ctor has the typeParameters before the return type.
  686. // This is not allowed for variable definitions, but this production
  687. // allows it, a semantic check could be used if you wanted.
  688. t:typeSpec[false] // method or variable declaration(s)
  689. ( IDENT // the name of the method
  690. // parse the formal parameter declarations.
  691. LPAREN! param:parameterDeclarationList RPAREN!
  692. rt:declaratorBrackets[#t]
  693. // get the list of exceptions that this method is
  694. // declared to throw
  695. (tc:throwsClause)?
  696. ( s2:compoundStatement | SEMI )
  697. {#classField = #(#[METHOD_DEF,"METHOD_DEF"],
  698. mods,
  699. tp,
  700. #(#[TYPE,"TYPE"],rt),
  701. IDENT,
  702. param,
  703. tc,
  704. s2);}
  705. | v:variableDefinitions[#mods,#t] SEMI
  706. {#classField = #v;}
  707. )
  708. )
  709. )
  710. // "static { ... }" class initializer
  711. | "static" s3:compoundStatement
  712. {#classField = #(#[STATIC_INIT,"STATIC_INIT"], s3);}
  713. // "{ ... }" instance initializer
  714. | s4:compoundStatement
  715. {#classField = #(#[INSTANCE_INIT,"INSTANCE_INIT"], s4);}
  716. ;
  717. // Now the various things that can be defined inside a interface
  718. interfaceField!
  719. : // method, constructor, or variable declaration
  720. mods:modifiers
  721. ( td:typeDefinitionInternal[#mods]
  722. {#interfaceField = #td;}
  723. | (tp:typeParameters)?
  724. // A generic method has the typeParameters before the return type.
  725. // This is not allowed for variable definitions, but this production
  726. // allows it, a semantic check could be used if you want a more strict
  727. // grammar.
  728. t:typeSpec[false] // method or variable declaration(s)
  729. ( IDENT // the name of the method
  730. // parse the formal parameter declarations.
  731. LPAREN! param:parameterDeclarationList RPAREN!
  732. rt:declaratorBrackets[#t]
  733. // get the list of exceptions that this method is
  734. // declared to throw
  735. (tc:throwsClause)?
  736. SEMI
  737. {#interfaceField = #(#[METHOD_DEF,"METHOD_DEF"],
  738. mods,
  739. tp,
  740. #(#[TYPE,"TYPE"],rt),
  741. IDENT,
  742. param,
  743. tc);}
  744. | v:variableDefinitions[#mods,#t] SEMI
  745. {#interfaceField = #v;}
  746. )
  747. )
  748. ;
  749. constructorBody
  750. : lc:LCURLY^ {#lc.setType(SLIST);}
  751. ( options { greedy=true; } : explicitConstructorInvocation)?
  752. (statement)*
  753. RCURLY!
  754. ;
  755. /** Catch obvious constructor calls, but not the expr.super(...) calls */
  756. explicitConstructorInvocation
  757. : (typeArguments)?
  758. ( "this"! lp1:LPAREN^ argList RPAREN! SEMI!
  759. {#lp1.setType(CTOR_CALL);}
  760. | "super"! lp2:LPAREN^ argList RPAREN! SEMI!
  761. {#lp2.setType(SUPER_CTOR_CALL);}
  762. )
  763. ;
  764. variableDefinitions[AST mods, AST t]
  765. : variableDeclarator[getASTFactory().dupTree(mods),
  766. getASTFactory().dupList(t)] //dupList as this also copies siblings (like TYPE_ARGUMENTS)
  767. ( COMMA!
  768. variableDeclarator[getASTFactory().dupTree(mods),
  769. getASTFactory().dupList(t)] //dupList as this also copies siblings (like TYPE_ARGUMENTS)
  770. )*
  771. ;
  772. /** Declaration of a variable. This can be a class/instance variable,
  773. * or a local variable in a method
  774. * It can also include possible initialization.
  775. */
  776. variableDeclarator![AST mods, AST t]
  777. : id:IDENT d:declaratorBrackets[t] v:varInitializer
  778. {#variableDeclarator = #(#[VARIABLE_DEF,"VARIABLE_DEF"], mods, #(#[TYPE,"TYPE"],d), id, v);}
  779. ;
  780. declaratorBrackets[AST typ]
  781. : {#declaratorBrackets=typ;}
  782. (lb:LBRACK^ {#lb.setType(ARRAY_DECLARATOR);} RBRACK!)*
  783. ;
  784. varInitializer
  785. : ( ASSIGN^ initializer )?
  786. ;
  787. // This is an initializer used to set up an array.
  788. arrayInitializer
  789. : lc:LCURLY^ {#lc.setType(ARRAY_INIT);}
  790. ( initializer
  791. (
  792. // CONFLICT: does a COMMA after an initializer start a new
  793. // initializer or start the option ',' at end?
  794. // ANTLR generates proper code by matching
  795. // the comma as soon as possible.
  796. options {
  797. warnWhenFollowAmbig = false;
  798. }
  799. :
  800. COMMA! initializer
  801. )*
  802. (COMMA!)?
  803. )?
  804. RCURLY!
  805. ;
  806. // The two "things" that can initialize an array element are an expression
  807. // and another (nested) array initializer.
  808. initializer
  809. : expression
  810. | arrayInitializer
  811. ;
  812. // This is the header of a method. It includes the name and parameters
  813. // for the method.
  814. // This also watches for a list of exception classes in a "throws" clause.
  815. ctorHead
  816. : IDENT // the name of the method
  817. // parse the formal parameter declarations.
  818. LPAREN! parameterDeclarationList RPAREN!
  819. // get the list of exceptions that this method is declared to throw
  820. (throwsClause)?
  821. ;
  822. // This is a list of exception classes that the method is declared to throw
  823. throwsClause
  824. : "throws"^ identifier ( COMMA! identifier )*
  825. ;
  826. // A list of formal parameters
  827. // Zero or more parameters
  828. // If a parameter is variable length (e.g. String... myArg) it is the right-most parameter
  829. parameterDeclarationList
  830. // The semantic check in ( .... )* block is flagged as superfluous, and seems superfluous but
  831. // is the only way I could make this work. If my understanding is correct this is a known bug
  832. : ( ( parameterDeclaration )=> parameterDeclaration
  833. ( options {warnWhenFollowAmbig=false;} : ( COMMA! parameterDeclaration ) => COMMA! parameterDeclaration )*
  834. ( COMMA! variableLengthParameterDeclaration )?
  835. |
  836. variableLengthParameterDeclaration
  837. )?
  838. {#parameterDeclarationList = #(#[PARAMETERS,"PARAMETERS"],
  839. #parameterDeclarationList);}
  840. ;
  841. // A formal parameter.
  842. parameterDeclaration!
  843. : pm:parameterModifier t:typeSpec[false] id:IDENT
  844. pd:declaratorBrackets[#t]
  845. {#parameterDeclaration = #(#[PARAMETER_DEF,"PARAMETER_DEF"],
  846. pm, #([TYPE,"TYPE"],pd), id);}
  847. ;
  848. variableLengthParameterDeclaration!
  849. : pm:parameterModifier t:typeSpec[false] TRIPLE_DOT! id:IDENT
  850. pd:declaratorBrackets[#t]
  851. {#variableLengthParameterDeclaration = #(#[VARIABLE_PARAMETER_DEF,"VARIABLE_PARAMETER_DEF"],
  852. pm, #([TYPE,"TYPE"],pd), id);}
  853. ;
  854. parameterModifier
  855. //final can appear amongst annotations in any order - greedily consume any preceding
  856. //annotations to shut nond-eterminism warnings off
  857. : (options{greedy=true;} : annotation)* (f:"final")? (annotation)*
  858. {#parameterModifier = #(#[MODIFIERS,"MODIFIERS"], #parameterModifier);}
  859. ;
  860. // Compound statement. This is used in many contexts:
  861. // Inside a class definition prefixed with "static":
  862. // it is a class initializer
  863. // Inside a class definition without "static":
  864. // it is an instance initializer
  865. // As the body of a method
  866. // As a completely indepdent braced block of code inside a method
  867. // it starts a new scope for variable definitions
  868. compoundStatement
  869. : lc:LCURLY^ {#lc.setType(SLIST);}
  870. // include the (possibly-empty) list of statements
  871. (statement)*
  872. RCURLY!
  873. ;
  874. statement
  875. // A list of statements in curly braces -- start a new scope!
  876. : compoundStatement
  877. // declarations are ambiguous with "ID DOT" relative to expression
  878. // statements. Must backtrack to be sure. Could use a semantic
  879. // predicate to test symbol table to see what the type was coming
  880. // up, but that's pretty hard without a symbol table ;)
  881. | (declaration)=> declaration SEMI!
  882. // An expression statement. This could be a method call,
  883. // assignment statement, or any other expression evaluated for
  884. // side-effects.
  885. | expression SEMI!
  886. //TODO: what abour interfaces, enums and annotations
  887. // class definition
  888. | m:modifiers! classDefinition[#m]
  889. // Attach a label to the front of a statement
  890. | IDENT c:COLON^ {#c.setType(LABELED_STAT);} statement
  891. // If-else statement
  892. | "if"^ LPAREN! expression RPAREN! statement
  893. (
  894. // CONFLICT: the old "dangling-else" problem...
  895. // ANTLR generates proper code matching
  896. // as soon as possible. Hush warning.
  897. options {
  898. warnWhenFollowAmbig = false;
  899. }
  900. :
  901. "else"! statement
  902. )?
  903. // For statement
  904. | forStatement
  905. // While statement
  906. | "while"^ LPAREN! expression RPAREN! statement
  907. // do-while statement
  908. | "do"^ statement "while"! LPAREN! expression RPAREN! SEMI!
  909. // get out of a loop (or switch)
  910. | "break"^ (IDENT)? SEMI!
  911. // do next iteration of a loop
  912. | "continue"^ (IDENT)? SEMI!
  913. // Return an expression
  914. | "return"^ (expression)? SEMI!
  915. // switch/case statement
  916. | "switch"^ LPAREN! expression RPAREN! LCURLY!
  917. ( casesGroup )*
  918. RCURLY!
  919. // exception try-catch block
  920. | tryBlock
  921. // throw an exception
  922. | "throw"^ expression SEMI!
  923. // synchronize a statement
  924. | "synchronized"^ LPAREN! expression RPAREN! compoundStatement
  925. // asserts (uncomment if you want 1.4 compatibility)
  926. | "assert"^ expression ( COLON! expression )? SEMI!
  927. // empty statement
  928. | s:SEMI {#s.setType(EMPTY_STAT);}
  929. ;
  930. forStatement
  931. : f:"for"^
  932. LPAREN!
  933. ( (forInit SEMI)=>traditionalForClause
  934. | forEachClause
  935. )
  936. RPAREN!
  937. statement // statement to loop over
  938. ;
  939. traditionalForClause
  940. :
  941. forInit SEMI! // initializer
  942. forCond SEMI! // condition test
  943. forIter // updater
  944. ;
  945. forEachClause
  946. :
  947. p:parameterDeclaration COLON! expression
  948. {#forEachClause = #(#[FOR_EACH_CLAUSE,"FOR_EACH_CLAUSE"], #forEachClause);}
  949. ;
  950. casesGroup
  951. : ( // CONFLICT: to which case group do the statements bind?
  952. // ANTLR generates proper code: it groups the
  953. // many "case"/"default" labels together then
  954. // follows them with the statements
  955. options {
  956. greedy = true;
  957. }
  958. :
  959. aCase
  960. )+
  961. caseSList
  962. {#casesGroup = #([CASE_GROUP, "CASE_GROUP"], #casesGroup);}
  963. ;
  964. aCase
  965. : ("case"^ expression | "default") COLON!
  966. ;
  967. caseSList
  968. : (statement)*
  969. {#caseSList = #(#[SLIST,"SLIST"],#caseSList);}
  970. ;
  971. // The initializer for a for loop
  972. forInit
  973. // if it looks like a declaration, it is
  974. : ((declaration)=> declaration
  975. // otherwise it could be an expression list...
  976. | expressionList
  977. )?
  978. {#forInit = #(#[FOR_INIT,"FOR_INIT"],#forInit);}
  979. ;
  980. forCond
  981. : (expression)?
  982. {#forCond = #(#[FOR_CONDITION,"FOR_CONDITION"],#forCond);}
  983. ;
  984. forIter
  985. : (expressionList)?
  986. {#forIter = #(#[FOR_ITERATOR,"FOR_ITERATOR"],#forIter);}
  987. ;
  988. // an exception handler try/catch block
  989. tryBlock
  990. : "try"^ compoundStatement
  991. (handler)*
  992. ( finallyClause )?
  993. ;
  994. finallyClause
  995. : "finally"^ compoundStatement
  996. ;
  997. // an exception handler
  998. handler
  999. : "catch"^ LPAREN! parameterDeclaration RPAREN! compoundStatement
  1000. ;
  1001. // expressions
  1002. // Note that most of these expressions follow the pattern
  1003. // thisLevelExpression :
  1004. // nextHigherPrecedenceExpression
  1005. // (OPERATOR nextHigherPrecedenceExpression)*
  1006. // which is a standard recursive definition for a parsing an expression.
  1007. // The operators in java have the following precedences:
  1008. // lowest (13) = *= /= %= += -= <<= >>= >>>= &= ^= |=
  1009. // (12) ?:
  1010. // (11) ||
  1011. // (10) &&
  1012. // ( 9) |
  1013. // ( 8) ^
  1014. // ( 7) &
  1015. // ( 6) == !=
  1016. // ( 5) < <= > >=
  1017. // ( 4) << >>
  1018. // ( 3) +(binary) -(binary)
  1019. // ( 2) * / %
  1020. // ( 1) ++ -- +(unary) -(unary) ~ ! (type)
  1021. // [] () (method call) . (dot -- identifier qualification)
  1022. // new () (explicit parenthesis)
  1023. //
  1024. // the last two are not usually on a precedence chart; I put them in
  1025. // to point out that new has a higher precedence than '.', so you
  1026. // can validy use
  1027. // new Frame().show()
  1028. //
  1029. // Note that the above precedence levels map to the rules below...
  1030. // Once you have a precedence chart, writing the appropriate rules as below
  1031. // is usually very straightfoward
  1032. // the mother of all expressions
  1033. expression
  1034. : assignmentExpression
  1035. {#expression = #(#[EXPR,"EXPR"],#expression);}
  1036. ;
  1037. // This is a list of expressions.
  1038. expressionList
  1039. : expression (COMMA! expression)*
  1040. {#expressionList = #(#[ELIST,"ELIST"], expressionList);}
  1041. ;
  1042. // assignment expression (level 13)
  1043. assignmentExpression
  1044. : conditionalExpression
  1045. ( ( ASSIGN^
  1046. | PLUS_ASSIGN^
  1047. | MINUS_ASSIGN^
  1048. | STAR_ASSIGN^
  1049. | DIV_ASSIGN^
  1050. | MOD_ASSIGN^
  1051. | SR_ASSIGN^
  1052. | BSR_ASSIGN^
  1053. | SL_ASSIGN^
  1054. | BAND_ASSIGN^
  1055. | BXOR_ASSIGN^
  1056. | BOR_ASSIGN^
  1057. )
  1058. assignmentExpression
  1059. )?
  1060. ;
  1061. // conditional test (level 12)
  1062. conditionalExpression
  1063. : logicalOrExpression
  1064. ( QUESTION^ assignmentExpression COLON! conditionalExpression )?
  1065. ;
  1066. // logical or (||) (level 11)
  1067. logicalOrExpression
  1068. : logicalAndExpression (LOR^ logicalAndExpression)*
  1069. ;
  1070. // logical and (&&) (level 10)
  1071. logicalAndExpression
  1072. : inclusiveOrExpression (LAND^ inclusiveOrExpression)*
  1073. ;
  1074. // bitwise or non-short-circuiting or (|) (level 9)
  1075. inclusiveOrExpression
  1076. : exclusiveOrExpression (BOR^ exclusiveOrExpression)*
  1077. ;
  1078. // exclusive or (^) (level 8)
  1079. exclusiveOrExpression
  1080. : andExpression (BXOR^ andExpression)*
  1081. ;
  1082. // bitwise or non-short-circuiting and (&) (level 7)
  1083. andExpression
  1084. : equalityExpression (BAND^ equalityExpression)*
  1085. ;
  1086. // equality/inequality (==/!=) (level 6)
  1087. equalityExpression
  1088. : relationalExpression ((NOT_EQUAL^ | EQUAL^) relationalExpression)*
  1089. ;
  1090. // boolean relational expressions (level 5)
  1091. relationalExpression
  1092. : shiftExpression
  1093. ( ( ( LT^
  1094. | GT^
  1095. | LE^
  1096. | GE^
  1097. )
  1098. shiftExpression
  1099. )*
  1100. | "instanceof"^ typeSpec[true]
  1101. )
  1102. ;
  1103. // bit shift expressions (level 4)
  1104. shiftExpression
  1105. : additiveExpression ((SL^ | SR^ | BSR^) additiveExpression)*
  1106. ;
  1107. // binary addition/subtraction (level 3)
  1108. additiveExpression
  1109. : multiplicativeExpression ((PLUS^ | MINUS^) multiplicativeExpression)*
  1110. ;
  1111. // multiplication/division/modulo (level 2)
  1112. multiplicativeExpression
  1113. : unaryExpression ((STAR^ | DIV^ | MOD^ ) unaryExpression)*
  1114. ;
  1115. unaryExpression
  1116. : INC^ unaryExpression
  1117. | DEC^ unaryExpression
  1118. | MINUS^ {#MINUS.setType(UNARY_MINUS);} unaryExpression
  1119. | PLUS^ {#PLUS.setType(UNARY_PLUS);} unaryExpression
  1120. | unaryExpressionNotPlusMinus
  1121. ;
  1122. unaryExpressionNotPlusMinus
  1123. : BNOT^ unaryExpression
  1124. | LNOT^ unaryExpression
  1125. | ( // subrule allows option to shut off warnings
  1126. options {
  1127. // "(int" ambig with postfixExpr due to lack of sequence
  1128. // info in linear approximate LL(k). It's ok. Shut up.
  1129. generateAmbigWarnings=false;
  1130. }
  1131. : // If typecast is built in type, must be numeric operand
  1132. // Have to backtrack to see if operator follows
  1133. (LPAREN builtInTypeSpec[true] RPAREN unaryExpression)=>
  1134. lpb:LPAREN^ {#lpb.setType(TYPECAST);} builtInTypeSpec[true] RPAREN!
  1135. unaryExpression
  1136. // Have to backtrack to see if operator follows. If no operator
  1137. // follows, it's a typecast. No semantic checking needed to parse.
  1138. // if it _looks_ like a cast, it _is_ a cast; else it's a "(expr)"
  1139. | (LPAREN classTypeSpec[true] RPAREN unaryExpressionNotPlusMinus)=>
  1140. lp:LPAREN^ {#lp.setType(TYPECAST);} classTypeSpec[true] RPAREN!
  1141. unaryExpressionNotPlusMinus
  1142. | postfixExpression
  1143. )
  1144. ;
  1145. // qualified names, array expressions, method invocation, post inc/dec
  1146. postfixExpression
  1147. :
  1148. primaryExpression
  1149. (
  1150. /*
  1151. options {
  1152. // the use of postfixExpression in SUPER_CTOR_CALL adds DOT
  1153. // to the lookahead set, and gives loads of false non-det
  1154. // warnings.
  1155. // shut them off.
  1156. generateAmbigWarnings=false;
  1157. }
  1158. : */
  1159. //type arguments are only appropriate for a parameterized method/ctor invocations
  1160. //semantic check may be needed here to ensure that this is the case
  1161. DOT^ (typeArguments)?
  1162. ( IDENT
  1163. ( lp:LPAREN^ {#lp.setType(METHOD_CALL);}
  1164. argList
  1165. RPAREN!
  1166. )?
  1167. | "super"
  1168. ( // (new Outer()).super() (create enclosing instance)
  1169. lp3:LPAREN^ argList RPAREN!
  1170. {#lp3.setType(SUPER_CTOR_CALL);}
  1171. | DOT^ (typeArguments)? IDENT
  1172. ( lps:LPAREN^ {#lps.setType(METHOD_CALL);}
  1173. argList
  1174. RPAREN!
  1175. )?
  1176. )
  1177. )
  1178. | DOT^ "this"
  1179. | DOT^ newExpression
  1180. | lb:LBRACK^ {#lb.setType(INDEX_OP);} expression RBRACK!
  1181. )*
  1182. ( // possibly add on a post-increment or post-decrement.
  1183. // allows INC/DEC on too much, but semantics can check
  1184. in:INC^ {#in.setType(POST_INC);}
  1185. | de:DEC^ {#de.setType(POST_DEC);}
  1186. )?
  1187. ;
  1188. // the basic element of an expression
  1189. primaryExpression
  1190. : identPrimary ( options {greedy=true;} : DOT^ "class" )?
  1191. | constant
  1192. | "true"
  1193. | "false"
  1194. | "null"
  1195. | newExpression
  1196. | "this"
  1197. | "super"
  1198. | LPAREN! assignmentExpression RPAREN!
  1199. // look for int.class and int[].class
  1200. | builtInType
  1201. ( lbt:LBRACK^ {#lbt.setType(ARRAY_DECLARATOR);} RBRACK! )*
  1202. DOT^ "class"
  1203. ;
  1204. /** Match a, a.b.c refs, a.b.c(...) refs, a.b.c[], a.b.c[].class,
  1205. * and a.b.c.class refs. Also this(...) and super(...). Match
  1206. * this or super.
  1207. */
  1208. identPrimary
  1209. : (ta1:typeArguments!)?
  1210. IDENT
  1211. // Syntax for method invocation with type arguments is
  1212. // <String>foo("blah")
  1213. (
  1214. options {
  1215. // .ident could match here or in postfixExpression.
  1216. // We do want to match here. Turn off warning.
  1217. greedy=true;
  1218. // This turns the ambiguity warning of the second alternative
  1219. // off. See below. (The "false" predicate makes it non-issue)
  1220. warnWhenFollowAmbig=false;
  1221. }
  1222. // we have a new nondeterminism because of
  1223. // typeArguments... only a syntactic predicate will help...
  1224. // The problem is that this loop here conflicts with
  1225. // DOT typeArguments "super" in postfixExpression (k=2)
  1226. // A proper solution would require a lot of refactoring...
  1227. : (DOT (typeArguments)? IDENT) =>
  1228. DOT^ (ta2:typeArguments!)? IDENT
  1229. | {false}? // FIXME: this is very ugly but it seems to work...
  1230. // this will also produce an ANTLR warning!
  1231. // Unfortunately a syntactic predicate can only select one of
  1232. // multiple alternatives on the same level, not break out of
  1233. // an enclosing loop, which is why this ugly hack (a fake
  1234. // empty alternative with always-false semantic predicate)
  1235. // is necessary.
  1236. )*
  1237. (
  1238. options {
  1239. // ARRAY_DECLARATOR here conflicts with INDEX_OP in
  1240. // postfixExpression on LBRACK RBRACK.
  1241. // We want to match [] here, so greedy. This overcomes
  1242. // limitation of linear approximate lookahead.
  1243. greedy=true;
  1244. }
  1245. : ( lp:LPAREN^ {#lp.setType(METHOD_CALL);}
  1246. // if the input is valid, only the last IDENT may
  1247. // have preceding typeArguments... rather hacky, this is...
  1248. {if (#ta2 != null) astFactory.addASTChild(currentAST, #ta2);}
  1249. {if (#ta2 == null) astFactory.addASTChild(currentAST, #ta1);}
  1250. argList RPAREN!
  1251. )
  1252. | ( options {greedy=true;} :
  1253. lbc:LBRACK^ {#lbc.setType(ARRAY_DECLARATOR);} RBRACK!
  1254. )+
  1255. )?
  1256. ;
  1257. /** object instantiation.
  1258. * Trees are built as illustrated by the following input/tree pairs:
  1259. *
  1260. * new T()
  1261. *
  1262. * new
  1263. * |
  1264. * T -- ELIST
  1265. * |
  1266. * arg1 -- arg2 -- .. -- argn
  1267. *
  1268. * new int[]
  1269. *
  1270. * new
  1271. * |
  1272. * int -- ARRAY_DECLARATOR
  1273. *
  1274. * new int[] {1,2}
  1275. *
  1276. * new
  1277. * |
  1278. * int -- ARRAY_DECLARATOR -- ARRAY_INIT
  1279. * |
  1280. * EXPR -- EXPR
  1281. * | |
  1282. * 1 2
  1283. *
  1284. * new int[3]
  1285. * new
  1286. * |
  1287. * int -- ARRAY_DECLARATOR
  1288. * |
  1289. * EXPR
  1290. * |
  1291. * 3
  1292. *
  1293. * new int[1][2]
  1294. *
  1295. * new
  1296. * |
  1297. * int -- ARRAY_DECLARATOR
  1298. * |
  1299. * ARRAY_DECLARATOR -- EXPR
  1300. * | |
  1301. * EXPR 1
  1302. * |
  1303. * 2
  1304. *
  1305. */
  1306. newExpression
  1307. : "new"^ (typeArguments)? type
  1308. ( LPAREN! argList RPAREN! (classBlock)?
  1309. //java 1.1
  1310. // Note: This will allow bad constructs like
  1311. // new int[4][][3] {exp,exp}.
  1312. // There needs to be a semantic check here...
  1313. // to make sure:
  1314. // a) [ expr ] and [ ] are not mixed
  1315. // b) [ expr ] and an init are not used together
  1316. | newArrayDeclarator (arrayInitializer)?
  1317. )
  1318. ;
  1319. argList
  1320. : ( expressionList
  1321. | /*nothing*/
  1322. {#argList = #[ELIST,"ELIST"];}
  1323. )
  1324. ;
  1325. newArrayDeclarator
  1326. : (
  1327. // CONFLICT:
  1328. // newExpression is a primaryExpression which can be
  1329. // followed by an array index reference. This is ok,
  1330. // as the generated code will stay in this loop as
  1331. // long as it sees an LBRACK (proper behavior)
  1332. options {
  1333. warnWhenFollowAmbig = false;
  1334. }
  1335. :
  1336. lb:LBRACK^ {#lb.setType(ARRAY_DECLARATOR);}
  1337. (expression)?
  1338. RBRACK!
  1339. )+
  1340. ;
  1341. constant
  1342. : NUM_INT
  1343. | CHAR_LITERAL
  1344. | STRING_LITERAL
  1345. | NUM_FLOAT
  1346. | NUM_LONG
  1347. | NUM_DOUBLE
  1348. ;
  1349. //----------------------------------------------------------------------------
  1350. // The Java scanner
  1351. //----------------------------------------------------------------------------
  1352. class JavaLexer extends Lexer;
  1353. options {
  1354. exportVocab=Java; // call the vocabulary "Java"
  1355. testLiterals=false; // don't automatically test for literals
  1356. k=4; // four characters of lookahead
  1357. charVocabulary='\u0003'..'\uFFFF';
  1358. // without inlining some bitset tests, couldn't do unicode;
  1359. // I need to make ANTLR generate smaller bitsets; see
  1360. // bottom of JavaLexer.java
  1361. codeGenBitsetTestThreshold=20;
  1362. }
  1363. {
  1364. /** flag for enabling the "assert" keyword */
  1365. private boolean assertEnabled = true;
  1366. /** flag for enabling the "enum" keyword */
  1367. private boolean enumEnabled = true;
  1368. /** Enable the "assert" keyword */
  1369. public void enableAssert(boolean shouldEnable) { assertEnabled = shouldEnable; }
  1370. /** Query the "assert" keyword state */
  1371. public boolean isAssertEnabled() { return assertEnabled; }
  1372. /** Enable the "enum" keyword */
  1373. public void enableEnum(boolean shouldEnable) { enumEnabled = shouldEnable; }
  1374. /** Query the "enum" keyword state */
  1375. public boolean isEnumEnabled() { return enumEnabled; }
  1376. }
  1377. // OPERATORS
  1378. QUESTION : '?' ;
  1379. LPAREN : '(' ;
  1380. RPAREN : ')' ;
  1381. LBRACK : '[' ;
  1382. RBRACK : ']' ;
  1383. LCURLY : '{' ;
  1384. RCURLY : '}' ;
  1385. COLON : ':' ;
  1386. COMMA : ',' ;
  1387. //DOT : '.' ;
  1388. ASSIGN : '=' ;
  1389. EQUAL : "==" ;
  1390. LNOT : '!' ;
  1391. BNOT : '~' ;
  1392. NOT_EQUAL : "!=" ;
  1393. DIV : '/' ;
  1394. DIV_ASSIGN : "/=" ;
  1395. PLUS : '+' ;
  1396. PLUS_ASSIGN : "+=" ;
  1397. INC : "++" ;
  1398. MINUS : '-' ;
  1399. MINUS_ASSIGN : "-=" ;
  1400. DEC : "--" ;
  1401. STAR : '*' ;
  1402. STAR_ASSIGN : "*=" ;
  1403. MOD : '%' ;
  1404. MOD_ASSIGN : "%=" ;
  1405. SR : ">>" ;
  1406. SR_ASSIGN : ">>=" ;
  1407. BSR : ">>>" ;
  1408. BSR_ASSIGN : ">>>=" ;
  1409. GE : ">=" ;
  1410. GT : ">" ;
  1411. SL : "<<" ;
  1412. SL_ASSIGN : "<<=" ;
  1413. LE : "<=" ;
  1414. LT : '<' ;
  1415. BXOR : '^' ;
  1416. BXOR_ASSIGN : "^=" ;
  1417. BOR : '|' ;
  1418. BOR_ASSIGN : "|=" ;
  1419. LOR : "||" ;
  1420. BAND : '&' ;
  1421. BAND_ASSIGN : "&=" ;
  1422. LAND : "&&" ;
  1423. SEMI : ';' ;
  1424. // Whitespace -- ignored
  1425. WS : ( ' '
  1426. | '\t'
  1427. | '\f'
  1428. // handle newlines
  1429. | ( options {generateAmbigWarnings=false;}
  1430. : "\r\n" // Evil DOS
  1431. | '\r' // Macintosh
  1432. | '\n' // Unix (the right way)
  1433. )
  1434. { newline(); }
  1435. )+
  1436. { _ttype = Token.SKIP; }
  1437. ;
  1438. // Single-line comments
  1439. SL_COMMENT
  1440. : "//"
  1441. (~('\n'|'\r'))* ('\n'|'\r'('\n')?)
  1442. {$setType(Token.SKIP); newline();}
  1443. ;
  1444. // multiple-line comments
  1445. ML_COMMENT
  1446. : "/*"
  1447. ( /* '\r' '\n' can be matched in one alternative or by matching
  1448. '\r' in one iteration and '\n' in another. I am trying to
  1449. handle any flavor of newline that comes in, but the language
  1450. that allows both "\r\n" and "\r" and "\n" to all be valid
  1451. newline is ambiguous. Consequently, the resulting grammar
  1452. must be ambiguous. I'm shutting this warning off.
  1453. */
  1454. options {
  1455. generateAmbigWarnings=false;
  1456. }
  1457. :
  1458. { LA(2)!='/' }? '*'
  1459. | '\r' '\n' {newline();}
  1460. | '\r' {newline();}
  1461. | '\n' {newline();}
  1462. | ~('*'|'\n'|'\r')
  1463. )*
  1464. "*/"
  1465. {$setType(Token.SKIP);}
  1466. ;
  1467. // character literals
  1468. CHAR_LITERAL
  1469. : '\'' ( ESC | ~('\''|'\n'|'\r'|'\\') ) '\''
  1470. ;
  1471. // string literals
  1472. STRING_LITERAL
  1473. : '"' (ESC|~('"'|'\\'|'\n'|'\r'))* '"'
  1474. ;
  1475. // escape sequence -- note that this is protected; it can only be called
  1476. // from another lexer rule -- it will not ever directly return a token to
  1477. // the parser
  1478. // There are various ambiguities hushed in this rule. The optional
  1479. // '0'...'9' digit matches should be matched here rather than letting
  1480. // them go back to STRING_LITERAL to be matched. ANTLR does the
  1481. // right thing by matching immediately; hence, it's ok to shut off
  1482. // the FOLLOW ambig warnings.
  1483. protected
  1484. ESC
  1485. : '\\'
  1486. ( 'n'
  1487. | 'r'
  1488. | 't'
  1489. | 'b'
  1490. | 'f'
  1491. | '"'
  1492. | '\''
  1493. | '\\'
  1494. | ('u')+ HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
  1495. | '0'..'3'
  1496. (
  1497. options {
  1498. warnWhenFollowAmbig = false;
  1499. }
  1500. : '0'..'7'
  1501. (
  1502. options {
  1503. warnWhenFollowAmbig = false;
  1504. }
  1505. : '0'..'7'
  1506. )?
  1507. )?
  1508. | '4'..'7'
  1509. (
  1510. options {
  1511. warnWhenFollowAmbig = false;
  1512. }
  1513. : '0'..'7'
  1514. )?
  1515. )
  1516. ;
  1517. // hexadecimal digit (again, note it's protected!)
  1518. protected
  1519. HEX_DIGIT
  1520. : ('0'..'9'|'A'..'F'|'a'..'f')
  1521. ;
  1522. // a dummy rule to force vocabulary to be all characters (except special
  1523. // ones that ANTLR uses internally (0 to 2)
  1524. protected
  1525. VOCAB
  1526. : '\3'..'\377'
  1527. ;
  1528. // an identifier. Note that testLiterals is set to true! This means
  1529. // that after we match the rule, we look in the literals table to see
  1530. // if it's a literal or really an identifer
  1531. IDENT
  1532. options {testLiterals=true;}
  1533. : ('a'..'z'|'A'..'Z'|'_'|'$') ('a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')*
  1534. {
  1535. // check if "assert" keyword is enabled
  1536. if (assertEnabled && "assert".equals($getText)) {
  1537. $setType(LITERAL_assert); // set token type for the rule in the parser
  1538. }
  1539. // check if "enum" keyword is enabled
  1540. if (enumEnabled && "enum".equals($getText)) {
  1541. $setType(LITERAL_enum); // set token type for the rule in the parser
  1542. }
  1543. }
  1544. ;
  1545. // a numeric literal
  1546. NUM_INT
  1547. {boolean isDecimal=false; Token t=null;}
  1548. : '.' {_ttype = DOT;}
  1549. (
  1550. (('0'..'9')+ (EXPONENT)? (f1:FLOAT_SUFFIX {t=f1;})?
  1551. {
  1552. if (t != null && t.getText().toUpperCase().indexOf('F')>=0) {
  1553. _ttype = NUM_FLOAT;
  1554. }
  1555. else {
  1556. _ttype = NUM_DOUBLE; // assume double
  1557. }
  1558. })
  1559. |
  1560. // JDK 1.5 token for variable length arguments
  1561. (".." {_ttype = TRIPLE_DOT;})
  1562. )?
  1563. | ( '0' {isDecimal = true;} // special case for just '0'
  1564. ( ('x'|'X')
  1565. ( // hex
  1566. // the 'e'|'E' and float suffix stuff look
  1567. // like hex digits, hence the (...)+ doesn't
  1568. // know when to stop: ambig. ANTLR resolves
  1569. // it correctly by matching immediately. It
  1570. // is therefor ok to hush warning.
  1571. options {
  1572. warnWhenFollowAmbig=false;
  1573. }
  1574. : HEX_DIGIT
  1575. )+
  1576. | //float or double with leading zero
  1577. (('0'..'9')+ ('.'|EXPONENT|FLOAT_SUFFIX)) => ('0'..'9')+
  1578. | ('0'..'7')+ // octal
  1579. )?
  1580. | ('1'..'9') ('0'..'9')* {isDecimal=true;} // non-zero decimal
  1581. )
  1582. ( ('l'|'L') { _ttype = NUM_LONG; }
  1583. // only check to see if it's a float if looks like decimal so far
  1584. | {isDecimal}?
  1585. ( '.' ('0'..'9')* (EXPONENT)? (f2:FLOAT_SUFFIX {t=f2;})?
  1586. | EXPONENT (f3:FLOAT_SUFFIX {t=f3;})?
  1587. | f4:FLOAT_SUFFIX {t=f4;}
  1588. )
  1589. {
  1590. if (t != null && t.getText().toUpperCase() .indexOf('F') >= 0) {
  1591. _ttype = NUM_FLOAT;
  1592. }
  1593. else {
  1594. _ttype = NUM_DOUBLE; // assume double
  1595. }
  1596. }
  1597. )?
  1598. ;
  1599. // JDK 1.5 token for annotations and their declarations
  1600. AT
  1601. : '@'
  1602. ;
  1603. // a couple protected methods to assist in matching floating point numbers
  1604. protected
  1605. EXPONENT
  1606. : ('e'|'E') ('+'|'-')? ('0'..'9')+
  1607. ;
  1608. protected
  1609. FLOAT_SUFFIX
  1610. : 'f'|'F'|'d'|'D'
  1611. ;