/documentation/docs/lexer.html

http://github.com/jashkenas/coffee-script · HTML · 2024 lines · 1768 code · 256 blank · 0 comment · 0 complexity · 9c4487192ff7778f37ad5ad37144c273 MD5 · raw file

Large files are truncated click here to view the full file

  1. <!DOCTYPE html>
  2. <html>
  3. <head>
  4. <title>lexer.coffee</title>
  5. <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  6. <meta name="viewport" content="width=device-width, target-densitydpi=160dpi, initial-scale=1.0; maximum-scale=1.0; user-scalable=0;">
  7. <link rel="stylesheet" media="all" href="docco.css" />
  8. </head>
  9. <body>
  10. <div id="container">
  11. <div id="background"></div>
  12. <ul id="jump_to">
  13. <li>
  14. <a class="large" href="javascript:void(0);">Jump To &hellip;</a>
  15. <a class="small" href="javascript:void(0);">+</a>
  16. <div id="jump_wrapper">
  17. <div id="jump_page_wrapper">
  18. <div id="jump_page">
  19. <a class="source" href="browser.html">
  20. browser.coffee
  21. </a>
  22. <a class="source" href="cake.html">
  23. cake.coffee
  24. </a>
  25. <a class="source" href="coffee-script.html">
  26. coffee-script.coffee
  27. </a>
  28. <a class="source" href="command.html">
  29. command.coffee
  30. </a>
  31. <a class="source" href="grammar.html">
  32. grammar.coffee
  33. </a>
  34. <a class="source" href="helpers.html">
  35. helpers.coffee
  36. </a>
  37. <a class="source" href="index.html">
  38. index.coffee
  39. </a>
  40. <a class="source" href="lexer.html">
  41. lexer.coffee
  42. </a>
  43. <a class="source" href="nodes.html">
  44. nodes.coffee
  45. </a>
  46. <a class="source" href="optparse.html">
  47. optparse.coffee
  48. </a>
  49. <a class="source" href="register.html">
  50. register.coffee
  51. </a>
  52. <a class="source" href="repl.html">
  53. repl.coffee
  54. </a>
  55. <a class="source" href="rewriter.html">
  56. rewriter.coffee
  57. </a>
  58. <a class="source" href="scope.html">
  59. scope.litcoffee
  60. </a>
  61. <a class="source" href="sourcemap.html">
  62. sourcemap.litcoffee
  63. </a>
  64. </div>
  65. </div>
  66. </li>
  67. </ul>
  68. <ul class="sections">
  69. <li id="title">
  70. <div class="annotation">
  71. <h1>lexer.coffee</h1>
  72. </div>
  73. </li>
  74. <li id="section-1">
  75. <div class="annotation">
  76. <div class="pilwrap ">
  77. <a class="pilcrow" href="#section-1">&#182;</a>
  78. </div>
  79. <p>The CoffeeScript Lexer. Uses a series of token-matching regexes to attempt
  80. matches against the beginning of the source code. When a match is found,
  81. a token is produced, we consume the match, and start again. Tokens are in the
  82. form:</p>
  83. <pre><code>[tag, value, locationData]
  84. </code></pre><p>where locationData is {first_line, first_column, last_line, last_column}, which is a
  85. format that can be fed directly into <a href="http://github.com/zaach/jison">Jison</a>. These
  86. are read by jison in the <code>parser.lexer</code> function defined in coffee-script.coffee.</p>
  87. </div>
  88. <div class="content"><div class='highlight'><pre>
  89. {Rewriter, INVERSES} = <span class="hljs-built_in">require</span> <span class="hljs-string">'./rewriter'</span></pre></div></div>
  90. </li>
  91. <li id="section-2">
  92. <div class="annotation">
  93. <div class="pilwrap ">
  94. <a class="pilcrow" href="#section-2">&#182;</a>
  95. </div>
  96. <p>Import the helpers we need.</p>
  97. </div>
  98. <div class="content"><div class='highlight'><pre>{count, starts, compact, repeat, invertLiterate,
  99. locationDataToString, throwSyntaxError} = <span class="hljs-built_in">require</span> <span class="hljs-string">'./helpers'</span></pre></div></div>
  100. </li>
  101. <li id="section-3">
  102. <div class="annotation">
  103. <div class="pilwrap ">
  104. <a class="pilcrow" href="#section-3">&#182;</a>
  105. </div>
  106. <h2 id="the-lexer-class">The Lexer Class</h2>
  107. </div>
  108. </li>
  109. <li id="section-4">
  110. <div class="annotation">
  111. <div class="pilwrap ">
  112. <a class="pilcrow" href="#section-4">&#182;</a>
  113. </div>
  114. </div>
  115. </li>
  116. <li id="section-5">
  117. <div class="annotation">
  118. <div class="pilwrap ">
  119. <a class="pilcrow" href="#section-5">&#182;</a>
  120. </div>
  121. <p>The Lexer class reads a stream of CoffeeScript and divvies it up into tagged
  122. tokens. Some potential ambiguity in the grammar has been avoided by
  123. pushing some extra smarts into the Lexer.</p>
  124. </div>
  125. <div class="content"><div class='highlight'><pre><span class="hljs-built_in">exports</span>.Lexer = <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Lexer</span></span></pre></div></div>
  126. </li>
  127. <li id="section-6">
  128. <div class="annotation">
  129. <div class="pilwrap ">
  130. <a class="pilcrow" href="#section-6">&#182;</a>
  131. </div>
  132. <p><strong>tokenize</strong> is the Lexers main method. Scan by attempting to match tokens
  133. one at a time, using a regular expression anchored at the start of the
  134. remaining code, or a custom recursive token-matching method
  135. (for interpolations). When the next token has been recorded, we move forward
  136. within the code past the token, and begin again.</p>
  137. <p>Each tokenizing method is responsible for returning the number of characters
  138. it has consumed.</p>
  139. <p>Before returning the token stream, run it through the <a href="rewriter.html">Rewriter</a>.</p>
  140. </div>
  141. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">tokenize</span>: <span class="hljs-function"><span class="hljs-params">(code, opts = {})</span> -&gt;</span>
  142. <span class="hljs-property">@literate</span> = opts.literate <span class="hljs-comment"># Are we lexing literate CoffeeScript?</span>
  143. <span class="hljs-property">@indent</span> = <span class="hljs-number">0</span> <span class="hljs-comment"># The current indentation level.</span>
  144. <span class="hljs-property">@baseIndent</span> = <span class="hljs-number">0</span> <span class="hljs-comment"># The overall minimum indentation level</span>
  145. <span class="hljs-property">@indebt</span> = <span class="hljs-number">0</span> <span class="hljs-comment"># The over-indentation at the current level.</span>
  146. <span class="hljs-property">@outdebt</span> = <span class="hljs-number">0</span> <span class="hljs-comment"># The under-outdentation at the current level.</span>
  147. <span class="hljs-property">@indents</span> = [] <span class="hljs-comment"># The stack of all current indentation levels.</span>
  148. <span class="hljs-property">@ends</span> = [] <span class="hljs-comment"># The stack for pairing up tokens.</span>
  149. <span class="hljs-property">@tokens</span> = [] <span class="hljs-comment"># Stream of parsed tokens in the form `['TYPE', value, location data]`.</span>
  150. <span class="hljs-property">@seenFor</span> = <span class="hljs-literal">no</span> <span class="hljs-comment"># Used to recognize FORIN and FOROF tokens.</span>
  151. <span class="hljs-property">@chunkLine</span> =
  152. opts.line <span class="hljs-keyword">or</span> <span class="hljs-number">0</span> <span class="hljs-comment"># The start line for the current @chunk.</span>
  153. <span class="hljs-property">@chunkColumn</span> =
  154. opts.column <span class="hljs-keyword">or</span> <span class="hljs-number">0</span> <span class="hljs-comment"># The start column of the current @chunk.</span>
  155. code = <span class="hljs-property">@clean</span> code <span class="hljs-comment"># The stripped, cleaned original source code.</span></pre></div></div>
  156. </li>
  157. <li id="section-7">
  158. <div class="annotation">
  159. <div class="pilwrap ">
  160. <a class="pilcrow" href="#section-7">&#182;</a>
  161. </div>
  162. <p>At every position, run through this list of attempted matches,
  163. short-circuiting if any of them succeed. Their order determines precedence:
  164. <code>@literalToken</code> is the fallback catch-all.</p>
  165. </div>
  166. <div class="content"><div class='highlight'><pre> i = <span class="hljs-number">0</span>
  167. <span class="hljs-keyword">while</span> <span class="hljs-property">@chunk</span> = code[i..]
  168. consumed = \
  169. <span class="hljs-property">@identifierToken</span>() <span class="hljs-keyword">or</span>
  170. <span class="hljs-property">@commentToken</span>() <span class="hljs-keyword">or</span>
  171. <span class="hljs-property">@whitespaceToken</span>() <span class="hljs-keyword">or</span>
  172. <span class="hljs-property">@lineToken</span>() <span class="hljs-keyword">or</span>
  173. <span class="hljs-property">@stringToken</span>() <span class="hljs-keyword">or</span>
  174. <span class="hljs-property">@numberToken</span>() <span class="hljs-keyword">or</span>
  175. <span class="hljs-property">@regexToken</span>() <span class="hljs-keyword">or</span>
  176. <span class="hljs-property">@jsToken</span>() <span class="hljs-keyword">or</span>
  177. <span class="hljs-property">@literalToken</span>()</pre></div></div>
  178. </li>
  179. <li id="section-8">
  180. <div class="annotation">
  181. <div class="pilwrap ">
  182. <a class="pilcrow" href="#section-8">&#182;</a>
  183. </div>
  184. <p>Update position</p>
  185. </div>
  186. <div class="content"><div class='highlight'><pre> [<span class="hljs-property">@chunkLine</span>, <span class="hljs-property">@chunkColumn</span>] = <span class="hljs-property">@getLineAndColumnFromChunk</span> consumed
  187. i += consumed
  188. <span class="hljs-keyword">return</span> {<span class="hljs-property">@tokens</span>, <span class="hljs-attribute">index</span>: i} <span class="hljs-keyword">if</span> opts.untilBalanced <span class="hljs-keyword">and</span> <span class="hljs-property">@ends</span>.length <span class="hljs-keyword">is</span> <span class="hljs-number">0</span>
  189. <span class="hljs-property">@closeIndentation</span>()
  190. <span class="hljs-property">@error</span> <span class="hljs-string">"missing <span class="hljs-subst">#{end.tag}</span>"</span>, end.origin[<span class="hljs-number">2</span>] <span class="hljs-keyword">if</span> end = <span class="hljs-property">@ends</span>.pop()
  191. <span class="hljs-keyword">return</span> <span class="hljs-property">@tokens</span> <span class="hljs-keyword">if</span> opts.rewrite <span class="hljs-keyword">is</span> <span class="hljs-literal">off</span>
  192. (<span class="hljs-keyword">new</span> Rewriter).rewrite <span class="hljs-property">@tokens</span></pre></div></div>
  193. </li>
  194. <li id="section-9">
  195. <div class="annotation">
  196. <div class="pilwrap ">
  197. <a class="pilcrow" href="#section-9">&#182;</a>
  198. </div>
  199. <p>Preprocess the code to remove leading and trailing whitespace, carriage
  200. returns, etc. If were lexing literate CoffeeScript, strip external Markdown
  201. by removing all lines that arent indented by at least four spaces or a tab.</p>
  202. </div>
  203. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">clean</span>: <span class="hljs-function"><span class="hljs-params">(code)</span> -&gt;</span>
  204. code = code.slice(<span class="hljs-number">1</span>) <span class="hljs-keyword">if</span> code.charCodeAt(<span class="hljs-number">0</span>) <span class="hljs-keyword">is</span> BOM
  205. code = code.replace(<span class="hljs-regexp">/\r/g</span>, <span class="hljs-string">''</span>).replace TRAILING_SPACES, <span class="hljs-string">''</span>
  206. <span class="hljs-keyword">if</span> WHITESPACE.test code
  207. code = <span class="hljs-string">"\n<span class="hljs-subst">#{code}</span>"</span>
  208. <span class="hljs-property">@chunkLine</span>--
  209. code = invertLiterate code <span class="hljs-keyword">if</span> <span class="hljs-property">@literate</span>
  210. code</pre></div></div>
  211. </li>
  212. <li id="section-10">
  213. <div class="annotation">
  214. <div class="pilwrap ">
  215. <a class="pilcrow" href="#section-10">&#182;</a>
  216. </div>
  217. <h2 id="tokenizers">Tokenizers</h2>
  218. </div>
  219. </li>
  220. <li id="section-11">
  221. <div class="annotation">
  222. <div class="pilwrap ">
  223. <a class="pilcrow" href="#section-11">&#182;</a>
  224. </div>
  225. </div>
  226. </li>
  227. <li id="section-12">
  228. <div class="annotation">
  229. <div class="pilwrap ">
  230. <a class="pilcrow" href="#section-12">&#182;</a>
  231. </div>
  232. <p>Matches identifying literals: variables, keywords, method names, etc.
  233. Check to ensure that JavaScript reserved words arent being used as
  234. identifiers. Because CoffeeScript reserves a handful of keywords that are
  235. allowed in JavaScript, were careful not to tag them as keywords when
  236. referenced as property names here, so you can still do <code>jQuery.is()</code> even
  237. though <code>is</code> means <code>===</code> otherwise.</p>
  238. </div>
  239. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">identifierToken</span>:<span class="hljs-function"> -&gt;</span>
  240. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> match = IDENTIFIER.exec <span class="hljs-property">@chunk</span>
  241. [input, id, colon] = match</pre></div></div>
  242. </li>
  243. <li id="section-13">
  244. <div class="annotation">
  245. <div class="pilwrap ">
  246. <a class="pilcrow" href="#section-13">&#182;</a>
  247. </div>
  248. <p>Preserve length of id for location data</p>
  249. </div>
  250. <div class="content"><div class='highlight'><pre> idLength = id.length
  251. poppedToken = <span class="hljs-literal">undefined</span>
  252. <span class="hljs-keyword">if</span> id <span class="hljs-keyword">is</span> <span class="hljs-string">'own'</span> <span class="hljs-keyword">and</span> <span class="hljs-property">@tag</span>() <span class="hljs-keyword">is</span> <span class="hljs-string">'FOR'</span>
  253. <span class="hljs-property">@token</span> <span class="hljs-string">'OWN'</span>, id
  254. <span class="hljs-keyword">return</span> id.length
  255. <span class="hljs-keyword">if</span> id <span class="hljs-keyword">is</span> <span class="hljs-string">'from'</span> <span class="hljs-keyword">and</span> <span class="hljs-property">@tag</span>() <span class="hljs-keyword">is</span> <span class="hljs-string">'YIELD'</span>
  256. <span class="hljs-property">@token</span> <span class="hljs-string">'FROM'</span>, id
  257. <span class="hljs-keyword">return</span> id.length
  258. [..., prev] = <span class="hljs-property">@tokens</span>
  259. forcedIdentifier = colon <span class="hljs-keyword">or</span> prev? <span class="hljs-keyword">and</span>
  260. (prev[<span class="hljs-number">0</span>] <span class="hljs-keyword">in</span> [<span class="hljs-string">'.'</span>, <span class="hljs-string">'?.'</span>, <span class="hljs-string">'::'</span>, <span class="hljs-string">'?::'</span>] <span class="hljs-keyword">or</span>
  261. <span class="hljs-keyword">not</span> prev.spaced <span class="hljs-keyword">and</span> prev[<span class="hljs-number">0</span>] <span class="hljs-keyword">is</span> <span class="hljs-string">'@'</span>)
  262. tag = <span class="hljs-string">'IDENTIFIER'</span>
  263. <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> forcedIdentifier <span class="hljs-keyword">and</span> (id <span class="hljs-keyword">in</span> JS_KEYWORDS <span class="hljs-keyword">or</span> id <span class="hljs-keyword">in</span> COFFEE_KEYWORDS)
  264. tag = id.toUpperCase()
  265. <span class="hljs-keyword">if</span> tag <span class="hljs-keyword">is</span> <span class="hljs-string">'WHEN'</span> <span class="hljs-keyword">and</span> <span class="hljs-property">@tag</span>() <span class="hljs-keyword">in</span> LINE_BREAK
  266. tag = <span class="hljs-string">'LEADING_WHEN'</span>
  267. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> tag <span class="hljs-keyword">is</span> <span class="hljs-string">'FOR'</span>
  268. <span class="hljs-property">@seenFor</span> = <span class="hljs-literal">yes</span>
  269. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> tag <span class="hljs-keyword">is</span> <span class="hljs-string">'UNLESS'</span>
  270. tag = <span class="hljs-string">'IF'</span>
  271. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> tag <span class="hljs-keyword">in</span> UNARY
  272. tag = <span class="hljs-string">'UNARY'</span>
  273. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> tag <span class="hljs-keyword">in</span> RELATION
  274. <span class="hljs-keyword">if</span> tag <span class="hljs-keyword">isnt</span> <span class="hljs-string">'INSTANCEOF'</span> <span class="hljs-keyword">and</span> <span class="hljs-property">@seenFor</span>
  275. tag = <span class="hljs-string">'FOR'</span> + tag
  276. <span class="hljs-property">@seenFor</span> = <span class="hljs-literal">no</span>
  277. <span class="hljs-keyword">else</span>
  278. tag = <span class="hljs-string">'RELATION'</span>
  279. <span class="hljs-keyword">if</span> <span class="hljs-property">@value</span>() <span class="hljs-keyword">is</span> <span class="hljs-string">'!'</span>
  280. poppedToken = <span class="hljs-property">@tokens</span>.pop()
  281. id = <span class="hljs-string">'!'</span> + id
  282. <span class="hljs-keyword">if</span> id <span class="hljs-keyword">in</span> JS_FORBIDDEN
  283. <span class="hljs-keyword">if</span> forcedIdentifier
  284. tag = <span class="hljs-string">'IDENTIFIER'</span>
  285. id = <span class="hljs-keyword">new</span> String id
  286. id.reserved = <span class="hljs-literal">yes</span>
  287. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> id <span class="hljs-keyword">in</span> RESERVED
  288. <span class="hljs-property">@error</span> <span class="hljs-string">"reserved word '<span class="hljs-subst">#{id}</span>'"</span>, <span class="hljs-attribute">length</span>: id.length
  289. <span class="hljs-keyword">unless</span> forcedIdentifier
  290. <span class="hljs-keyword">if</span> id <span class="hljs-keyword">in</span> COFFEE_ALIASES
  291. alias = id
  292. id = COFFEE_ALIAS_MAP[id]
  293. tag = <span class="hljs-keyword">switch</span> id
  294. <span class="hljs-keyword">when</span> <span class="hljs-string">'!'</span> <span class="hljs-keyword">then</span> <span class="hljs-string">'UNARY'</span>
  295. <span class="hljs-keyword">when</span> <span class="hljs-string">'=='</span>, <span class="hljs-string">'!='</span> <span class="hljs-keyword">then</span> <span class="hljs-string">'COMPARE'</span>
  296. <span class="hljs-keyword">when</span> <span class="hljs-string">'&amp;&amp;'</span>, <span class="hljs-string">'||'</span> <span class="hljs-keyword">then</span> <span class="hljs-string">'LOGIC'</span>
  297. <span class="hljs-keyword">when</span> <span class="hljs-string">'true'</span>, <span class="hljs-string">'false'</span> <span class="hljs-keyword">then</span> <span class="hljs-string">'BOOL'</span>
  298. <span class="hljs-keyword">when</span> <span class="hljs-string">'break'</span>, <span class="hljs-string">'continue'</span> <span class="hljs-keyword">then</span> <span class="hljs-string">'STATEMENT'</span>
  299. <span class="hljs-keyword">else</span> tag
  300. tagToken = <span class="hljs-property">@token</span> tag, id, <span class="hljs-number">0</span>, idLength
  301. tagToken.origin = [tag, alias, tagToken[<span class="hljs-number">2</span>]] <span class="hljs-keyword">if</span> alias
  302. tagToken.variable = <span class="hljs-keyword">not</span> forcedIdentifier
  303. <span class="hljs-keyword">if</span> poppedToken
  304. [tagToken[<span class="hljs-number">2</span>].first_line, tagToken[<span class="hljs-number">2</span>].first_column] =
  305. [poppedToken[<span class="hljs-number">2</span>].first_line, poppedToken[<span class="hljs-number">2</span>].first_column]
  306. <span class="hljs-keyword">if</span> colon
  307. colonOffset = input.lastIndexOf <span class="hljs-string">':'</span>
  308. <span class="hljs-property">@token</span> <span class="hljs-string">':'</span>, <span class="hljs-string">':'</span>, colonOffset, colon.length
  309. input.length</pre></div></div>
  310. </li>
  311. <li id="section-14">
  312. <div class="annotation">
  313. <div class="pilwrap ">
  314. <a class="pilcrow" href="#section-14">&#182;</a>
  315. </div>
  316. <p>Matches numbers, including decimals, hex, and exponential notation.
  317. Be careful not to interfere with ranges-in-progress.</p>
  318. </div>
  319. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">numberToken</span>:<span class="hljs-function"> -&gt;</span>
  320. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> match = NUMBER.exec <span class="hljs-property">@chunk</span>
  321. number = match[<span class="hljs-number">0</span>]
  322. lexedLength = number.length
  323. <span class="hljs-keyword">if</span> <span class="hljs-regexp">/^0[BOX]/</span>.test number
  324. <span class="hljs-property">@error</span> <span class="hljs-string">"radix prefix in '<span class="hljs-subst">#{number}</span>' must be lowercase"</span>, <span class="hljs-attribute">offset</span>: <span class="hljs-number">1</span>
  325. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> <span class="hljs-regexp">/E/</span>.test(number) <span class="hljs-keyword">and</span> <span class="hljs-keyword">not</span> <span class="hljs-regexp">/^0x/</span>.test number
  326. <span class="hljs-property">@error</span> <span class="hljs-string">"exponential notation in '<span class="hljs-subst">#{number}</span>' must be indicated with a lowercase 'e'"</span>,
  327. <span class="hljs-attribute">offset</span>: number.indexOf(<span class="hljs-string">'E'</span>)
  328. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> <span class="hljs-regexp">/^0\d*[89]/</span>.test number
  329. <span class="hljs-property">@error</span> <span class="hljs-string">"decimal literal '<span class="hljs-subst">#{number}</span>' must not be prefixed with '0'"</span>, <span class="hljs-attribute">length</span>: lexedLength
  330. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> <span class="hljs-regexp">/^0\d+/</span>.test number
  331. <span class="hljs-property">@error</span> <span class="hljs-string">"octal literal '<span class="hljs-subst">#{number}</span>' must be prefixed with '0o'"</span>, <span class="hljs-attribute">length</span>: lexedLength
  332. <span class="hljs-keyword">if</span> octalLiteral = <span class="hljs-regexp">/^0o([0-7]+)/</span>.exec number
  333. number = <span class="hljs-string">'0x'</span> + parseInt(octalLiteral[<span class="hljs-number">1</span>], <span class="hljs-number">8</span>).toString <span class="hljs-number">16</span>
  334. <span class="hljs-keyword">if</span> binaryLiteral = <span class="hljs-regexp">/^0b([01]+)/</span>.exec number
  335. number = <span class="hljs-string">'0x'</span> + parseInt(binaryLiteral[<span class="hljs-number">1</span>], <span class="hljs-number">2</span>).toString <span class="hljs-number">16</span>
  336. <span class="hljs-property">@token</span> <span class="hljs-string">'NUMBER'</span>, number, <span class="hljs-number">0</span>, lexedLength
  337. lexedLength</pre></div></div>
  338. </li>
  339. <li id="section-15">
  340. <div class="annotation">
  341. <div class="pilwrap ">
  342. <a class="pilcrow" href="#section-15">&#182;</a>
  343. </div>
  344. <p>Matches strings, including multi-line strings, as well as heredocs, with or without
  345. interpolation.</p>
  346. </div>
  347. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">stringToken</span>:<span class="hljs-function"> -&gt;</span>
  348. [quote] = STRING_START.exec(<span class="hljs-property">@chunk</span>) || []
  349. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> quote
  350. regex = <span class="hljs-keyword">switch</span> quote
  351. <span class="hljs-keyword">when</span> <span class="hljs-string">"'"</span> <span class="hljs-keyword">then</span> STRING_SINGLE
  352. <span class="hljs-keyword">when</span> <span class="hljs-string">'"'</span> <span class="hljs-keyword">then</span> STRING_DOUBLE
  353. <span class="hljs-keyword">when</span> <span class="hljs-string">"'''"</span> <span class="hljs-keyword">then</span> HEREDOC_SINGLE
  354. <span class="hljs-keyword">when</span> <span class="hljs-string">'"""'</span> <span class="hljs-keyword">then</span> HEREDOC_DOUBLE
  355. heredoc = quote.length <span class="hljs-keyword">is</span> <span class="hljs-number">3</span>
  356. {tokens, <span class="hljs-attribute">index</span>: end} = <span class="hljs-property">@matchWithInterpolations</span> regex, quote
  357. $ = tokens.length - <span class="hljs-number">1</span>
  358. delimiter = quote.charAt(<span class="hljs-number">0</span>)
  359. <span class="hljs-keyword">if</span> heredoc</pre></div></div>
  360. </li>
  361. <li id="section-16">
  362. <div class="annotation">
  363. <div class="pilwrap ">
  364. <a class="pilcrow" href="#section-16">&#182;</a>
  365. </div>
  366. <p>Find the smallest indentation. It will be removed from all lines later.</p>
  367. </div>
  368. <div class="content"><div class='highlight'><pre> indent = <span class="hljs-literal">null</span>
  369. doc = (token[<span class="hljs-number">1</span>] <span class="hljs-keyword">for</span> token, i <span class="hljs-keyword">in</span> tokens <span class="hljs-keyword">when</span> token[<span class="hljs-number">0</span>] <span class="hljs-keyword">is</span> <span class="hljs-string">'NEOSTRING'</span>).join <span class="hljs-string">'#{}'</span>
  370. <span class="hljs-keyword">while</span> match = HEREDOC_INDENT.exec doc
  371. attempt = match[<span class="hljs-number">1</span>]
  372. indent = attempt <span class="hljs-keyword">if</span> indent <span class="hljs-keyword">is</span> <span class="hljs-literal">null</span> <span class="hljs-keyword">or</span> <span class="hljs-number">0</span> &lt; attempt.length &lt; indent.length
  373. indentRegex = <span class="hljs-regexp">/// ^<span class="hljs-subst">#{indent}</span> ///</span>gm <span class="hljs-keyword">if</span> indent
  374. <span class="hljs-property">@mergeInterpolationTokens</span> tokens, {delimiter}, <span class="hljs-function"><span class="hljs-params">(value, i)</span> =&gt;</span>
  375. value = <span class="hljs-property">@formatString</span> value
  376. value = value.replace LEADING_BLANK_LINE, <span class="hljs-string">''</span> <span class="hljs-keyword">if</span> i <span class="hljs-keyword">is</span> <span class="hljs-number">0</span>
  377. value = value.replace TRAILING_BLANK_LINE, <span class="hljs-string">''</span> <span class="hljs-keyword">if</span> i <span class="hljs-keyword">is</span> $
  378. value = value.replace indentRegex, <span class="hljs-string">''</span> <span class="hljs-keyword">if</span> indentRegex
  379. value
  380. <span class="hljs-keyword">else</span>
  381. <span class="hljs-property">@mergeInterpolationTokens</span> tokens, {delimiter}, <span class="hljs-function"><span class="hljs-params">(value, i)</span> =&gt;</span>
  382. value = <span class="hljs-property">@formatString</span> value
  383. value = value.replace SIMPLE_STRING_OMIT, <span class="hljs-function"><span class="hljs-params">(match, offset)</span> -&gt;</span>
  384. <span class="hljs-keyword">if</span> (i <span class="hljs-keyword">is</span> <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> offset <span class="hljs-keyword">is</span> <span class="hljs-number">0</span>) <span class="hljs-keyword">or</span>
  385. (i <span class="hljs-keyword">is</span> $ <span class="hljs-keyword">and</span> offset + match.length <span class="hljs-keyword">is</span> value.length)
  386. <span class="hljs-string">''</span>
  387. <span class="hljs-keyword">else</span>
  388. <span class="hljs-string">' '</span>
  389. value
  390. end</pre></div></div>
  391. </li>
  392. <li id="section-17">
  393. <div class="annotation">
  394. <div class="pilwrap ">
  395. <a class="pilcrow" href="#section-17">&#182;</a>
  396. </div>
  397. <p>Matches and consumes comments.</p>
  398. </div>
  399. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">commentToken</span>:<span class="hljs-function"> -&gt;</span>
  400. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> match = <span class="hljs-property">@chunk</span>.match COMMENT
  401. [comment, here] = match
  402. <span class="hljs-keyword">if</span> here
  403. <span class="hljs-keyword">if</span> match = HERECOMMENT_ILLEGAL.exec comment
  404. <span class="hljs-property">@error</span> <span class="hljs-string">"block comments cannot contain <span class="hljs-subst">#{match[<span class="hljs-number">0</span>]}</span>"</span>,
  405. <span class="hljs-attribute">offset</span>: match.index, <span class="hljs-attribute">length</span>: match[<span class="hljs-number">0</span>].length
  406. <span class="hljs-keyword">if</span> here.indexOf(<span class="hljs-string">'\n'</span>) &gt;= <span class="hljs-number">0</span>
  407. here = here.replace <span class="hljs-regexp">/// \n <span class="hljs-subst">#{repeat <span class="hljs-string">' '</span>, <span class="hljs-property">@indent</span>}</span> ///</span>g, <span class="hljs-string">'\n'</span>
  408. <span class="hljs-property">@token</span> <span class="hljs-string">'HERECOMMENT'</span>, here, <span class="hljs-number">0</span>, comment.length
  409. comment.length</pre></div></div>
  410. </li>
  411. <li id="section-18">
  412. <div class="annotation">
  413. <div class="pilwrap ">
  414. <a class="pilcrow" href="#section-18">&#182;</a>
  415. </div>
  416. <p>Matches JavaScript interpolated directly into the source via backticks.</p>
  417. </div>
  418. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">jsToken</span>:<span class="hljs-function"> -&gt;</span>
  419. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> <span class="hljs-property">@chunk</span>.charAt(<span class="hljs-number">0</span>) <span class="hljs-keyword">is</span> <span class="hljs-string">'`'</span> <span class="hljs-keyword">and</span> match = JSTOKEN.exec <span class="hljs-property">@chunk</span>
  420. <span class="hljs-property">@token</span> <span class="hljs-string">'JS'</span>, (script = match[<span class="hljs-number">0</span>])[<span class="hljs-number">1.</span>..-<span class="hljs-number">1</span>], <span class="hljs-number">0</span>, script.length
  421. script.length</pre></div></div>
  422. </li>
  423. <li id="section-19">
  424. <div class="annotation">
  425. <div class="pilwrap ">
  426. <a class="pilcrow" href="#section-19">&#182;</a>
  427. </div>
  428. <p>Matches regular expression literals, as well as multiline extended ones.
  429. Lexing regular expressions is difficult to distinguish from division, so we
  430. borrow some basic heuristics from JavaScript and Ruby.</p>
  431. </div>
  432. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">regexToken</span>:<span class="hljs-function"> -&gt;</span>
  433. <span class="hljs-keyword">switch</span>
  434. <span class="hljs-keyword">when</span> match = REGEX_ILLEGAL.exec <span class="hljs-property">@chunk</span>
  435. <span class="hljs-property">@error</span> <span class="hljs-string">"regular expressions cannot begin with <span class="hljs-subst">#{match[<span class="hljs-number">2</span>]}</span>"</span>,
  436. <span class="hljs-attribute">offset</span>: match.index + match[<span class="hljs-number">1</span>].length
  437. <span class="hljs-keyword">when</span> match = <span class="hljs-property">@matchWithInterpolations</span> HEREGEX, <span class="hljs-string">'///'</span>
  438. {tokens, index} = match
  439. <span class="hljs-keyword">when</span> match = REGEX.exec <span class="hljs-property">@chunk</span>
  440. [regex, body, closed] = match
  441. <span class="hljs-property">@validateEscapes</span> body, <span class="hljs-attribute">isRegex</span>: <span class="hljs-literal">yes</span>, <span class="hljs-attribute">offsetInChunk</span>: <span class="hljs-number">1</span>
  442. index = regex.length
  443. [..., prev] = <span class="hljs-property">@tokens</span>
  444. <span class="hljs-keyword">if</span> prev
  445. <span class="hljs-keyword">if</span> prev.spaced <span class="hljs-keyword">and</span> prev[<span class="hljs-number">0</span>] <span class="hljs-keyword">in</span> CALLABLE
  446. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span> <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> closed <span class="hljs-keyword">or</span> POSSIBLY_DIVISION.test regex
  447. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> prev[<span class="hljs-number">0</span>] <span class="hljs-keyword">in</span> NOT_REGEX
  448. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
  449. <span class="hljs-property">@error</span> <span class="hljs-string">'missing / (unclosed regex)'</span> <span class="hljs-keyword">unless</span> closed
  450. <span class="hljs-keyword">else</span>
  451. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
  452. [flags] = REGEX_FLAGS.exec <span class="hljs-property">@chunk</span>[index..]
  453. end = index + flags.length
  454. origin = <span class="hljs-property">@makeToken</span> <span class="hljs-string">'REGEX'</span>, <span class="hljs-literal">null</span>, <span class="hljs-number">0</span>, end
  455. <span class="hljs-keyword">switch</span>
  456. <span class="hljs-keyword">when</span> <span class="hljs-keyword">not</span> VALID_FLAGS.test flags
  457. <span class="hljs-property">@error</span> <span class="hljs-string">"invalid regular expression flags <span class="hljs-subst">#{flags}</span>"</span>, <span class="hljs-attribute">offset</span>: index, <span class="hljs-attribute">length</span>: flags.length
  458. <span class="hljs-keyword">when</span> regex <span class="hljs-keyword">or</span> tokens.length <span class="hljs-keyword">is</span> <span class="hljs-number">1</span>
  459. body ?= <span class="hljs-property">@formatHeregex</span> tokens[<span class="hljs-number">0</span>][<span class="hljs-number">1</span>]
  460. <span class="hljs-property">@token</span> <span class="hljs-string">'REGEX'</span>, <span class="hljs-string">"<span class="hljs-subst">#{<span class="hljs-property">@makeDelimitedLiteral</span> body, delimiter: <span class="hljs-string">'/'</span>}</span><span class="hljs-subst">#{flags}</span>"</span>, <span class="hljs-number">0</span>, end, origin
  461. <span class="hljs-keyword">else</span>
  462. <span class="hljs-property">@token</span> <span class="hljs-string">'REGEX_START'</span>, <span class="hljs-string">'('</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, origin
  463. <span class="hljs-property">@token</span> <span class="hljs-string">'IDENTIFIER'</span>, <span class="hljs-string">'RegExp'</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>
  464. <span class="hljs-property">@token</span> <span class="hljs-string">'CALL_START'</span>, <span class="hljs-string">'('</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>
  465. <span class="hljs-property">@mergeInterpolationTokens</span> tokens, {<span class="hljs-attribute">delimiter</span>: <span class="hljs-string">'"'</span>, <span class="hljs-attribute">double</span>: <span class="hljs-literal">yes</span>}, <span class="hljs-property">@formatHeregex</span>
  466. <span class="hljs-keyword">if</span> flags
  467. <span class="hljs-property">@token</span> <span class="hljs-string">','</span>, <span class="hljs-string">','</span>, index, <span class="hljs-number">0</span>
  468. <span class="hljs-property">@token</span> <span class="hljs-string">'STRING'</span>, <span class="hljs-string">'"'</span> + flags + <span class="hljs-string">'"'</span>, index, flags.length
  469. <span class="hljs-property">@token</span> <span class="hljs-string">')'</span>, <span class="hljs-string">')'</span>, end, <span class="hljs-number">0</span>
  470. <span class="hljs-property">@token</span> <span class="hljs-string">'REGEX_END'</span>, <span class="hljs-string">')'</span>, end, <span class="hljs-number">0</span>
  471. end</pre></div></div>
  472. </li>
  473. <li id="section-20">
  474. <div class="annotation">
  475. <div class="pilwrap ">
  476. <a class="pilcrow" href="#section-20">&#182;</a>
  477. </div>
  478. <p>Matches newlines, indents, and outdents, and determines which is which.
  479. If we can detect that the current line is continued onto the next line,
  480. then the newline is suppressed:</p>
  481. <pre><code>elements
  482. .each( ... )
  483. .map( ... )
  484. </code></pre><p>Keeps track of the level of indentation, because a single outdent token
  485. can close multiple indents, so we need to know how far in we happen to be.</p>
  486. </div>
  487. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">lineToken</span>:<span class="hljs-function"> -&gt;</span>
  488. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> match = MULTI_DENT.exec <span class="hljs-property">@chunk</span>
  489. indent = match[<span class="hljs-number">0</span>]
  490. <span class="hljs-property">@seenFor</span> = <span class="hljs-literal">no</span>
  491. size = indent.length - <span class="hljs-number">1</span> - indent.lastIndexOf <span class="hljs-string">'\n'</span>
  492. noNewlines = <span class="hljs-property">@unfinished</span>()
  493. <span class="hljs-keyword">if</span> size - <span class="hljs-property">@indebt</span> <span class="hljs-keyword">is</span> <span class="hljs-property">@indent</span>
  494. <span class="hljs-keyword">if</span> noNewlines <span class="hljs-keyword">then</span> <span class="hljs-property">@suppressNewlines</span>() <span class="hljs-keyword">else</span> <span class="hljs-property">@newlineToken</span> <span class="hljs-number">0</span>
  495. <span class="hljs-keyword">return</span> indent.length
  496. <span class="hljs-keyword">if</span> size &gt; <span class="hljs-property">@indent</span>
  497. <span class="hljs-keyword">if</span> noNewlines
  498. <span class="hljs-property">@indebt</span> = size - <span class="hljs-property">@indent</span>
  499. <span class="hljs-property">@suppressNewlines</span>()
  500. <span class="hljs-keyword">return</span> indent.length
  501. <span class="hljs-keyword">unless</span> <span class="hljs-property">@tokens</span>.length
  502. <span class="hljs-property">@baseIndent</span> = <span class="hljs-property">@indent</span> = size
  503. <span class="hljs-keyword">return</span> indent.length
  504. diff = size - <span class="hljs-property">@indent</span> + <span class="hljs-property">@outdebt</span>
  505. <span class="hljs-property">@token</span> <span class="hljs-string">'INDENT'</span>, diff, indent.length - size, size
  506. <span class="hljs-property">@indents</span>.push diff
  507. <span class="hljs-property">@ends</span>.push {<span class="hljs-attribute">tag</span>: <span class="hljs-string">'OUTDENT'</span>}
  508. <span class="hljs-property">@outdebt</span> = <span class="hljs-property">@indebt</span> = <span class="hljs-number">0</span>
  509. <span class="hljs-property">@indent</span> = size
  510. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> size &lt; <span class="hljs-property">@baseIndent</span>
  511. <span class="hljs-property">@error</span> <span class="hljs-string">'missing indentation'</span>, <span class="hljs-attribute">offset</span>: indent.length
  512. <span class="hljs-keyword">else</span>
  513. <span class="hljs-property">@indebt</span> = <span class="hljs-number">0</span>
  514. <span class="hljs-property">@outdentToken</span> <span class="hljs-property">@indent</span> - size, noNewlines, indent.length
  515. indent.length</pre></div></div>
  516. </li>
  517. <li id="section-21">
  518. <div class="annotation">
  519. <div class="pilwrap ">
  520. <a class="pilcrow" href="#section-21">&#182;</a>
  521. </div>
  522. <p>Record an outdent token or multiple tokens, if we happen to be moving back
  523. inwards past several recorded indents. Sets new @indent value.</p>
  524. </div>
  525. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">outdentToken</span>: <span class="hljs-function"><span class="hljs-params">(moveOut, noNewlines, outdentLength)</span> -&gt;</span>
  526. decreasedIndent = <span class="hljs-property">@indent</span> - moveOut
  527. <span class="hljs-keyword">while</span> moveOut &gt; <span class="hljs-number">0</span>
  528. lastIndent = <span class="hljs-property">@indents</span>[<span class="hljs-property">@indents</span>.length - <span class="hljs-number">1</span>]
  529. <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> lastIndent
  530. moveOut = <span class="hljs-number">0</span>
  531. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> lastIndent <span class="hljs-keyword">is</span> <span class="hljs-property">@outdebt</span>
  532. moveOut -= <span class="hljs-property">@outdebt</span>
  533. <span class="hljs-property">@outdebt</span> = <span class="hljs-number">0</span>
  534. <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> lastIndent &lt; <span class="hljs-property">@outdebt</span>
  535. <span class="hljs-property">@outdebt</span> -= lastIndent
  536. moveOut -= lastIndent
  537. <span class="hljs-keyword">else</span>
  538. dent = <span class="hljs-property">@indents</span>.pop() + <span class="hljs-property">@outdebt</span>
  539. <span class="hljs-keyword">if</span> outdentLength <span class="hljs-keyword">and</span> <span class="hljs-property">@chunk</span>[outdentLength] <span class="hljs-keyword">in</span> INDENTABLE_CLOSERS
  540. decreasedIndent -= dent - moveOut
  541. moveOut = dent
  542. <span class="hljs-property">@outdebt</span> = <span class="hljs-number">0</span></pre></div></div>
  543. </li>
  544. <li id="section-22">
  545. <div class="annotation">
  546. <div class="pilwrap ">
  547. <a class="pilcrow" href="#section-22">&#182;</a>
  548. </div>
  549. <p>pair might call outdentToken, so preserve decreasedIndent</p>
  550. </div>
  551. <div class="content"><div class='highlight'><pre> <span class="hljs-property">@pair</span> <span class="hljs-string">'OUTDENT'</span>
  552. <span class="hljs-property">@token</span> <span class="hljs-string">'OUTDENT'</span>, moveOut, <span class="hljs-number">0</span>, outdentLength
  553. moveOut -= dent
  554. <span class="hljs-property">@outdebt</span> -= moveOut <span class="hljs-keyword">if</span> dent
  555. <span class="hljs-property">@tokens</span>.pop() <span class="hljs-keyword">while</span> <span class="hljs-property">@value</span>() <span class="hljs-keyword">is</span> <span class="hljs-string">';'</span>
  556. <span class="hljs-property">@token</span> <span class="hljs-string">'TERMINATOR'</span>, <span class="hljs-string">'\n'</span>, outdentLength, <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> <span class="hljs-property">@tag</span>() <span class="hljs-keyword">is</span> <span class="hljs-string">'TERMINATOR'</span> <span class="hljs-keyword">or</span> noNewlines
  557. <span class="hljs-property">@indent</span> = decreasedIndent
  558. <span class="hljs-keyword">this</span></pre></div></div>
  559. </li>
  560. <li id="section-23">
  561. <div class="annotation">
  562. <div class="pilwrap ">
  563. <a class="pilcrow" href="#section-23">&#182;</a>
  564. </div>
  565. <p>Matches and consumes non-meaningful whitespace. Tag the previous token
  566. as being spaced, because there are some cases where it makes a difference.</p>
  567. </div>
  568. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">whitespaceToken</span>:<span class="hljs-function"> -&gt;</span>
  569. <span class="hljs-keyword">return</span> <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> (match = WHITESPACE.exec <span class="hljs-property">@chunk</span>) <span class="hljs-keyword">or</span>
  570. (nline = <span class="hljs-property">@chunk</span>.charAt(<span class="hljs-number">0</span>) <span class="hljs-keyword">is</span> <span class="hljs-string">'\n'</span>)
  571. [..., prev] = <span class="hljs-property">@tokens</span>
  572. prev[<span class="hljs-keyword">if</span> match <span class="hljs-keyword">then</span> <span class="hljs-string">'spaced'</span> <span class="hljs-keyword">else</span> <span class="hljs-string">'newLine'</span>] = <span class="hljs-literal">true</span> <span class="hljs-keyword">if</span> prev
  573. <span class="hljs-keyword">if</span> match <span class="hljs-keyword">then</span> match[<span class="hljs-number">0</span>].length <span class="hljs-keyword">else</span> <span class="hljs-number">0</span></pre></div></div>
  574. </li>
  575. <li id="section-24">
  576. <div class="annotation">
  577. <div class="pilwrap ">
  578. <a class="pilcrow" href="#section-24">&#182;</a>
  579. </div>
  580. <p>Generate a newline token. Consecutive newlines get merged together.</p>
  581. </div>
  582. <div class="content"><div class='highlight'><pre> <span class="hljs-attribute">newlineToken</span>: <span class="hljs-function"><span class="hljs-params">(offset)</span> -&gt;</span>
  583. <span class="hljs-property">@tokens</span>.pop() <span class="hljs-keyword">while</span> <span class="hljs-property">@value</span>() <span class="hljs-keyword">is</span> <span class="hljs-string">';'</span>
  584. <span class="hljs-property">@token</span> <span class="hljs-string">'TERMINATOR'</span>, <span class="hljs-string">'\n'</span>, offset, <span class="hljs-number">0</span> <span class="hljs-keyword">unless</span> <s