PageRenderTime 47ms CodeModel.GetById 21ms RepoModel.GetById 1ms app.codeStats 0ms

/AutoHotkey.docset/Contents/Resources/Documents/misc/RegExCallout.htm

https://gitlab.com/ahkscript/Autohotkey.docset
HTML | 117 lines | 95 code | 22 blank | 0 comment | 0 complexity | 98c5a41066f51100aaddbf1782c5aecf MD5 | raw file
  1. <!DOCTYPE HTML>
  2. <html>
  3. <head>
  4. <title>Regular Expression Callouts</title>
  5. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  6. <meta http-equiv="X-UA-Compatible" content="IE=edge">
  7. <link href="../static/theme.css" rel="stylesheet" type="text/css" />
  8. <script src="../static/content.js" type="text/javascript"></script>
  9. </head>
  10. <body>
  11. <h1>Regular Expression Callouts <span class="ver">[AHK_L 14+]</span></h1>
  12. <p>Callouts provide a means of temporarily passing control to the script in the middle of regular expression pattern matching. For detailed information about the PCRE-standard callout feature, see <a href="http://www.pcre.org/pcre.txt">pcre.txt</a>.</p>
  13. <p>Callouts are currently supported only by <a href="../commands/RegExMatch.htm">RegExMatch</a> and <a href="../commands/RegExReplace.htm">RegExReplace</a>.</p>
  14. <h3>Syntax</h3>
  15. <p>The syntax for a callout in AutoHotkey is <span class="Syntax">(?C<em>Number</em>:<em>Function</em>)</span>, where both <em>Number</em> and <em>Function</em> are optional. Colon ':' is allowed only if <em>Function</em> is specified, and is optional if <em>Number</em> is omitted. If <em>Function</em> is specified but is not the name of a user-defined function, a compile error occurs and pattern-matching does not begin.</p>
  16. <p>If <em>Function</em> is omitted, the function name must be specified in a variable named <b>pcre_callout</b>. If both a global variable and local variable exist with this name, the local variable takes precedence. If <em>pcre_callout</em> does not contain the name of a user-defined function, callouts which omit <em>Function</em> are ignored.</p>
  17. <h3>Callout Functions</h3>
  18. <pre class="Syntax">Function(Match, CalloutNumber, FoundPos, Haystack, NeedleRegEx)
  19. {
  20. ...
  21. }</pre>
  22. <p>Callout functions may define up to 5 parameters:</p>
  23. <ul>
  24. <li><b>Match</b>: Equivalent to the <em>UnquotedOutputVar</em> of RegExMatch, including the creation of array variables if appropriate.</li>
  25. <li><b>CalloutNumber</b>: Receives the <em>Number</em> of the callout.</li>
  26. <li><b>FoundPos</b>: Receives the position of the current potential match.</li>
  27. <li><b>Haystack</b>: Receives the <em>Haystack</em> passed to RegExMatch or RegExReplace.</li>
  28. <li><b>NeedleRegEx</b>: Receives the <em>NeedleRegEx</em> passed to RegExMatch or RegExReplace.</li>
  29. </ul>
  30. <p>These names are suggestive only. Actual names may vary.</p>
  31. <p>Pattern-matching may proceed or fail depending on the return value of the callout function:</p>
  32. <ul>
  33. <li>If the function returns <b>0</b> or does not return a numeric value, matching proceeds as normal.</li>
  34. <li>If the function returns <b>1</b> or greater, matching fails at the current point, but the testing of other matching possibilities goes ahead.</li>
  35. <li>If the function returns <b>-1</b>, matching is abandoned.</li>
  36. <li>If the function returns a value less than -1, it is treated as a PCRE error code and matching is abandoned. RegExMatch returns a blank string, while RegExReplace returns the original <em>Haystack</em>. In either case, ErrorLevel contains the error code.</li>
  37. </ul>
  38. <p>For example:</p>
  39. <pre>Haystack = The quick brown fox jumps over the lazy dog.
  40. RegExMatch(Haystack, "i)(The) (\w+)\b(?CCallout)")
  41. Callout(m) {
  42. MsgBox m=%m%`nm1=%m1%`nm2=%m2%
  43. return 1
  44. }</pre>
  45. <p>In the above example, <em>Func</em> is called once for each substring which matches the part of the pattern preceding the callout. <span class="Syntax">\b</span> is used to exclude incomplete words in matches such as <em>The quic</em>, <em>The qui</em>, <em>The qu</em>, etc.</p>
  46. <h3 id="EventInfo">EventInfo</h3>
  47. <p>Additional information is available by accessing the pcre_callout_block structure via <b>A_EventInfo</b>.</p>
  48. <pre>version := NumGet(A_EventInfo, 0, "Int")
  49. callout_number := NumGet(A_EventInfo, 4, "Int")
  50. offset_vector := NumGet(A_EventInfo, 8)
  51. subject := NumGet(A_EventInfo, 8 + A_PtrSize)
  52. subject_length := NumGet(A_EventInfo, 8 + A_PtrSize*2, "Int")
  53. start_match := NumGet(A_EventInfo, 12 + A_PtrSize*2, "Int")
  54. current_position := NumGet(A_EventInfo, 16 + A_PtrSize*2, "Int")
  55. capture_top := NumGet(A_EventInfo, 20 + A_PtrSize*2, "Int")
  56. capture_last := NumGet(A_EventInfo, 24 + A_PtrSize*2, "Int")
  57. pad := A_PtrSize=8 ? 4 : 0 <em>; Compensate for 64-bit data alignment.</em>
  58. callout_data := NumGet(A_EventInfo, 28 + pad + A_PtrSize*2)
  59. pattern_position := NumGet(A_EventInfo, 28 + pad + A_PtrSize*3, "Int")
  60. next_item_length := NumGet(A_EventInfo, 32 + pad + A_PtrSize*3, "Int")
  61. if version >= 2
  62. mark := StrGet(NumGet(A_EventInfo, 36 + pad + A_PtrSize*3, "Int"), "UTF-8")
  63. </pre>
  64. <p>For more information, see <a href="http://www.pcre.org/pcre.txt">pcre.txt</a>, <a href="../commands/NumGet.htm">NumGet</a> and <a href="../Variables.htm#PtrSize">A_PtrSize</a>.</p>
  65. <h3 id="auto">Auto-Callout</h3>
  66. <p>Including <span class="Syntax">C</span> in the options of the pattern enables the auto-callout mode. In this mode, callouts equivalent to <span class="Syntax">(?C255)</span> are inserted before each item in the pattern. For example, the following template may be used to debug regular expressions:</p>
  67. <pre><em>; Set the default callout function.</em>
  68. pcre_callout = DebugRegEx
  69. <em>; Call RegExMatch with auto-callout option C.</em>
  70. RegExMatch("xxxabc123xyz", "C)abc.*xyz")
  71. DebugRegEx(Match, CalloutNumber, FoundPos, Haystack, NeedleRegEx)
  72. {
  73. <em>; See pcre.txt for descriptions of these fields.</em>
  74. start_match := NumGet(A_EventInfo, 12 + A_PtrSize*2, "Int")
  75. current_position := NumGet(A_EventInfo, 16 + A_PtrSize*2, "Int")
  76. pad := A_PtrSize=8 ? 4 : 0
  77. pattern_position := NumGet(A_EventInfo, 28 + pad + A_PtrSize*3, "Int")
  78. next_item_length := NumGet(A_EventInfo, 32 + pad + A_PtrSize*3, "Int")
  79. <em>; Point out &gt;&gt;current match&lt;&lt;.</em>
  80. _HAYSTACK:=SubStr(Haystack, 1, start_match)
  81. . "&gt;&gt;" SubStr(Haystack, start_match + 1, current_position - start_match)
  82. . "&lt;&lt;" SubStr(Haystack, current_position + 1)
  83. <em>; Point out &gt;&gt;next item to be evaluated&lt;&lt;.</em>
  84. _NEEDLE:= SubStr(NeedleRegEx, 1, pattern_position)
  85. . "&gt;&gt;" SubStr(NeedleRegEx, pattern_position + 1, next_item_length)
  86. . "&lt;&lt;" SubStr(NeedleRegEx, pattern_position + 1 + next_item_length)
  87. ListVars
  88. <em>; Press Pause to continue.</em>
  89. Pause
  90. }</pre>
  91. <h3>Remarks</h3>
  92. <p>Callouts are executed on the current quasi-thread, but the previous value of A_EventInfo will be restored after the callout function returns. ErrorLevel is not set until immediately before RegExMatch or RegExReplace returns.</p>
  93. <p>PCRE is optimized to abort early in some cases if it can determine that a match is not possible. For all callouts to be called in such cases, it may be necessary to disable these optimizations by specifying <code>(*NO_START_OPT)</code> at the start of the pattern. This requires v1.1.05 or later.</p>
  94. </body>
  95. </html>