/AutoHotkey.docset/Contents/Resources/Documents/misc/RegExCallout.htm
HTML | 117 lines | 95 code | 22 blank | 0 comment | 0 complexity | 98c5a41066f51100aaddbf1782c5aecf MD5 | raw file
- <!DOCTYPE HTML>
- <html>
- <head>
- <title>Regular Expression Callouts</title>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <meta http-equiv="X-UA-Compatible" content="IE=edge">
- <link href="../static/theme.css" rel="stylesheet" type="text/css" />
- <script src="../static/content.js" type="text/javascript"></script>
- </head>
- <body>
- <h1>Regular Expression Callouts <span class="ver">[AHK_L 14+]</span></h1>
- <p>Callouts provide a means of temporarily passing control to the script in the middle of regular expression pattern matching. For detailed information about the PCRE-standard callout feature, see <a href="http://www.pcre.org/pcre.txt">pcre.txt</a>.</p>
- <p>Callouts are currently supported only by <a href="../commands/RegExMatch.htm">RegExMatch</a> and <a href="../commands/RegExReplace.htm">RegExReplace</a>.</p>
- <h3>Syntax</h3>
- <p>The syntax for a callout in AutoHotkey is <span class="Syntax">(?C<em>Number</em>:<em>Function</em>)</span>, where both <em>Number</em> and <em>Function</em> are optional. Colon ':' is allowed only if <em>Function</em> is specified, and is optional if <em>Number</em> is omitted. If <em>Function</em> is specified but is not the name of a user-defined function, a compile error occurs and pattern-matching does not begin.</p>
- <p>If <em>Function</em> is omitted, the function name must be specified in a variable named <b>pcre_callout</b>. If both a global variable and local variable exist with this name, the local variable takes precedence. If <em>pcre_callout</em> does not contain the name of a user-defined function, callouts which omit <em>Function</em> are ignored.</p>
- <h3>Callout Functions</h3>
- <pre class="Syntax">Function(Match, CalloutNumber, FoundPos, Haystack, NeedleRegEx)
- {
- ...
- }</pre>
- <p>Callout functions may define up to 5 parameters:</p>
- <ul>
- <li><b>Match</b>: Equivalent to the <em>UnquotedOutputVar</em> of RegExMatch, including the creation of array variables if appropriate.</li>
- <li><b>CalloutNumber</b>: Receives the <em>Number</em> of the callout.</li>
- <li><b>FoundPos</b>: Receives the position of the current potential match.</li>
- <li><b>Haystack</b>: Receives the <em>Haystack</em> passed to RegExMatch or RegExReplace.</li>
- <li><b>NeedleRegEx</b>: Receives the <em>NeedleRegEx</em> passed to RegExMatch or RegExReplace.</li>
- </ul>
- <p>These names are suggestive only. Actual names may vary.</p>
- <p>Pattern-matching may proceed or fail depending on the return value of the callout function:</p>
- <ul>
- <li>If the function returns <b>0</b> or does not return a numeric value, matching proceeds as normal.</li>
- <li>If the function returns <b>1</b> or greater, matching fails at the current point, but the testing of other matching possibilities goes ahead.</li>
- <li>If the function returns <b>-1</b>, matching is abandoned.</li>
- <li>If the function returns a value less than -1, it is treated as a PCRE error code and matching is abandoned. RegExMatch returns a blank string, while RegExReplace returns the original <em>Haystack</em>. In either case, ErrorLevel contains the error code.</li>
- </ul>
- <p>For example:</p>
- <pre>Haystack = The quick brown fox jumps over the lazy dog.
- RegExMatch(Haystack, "i)(The) (\w+)\b(?CCallout)")
- Callout(m) {
- MsgBox m=%m%`nm1=%m1%`nm2=%m2%
- return 1
- }</pre>
- <p>In the above example, <em>Func</em> is called once for each substring which matches the part of the pattern preceding the callout. <span class="Syntax">\b</span> is used to exclude incomplete words in matches such as <em>The quic</em>, <em>The qui</em>, <em>The qu</em>, etc.</p>
- <h3 id="EventInfo">EventInfo</h3>
- <p>Additional information is available by accessing the pcre_callout_block structure via <b>A_EventInfo</b>.</p>
- <pre>version := NumGet(A_EventInfo, 0, "Int")
- callout_number := NumGet(A_EventInfo, 4, "Int")
- offset_vector := NumGet(A_EventInfo, 8)
- subject := NumGet(A_EventInfo, 8 + A_PtrSize)
- subject_length := NumGet(A_EventInfo, 8 + A_PtrSize*2, "Int")
- start_match := NumGet(A_EventInfo, 12 + A_PtrSize*2, "Int")
- current_position := NumGet(A_EventInfo, 16 + A_PtrSize*2, "Int")
- capture_top := NumGet(A_EventInfo, 20 + A_PtrSize*2, "Int")
- capture_last := NumGet(A_EventInfo, 24 + A_PtrSize*2, "Int")
- pad := A_PtrSize=8 ? 4 : 0 <em>; Compensate for 64-bit data alignment.</em>
- callout_data := NumGet(A_EventInfo, 28 + pad + A_PtrSize*2)
- pattern_position := NumGet(A_EventInfo, 28 + pad + A_PtrSize*3, "Int")
- next_item_length := NumGet(A_EventInfo, 32 + pad + A_PtrSize*3, "Int")
- if version >= 2
- mark := StrGet(NumGet(A_EventInfo, 36 + pad + A_PtrSize*3, "Int"), "UTF-8")
- </pre>
- <p>For more information, see <a href="http://www.pcre.org/pcre.txt">pcre.txt</a>, <a href="../commands/NumGet.htm">NumGet</a> and <a href="../Variables.htm#PtrSize">A_PtrSize</a>.</p>
- <h3 id="auto">Auto-Callout</h3>
- <p>Including <span class="Syntax">C</span> in the options of the pattern enables the auto-callout mode. In this mode, callouts equivalent to <span class="Syntax">(?C255)</span> are inserted before each item in the pattern. For example, the following template may be used to debug regular expressions:</p>
- <pre><em>; Set the default callout function.</em>
- pcre_callout = DebugRegEx
- <em>; Call RegExMatch with auto-callout option C.</em>
- RegExMatch("xxxabc123xyz", "C)abc.*xyz")
- DebugRegEx(Match, CalloutNumber, FoundPos, Haystack, NeedleRegEx)
- {
- <em>; See pcre.txt for descriptions of these fields.</em>
- start_match := NumGet(A_EventInfo, 12 + A_PtrSize*2, "Int")
- current_position := NumGet(A_EventInfo, 16 + A_PtrSize*2, "Int")
- pad := A_PtrSize=8 ? 4 : 0
- pattern_position := NumGet(A_EventInfo, 28 + pad + A_PtrSize*3, "Int")
- next_item_length := NumGet(A_EventInfo, 32 + pad + A_PtrSize*3, "Int")
- <em>; Point out >>current match<<.</em>
- _HAYSTACK:=SubStr(Haystack, 1, start_match)
- . ">>" SubStr(Haystack, start_match + 1, current_position - start_match)
- . "<<" SubStr(Haystack, current_position + 1)
-
- <em>; Point out >>next item to be evaluated<<.</em>
- _NEEDLE:= SubStr(NeedleRegEx, 1, pattern_position)
- . ">>" SubStr(NeedleRegEx, pattern_position + 1, next_item_length)
- . "<<" SubStr(NeedleRegEx, pattern_position + 1 + next_item_length)
-
- ListVars
- <em>; Press Pause to continue.</em>
- Pause
- }</pre>
- <h3>Remarks</h3>
- <p>Callouts are executed on the current quasi-thread, but the previous value of A_EventInfo will be restored after the callout function returns. ErrorLevel is not set until immediately before RegExMatch or RegExReplace returns.</p>
- <p>PCRE is optimized to abort early in some cases if it can determine that a match is not possible. For all callouts to be called in such cases, it may be necessary to disable these optimizations by specifying <code>(*NO_START_OPT)</code> at the start of the pattern. This requires v1.1.05 or later.</p>
- </body>
- </html>