PageRenderTime 1224ms CodeModel.GetById 1206ms app.highlight 8ms RepoModel.GetById 2ms app.codeStats 0ms

/vendor/pcre/README

http://github.com/feyeleanor/RubyGoLightly
#! | 756 lines | 587 code | 169 blank | 0 comment | 0 complexity | 58a510ac9ee2d0e960aa2828aa6900d1 MD5 | raw file
  1README file for PCRE (Perl-compatible regular expression library)
  2-----------------------------------------------------------------
  3
  4The latest release of PCRE is always available in three alternative formats
  5from:
  6
  7  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
  8  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.bz2
  9  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.zip
 10
 11There is a mailing list for discussion about the development of PCRE at
 12
 13  pcre-dev@exim.org
 14
 15Please read the NEWS file if you are upgrading from a previous release.
 16The contents of this README file are:
 17
 18  The PCRE APIs
 19  Documentation for PCRE
 20  Contributions by users of PCRE
 21  Building PCRE on non-Unix systems
 22  Building PCRE on Unix-like systems
 23  Retrieving configuration information on Unix-like systems
 24  Shared libraries on Unix-like systems
 25  Cross-compiling on Unix-like systems
 26  Using HP's ANSI C++ compiler (aCC)
 27  Making new tarballs
 28  Testing PCRE
 29  Character tables
 30  File manifest
 31
 32
 33The PCRE APIs
 34-------------
 35
 36PCRE is written in C, and it has its own API. The distribution also includes a
 37set of C++ wrapper functions (see the pcrecpp man page for details), courtesy
 38of Google Inc.
 39
 40In addition, there is a set of C wrapper functions that are based on the POSIX
 41regular expression API (see the pcreposix man page). These end up in the
 42library called libpcreposix. Note that this just provides a POSIX calling
 43interface to PCRE; the regular expressions themselves still follow Perl syntax
 44and semantics. The POSIX API is restricted, and does not give full access to
 45all of PCRE's facilities.
 46
 47The header file for the POSIX-style functions is called pcreposix.h. The
 48official POSIX name is regex.h, but I did not want to risk possible problems
 49with existing files of that name by distributing it that way. To use PCRE with
 50an existing program that uses the POSIX API, pcreposix.h will have to be
 51renamed or pointed at by a link.
 52
 53If you are using the POSIX interface to PCRE and there is already a POSIX regex
 54library installed on your system, as well as worrying about the regex.h header
 55file (as mentioned above), you must also take care when linking programs to
 56ensure that they link with PCRE's libpcreposix library. Otherwise they may pick
 57up the POSIX functions of the same name from the other library.
 58
 59One way of avoiding this confusion is to compile PCRE with the addition of
 60-Dregcomp=PCREregcomp (and similarly for the other POSIX functions) to the
 61compiler flags (CFLAGS if you are using "configure" -- see below). This has the
 62effect of renaming the functions so that the names no longer clash. Of course,
 63you have to do the same thing for your applications, or write them using the
 64new names.
 65
 66
 67Documentation for PCRE
 68----------------------
 69
 70If you install PCRE in the normal way on a Unix-like system, you will end up
 71with a set of man pages whose names all start with "pcre". The one that is just
 72called "pcre" lists all the others. In addition to these man pages, the PCRE
 73documentation is supplied in two other forms:
 74
 75  1. There are files called doc/pcre.txt, doc/pcregrep.txt, and
 76     doc/pcretest.txt in the source distribution. The first of these is a
 77     concatenation of the text forms of all the section 3 man pages except
 78     those that summarize individual functions. The other two are the text
 79     forms of the section 1 man pages for the pcregrep and pcretest commands.
 80     These text forms are provided for ease of scanning with text editors or
 81     similar tools. They are installed in <prefix>/share/doc/pcre, where
 82     <prefix> is the installation prefix (defaulting to /usr/local).
 83
 84  2. A set of files containing all the documentation in HTML form, hyperlinked
 85     in various ways, and rooted in a file called index.html, is distributed in
 86     doc/html and installed in <prefix>/share/doc/pcre/html.
 87
 88
 89Contributions by users of PCRE
 90------------------------------
 91
 92You can find contributions from PCRE users in the directory
 93
 94  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Contrib
 95
 96There is a README file giving brief descriptions of what they are. Some are
 97complete in themselves; others are pointers to URLs containing relevant files.
 98Some of this material is likely to be well out-of-date. Several of the earlier
 99contributions provided support for compiling PCRE on various flavours of
100Windows (I myself do not use Windows). Nowadays there is more Windows support
101in the standard distribution, so these contibutions have been archived.
102
103
104Building PCRE on non-Unix systems
105---------------------------------
106
107For a non-Unix system, please read the comments in the file NON-UNIX-USE,
108though if your system supports the use of "configure" and "make" you may be
109able to build PCRE in the same way as for Unix-like systems. PCRE can also be
110configured in many platform environments using the GUI facility of CMake's
111CMakeSetup. It creates Makefiles, solution files, etc.
112
113PCRE has been compiled on many different operating systems. It should be
114straightforward to build PCRE on any system that has a Standard C compiler and
115library, because it uses only Standard C functions.
116
117
118Building PCRE on Unix-like systems
119----------------------------------
120
121If you are using HP's ANSI C++ compiler (aCC), please see the special note
122in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.
123
124The following instructions assume the use of the widely used "configure, make,
125make install" process. There is also support for CMake in the PCRE
126distribution; there are some comments about using CMake in the NON-UNIX-USE
127file, though it can also be used in Unix-like systems.
128
129To build PCRE on a Unix-like system, first run the "configure" command from the
130PCRE distribution directory, with your current directory set to the directory
131where you want the files to be created. This command is a standard GNU
132"autoconf" configuration script, for which generic instructions are supplied in
133the file INSTALL.
134
135Most commonly, people build PCRE within its own distribution directory, and in
136this case, on many systems, just running "./configure" is sufficient. However,
137the usual methods of changing standard defaults are available. For example:
138
139CFLAGS='-O2 -Wall' ./configure --prefix=/opt/local
140
141specifies that the C compiler should be run with the flags '-O2 -Wall' instead
142of the default, and that "make install" should install PCRE under /opt/local
143instead of the default /usr/local.
144
145If you want to build in a different directory, just run "configure" with that
146directory as current. For example, suppose you have unpacked the PCRE source
147into /source/pcre/pcre-xxx, but you want to build it in /build/pcre/pcre-xxx:
148
149cd /build/pcre/pcre-xxx
150/source/pcre/pcre-xxx/configure
151
152PCRE is written in C and is normally compiled as a C library. However, it is
153possible to build it as a C++ library, though the provided building apparatus
154does not have any features to support this.
155
156There are some optional features that can be included or omitted from the PCRE
157library. You can read more about them in the pcrebuild man page.
158
159. If you want to suppress the building of the C++ wrapper library, you can add
160  --disable-cpp to the "configure" command. Otherwise, when "configure" is run,
161  it will try to find a C++ compiler and C++ header files, and if it succeeds,
162  it will try to build the C++ wrapper.
163
164. If you want to make use of the support for UTF-8 character strings in PCRE,
165  you must add --enable-utf8 to the "configure" command. Without it, the code
166  for handling UTF-8 is not included in the library. (Even when included, it
167  still has to be enabled by an option at run time.)
168
169. If, in addition to support for UTF-8 character strings, you want to include
170  support for the \P, \p, and \X sequences that recognize Unicode character
171  properties, you must add --enable-unicode-properties to the "configure"
172  command. This adds about 30K to the size of the library (in the form of a
173  property table); only the basic two-letter properties such as Lu are
174  supported.
175
176. You can build PCRE to recognize either CR or LF or the sequence CRLF or any
177  of the preceding, or any of the Unicode newline sequences as indicating the
178  end of a line. Whatever you specify at build time is the default; the caller
179  of PCRE can change the selection at run time. The default newline indicator
180  is a single LF character (the Unix standard). You can specify the default
181  newline indicator by adding --enable-newline-is-cr or --enable-newline-is-lf
182  or --enable-newline-is-crlf or --enable-newline-is-anycrlf or
183  --enable-newline-is-any to the "configure" command, respectively.
184
185  If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
186  the standard tests will fail, because the lines in the test files end with
187  LF. Even if the files are edited to change the line endings, there are likely
188  to be some failures. With --enable-newline-is-anycrlf or
189  --enable-newline-is-any, many tests should succeed, but there may be some
190  failures.
191
192. By default, the sequence \R in a pattern matches any Unicode line ending
193  sequence. This is independent of the option specifying what PCRE considers to
194  be the end of a line (see above). However, the caller of PCRE can restrict \R
195  to match only CR, LF, or CRLF. You can make this the default by adding
196  --enable-bsr-anycrlf to the "configure" command (bsr = "backslash R").
197
198. When called via the POSIX interface, PCRE uses malloc() to get additional
199  storage for processing capturing parentheses if there are more than 10 of
200  them in a pattern. You can increase this threshold by setting, for example,
201
202  --with-posix-malloc-threshold=20
203
204  on the "configure" command.
205
206. PCRE has a counter that can be set to limit the amount of resources it uses.
207  If the limit is exceeded during a match, the match fails. The default is ten
208  million. You can change the default by setting, for example,
209
210  --with-match-limit=500000
211
212  on the "configure" command. This is just the default; individual calls to
213  pcre_exec() can supply their own value. There is more discussion on the
214  pcreapi man page.
215
216. There is a separate counter that limits the depth of recursive function calls
217  during a matching process. This also has a default of ten million, which is
218  essentially "unlimited". You can change the default by setting, for example,
219
220  --with-match-limit-recursion=500000
221
222  Recursive function calls use up the runtime stack; running out of stack can
223  cause programs to crash in strange ways. There is a discussion about stack
224  sizes in the pcrestack man page.
225
226. The default maximum compiled pattern size is around 64K. You can increase
227  this by adding --with-link-size=3 to the "configure" command. You can
228  increase it even more by setting --with-link-size=4, but this is unlikely
229  ever to be necessary. Increasing the internal link size will reduce
230  performance.
231
232. You can build PCRE so that its internal match() function that is called from
233  pcre_exec() does not call itself recursively. Instead, it uses memory blocks
234  obtained from the heap via the special functions pcre_stack_malloc() and
235  pcre_stack_free() to save data that would otherwise be saved on the stack. To
236  build PCRE like this, use
237
238  --disable-stack-for-recursion
239
240  on the "configure" command. PCRE runs more slowly in this mode, but it may be
241  necessary in environments with limited stack sizes. This applies only to the
242  pcre_exec() function; it does not apply to pcre_dfa_exec(), which does not
243  use deeply nested recursion. There is a discussion about stack sizes in the
244  pcrestack man page.
245
246. For speed, PCRE uses four tables for manipulating and identifying characters
247  whose code point values are less than 256. By default, it uses a set of
248  tables for ASCII encoding that is part of the distribution. If you specify
249
250  --enable-rebuild-chartables
251
252  a program called dftables is compiled and run in the default C locale when
253  you obey "make". It builds a source file called pcre_chartables.c. If you do
254  not specify this option, pcre_chartables.c is created as a copy of
255  pcre_chartables.c.dist. See "Character tables" below for further information.
256
257. It is possible to compile PCRE for use on systems that use EBCDIC as their
258  default character code (as opposed to ASCII) by specifying
259
260  --enable-ebcdic
261
262  This automatically implies --enable-rebuild-chartables (see above).
263
264. It is possible to compile pcregrep to use libz and/or libbz2, in order to
265  read .gz and .bz2 files (respectively), by specifying one or both of
266
267  --enable-pcregrep-libz
268  --enable-pcregrep-libbz2
269
270  Of course, the relevant libraries must be installed on your system.
271
272. It is possible to compile pcretest so that it links with the libreadline
273  library, by specifying
274
275  --enable-pcretest-libreadline
276
277  If this is done, when pcretest's input is from a terminal, it reads it using
278  the readline() function. This provides line-editing and history facilities.
279  Note that libreadline is GPL-licenced, so if you distribute a binary of
280  pcretest linked in this way, there may be licensing issues.
281
282  Setting this option causes the -lreadline option to be added to the pcretest
283  build. In many operating environments with a sytem-installed readline
284  library this is sufficient. However, in some environments (e.g. if an
285  unmodified distribution version of readline is in use), it may be necessary
286  to specify something like LIBS="-lncurses" as well. This is because, to quote
287  the readline INSTALL, "Readline uses the termcap functions, but does not link
288  with the termcap or curses library itself, allowing applications which link
289  with readline the to choose an appropriate library."
290
291The "configure" script builds the following files for the basic C library:
292
293. Makefile is the makefile that builds the library
294. config.h contains build-time configuration options for the library
295. pcre.h is the public PCRE header file
296. pcre-config is a script that shows the settings of "configure" options
297. libpcre.pc is data for the pkg-config command
298. libtool is a script that builds shared and/or static libraries
299. RunTest is a script for running tests on the basic C library
300. RunGrepTest is a script for running tests on the pcregrep command
301
302Versions of config.h and pcre.h are distributed in the PCRE tarballs under
303the names config.h.generic and pcre.h.generic. These are provided for the
304benefit of those who have to built PCRE without the benefit of "configure". If
305you use "configure", the .generic versions are not used.
306
307If a C++ compiler is found, the following files are also built:
308
309. libpcrecpp.pc is data for the pkg-config command
310. pcrecpparg.h is a header file for programs that call PCRE via the C++ wrapper
311. pcre_stringpiece.h is the header for the C++ "stringpiece" functions
312
313The "configure" script also creates config.status, which is an executable
314script that can be run to recreate the configuration, and config.log, which
315contains compiler output from tests that "configure" runs.
316
317Once "configure" has run, you can run "make". It builds two libraries, called
318libpcre and libpcreposix, a test program called pcretest, and the pcregrep
319command. If a C++ compiler was found on your system, "make" also builds the C++
320wrapper library, which is called libpcrecpp, and some test programs called
321pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest.
322Building the C++ wrapper can be disabled by adding --disable-cpp to the
323"configure" command.
324
325The command "make check" runs all the appropriate tests. Details of the PCRE
326tests are given below in a separate section of this document.
327
328You can use "make install" to install PCRE into live directories on your
329system. The following are installed (file names are all relative to the
330<prefix> that is set when "configure" is run):
331
332  Commands (bin):
333    pcretest
334    pcregrep
335    pcre-config
336
337  Libraries (lib):
338    libpcre
339    libpcreposix
340    libpcrecpp (if C++ support is enabled)
341
342  Configuration information (lib/pkgconfig):
343    libpcre.pc
344    libpcrecpp.pc (if C++ support is enabled)
345
346  Header files (include):
347    pcre.h
348    pcreposix.h
349    pcre_scanner.h      )
350    pcre_stringpiece.h  ) if C++ support is enabled
351    pcrecpp.h           )
352    pcrecpparg.h        )
353
354  Man pages (share/man/man{1,3}):
355    pcregrep.1
356    pcretest.1
357    pcre.3
358    pcre*.3 (lots more pages, all starting "pcre")
359
360  HTML documentation (share/doc/pcre/html):
361    index.html
362    *.html (lots more pages, hyperlinked from index.html)
363
364  Text file documentation (share/doc/pcre):
365    AUTHORS
366    COPYING
367    ChangeLog
368    LICENCE
369    NEWS
370    README
371    pcre.txt       (a concatenation of the man(3) pages)
372    pcretest.txt   the pcretest man page
373    pcregrep.txt   the pcregrep man page
374
375If you want to remove PCRE from your system, you can run "make uninstall".
376This removes all the files that "make install" installed. However, it does not
377remove any directories, because these are often shared with other programs.
378
379
380Retrieving configuration information on Unix-like systems
381---------------------------------------------------------
382
383Running "make install" installs the command pcre-config, which can be used to
384recall information about the PCRE configuration and installation. For example:
385
386  pcre-config --version
387
388prints the version number, and
389
390  pcre-config --libs
391
392outputs information about where the library is installed. This command can be
393included in makefiles for programs that use PCRE, saving the programmer from
394having to remember too many details.
395
396The pkg-config command is another system for saving and retrieving information
397about installed libraries. Instead of separate commands for each library, a
398single command is used. For example:
399
400  pkg-config --cflags pcre
401
402The data is held in *.pc files that are installed in a directory called
403<prefix>/lib/pkgconfig.
404
405
406Shared libraries on Unix-like systems
407-------------------------------------
408
409The default distribution builds PCRE as shared libraries and static libraries,
410as long as the operating system supports shared libraries. Shared library
411support relies on the "libtool" script which is built as part of the
412"configure" process.
413
414The libtool script is used to compile and link both shared and static
415libraries. They are placed in a subdirectory called .libs when they are newly
416built. The programs pcretest and pcregrep are built to use these uninstalled
417libraries (by means of wrapper scripts in the case of shared libraries). When
418you use "make install" to install shared libraries, pcregrep and pcretest are
419automatically re-built to use the newly installed shared libraries before being
420installed themselves. However, the versions left in the build directory still
421use the uninstalled libraries.
422
423To build PCRE using static libraries only you must use --disable-shared when
424configuring it. For example:
425
426./configure --prefix=/usr/gnu --disable-shared
427
428Then run "make" in the usual way. Similarly, you can use --disable-static to
429build only shared libraries.
430
431
432Cross-compiling on Unix-like systems
433------------------------------------
434
435You can specify CC and CFLAGS in the normal way to the "configure" command, in
436order to cross-compile PCRE for some other host. However, you should NOT
437specify --enable-rebuild-chartables, because if you do, the dftables.c source
438file is compiled and run on the local host, in order to generate the inbuilt
439character tables (the pcre_chartables.c file). This will probably not work,
440because dftables.c needs to be compiled with the local compiler, not the cross
441compiler.
442
443When --enable-rebuild-chartables is not specified, pcre_chartables.c is created
444by making a copy of pcre_chartables.c.dist, which is a default set of tables
445that assumes ASCII code. Cross-compiling with the default tables should not be
446a problem.
447
448If you need to modify the character tables when cross-compiling, you should
449move pcre_chartables.c.dist out of the way, then compile dftables.c by hand and
450run it on the local host to make a new version of pcre_chartables.c.dist.
451Then when you cross-compile PCRE this new version of the tables will be used.
452
453
454Using HP's ANSI C++ compiler (aCC)
455----------------------------------
456
457Unless C++ support is disabled by specifying the "--disable-cpp" option of the
458"configure" script, you must include the "-AA" option in the CXXFLAGS
459environment variable in order for the C++ components to compile correctly.
460
461Also, note that the aCC compiler on PA-RISC platforms may have a defect whereby
462needed libraries fail to get included when specifying the "-AA" compiler
463option. If you experience unresolved symbols when linking the C++ programs,
464use the workaround of specifying the following environment variable prior to
465running the "configure" script:
466
467  CXXLDFLAGS="-lstd_v2 -lCsup_v2"
468
469
470Making new tarballs
471-------------------
472
473The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and
474zip formats. The command "make distcheck" does the same, but then does a trial
475build of the new distribution to ensure that it works.
476
477If you have modified any of the man page sources in the doc directory, you
478should first run the PrepareRelease script before making a distribution. This
479script creates the .txt and HTML forms of the documentation from the man pages.
480
481
482Testing PCRE
483------------
484
485To test the basic PCRE library on a Unix system, run the RunTest script that is
486created by the configuring process. There is also a script called RunGrepTest
487that tests the options of the pcregrep command. If the C++ wrapper library is
488built, three test programs called pcrecpp_unittest, pcre_scanner_unittest, and
489pcre_stringpiece_unittest are also built.
490
491Both the scripts and all the program tests are run if you obey "make check" or
492"make test". For other systems, see the instructions in NON-UNIX-USE.
493
494The RunTest script runs the pcretest test program (which is documented in its
495own man page) on each of the testinput files in the testdata directory in
496turn, and compares the output with the contents of the corresponding testoutput
497files. A file called testtry is used to hold the main output from pcretest
498(testsavedregex is also used as a working file). To run pcretest on just one of
499the test files, give its number as an argument to RunTest, for example:
500
501  RunTest 2
502
503The first test file can also be fed directly into the perltest.pl script to
504check that Perl gives the same results. The only difference you should see is
505in the first few lines, where the Perl version is given instead of the PCRE
506version.
507
508The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(),
509pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error
510detection, and run-time flags that are specific to PCRE, as well as the POSIX
511wrapper API. It also uses the debugging flags to check some of the internals of
512pcre_compile().
513
514If you build PCRE with a locale setting that is not the standard C locale, the
515character tables may be different (see next paragraph). In some cases, this may
516cause failures in the second set of tests. For example, in a locale where the
517isprint() function yields TRUE for characters in the range 128-255, the use of
518[:isascii:] inside a character class defines a different set of characters, and
519this shows up in this test as a difference in the compiled code, which is being
520listed for checking. Where the comparison test output contains [\x00-\x7f] the
521test will contain [\x00-\xff], and similarly in some other cases. This is not a
522bug in PCRE.
523
524The third set of tests checks pcre_maketables(), the facility for building a
525set of character tables for a specific locale and using them instead of the
526default tables. The tests make use of the "fr_FR" (French) locale. Before
527running the test, the script checks for the presence of this locale by running
528the "locale" command. If that command fails, or if it doesn't include "fr_FR"
529in the list of available locales, the third test cannot be run, and a comment
530is output to say why. If running this test produces instances of the error
531
532  ** Failed to set locale "fr_FR"
533
534in the comparison output, it means that locale is not available on your system,
535despite being listed by "locale". This does not mean that PCRE is broken.
536
537[If you are trying to run this test on Windows, you may be able to get it to
538work by changing "fr_FR" to "french" everywhere it occurs. Alternatively, use
539RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
540Windows versions of test 2. More info on using RunTest.bat is included in the
541document entitled NON-UNIX-USE.]
542
543The fourth test checks the UTF-8 support. It is not run automatically unless
544PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when
545running "configure". This file can be also fed directly to the perltest script,
546provided you are running Perl 5.8 or higher. (For Perl 5.6, a small patch,
547commented in the script, can be be used.)
548
549The fifth test checks error handling with UTF-8 encoding, and internal UTF-8
550features of PCRE that are not relevant to Perl.
551
552The sixth test checks the support for Unicode character properties. It it not
553run automatically unless PCRE is built with Unicode property support. To to
554this you must set --enable-unicode-properties when running "configure".
555
556The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative
557matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode
558property support, respectively. The eighth and ninth tests are not run
559automatically unless PCRE is build with the relevant support.
560
561
562Character tables
563----------------
564
565For speed, PCRE uses four tables for manipulating and identifying characters
566whose code point values are less than 256. The final argument of the
567pcre_compile() function is a pointer to a block of memory containing the
568concatenated tables. A call to pcre_maketables() can be used to generate a set
569of tables in the current locale. If the final argument for pcre_compile() is
570passed as NULL, a set of default tables that is built into the binary is used.
571
572The source file called pcre_chartables.c contains the default set of tables. By
573default, this is created as a copy of pcre_chartables.c.dist, which contains
574tables for ASCII coding. However, if --enable-rebuild-chartables is specified
575for ./configure, a different version of pcre_chartables.c is built by the
576program dftables (compiled from dftables.c), which uses the ANSI C character
577handling functions such as isalnum(), isalpha(), isupper(), islower(), etc. to
578build the table sources. This means that the default C locale which is set for
579your system will control the contents of these default tables. You can change
580the default tables by editing pcre_chartables.c and then re-building PCRE. If
581you do this, you should take care to ensure that the file does not get
582automatically re-generated. The best way to do this is to move
583pcre_chartables.c.dist out of the way and replace it with your customized
584tables.
585
586When the dftables program is run as a result of --enable-rebuild-chartables,
587it uses the default C locale that is set on your system. It does not pay
588attention to the LC_xxx environment variables. In other words, it uses the
589system's default locale rather than whatever the compiling user happens to have
590set. If you really do want to build a source set of character tables in a
591locale that is specified by the LC_xxx variables, you can run the dftables
592program by hand with the -L option. For example:
593
594  ./dftables -L pcre_chartables.c.special
595
596The first two 256-byte tables provide lower casing and case flipping functions,
597respectively. The next table consists of three 32-byte bit maps which identify
598digits, "word" characters, and white space, respectively. These are used when
599building 32-byte bit maps that represent character classes for code points less
600than 256.
601
602The final 256-byte table has bits indicating various character types, as
603follows:
604
605    1   white space character
606    2   letter
607    4   decimal digit
608    8   hexadecimal digit
609   16   alphanumeric or '_'
610  128   regular expression metacharacter or binary zero
611
612You should not alter the set of characters that contain the 128 bit, as that
613will cause PCRE to malfunction.
614
615
616File manifest
617-------------
618
619The distribution should contain the following files:
620
621(A) Source files of the PCRE library functions and their headers:
622
623  dftables.c              auxiliary program for building pcre_chartables.c
624                            when --enable-rebuild-chartables is specified
625
626  pcre_chartables.c.dist  a default set of character tables that assume ASCII
627                            coding; used, unless --enable-rebuild-chartables is
628                            specified, by copying to pcre_chartables.c
629
630  pcreposix.c             )
631  pcre_compile.c          )
632  pcre_config.c           )
633  pcre_dfa_exec.c         )
634  pcre_exec.c             )
635  pcre_fullinfo.c         )
636  pcre_get.c              ) sources for the functions in the library,
637  pcre_globals.c          )   and some internal functions that they use
638  pcre_info.c             )
639  pcre_maketables.c       )
640  pcre_newline.c          )
641  pcre_ord2utf8.c         )
642  pcre_refcount.c         )
643  pcre_study.c            )
644  pcre_tables.c           )
645  pcre_try_flipped.c      )
646  pcre_ucd.c              )
647  pcre_valid_utf8.c       )
648  pcre_version.c          )
649  pcre_xclass.c           )
650  pcre_printint.src       ) debugging function that is #included in pcretest,
651                          )   and can also be #included in pcre_compile()
652  pcre.h.in               template for pcre.h when built by "configure"
653  pcreposix.h             header for the external POSIX wrapper API
654  pcre_internal.h         header for internal use
655  ucp.h                   header for Unicode property handling
656
657  config.h.in             template for config.h, which is built by "configure"
658
659  pcrecpp.h               public header file for the C++ wrapper
660  pcrecpparg.h.in         template for another C++ header file
661  pcre_scanner.h          public header file for C++ scanner functions
662  pcrecpp.cc              )
663  pcre_scanner.cc         ) source for the C++ wrapper library
664
665  pcre_stringpiece.h.in   template for pcre_stringpiece.h, the header for the
666                            C++ stringpiece functions
667  pcre_stringpiece.cc     source for the C++ stringpiece functions
668
669(B) Source files for programs that use PCRE:
670
671  pcredemo.c              simple demonstration of coding calls to PCRE
672  pcregrep.c              source of a grep utility that uses PCRE
673  pcretest.c              comprehensive test program
674
675(C) Auxiliary files:
676
677  132html                 script to turn "man" pages into HTML
678  AUTHORS                 information about the author of PCRE
679  ChangeLog               log of changes to the code
680  CleanTxt                script to clean nroff output for txt man pages
681  Detrail                 script to remove trailing spaces
682  HACKING                 some notes about the internals of PCRE
683  INSTALL                 generic installation instructions
684  LICENCE                 conditions for the use of PCRE
685  COPYING                 the same, using GNU's standard name
686  Makefile.in             ) template for Unix Makefile, which is built by
687                          )   "configure"
688  Makefile.am             ) the automake input that was used to create
689                          )   Makefile.in
690  NEWS                    important changes in this release
691  NON-UNIX-USE            notes on building PCRE on non-Unix systems
692  PrepareRelease          script to make preparations for "make dist"
693  README                  this file
694  RunTest                 a Unix shell script for running tests
695  RunGrepTest             a Unix shell script for pcregrep tests
696  aclocal.m4              m4 macros (generated by "aclocal")
697  config.guess            ) files used by libtool,
698  config.sub              )   used only when building a shared library
699  configure               a configuring shell script (built by autoconf)
700  configure.ac            ) the autoconf input that was used to build
701                          )   "configure" and config.h
702  depcomp                 ) script to find program dependencies, generated by
703                          )   automake
704  doc/*.3                 man page sources for the PCRE functions
705  doc/*.1                 man page sources for pcregrep and pcretest
706  doc/index.html.src      the base HTML page
707  doc/html/*              HTML documentation
708  doc/pcre.txt            plain text version of the man pages
709  doc/pcretest.txt        plain text documentation of test program
710  doc/perltest.txt        plain text documentation of Perl test program
711  install-sh              a shell script for installing files
712  libpcre.pc.in           template for libpcre.pc for pkg-config
713  libpcrecpp.pc.in        template for libpcrecpp.pc for pkg-config
714  ltmain.sh               file used to build a libtool script
715  missing                 ) common stub for a few missing GNU programs while
716                          )   installing, generated by automake
717  mkinstalldirs           script for making install directories
718  perltest.pl             Perl test program
719  pcre-config.in          source of script which retains PCRE information
720  pcrecpp_unittest.cc          )
721  pcre_scanner_unittest.cc     ) test programs for the C++ wrapper
722  pcre_stringpiece_unittest.cc )
723  testdata/testinput*     test data for main library tests
724  testdata/testoutput*    expected test results
725  testdata/grep*          input and output for pcregrep tests
726
727(D) Auxiliary files for cmake support
728
729  cmake/COPYING-CMAKE-SCRIPTS
730  cmake/FindPackageHandleStandardArgs.cmake
731  cmake/FindReadline.cmake
732  CMakeLists.txt
733  config-cmake.h.in
734
735(E) Auxiliary files for VPASCAL
736
737  makevp.bat
738  makevp_c.txt
739  makevp_l.txt
740  pcregexp.pas
741
742(F) Auxiliary files for building PCRE "by hand"
743
744  pcre.h.generic          ) a version of the public PCRE header file
745                          )   for use in non-"configure" environments
746  config.h.generic        ) a version of config.h for use in non-"configure"
747                          )   environments
748
749(F) Miscellaneous
750
751  RunTest.bat            a script for running tests under Windows
752
753Philip Hazel
754Email local part: ph10
755Email domain: cam.ac.uk
756Last updated: 05 September 2008