/Doc/Manual/Extending.html
HTML | 4034 lines | 3273 code | 755 blank | 6 comment | 0 complexity | 2ef4bba18c35dc72606eabd75981c999 MD5 | raw file
Possible License(s): 0BSD, GPL-2.0, LGPL-2.1
Large files files are truncated, but you can click here to view the full file
- <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
- <html>
- <head>
- <title>Extending SWIG to support new languages</title>
- <link rel="stylesheet" type="text/css" href="style.css">
- </head>
- <body bgcolor="#ffffff">
- <H1><a name="Extending"></a>35 Extending SWIG to support new languages</H1>
- <!-- INDEX -->
- <div class="sectiontoc">
- <ul>
- <li><a href="#Extending_nn2">Introduction</a>
- <li><a href="#Extending_nn3">Prerequisites</a>
- <li><a href="#Extending_nn4">The Big Picture</a>
- <li><a href="#Extending_nn5">Execution Model</a>
- <ul>
- <li><a href="#Extending_nn6">Preprocessing</a>
- <li><a href="#Extending_nn7">Parsing</a>
- <li><a href="#Extending_nn8">Parse Trees</a>
- <li><a href="#Extending_nn9">Attribute namespaces</a>
- <li><a href="#Extending_nn10">Symbol Tables</a>
- <li><a href="#Extending_nn11">The %feature directive</a>
- <li><a href="#Extending_nn12">Code Generation</a>
- <li><a href="#Extending_nn13">SWIG and XML</a>
- </ul>
- <li><a href="#Extending_nn14">Primitive Data Structures</a>
- <ul>
- <li><a href="#Extending_nn15">Strings</a>
- <li><a href="#Extending_nn16">Hashes</a>
- <li><a href="#Extending_nn17">Lists</a>
- <li><a href="#Extending_nn18">Common operations</a>
- <li><a href="#Extending_nn19">Iterating over Lists and Hashes</a>
- <li><a href="#Extending_nn20">I/O</a>
- </ul>
- <li><a href="#Extending_nn21">Navigating and manipulating parse trees</a>
- <li><a href="#Extending_nn22">Working with attributes</a>
- <li><a href="#Extending_nn23">Type system</a>
- <ul>
- <li><a href="#Extending_nn24">String encoding of types</a>
- <li><a href="#Extending_nn25">Type construction</a>
- <li><a href="#Extending_nn26">Type tests</a>
- <li><a href="#Extending_nn27">Typedef and inheritance</a>
- <li><a href="#Extending_nn28">Lvalues</a>
- <li><a href="#Extending_nn29">Output functions</a>
- </ul>
- <li><a href="#Extending_nn30">Parameters</a>
- <li><a href="#Extending_nn31">Writing a Language Module</a>
- <ul>
- <li><a href="#Extending_nn32">Execution model</a>
- <li><a href="#Extending_starting_out">Starting out</a>
- <li><a href="#Extending_nn34">Command line options</a>
- <li><a href="#Extending_nn35">Configuration and preprocessing</a>
- <li><a href="#Extending_nn36">Entry point to code generation</a>
- <li><a href="#Extending_nn37">Module I/O and wrapper skeleton</a>
- <li><a href="#Extending_nn38">Low-level code generators</a>
- <li><a href="#Extending_nn39">Configuration files</a>
- <li><a href="#Extending_nn40">Runtime support</a>
- <li><a href="#Extending_nn41">Standard library files</a>
- <li><a href="#Extending_nn42">User examples</a>
- <li><a href="#Extending_test_suite">Test driven development and the test-suite</a>
- <ul>
- <li><a href="#Extending_running_test_suite">Running the test-suite</a>
- </ul>
- <li><a href="#Extending_nn43">Documentation</a>
- <li><a href="#Extending_prerequisites">Prerequisites for adding a new language module to the SWIG distribution</a>
- <li><a href="#Extending_coding_style_guidelines">Coding style guidelines</a>
- </ul>
- <li><a href="#Extending_debugging_options">Debugging Options</a>
- <li><a href="#Extending_nn46">Guide to parse tree nodes</a>
- <li><a href="#Extending_further_info">Further Development Information</a>
- </ul>
- </div>
- <!-- INDEX -->
- <H2><a name="Extending_nn2"></a>35.1 Introduction</H2>
- <p>
- This chapter describes SWIG's internal organization and the process by which
- new target languages can be developed. First, a brief word of warning---SWIG
- is continually evolving.
- The information in this chapter is mostly up to
- date, but changes are ongoing. Expect a few inconsistencies.
- </p>
- <p>
- Also, this chapter is not meant to be a hand-holding tutorial. As a starting point,
- you should probably look at one of SWIG's existing modules.
- </p>
- <H2><a name="Extending_nn3"></a>35.2 Prerequisites</H2>
- <p>
- In order to extend SWIG, it is useful to have the following background:
- </p>
- <ul>
- <li>An understanding of the C API for the target language.
- <li>A good grasp of the C++ type system.
- <li>An understanding of typemaps and some of SWIG's advanced features.
- <li>Some familiarity with writing C++ (language modules are currently written in C++).
- </ul>
- <p>
- Since SWIG is essentially a specialized C++ compiler, it may be useful
- to have some prior experience with compiler design (perhaps even a
- compilers course) to better understand certain parts of the system. A
- number of books will also be useful. For example, "The C Programming
- Language" by Kernighan and Ritchie (a.k.a, "K&R") and the C++ standard,
- "ISO/IEC 14882 Programming Languages - C++" will be of great use.
- </p>
- <p>
- Also, it is useful to keep in mind that SWIG primarily operates as an
- extension of the C++ <em>type</em> system. At first glance, this might not be
- obvious, but almost all SWIG directives as well as the low-level generation of
- wrapper code are driven by C++ datatypes.
- </p>
- <H2><a name="Extending_nn4"></a>35.3 The Big Picture</H2>
- <p>
- SWIG is a special purpose compiler that parses C++ declarations to
- generate wrapper code. To make this conversion possible, SWIG makes
- three fundamental extensions to the C++ language:
- </p>
- <ul>
- <li><b>Typemaps</b>. Typemaps are used to define the
- conversion/marshalling behavior of specific C++ datatypes. All type conversion in SWIG is
- based on typemaps. Furthermore, the association of typemaps to datatypes utilizes an advanced pattern matching
- mechanism that is fully integrated with the C++ type system.
- </li>
- <li><b>Declaration Annotation</b>. To customize wrapper code
- generation, most declarations can be annotated with special features.
- For example, you can make a variable read-only, you can ignore a
- declaration, you can rename a member function, you can add exception
- handling, and so forth. Virtually all of these customizations are built on top of a low-level
- declaration annotator that can attach arbitrary attributes to any declaration.
- Code generation modules can look for these attributes to guide the wrapping process.
- </li>
- <li><b>Class extension</b>. SWIG allows classes and structures to be extended with new
- methods and attributes (the <tt>%extend</tt> directive). This has the effect of altering
- the API in the target language and can be used to generate OO interfaces to C libraries.
- </ul>
- <p>
- It is important to emphasize that virtually all SWIG features reduce to one of these three
- fundamental concepts. The type system and pattern matching rules also play a critical
- role in making the system work. For example, both typemaps and declaration annotation are
- based on pattern matching and interact heavily with the underlying type system.
- </p>
- <H2><a name="Extending_nn5"></a>35.4 Execution Model</H2>
- <p>
- When you run SWIG on an interface, processing is handled in stages by a series of system components:
- </p>
- <ul>
- <li>An integrated C preprocessor reads a collection of configuration
- files and the specified interface file into memory. The preprocessor
- performs the usual functions including macro expansion and file
- inclusion. However, the preprocessor also performs some transformations of the
- interface. For instance, <tt>#define</tt> statements are sometimes transformed into
- <tt>%constant</tt> declarations. In addition, information related to file/line number
- tracking is inserted.
- </li>
- <li>A C/C++ parser reads the preprocessed input and generates a full
- parse tree of all of the SWIG directives and C declarations found.
- The parser is responsible for many aspects of the system including
- renaming, declaration annotation, and template expansion. However, the parser
- does not produce any output nor does it interact with the target
- language module as it runs. SWIG is not a one-pass compiler.
- </li>
- <li>A type-checking pass is made. This adjusts all of the C++ typenames to properly
- handle namespaces, typedefs, nested classes, and other issues related to type scoping.
- </li>
- <li>A semantic pass is made on the parse tree to collect information
- related to properties of the C++ interface. For example, this pass
- would determine whether or not a class allows a default constructor.
- </li>
- <li>A code generation pass is made using a specific target language
- module. This phase is responsible for generating the actual wrapper
- code. All of SWIG's user-defined modules are invoked during this
- latter stage of compilation.
- </li>
- </ul>
- <p>
- The next few sections briefly describe some of these stages.
- </p>
- <H3><a name="Extending_nn6"></a>35.4.1 Preprocessing</H3>
- <p>
- The preprocessor plays a critical role in the SWIG implementation. This is because a lot
- of SWIG's processing and internal configuration is managed not by code written in C, but
- by configuration files in the SWIG library. In fact, when you
- run SWIG, parsing starts with a small interface file like this (note: this explains
- the cryptic error messages that new users sometimes get when SWIG is misconfigured or installed
- incorrectly):
- </p>
- <div class="code">
- <pre>
- %include "swig.swg" // Global SWIG configuration
- %include "<em>langconfig.swg</em>" // Language specific configuration
- %include "yourinterface.i" // Your interface file
- </pre>
- </div>
- <p>
- The <tt>swig.swg</tt> file contains global configuration information. In addition, this file
- defines many of SWIG's standard directives as macros. For instance, part of
- of <tt>swig.swg</tt> looks like this:
- </p>
- <div class="code">
- <pre>
- ...
- /* Code insertion directives such as %wrapper %{ ... %} */
- #define %begin %insert("begin")
- #define %runtime %insert("runtime")
- #define %header %insert("header")
- #define %wrapper %insert("wrapper")
- #define %init %insert("init")
- /* Access control directives */
- #define %immutable %feature("immutable","1")
- #define %mutable %feature("immutable")
- /* Directives for callback functions */
- #define %callback(x) %feature("callback") `x`;
- #define %nocallback %feature("callback");
- /* %ignore directive */
- #define %ignore %rename($ignore)
- #define %ignorewarn(x) %rename("$ignore:" x)
- ...
- </pre>
- </div>
- <p>
- The fact that most of the standard SWIG directives are macros is
- intended to simplify the implementation of the internals. For instance,
- rather than having to support dozens of special directives, it is
- easier to have a few basic primitives such as <tt>%feature</tt> or
- <tt>%insert</tt>.
- </p>
- <p>
- The <em><tt>langconfig.swg</tt></em> file is supplied by the target
- language. This file contains language-specific configuration
- information. More often than not, this file provides run-time wrapper
- support code (e.g., the type-checker) as well as a collection of
- typemaps that define the default wrapping behavior. Note: the name of this
- file depends on the target language and is usually something like <tt>python.swg</tt>
- or <tt>perl5.swg</tt>.
- </p>
- <p>
- As a debugging aide, the text that SWIG feeds to its C++ parser can be
- obtained by running <tt>swig -E interface.i</tt>. This output
- probably isn't too useful in general, but it will show how macros have
- been expanded as well as everything else that goes into the low-level
- construction of the wrapper code.
- </p>
- <H3><a name="Extending_nn7"></a>35.4.2 Parsing</H3>
- <p>
- The current C++ parser handles a subset of C++. Most incompatibilities with C are due to
- subtle aspects of how SWIG parses declarations. Specifically, SWIG expects all C/C++ declarations to follow this general form:
- </p>
- <div class="diagram">
- <pre>
- <em>storage</em> <em>type</em> <em>declarator</em> <em>initializer</em>;
- </pre>
- </div>
- <p>
- <tt><em>storage</em></tt> is a keyword such as <tt>extern</tt>,
- <tt>static</tt>, <tt>typedef</tt>, or <tt>virtual</tt>. <tt><em>type</em></tt> is a primitive
- datatype such as <tt>int</tt> or <tt>void</tt>. <tt><em>type</em></tt> may be optionally
- qualified with a qualifier such as <tt>const</tt> or <tt>volatile</tt>. <tt><em>declarator</em></tt>
- is a name with additional type-construction modifiers attached to it (pointers, arrays, references,
- functions, etc.). Examples of declarators include <tt>*x</tt>, <tt>**x</tt>, <tt>x[20]</tt>, and
- <tt>(*x)(int,double)</tt>. The <tt><em>initializer</em></tt> may be a value assigned using <tt>=</tt> or
- body of code enclosed in braces <tt>{ ... }</tt>.
- </p>
- <p>
- This declaration format covers most common C++ declarations. However, the C++ standard
- is somewhat more flexible in the placement of the parts. For example, it is technically legal, although
- uncommon to write something like <tt>int typedef const a</tt> in your program. SWIG simply
- doesn't bother to deal with this case.
- </p>
- <p>
- The other significant difference between C++ and SWIG is in the
- treatment of typenames. In C++, if you have a declaration like this,
- </p>
- <div class="code">
- <pre>
- int blah(Foo *x, Bar *y);
- </pre>
- </div>
- <p>
- it won't parse correctly unless <tt>Foo</tt> and <tt>Bar</tt> have
- been previously defined as types either using a <tt>class</tt>
- definition or a <tt>typedef</tt>. The reasons for this are subtle,
- but this treatment of typenames is normally integrated at the level of the C
- tokenizer---when a typename appears, a different token is returned to the parser
- instead of an identifier.
- </p>
- <p>
- SWIG does not operate in this manner--any legal identifier can be used
- as a type name. The reason for this is primarily motivated by the use
- of SWIG with partially defined data. Specifically,
- SWIG is supposed to be easy to use on interfaces with missing type information.
- </p>
- <p>
- Because of the different treatment of typenames, the most serious
- limitation of the SWIG parser is that it can't process type declarations where
- an extra (and unnecessary) grouping operator is used. For example:
- </p>
- <div class="code">
- <pre>
- int (x); /* A variable x */
- int (y)(int); /* A function y */
- </pre>
- </div>
- <p>
- The placing of extra parentheses in type declarations like this is
- already recognized by the C++ community as a potential source of
- strange programming errors. For example, Scott Meyers "Effective STL"
- discusses this problem in a section on avoiding C++'s "most vexing
- parse."
- </p>
- <p>
- The parser is also unable to handle declarations with no return type or bare argument names.
- For example, in an old C program, you might see things like this:
- </p>
- <div class="code">
- <pre>
- foo(a,b) {
- ...
- }
- </pre>
- </div>
- <p>
- In this case, the return type as well as the types of the arguments
- are taken by the C compiler to be an <tt>int</tt>. However, SWIG
- interprets the above code as an abstract declarator for a function
- returning a <tt>foo</tt> and taking types <tt>a</tt> and <tt>b</tt> as
- arguments).
- </p>
- <H3><a name="Extending_nn8"></a>35.4.3 Parse Trees</H3>
- <p>
- The SWIG parser produces a complete parse tree of the input file before any wrapper code
- is actually generated. Each item in the tree is known as a "Node". Each node is identified
- by a symbolic tag. Furthermore, a node may have an arbitrary number of children.
- The parse tree structure and tag names of an interface can be displayed using <tt>swig -debug-tags</tt>.
- For example:
- </p>
- <div class="shell">
- <pre>
- $ <b>swig -c++ -python -debug-tags example.i</b>
- . top (example.i:1)
- . top . include (example.i:1)
- . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/swig.swg:71)
- . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/swig.swg:71)
- . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/swig.swg:83)
- . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/swig.swg:83)
- . top . include (example.i:4)
- . top . include . insert (/r0/beazley/Projects/lib/swig1.3/python/python.swg:7)
- . top . include . insert (/r0/beazley/Projects/lib/swig1.3/python/python.swg:8)
- . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:19)
- ...
- . top . include (example.i:6)
- . top . include . module (example.i:2)
- . top . include . insert (example.i:6)
- . top . include . include (example.i:9)
- . top . include . include . class (example.h:3)
- . top . include . include . class . access (example.h:4)
- . top . include . include . class . constructor (example.h:7)
- . top . include . include . class . destructor (example.h:10)
- . top . include . include . class . cdecl (example.h:11)
- . top . include . include . class . cdecl (example.h:11)
- . top . include . include . class . cdecl (example.h:12)
- . top . include . include . class . cdecl (example.h:13)
- . top . include . include . class . cdecl (example.h:14)
- . top . include . include . class . cdecl (example.h:15)
- . top . include . include . class (example.h:18)
- . top . include . include . class . access (example.h:19)
- . top . include . include . class . cdecl (example.h:20)
- . top . include . include . class . access (example.h:21)
- . top . include . include . class . constructor (example.h:22)
- . top . include . include . class . cdecl (example.h:23)
- . top . include . include . class . cdecl (example.h:24)
- . top . include . include . class (example.h:27)
- . top . include . include . class . access (example.h:28)
- . top . include . include . class . cdecl (example.h:29)
- . top . include . include . class . access (example.h:30)
- . top . include . include . class . constructor (example.h:31)
- . top . include . include . class . cdecl (example.h:32)
- . top . include . include . class . cdecl (example.h:33)
- </pre>
- </div>
- <p>
- Even for the most simple interface, the parse tree structure is larger than you might expect. For example, in the
- above output, a substantial number of nodes are actually generated by the <tt>python.swg</tt> configuration file
- which defines typemaps and other directives. The contents of the user-supplied input file don't appear until the end
- of the output.
- </p>
- <p>
- The contents of each parse tree node consist of a collection of attribute/value
- pairs. Internally, the nodes are simply represented by hash tables. A display of
- the entire parse-tree structure can be obtained using <tt>swig -debug-top <n></tt>, where <tt>n</tt> is
- the stage being processed.
- There are a number of other parse tree display options, for example, <tt>swig -debug-module <n></tt> will
- avoid displaying system parse information and only display the parse tree pertaining to the user's module at
- stage <tt>n</tt> of processing.
- </p>
- <div class="shell">
- <pre>
- $ swig -c++ -python -debug-module 4 example.i
- +++ include ----------------------------------------
- | name - "example.i"
- +++ module ----------------------------------------
- | name - "example"
- |
- +++ insert ----------------------------------------
- | code - "\n#include \"example.h\"\n"
- |
- +++ include ----------------------------------------
- | name - "example.h"
- +++ class ----------------------------------------
- | abstract - "1"
- | sym:name - "Shape"
- | name - "Shape"
- | kind - "class"
- | symtab - 0x40194140
- | sym:symtab - 0x40191078
- +++ access ----------------------------------------
- | kind - "public"
- |
- +++ constructor ----------------------------------------
- | sym:name - "Shape"
- | name - "Shape"
- | decl - "f()."
- | code - "{\n nshapes++;\n }"
- | sym:symtab - 0x40194140
- |
- +++ destructor ----------------------------------------
- | sym:name - "~Shape"
- | name - "~Shape"
- | storage - "virtual"
- | code - "{\n nshapes--;\n }"
- | sym:symtab - 0x40194140
- |
- +++ cdecl ----------------------------------------
- | sym:name - "x"
- | name - "x"
- | decl - ""
- | type - "double"
- | sym:symtab - 0x40194140
- |
- +++ cdecl ----------------------------------------
- | sym:name - "y"
- | name - "y"
- | decl - ""
- | type - "double"
- | sym:symtab - 0x40194140
- |
- +++ cdecl ----------------------------------------
- | sym:name - "move"
- | name - "move"
- | decl - "f(double,double)."
- | parms - double ,double
- | type - "void"
- | sym:symtab - 0x40194140
- |
- +++ cdecl ----------------------------------------
- | sym:name - "area"
- | name - "area"
- | decl - "f(void)."
- | parms - void
- | storage - "virtual"
- | value - "0"
- | type - "double"
- | sym:symtab - 0x40194140
- |
- +++ cdecl ----------------------------------------
- | sym:name - "perimeter"
- | name - "perimeter"
- | decl - "f(void)."
- | parms - void
- | storage - "virtual"
- | value - "0"
- | type - "double"
- | sym:symtab - 0x40194140
- |
- +++ cdecl ----------------------------------------
- | sym:name - "nshapes"
- | name - "nshapes"
- | decl - ""
- | storage - "static"
- | type - "int"
- | sym:symtab - 0x40194140
- |
- +++ class ----------------------------------------
- | sym:name - "Circle"
- | name - "Circle"
- | kind - "class"
- | bases - 0x40194510
- | symtab - 0x40194538
- | sym:symtab - 0x40191078
- +++ access ----------------------------------------
- | kind - "private"
- |
- +++ cdecl ----------------------------------------
- | name - "radius"
- | decl - ""
- | type - "double"
- |
- +++ access ----------------------------------------
- | kind - "public"
- |
- +++ constructor ----------------------------------------
- | sym:name - "Circle"
- | name - "Circle"
- | parms - double
- | decl - "f(double)."
- | code - "{ }"
- | sym:symtab - 0x40194538
- |
- +++ cdecl ----------------------------------------
- | sym:name - "area"
- | name - "area"
- | decl - "f(void)."
- | parms - void
- | storage - "virtual"
- | type - "double"
- | sym:symtab - 0x40194538
- |
- +++ cdecl ----------------------------------------
- | sym:name - "perimeter"
- | name - "perimeter"
- | decl - "f(void)."
- | parms - void
- | storage - "virtual"
- | type - "double"
- | sym:symtab - 0x40194538
- |
- +++ class ----------------------------------------
- | sym:name - "Square"
- | name - "Square"
- | kind - "class"
- | bases - 0x40194760
- | symtab - 0x40194788
- | sym:symtab - 0x40191078
- +++ access ----------------------------------------
- | kind - "private"
- |
- +++ cdecl ----------------------------------------
- | name - "width"
- | decl - ""
- | type - "double"
- |
- +++ access ----------------------------------------
- | kind - "public"
- |
- +++ constructor ----------------------------------------
- | sym:name - "Square"
- | name - "Square"
- | parms - double
- | decl - "f(double)."
- | code - "{ }"
- | sym:symtab - 0x40194788
- |
- +++ cdecl ----------------------------------------
- | sym:name - "area"
- | name - "area"
- | decl - "f(void)."
- | parms - void
- | storage - "virtual"
- | type - "double"
- | sym:symtab - 0x40194788
- |
- +++ cdecl ----------------------------------------
- | sym:name - "perimeter"
- | name - "perimeter"
- | decl - "f(void)."
- | parms - void
- | storage - "virtual"
- | type - "double"
- | sym:symtab - 0x40194788
- </pre>
- </div>
- <H3><a name="Extending_nn9"></a>35.4.4 Attribute namespaces</H3>
- <p>
- Attributes of parse tree nodes are often prepended with a namespace qualifier.
- For example, the attributes
- <tt>sym:name</tt> and <tt>sym:symtab</tt> are attributes related to
- symbol table management and are prefixed with <tt>sym:</tt>. As a
- general rule, only those attributes which are directly related to the raw declaration
- appear without a prefix (type, name, declarator, etc.).
- </p>
- <p>
- Target language modules may add additional attributes to nodes to assist the generation
- of wrapper code. The convention for doing this is to place these attributes in a namespace
- that matches the name of the target language. For example, <tt>python:foo</tt> or
- <tt>perl:foo</tt>.
- </p>
- <H3><a name="Extending_nn10"></a>35.4.5 Symbol Tables</H3>
- <p>
- During parsing, all symbols are managed in the space of the target
- language. The <tt>sym:name</tt> attribute of each node contains the symbol name
- selected by the parser. Normally, <tt>sym:name</tt> and <tt>name</tt>
- are the same. However, the <tt>%rename</tt> directive can be used to
- change the value of <tt>sym:name</tt>. You can see the effect of
- <tt>%rename</tt> by trying it on a simple interface and dumping the
- parse tree. For example:
- </p>
- <div class="code">
- <pre>
- %rename(foo_i) foo(int);
- %rename(foo_d) foo(double);
- void foo(int);
- void foo(double);
- void foo(Bar *b);
- </pre>
- </div>
- <p>
- There are various <tt>debug-</tt> options that can be useful for debugging and analysing the parse tree.
- For example, the <tt>debug-top <n></tt> or <tt>debug-module <n></tt> options will
- dump the entire/top of the parse tree or the module subtree at one of the four <tt>n</tt> stages of processing.
- The parse tree can be viewed after the final stage of processing by running SWIG:
- </p>
- <div class="shell">
- <pre>
- $ swig -debug-top 4 example.i
- ...
- +++ cdecl ----------------------------------------
- | sym:name - "foo_i"
- | name - "foo"
- | decl - "f(int)."
- | parms - int
- | type - "void"
- | sym:symtab - 0x40165078
- |
- +++ cdecl ----------------------------------------
- | sym:name - "foo_d"
- | name - "foo"
- | decl - "f(double)."
- | parms - double
- | type - "void"
- | sym:symtab - 0x40165078
- |
- +++ cdecl ----------------------------------------
- | sym:name - "foo"
- | name - "foo"
- | decl - "f(p.Bar)."
- | parms - Bar *
- | type - "void"
- | sym:symtab - 0x40165078
- </pre>
- </div>
- <p>
- All symbol-related conflicts and complaints about overloading are based on <tt>sym:name</tt> values.
- For instance, the following example uses <tt>%rename</tt> in reverse to generate a name clash.
- </p>
- <div class="code">
- <pre>
- %rename(foo) foo_i(int);
- %rename(foo) foo_d(double;
- void foo_i(int);
- void foo_d(double);
- void foo(Bar *b);
- </pre>
- </div>
- <p>
- When you run SWIG on this you now get:
- </p>
- <div class="shell">
- <pre>
- $ ./swig example.i
- example.i:6. Overloaded declaration ignored. foo_d(double )
- example.i:5. Previous declaration is foo_i(int )
- example.i:7. Overloaded declaration ignored. foo(Bar *)
- example.i:5. Previous declaration is foo_i(int )
- </pre>
- </div>
- <H3><a name="Extending_nn11"></a>35.4.6 The %feature directive</H3>
- <p>
- A number of SWIG directives such as <tt>%exception</tt> are implemented using the
- low-level <tt>%feature</tt> directive. For example:
- </p>
- <div class="code">
- <pre>
- %feature("except") getitem(int) {
- try {
- $action
- } catch (badindex) {
- ...
- }
- }
- ...
- class Foo {
- public:
- Object *getitem(int index) throws(badindex);
- ...
- };
- </pre>
- </div>
- <p>
- The behavior of <tt>%feature</tt> is very easy to describe--it simply
- attaches a new attribute to any parse tree node that matches the
- given prototype. When a feature is added, it shows up as an attribute in the <tt>feature:</tt> namespace.
- You can see this when running with the <tt>-debug-top 4</tt> option. For example:
- </p>
- <div class="shell">
- <pre>
- +++ cdecl ----------------------------------------
- | sym:name - "getitem"
- | name - "getitem"
- | decl - "f(int).p."
- | parms - int
- | type - "Object"
- | feature:except - "{\n try {\n $action\n } catc..."
- | sym:symtab - 0x40168ac8
- |
- </pre>
- </div>
- <p>
- Feature names are completely arbitrary and a target language module can be
- programmed to respond to any feature name that it wants to recognize. The
- data stored in a feature attribute is usually just a raw unparsed string.
- For example, the exception code above is simply
- stored without any modifications.
- </p>
- <H3><a name="Extending_nn12"></a>35.4.7 Code Generation</H3>
- <p>
- Language modules work by defining handler functions that know how to respond to
- different types of parse-tree nodes. These handlers simply look at the
- attributes of each node in order to produce low-level code.
- </p>
- <p>
- In reality, the generation of code is somewhat more subtle than simply
- invoking handler functions. This is because parse-tree nodes might be
- transformed. For example, suppose you are wrapping a class like this:
- </p>
- <div class="code">
- <pre>
- class Foo {
- public:
- virtual int *bar(int x);
- };
- </pre>
- </div>
- <p>
- When the parser constructs a node for the member <tt>bar</tt>, it creates a raw "cdecl" node with the following
- attributes:
- </p>
- <div class="diagram">
- <pre>
- nodeType : cdecl
- name : bar
- type : int
- decl : f(int).p
- parms : int x
- storage : virtual
- sym:name : bar
- </pre>
- </div>
- <p>
- To produce wrapper code, this "cdecl" node undergoes a number of transformations. First, the node is recognized as a function declaration. This adjusts some of the type information--specifically, the declarator is joined with the base datatype to produce this:
- </p>
- <div class="diagram">
- <pre>
- nodeType : cdecl
- name : bar
- type : p.int <-- Notice change in return type
- decl : f(int).p
- parms : int x
- storage : virtual
- sym:name : bar
- </pre>
- </div>
- <p>
- Next, the context of the node indicates that the node is really a
- member function. This produces a transformation to a low-level
- accessor function like this:
- </p>
- <div class="diagram">
- <pre>
- nodeType : cdecl
- name : bar
- type : int.p
- decl : f(int).p
- parms : Foo *self, int x <-- Added parameter
- storage : virtual
- wrap:action : result = (arg1)->bar(arg2) <-- Action code added
- sym:name : Foo_bar <-- Symbol name changed
- </pre>
- </div>
- <p>
- In this transformation, notice how an additional parameter was added
- to the parameter list and how the symbol name of the node has suddenly
- changed into an accessor using the naming scheme described in the
- "SWIG Basics" chapter. A small fragment of "action" code has also
- been generated--notice how the <tt>wrap:action</tt> attribute defines
- the access to the underlying method. The data in this transformed
- node is then used to generate a wrapper.
- </p>
- <p>
- Language modules work by registering handler functions for dealing with
- various types of nodes at different stages of transformation. This is done by
- inheriting from a special <tt>Language</tt> class and defining a collection
- of virtual methods. For example, the Python module defines a class as
- follows:
- </p>
- <div class="code">
- <pre>
- class PYTHON : public Language {
- protected:
- public :
- virtual void main(int, char *argv[]);
- virtual int top(Node *);
- virtual int functionWrapper(Node *);
- virtual int constantWrapper(Node *);
- virtual int variableWrapper(Node *);
- virtual int nativeWrapper(Node *);
- virtual int membervariableHandler(Node *);
- virtual int memberconstantHandler(Node *);
- virtual int memberfunctionHandler(Node *);
- virtual int constructorHandler(Node *);
- virtual int destructorHandler(Node *);
- virtual int classHandler(Node *);
- virtual int classforwardDeclaration(Node *);
- virtual int insertDirective(Node *);
- virtual int importDirective(Node *);
- };
- </pre>
- </div>
- <p>
- The role of these functions is described shortly.
- </p>
- <H3><a name="Extending_nn13"></a>35.4.8 SWIG and XML</H3>
- <p>
- Much of SWIG's current parser design was originally motivated by
- interest in using XML to represent SWIG parse trees. Although XML is
- not currently used in any direct manner, the parse tree structure, use
- of node tags, attributes, and attribute namespaces are all influenced
- by aspects of XML parsing. Therefore, in trying to understand SWIG's
- internal data structures, it may be useful to keep XML in the back of
- your mind as a model.
- </p>
- <H2><a name="Extending_nn14"></a>35.5 Primitive Data Structures</H2>
- <p>
- Most of SWIG is constructed using three basic data structures:
- strings, hashes, and lists. These data structures are dynamic in same way as
- similar structures found in many scripting languages. For instance,
- you can have containers (lists and hash tables) of mixed types and
- certain operations are polymorphic.
- </p>
- <p>
- This section briefly describes the basic structures so that later
- sections of this chapter make more sense.
- </p>
- <p>
- When describing the low-level API, the following type name conventions are
- used:
- </p>
- <ul>
- <li><tt>String</tt>. A string object.
- <li><tt>Hash</tt>. A hash object.
- <li><tt>List</tt>. A list object.
- <li><tt>String_or_char</tt>. A string object or a <tt>char *</tt>.
- <li><tt>Object_or_char</tt>. An object or a <tt>char *</tt>.
- <li><tt>Object</tt>. Any object (string, hash, list, etc.)
- </ul>
- <p>
- In most cases, other typenames in the source are aliases for one of these
- primitive types. Specifically:
- </p>
- <div class="code">
- <pre>
- typedef String SwigType;
- typedef Hash Parm;
- typedef Hash ParmList;
- typedef Hash Node;
- typedef Hash Symtab;
- typedef Hash Typetab;
- </pre>
- </div>
- <H3><a name="Extending_nn15"></a>35.5.1 Strings</H3>
- <p>
- <b><tt>String *NewString(const String_or_char *val)</tt></b>
- </p>
- <div class="indent">
- Creates a new string with initial value <tt>val</tt>. <tt>val</tt> may
- be a <tt>char *</tt> or another <tt>String</tt> object. If you want
- to create an empty string, use "" for val.
- </div>
- <p>
- <b><tt>String *NewStringf(const char *fmt, ...)</tt></b>
- </p>
- <div class="indent">
- Creates a new string whose initial value is set according to a C <tt>printf</tt> style
- format string in <tt>fmt</tt>. Additional arguments follow depending
- on <tt>fmt</tt>.
- </div>
- <p>
- <b><tt>String *Copy(String *s)</tt></b>
- </p>
- <div class="indent">
- Make a copy of the string <tt>s</tt>.
- </div>
- <p>
- <b><tt>void Delete(String *s)</tt></b>
- </p>
- <div class="indent">
- Deletes <tt>s</tt>.
- </div>
- <p>
- <b><tt>int Len(const String_or_char *s)</tt></b>
- </p>
- <div class="indent">
- Returns the length of the string.
- </div>
- <p>
- <b><tt>char *Char(const String_or_char *s)</tt></b>
- </p>
- <div class="indent">
- Returns a pointer to the first character in a string.
- </div>
- <p>
- <b><tt>void Append(String *s, const String_or_char *t)</tt></b>
- </p>
- <div class="indent">
- Appends <tt>t</tt> to the end of string <tt>s</tt>.
- </div>
- <p>
- <b><tt>void Insert(String *s, int pos, const String_or_char *t)</tt></b>
- </p>
- <div class="indent">
- Inserts <tt>t</tt> into <tt>s</tt> at position <tt>pos</tt>. The contents
- of <tt>s</tt> are shifted accordingly. The special value <tt>DOH_END</tt>
- can be used for <tt>pos</tt> to indicate insertion at the end of the string (appending).
- </div>
- <p>
- <b><tt>int Strcmp(const String_or_char *s, const String_or_char *t)</tt></b>
- </p>
- <div class="indent">
- Compare strings <tt>s</tt> and <tt>t</tt>. Same as the C <tt>strcmp()</tt>
- function.
- </div>
- <p>
- <b><tt>int Strncmp(const String_or_char *s, const String_or_char *t, int len)</tt></b>
- </p>
- <div class="indent">
- Compare the first <tt>len</tt> characters of strings <tt>s</tt> and <tt>t</tt>. Same as the C <tt>strncmp()</tt>
- function.
- </div>
- <p>
- <b><tt>char *Strstr(const String_or_char *s, const String_or_char *pat)</tt></b>
- </p>
- <div class="indent">
- Returns a pointer to the first occurrence of <tt>pat</tt> in <tt>s</tt>.
- Same as the C <tt>strstr()</tt> function.
- </div>
- <p>
- <b><tt>char *Strchr(const String_or_char *s, char ch)</tt></b>
- </p>
- <div class="indent">
- Returns a pointer to the first occurrence of character <tt>ch</tt> in <tt>s</tt>.
- Same as the C <tt>strchr()</tt> function.
- </div>
- <p>
- <b><tt>void Chop(String *s)</tt></b>
- </p>
- <div class="indent">
- Chops trailing whitespace off the end of <tt>s</tt>.
- </div>
- <p>
- <b><tt>int Replace(String *s, const String_or_char *pat, const String_or_char *rep, int flags)</tt></b>
- </p>
- <div class="indent">
- <p>
- Replaces the pattern <tt>pat</tt> with <tt>rep</tt> in string <tt>s</tt>.
- <tt>flags</tt> is a combination of the following flags:</p>
- <div class="code">
- <pre>
- DOH_REPLACE_ANY - Replace all occurrences
- DOH_REPLACE_ID - Valid C identifiers only
- DOH_REPLACE_NOQUOTE - Don't replace in quoted strings
- DOH_REPLACE_FIRST - Replace first occurrence only.
- </pre>
- </div>
- <p>
- Returns the number of replacements made (if any).
- </p>
- </div>
- <H3><a name="Extending_nn16"></a>35.5.2 Hashes</H3>
- <p>
- <b><tt>Hash *NewHash()</tt></b>
- </p>
- <div class="indent">
- Creates a new empty hash table.
- </div>
- <p>
- <b><tt>Hash *Copy(Hash *h)</tt></b>
- </p>
- <div class="indent">
- Make a shallow copy of the hash <tt>h</tt>.
- </div>
- <p>
- <b><tt>void Delete(Hash *h)</tt></b>
- </p>
- <div class="indent">
- Deletes <tt>h</tt>.
- </div>
- <p>
- <b><tt>int Len(Hash *h)</tt></b>
- </p>
- <div class="indent">
- Returns the number of items in <tt>h</tt>.
- </div>
- <p>
- <b><tt>Object *Getattr(Hash *h, const String_or_char *key)</tt></b>
- </p>
- <div class="indent">
- Gets an object from <tt>h</tt>. <tt>key</tt> may be a string or
- a simple <tt>char *</tt> string. Returns NULL if not found.
- </div>
- <p>
- <b><tt>int Setattr(Hash *h, const String_or_char *key, const Object_or_char *val)</tt></b>
- </p>
- <div class="indent">
- Stores <tt>val</tt> in <tt>h</tt>. <tt>key</tt> may be a string or
- a simple <tt>char *</tt>. If <tt>val</tt> is not a standard
- object (String, Hash, or List) it is assumed to be a <tt>char *</tt> in which
- case it is used to construct a <tt>String</tt> that is stored in the hash.
- If <tt>val</tt> is NULL, the object is deleted. Increases the reference count
- of <tt>val</tt>. Returns 1 if this operation replaced an existing hash entry,
- 0 otherwise.
- </div>
- <p>
- <b><tt>int Delattr(Hash *h, const String_or_char *key)</tt></b>
- </p>
- <div class="indent">
- Deletes the hash item referenced by <tt>key</tt>. Decreases the
- reference count on the corresponding object (if any). Returns 1
- if an object was removed, 0 otherwise.
- </div>
- <p>
- <b><tt>List *Keys(Hash *h)</tt></b>
- </p>
- <div class="indent">
- Returns the list of hash table keys.
- </div>
- <H3><a name="Extending_nn17"></a>35.5.3 Lists</H3>
- <p>
- <b><tt>List *NewList()</tt></b>
- </p>
- <div class="indent">
- Creates a new empty list.
- </div>
- <p>
- <b><tt>List *Copy(List *x)</tt></b>
- </p>
- <div class="indent">
- Make a shallow copy of the List <tt>x</tt>.
- </div>
- <p>
- <b><tt>void Delete(List *x)</tt></b>
- </p>
- <div class="indent">
- Deletes <tt>x</tt>.
- </div>
- <p>
- <b><tt>int Len(List *x)</tt></b>
- </p>
- <div class="indent">
- Returns the number of items in <tt>x</tt>.
- </div>
- <p>
- <b><tt>Object *Getitem(List *x, int n)</tt></b>
- </p>
- <div class="indent">
- Returns an object from <tt>x</tt> with index <tt>n</tt>. If <tt>n</tt> is
- beyond the end of the list, the last item is returned. If <tt>n</tt> is
- negative, the first item is returned.
- </div>
- <p>
- <b><tt>int *Setitem(List *x, int n, const Object_or_char *val)</tt></b>
- </p>
- <div class="indent">
- Stores <tt>val</tt> in <tt>x</tt>.
- If <tt>val</tt> is not a standard
- object (String, Hash, or List) it is assumed to be a <tt>char *</tt> in which
- case it is used to construct a <tt>String</tt> that is stored in the list.
- <tt>n</tt> must be in range. Otherwise, an assertion will be raised.
- </div>
- <p>
- <b><tt>int *Delitem(List *x, int n)</tt></b>
- </p>
- <div class="indent">
- Deletes item <tt>n</tt> from the list, shifting items down if necessary.
- To delete the last item in the list, use the special value <tt>DOH_END</tt>
- for <tt>n</tt>.
- </div>
- <p>
- <b><tt>void Append(List *x, const Object_or_char *t)</tt></b>
- </p>
- <div class="indent">
- Appends <tt>t</tt> to the end of <tt>x</tt>. If <tt>t</tt> is not
- a standard object, it is assumed to be a <tt>char *</tt> and is
- used to create a String object.
- </div>
- <p>
- <b><tt>void Insert(String *s, int pos, const Object_or_char *t)</tt></b>
- </p>
- <div class="indent">
- Inserts <tt>t</tt> into <tt>s</tt> at position <tt>pos</tt>. The contents
- of <tt>s</tt> are shifted accordingly. The special value <tt>DOH_END</tt>
- can be used for <tt>pos</tt> to indicate insertion at the end of the list (appending).
- If <tt>t</tt> is not a standard object, it is assumed to be a <tt>char *</tt>
- and is used to create a String object.
- </div>
- <H3><a name="Extending_nn18"></a>35.5.4 Common operations</H3>
- The following operations are applicable to all datatypes.
- <p>
- <b><tt>Object *Copy(Object *x)</tt></b>
- </p>
- <div class="indent">
- Make a copy of the object <tt>x</tt>.
- </div>
- <p>
- <b><tt>void Delete(Object *x)</tt></b>
- </p>
- <div class="indent">
- Deletes <tt>x</tt>.
- </div>
- <p>
- <b><tt>void Setfile(Object *x, String_or_char *f)</tt></b>
- </p>
- <div class="indent">
- Sets the filename associated with <tt>x</tt>. Used to track
- objects and report errors.
- </div>
- <p>
- <b><tt>String *Getfile(Object *x)</tt></b>
- </p>
- <div class="indent">
- Gets the filename associated with <tt>x</tt>.
- </div>
- <p>
- <b><tt>void Setline(Object *x, int n)</tt></b>
- </p>
- <div class="indent">
- Sets the line number associated with <tt>x</tt>. Used to track
- objects and report errors.
- </div>
- <p>
- <b><tt>int Getline(Object *x)</tt></b>
- </p>
- <div class="indent">
- Gets the line number associated with <tt>x</tt>.
- </div>
- <H3><a name="Extending_nn19"></a>35.5.5 Iterating over Lists and Hashes</H3>
- To iterate over the elements of a list or a hash table, the following functions are used:
- <p>
- <b><tt>Iterator First(Object *x)</tt></b>
- </p>
- <div class="indent">
- Returns an iterator object that points to the first item in a list or hash table. The
- <tt>item</tt> attribute of the Iterator object is a pointer to the item. For hash tables, the <tt>key</tt> attribute
- of the Iterator object additionally points to the corresponding Hash table key. The <tt>item</tt> and <tt>key</tt> attributes
- are NULL if the object contains no items or if there are no more items.
- </div>
- <p>
- <b><tt>Iterator Next(Iterator i)</tt></b>
- </p>
- <div class="indent">
- <p>Returns an iterator that points to the next item in a list or hash table.
- Here are two examples of iteration:</p>
- <div class="code">
- <pre>
- List *l = (some list);
- Iterator i;
- for (i = First(l); i.item; i = Next(i)) {
- Printf(stdout,"%s\n", i.item);
- }
- Hash *h = (some hash);
- Iterator j;
- for (j = First(j); j.item; j= Next(j)) {
- Printf(stdout,"%s : %s\n", j.key, j.item);
- }
- </pre>
- </div>
- </div>
- <H3><a name="Extending_nn20"></a>35.5.6 I/O</H3>
- Special I/O functions are used for all internal I/O. These operations
- work on C <tt>FILE *</tt> objects, String objects, and special <tt>File</tt> objects
- (which are merely a wrapper around <tt>FILE *</tt>).
- <p>
- <b><tt>int Printf(String_or_FILE *f, const char *fmt, ...)</tt></b>
- </p>
- <div class="indent">
- Formatted I/O. Same as the C <tt>fprintf()</tt> function except that output
- can also be directed to a string object. Note: the <tt>%s</tt> format
- specifier works with both strings and <tt>char *</tt>. All other format
- operators have the same meaning.
- </div>
- <p>
- <b><tt>int Printv(String_or_FILE *f, String_or_char *arg1,..., NULL)</tt></b>
- </p>
- <div class="indent">
- Prints a variable number of strings arguments to the output. The last
- argument to this function must be NULL. The other arguments can either
- be <tt>char *</tt> or string objects.
- </div>
- <p>
- <b><tt>int Putc(int ch, String_or_FILE *f)</tt></b>
- </p>
- <div class="indent">
- Same as the C <tt>fputc()</tt> function.
- </div>
- <p>
- <b><tt>int Write(String_or_FILE *f, void *buf, int len)</tt></b>
- </p>
- <div class="indent">
- Same as the C <tt>write()</tt> function.
- </div>
- <p>
- <b><tt>int Read(String_or_FILE *f, void *buf, int maxlen)</tt></b>
- </p>
- <div class="indent">
- Same as the C <tt>read()</tt> function.
- </div>
- <p>
- <b><tt>int Getc(String_or_FILE *f)</tt></b>
- </p>
- <div class="indent">
- Same as the C <tt>fgetc()</tt> function.
- </div>
- <p>
- <b><tt>int Ungetc(int ch, String_or_FILE *f)</tt></b>
- </p>
- <div class="indent">
- Same as the C <tt>ungetc()</tt> function.
- </div>
- <p>
- <b><tt>int Seek(String_or_FILE *f, int offset, int whence)</tt></b>
- </p>
- <div class="indent">
- Same as the C <tt>seek()</tt> function. <tt>offset</tt> is the number
- of bytes. <tt>whence</tt> is one of <tt>SEEK_SET</tt>,<tt>SEEK_CUR</tt>,
- or <tt>SEEK_END</tt>..
- </div>
- <p>
- <b><tt>long Tell(String_or_FILE *f)</tt></b>
- </p>
- <div class="indent">
- Same as the C <tt>tell()</tt> function.
- </div>
- <p>
- <b><tt>File *NewFile(const char *filename, const char *mode, List *newfiles)</tt></b>
- </p>
- <div class="indent">
- Create a File object using the <tt>fopen()</tt> library call. This
- file differs from <tt>FILE *</tt> in that it can be placed in the standard
- SWIG containers (lists, hashes, etc.). The <tt>filename</tt> is added to the
- <tt>newfiles</tt> list if <tt>newfiles</tt> is non-zero and the file was created successfully.
- </div>
- <p>
- <b><tt>File *NewFileFromFile(FILE *f)</tt></b>
- </p>
- <div class="indent">
- Create a File object wrapper around an existing <tt>FILE *</tt> object.
- </div>
- <p>
- <b><tt>int Close(String_or_FILE *f)</tt></b>
- </p>
- <div class="indent">
- <p>Closes a file. Has no effect on strings.</p>
- <p>
- The use of the above I/O functions and strings play a critical role in SWIG. It is
- common to see small code fragments of code generated using code like this:
- </p>
- <div class="code">
- <pre>
- /* Print into a string */
- String *s = NewString("");
- Printf(s,"Hello\n");
- for (i = 0; i < 10; i++) {
- Printf(s,"%d\n", i);
- }
- ...
- /* Print string into a file */
- Printf(f, "%s\n", s);
- </pre>
- </div>
- <p>
- Similarly, the preprocessor and parser all operate on string-files.
- </p>
- </div>
- <H2><a name="Extending_nn21"></a>35.6 Navigating and manipulating parse trees</H2>
- Parse trees are built as collections of hash tables. Each node is a hash table in which
- arbitrary attributes can be stored. Certain attributes in the hash table provide links to
- other parse tree nodes. The following macros can be used to move around the parse tree.
- <p>
- <b><tt>String *nodeType(Node *n)</tt></b>
- </p>
- <div class="indent">
- Returns the node type tag as a string. The returned string indicates the type of parse
- tree node.
- </div>
- <p>
- <b><tt>Node *nextSibling(Node *n)</tt></b>
- </p>
- <div class="indent">
- Returns the next node in the parse tree. For example, the next C declaration.
- </div>
- <p>
- <b><tt>Node *previousSibling(Node *n)</tt…
Large files files are truncated, but you can click here to view the full file