usenix2001.tex | searchcode

/tags/wad-0-2-1/SWIG/Tools/WAD/Papers/usenix2001.tex

# · LaTeX · 1347 lines · 1073 code · 208 blank · 66 comment · 0 complexity · d1bffb819b928936709585ec80b9bb01 MD5 · raw file
Large files are truncated click here to view the full file

%template for producing IEEE-format articles using LaTeX.
%written by Matthew Ward, CS Department, Worcester Polytechnic Institute.
%use at your own risk.  Complaints to /dev/null.
%make two column with no page numbering, default is 10 point
%\documentstyle{article}
\documentstyle[twocolumn,times]{article}
\pagestyle{empty}

%set dimensions of columns, gap between columns, and space between paragraphs
%\setlength{\textheight}{8.75in}
\setlength{\textheight}{9.0in}
\setlength{\columnsep}{0.25in}
\setlength{\textwidth}{6.45in}
\setlength{\footheight}{0.0in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0.0in}
\setlength{\headsep}{0.0in}
\setlength{\oddsidemargin}{0in}
%\setlength{\oddsidemargin}{-.065in}
%\setlength{\oddsidemargin}{-.17in}
%\setlength{\parindent}{0pc}

%I copied stuff out of art10.sty and modified them to conform to IEEE format

\makeatletter
%as Latex considers descenders in its calculation of interline spacing,
%to get 12 point spacing for normalsize text, must set it to 10 points
\def\@normalsize{\@setsize\normalsize{12pt}\xpt\@xpt
\abovedisplayskip 10pt plus2pt minus5pt\belowdisplayskip \abovedisplayskip
\abovedisplayshortskip \z@ plus3pt\belowdisplayshortskip 6pt plus3pt
minus3pt\let\@listi\@listI} 

%need an 11 pt font size for subsection and abstract headings
\def\subsize{\@setsize\subsize{12pt}\xipt\@xipt}

%make section titles bold and 12 point, 2 blank lines before, 1 after
\def\section{\@startsection {section}{1}{\z@}{24pt plus 2pt minus 2pt}
{12pt plus 2pt minus 2pt}{\large\bf}}

%make subsection titles bold and 11 point, 1 blank line before, 1 after
\def\subsection{\@startsection {subsection}{2}{\z@}{12pt plus 2pt minus 2pt}
{12pt plus 2pt minus 2pt}{\subsize\bf}}
\makeatother

\newcommand{\ignore}[1]{}
%\renewcommand{\thesubsection}{\arabic{subsection}.}

\begin{document}

%don't want date printed
\date{}

%make title bold and 14 pt font (Latex default is non-bold, 16 pt)
\title{\Large \bf   An Embedded Error Recovery and Debugging Mechanism for Scripting Language Extensions}

%for single author (just remove % characters)
\author{{David M.\ Beazley} \\
{\em Department of Computer Science} \\
{\em University of Chicago }\\
{\em Chicago, Illinois 60637 }\\
{\em beazley@cs.uchicago.edu }}

%  My Department \\
%  My Institute \\
%  My City, ST, zip}
 
%for two authors (this is what is printed)
%\author{\begin{tabular}[t]{c@{\extracolsep{8em}}c}
%  Roscoe Giles	                        & Pablo Tamayo \\
% \\
%  Department of Electrical, Computer,   & Thinking Machines Corp. \\
%  and Systems Engineering               & Cambridge, MA~~02142.  \\
%  and                                   & \\
%  Center for Computational Science      & \\
%  Boston University, Boston, MA~~02215. & 
%\end{tabular}}

\maketitle

%I don't know why I have to reset thispagesyle, but otherwise get page numbers
\thispagestyle{empty}


\subsection*{Abstract}
{\em
In recent years, scripting languages such as Perl, Python, and Tcl
have become popular development tools for the creation of
sophisticated application software.  One of the most useful features
of these languages is their ability to easily interact with compiled
languages such as C and C++.  Although this mixed language approach
has many benefits, one of the greatest drawbacks is the complexity of
debugging that results from using interpreted and compiled code in the
same application.  In part, this is due to the fact that scripting
language interpreters are unable to recover from catastrophic errors
in compiled extension code. Moreover, traditional C/C++ debuggers
do not provide a satisfactory degree of integration with interpreted
languages.  This paper describes an experimental system in which fatal
extension errors such as segmentation faults, bus errors, and failed
assertions are handled as scripting language exceptions.  This system,
which has been implemented as a general purpose shared library,
requires no modifications to the target scripting language, introduces
no performance penalty, and simplifies the debugging of mixed
interpreted-compiled application software.
}

\section{Introduction}

Slightly more than ten years have passed since John Ousterhout
introduced the Tcl scripting language at the 1990 USENIX technical
conference \cite{ousterhout}.  Since then, scripting languages have
been gaining in popularity as evidenced by the wide-spread use of
systems such as Tcl, Perl, Python, Guile, PHP, and Ruby
\cite{ousterhout,perl,python,guile,php,ruby}.

In part, the success of modern scripting languages is due to their
ability to be easily integrated with software written in compiled
languages such as C, C++, and Fortran.  In addition, a wide variety of wrapper
generation tools can be used
to automatically produce bindings between existing code and a
variety of scripting language environments
\cite{swig,sip,pyfort,f2py,advperl,heidrich,vtk,gwrap,wrappy}.  As a result, a large number of
programmers are now using scripting languages to control
complex C/C++ programs or as a tool for re-engineering legacy
software.  This approach is attractive because it allows programmers
to benefit from the flexibility and rapid development of
scripting while retaining the best features of compiled code such as high
performance \cite{ouster1}.

A critical aspect of scripting-compiled code integration is the way in
which it departs from traditional C/C++ development and shell
scripting.  Rather than building stand-alone applications that run as
separate processes, extension programming encourages a style of
programming in which components are tightly integrated within 
an interpreter that is responsible for high-level control.
Because of this, scripted software tends to rely heavily
upon shared libraries, dynamic loading, scripts, and
third-party extensions. In this sense, one might argue that the
benefits of scripting are achieved at the expense of creating a
more complicated development environment.

A consequence of this complexity is an increased degree of difficulty
associated with debugging programs that utilize multiple languages,
dynamically loadable modules, and a sophisticated runtime environment.
To address this problem, this paper describes an experimental system
known as WAD (Wrapped Application Debugger) in which an embedded error
reporting and debugging mechanism is added to common scripting
languages.  This system converts catastrophic signals such as
segmentation faults and failed assertions to exceptions that can be
handled by the scripting language interpreter.  In doing so, it
provides more seamless integration between error handling in
scripting language interpreters and compiled extensions. 

\section{The Debugging Problem}

Normally, a programming error in a scripted application 
results in an exception that describes the problem and the context in
which it occurred.  For example, an error in a Python script might
produce a traceback similar to the following:

\begin{verbatim}
% python foo.py
Traceback (innermost last):
  File "foo.py", line 11, in ?
    foo()
  File "foo.py", line 8, in foo
    bar()
  File "foo.py", line 5, in bar
    spam()
  File "foo.py", line 2, in spam
    doh()
NameError: doh
\end{verbatim}

In this case, a programmer might be able to apply a fix simply based
on information in the traceback.  Alternatively, if the problem is
more complicated, a script-level debugger can be used to provide more
information.  In contrast, a failure in compiled extension code might
produce the following result:

\begin{verbatim}
% python foo.py
Segmentation Fault (core dumped)
\end{verbatim}

In this case, the user has no idea of what has happened other than it
appears to be ``very bad.''  Furthermore, script-level debuggers are
unable to identify the problem since they also crash when the error
occurs (they run in the same process as the interpreter).  This means
that the only way for a user to narrow the source of the problem
within a script is through trial-and-error techniques such as
inserting print statements, commenting out sections of scripts, or
having a deep intuition of the underlying implementation. Obviously,
none of these techniques are particularly elegant.

An alternative approach is to run the application under the control of
a traditional debugger such as gdb \cite{gdb}.  Although this provides
some information about the error, the debugger mostly provides
detailed information about the internal implementation of the
scripting language interpreter instead of the script-level code that
was running at the time of the error.  Needless to say, this information 
isn't very useful to most programmers.
A related problem is that
the structure of a scripted application tends to be much more complex
than a traditional stand-alone program.  As a result, a user may not
have a good sense of how to actually attach an external debugger to their
script.  In addition, execution may occur within a
complex run-time environment involving events, threads, and network
connections.  Because of this, it can be difficult for the user to reproduce
and identify certain types of catastrophic errors if they depend on
timing or unusual event sequences. Finally, this approach
requires a programmer to have a C development environment installed on
their machine.  Unfortunately, this may not hold in practice.
This is because scripting languages are often used to provide programmability to
applications where end-users write scripts, but do not write low-level C code.

Even if a traditional debugger such as gdb were modified to provide
better integration with scripting languages, it is not clear that this
would be the most natural solution to the problem.  For one, 
having to run a separate debugging process to debug
extension code is unnatural when no such requirement exists for
scripts.  Moreover, even if such a debugger existed, an
inexperienced user may not have the expertise or inclination to use
it.  Finally, obscure fatal errors may occur long after an application
has been deployed.  Unless the debugger is distributed along with the
application in some manner, it will be extraordinary difficult to
obtain useful diagnostics when such errors occur.

\begin{figure*}[t]
{\small
\begin{verbatim}
% python foo.py
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "foo.py", line 16, in ?
    foo()
  File "foo.py", line 13, in foo
    bar()
  File "foo.py", line 10, in bar
    spam()
  File "foo.py", line 7, in spam
    doh.doh(a,b,c)

SegFault: [ C stack trace ]

#2 0x00027774 in call_builtin(func=0x1c74f0,arg=0x1a1ccc,kw=0x0) in 'ceval.c',line 2650
#1 0xff083544 in _wrap_doh(self=0x0,args=0x1a1ccc) in 'foo_wrap.c',line 745
#0 0xfe7e0568 in doh(a=3,b=4,c=0x0) in 'foo.c',line 28

/u0/beazley/Projects/WAD/Python/foo.c, line 28

    int doh(int a, int b, int *c) {
 =>   *c = a + b;
      return *c;
    }
\end{verbatim}
}
\caption{Cross language traceback generated by WAD for a segmentation fault in a Python extension}
\end{figure*}

The current state of the art in extension debugging is to simply add
as much error checking as possible to extension modules. This is never
a bad thing to do, but in practice it's usually not enough to
eliminate every possible problem.  For one, scripting languages are
sometimes used to control hundreds of thousands to millions of lines
of compiled code.  In this case, it is improbable that a programmer will
foresee every conceivable error.  In addition, scripting languages are
often used to put new user interfaces on legacy software. In this
case, scripting may introduce new modes of execution that cause a
formerly ``bug-free'' application to fail in an unexpected manner.
Finally, certain types of errors such as floating-point exceptions can
be particularly difficult to eliminate because they might be generated
algorithmically (e.g., as the result of instability in a numerical
method). Therefore, even if a programmer has worked hard to eliminate
crashes, there is usually a small probability that an application may
fail under unusual circumstances.

\section{Embedded Error Reporting}

Rather than modifying an existing debugger to support scripting
languages, an alternative approach is to add a more powerful error
handling and reporting mechanism to the scripting language
interpreter.  We have implemented this approach in the form of an
experimental system known as WAD.  WAD is packaged as dynamically
loadable shared library that can either be loaded as a scripting
language extension module or linked to existing extension modules as a
library.  The core of the system is generic and requires no
modifications to the scripting interpreter or existing extension
modules.  Furthermore, the system does not introduce a performance
penalty as it does not rely upon program instrumentation or tracing.

WAD works by converting fatal signals such as SIGSEGV,
SIGBUS, SIGFPE, and SIGABRT into scripting language exceptions that contain
debugging information collected from the call-stack of compiled
extension code.  By handling errors in this manner, the scripting
language interpreter is able to produce a cross-language stack trace that
contains information from both the script code and extension code as
shown for Python and Tcl/Tk in Figures 1 and 2.  In this case, the user
is given a very clear idea of what has happened without having
to launch a separate debugger. 

The advantage to this approach is that it provides more seamless
integration between error handling in scripts and error handling in
extensions.  In addition, it eliminates the most common debugging step
that a developer is likely to perform in the event of a fatal
error--running a separate debugger on a core file and typing 'where'
to get a stack trace.  Finally, this allows end-users to provide
extension writers with useful debugging information since they can
supply a stack trace as opposed to a vague complaint that the program
``crashed.''

\begin{figure*}[t]
\begin{picture}(400,250)(0,0)
\put(50,-110){\special{psfile = tcl.ps hscale = 60 vscale = 60}}
\end{picture}
\caption{Dialog box with WAD generated traceback information for a failed assertion in a Tcl/Tk extension}
\end{figure*}

\section{Scripting Language Internals}

In order to provide embedded error recovery, it is critical to understand how
scripting language interpreters interface with extension code.  Despite the wide variety
of scripting languages, essentially every implementation uses a similar
technique for accessing foreign code.  

Virtually all scripting languages provide an extension mechanism in the form of a foreign function
interface in which compiled procedures can be called from the scripting language
interpreter. This is accomplished by writing a collection of wrapper functions that conform
to a specified calling convention. The primary purpose of the wrappers are to
marshal arguments and return values between the two languages and to handle errors.
For example, in Tcl, every wrapper
function must conform to the following prototype:

\begin{verbatim}
int 
wrap_foo(ClientData clientData,
         Tcl_Interp *interp,
         int objc,
         Tcl_Obj *CONST objv[])
{
    /* Convert arguments */
    ...
    /* Call a function */

    result = foo(args);
    /* Set result */
    ...
    if (success) {
        return TCL_OK;
    } else {
        return TCL_ERROR;
    }
}
\end{verbatim}

Another common extension mechanism is an object/type interface that allows programmers to create new
kinds of fundamental types or attach special properties to objects in
the interpreter.  For example, both Tcl and Python provide an API for creating new 
``built-in'' objects that behave like numbers, strings, lists, etc.  
In most cases, this involves setting up tables of function
pointers that define various properties of an object.  For example, if
you wanted to add complex numbers to an interpreter, you might fill in a special
data structure with pointers to methods that implement various numerical operations like this:

\begin{verbatim}
NumberMethods ComplexMethods {
    complex_add,
    complex_sub,
    complex_mul,
    complex_div,
    ...
};\end{verbatim}

\noindent
Once registered with the interpreter, the methods in this structure
would be invoked by various interpreter operators such as $+$,
$-$, $*$, and $/$.

Most interpreters handle errors as a two-step process in which
detailed error information is first registered with the interpreter
and then a special error code is returned. For example, in Tcl, errors
are handled by setting error information in the interpreter and
returning a value of TCL\_ERROR.  Similarly in Python, errors are
handled by calling a special function to raise an exception and returning NULL.  In both cases,
this triggers the interpreter's error handler---possibly resulting in
a stack trace of the running script.  In some cases, an interpreter
might handle errors using a form of the C {\tt longjmp} function. 
For example, Perl provides a special function {\tt die} that jumps back
to the interpreter with a fatal error \cite{advperl}.

The precise implementation details of these mechanisms aren't so
important for our discussion.  The critical point is that scripting
languages always access extension code though a well-defined interface
that precisely defines how arguments are to be passed, values are to be
returned, and errors are to be handled.

\section{Scripting Languages and Signals}

Under normal circumstances, errors in extension code are handled
through the error-handling API provided by the scripting language
interpreter.  For example, if an invalid function parameter is passed,
a program can simply set an error message and return to the
interpreter.  Similarly, automatic wrapper generators such as SWIG can produce
code to convert C++ exceptions and other C-related error handling
schemes to scripting language errors \cite{swigexcept}. On the other
hand, segmentation faults, failed assertions, and similar problems
produce signals that cause the interpreter to abort execution.

Most scripting languages provide limited support for Unix signal
handling \cite{stevens}.  However, this support is not sufficiently advanced to
recover from fatal signals produced by extension code.
Unlike signals generated for asynchronous events such as I/O,
execution can {\em not} be resumed at the point of a fatal signal.
Therefore, even if such a signal could be caught and handled by a script,
there isn't much that it can do except to print a diagnostic
message and abort before the signal handler returns.  In addition,
some interpreters block signal delivery while executing
extension code--opting to handle signals at a time when it is more convenient.
In this case, a signal such as SIGSEGV would simply cause the whole application
to freeze since there is no way for execution to continue to a point where
the signal could be delivered.  Thus, scripting languages tend to 
either ignore the problem or label it as a ``limitation.''

\section{Overview of WAD}

WAD installs a signal handler for SIGSEGV, SIGBUS, SIGABRT, SIGILL,
and SIGFPE using the {\tt sigaction} function
\cite{stevens}. Furthermore, it uses a special option (SA\_SIGINFO) of
signal handling that passes process context information to the signal
handler when a signal occurs. Since none of these signals are normally used in the
implementation of the scripting interpreter or by user scripts,
this does not usually override any previous signal handling.
Afterwards, when one of these signals occurs, a two-phase recovery
process executes. First, information is collected about the execution
context including a full stack-trace, symbol table entries, and
debugging information.  Then, the current stream of execution is
aborted and an error is returned to the interpreter.  This process is
illustrated in Figure~3.

The collection of context and debugging information involves the
following steps:

\begin{itemize}
\item The program counter and stack pointer are obtained from 
context information passed to the signal handler.

\item The virtual memory map of the process is obtained from /proc
and used to associate virtual memory addresses with executable files,
shared libraries, and dynamically loaded extension modules \cite{proc}.

\item The call stack is unwound to collect traceback information.
At each step of the stack traceback, symbol table and debugging
information is gathered and stored in a generic data structure for later use
in the recovery process.  This data is obtained by memory-mapping
the object files associated with the process and extracting
symbol table and debugging information. 
\end{itemize}

Once debugging information has been collected, the signal handler
enters an error-recovery phase that
attempts to raise a scripting exception and return to a suitable location in the 
interpreter.  To do this, the following steps are performed:

\begin{itemize}

\item The stack trace is examined to see if there are any locations in the interpreter
to which control can be returned. 

\item If a suitable return location is found, the CPU context is modified in
a manner that makes the signal handler return to the interpreter
with an error.  This return process is assisted by a small
trampoline function (partially written in assembly language) that arranges a proper
return to the interpreter after the signal handler returns.
\end{itemize}

\noindent
Of the two phases, the first is the most straightforward to implement
because it involves standard Unix API functions and common file formats such
as ELF and stabs \cite{elf,stabs}.   On the other hand, the recovery phase in
which control is returned to the interpreter is of greater interest.  Therefore, 
it is now described in greater detail.

\begin{figure*}[t]
\begin{picture}(480,340)(5,60)

\put(50,330){\framebox(200,70){}}
\put(60,388){\small \tt >>> {\bf foo()}}
\put(60,376){\small \tt Traceback (most recent call last):}
\put(70,364){\small \tt   File "<stdin>", line 1, in ?}
\put(60,352){\small \tt SegFault: [ C stack trace ]}
\put(60,340){\small \tt ...}

\put(55,392){\line(-1,0){25}}
\put(30,392){\line(0,-1){80}}
\put(30,312){\line(1,0){95}}
\put(125,312){\vector(0,-1){10}}
\put(175,302){\line(0,1){10}}
\put(175,312){\line(1,0){95}}
\put(270,312){\line(0,1){65}}
\put(270,377){\vector(-1,0){30}}

\put(50,285){\framebox(200,15)[c]{[Python internals]}}
\put(125,285){\vector(0,-1){10}}
\put(175,275){\vector(0,1){10}}
\put(50,260){\framebox(200,15)[c]{call\_builtin()}}
\put(125,260){\vector(0,-1){10}}
%\put(175,250){\vector(0,1){10}}
\put(50,235){\framebox(200,15)[c]{wrap\_foo()}}
\put(125,235){\vector(0,-1){10}}
\put(50,210){\framebox(200,15)[c]{foo()}}
\put(125,210){\vector(0,-1){10}}
\put(50,185){\framebox(200,15)[c]{doh()}}
\put(125,185){\vector(0,-1){20}}
\put(110,148){SIGSEGV}
\put(160,152){\vector(1,0){100}}
\put(260,70){\framebox(200,100){}}
\put(310,155){WAD signal handler}
\put(265,140){1. Unwind C stack}
\put(265,125){2. Gather symbols and debugging info}
\put(265,110){3. Find safe return location}
\put(265,95){4. Raise Python exception}
\put(265,80){5. Modify CPU context and return}

\put(260,185){\framebox(200,15)[c]{return assist}}
\put(365,174){Return from signal}
\put(360,170){\vector(0,1){15}}
\put(360,200){\line(0,1){65}}

%\put(360,70){\line(0,-1){10}}
%\put(360,60){\line(1,0){110}}
%\put(470,60){\line(0,1){130}}
%\put(470,190){\vector(-1,0){10}}

\put(360,265){\vector(-1,0){105}}
\put(255,250){NULL}
\put(255,270){Return to interpreter}

\end{picture}

\caption{Control Flow of the Error Recovery Mechanism for Python}
\label{wad}
\end{figure*}

\section{Returning to the Interpreter}

To return to the interpreter, WAD maintains a table of symbolic names
that correspond to locations within the interpreter
responsible for invoking wrapper functions and object/type methods.
For example, Table 1 shows a partial list of return locations used in
the Python implementation.  When an error occurs, the call stack is
scanned for the first occurrence of any symbol in this table.  If a
match is found, control is returned to that location by emulating the
return of a wrapper function with the error code from the table. If no
match is found, the error handler simply prints a stack trace to
standard output and aborts.

When a symbolic match is found, WAD invokes a special user-defined
handler function that is written for a specific scripting language.
The primary role of this handler is to take debugging information
gathered from the call stack and generate an appropriate scripting
language error.  One peculiar problem of this step is that the
generation of an error may require the use of parameters passed to a
wrapper function.  For example, in the Tcl wrapper shown earlier, one
of the arguments was an object of type ``{\tt Tcl\_Interp *}''.  This
object contains information specific to the state of the interpreter
(and multiple interpreter objects may exist in a single application).
Unfortunately, no reference to the interpreter object is available in the
signal handler nor is a reference to interpreter guaranteed to exist in
the context of a function that generated the error.

To work around this problem, WAD implements a feature
known as argument stealing.  When examining the call-stack, the signal
handler has full access to all function arguments and local variables of each function
on the stack.
Therefore, if the handler knows that an error was generated while
calling a wrapper function (as determined by looking at the symbol names),
it can grab the interpreter object from the stack frame of the wrapper and
use it to set an appropriate error code before returning to the interpreter.
Currently, this is managed by allowing the signal handler to steal
arguments from the caller using positional information.
For example, to grab the {\tt Tcl\_Interp *} object from a Tcl wrapper function,
code similar to the following is written:

\begin{verbatim}
Tcl_Interp *interp;
int         err;

interp = (Tcl_Interp *)
  wad_steal_outarg(
           stack,                
           "TclExecuteByteCode",
           1,
           &err
  );
  ...
if (!err) {
  Tcl_SetResult(interp,errtype,...);
  Tcl_AddErrorInfo(interp,errdetails);
}
\end{verbatim}

In this case, the Tcl interpreter argument passed to a wrapper function 
is stolen and used to generate an error.  Also, the name {\tt TclExecuteByteCode}
refers to the calling function, not the wrapper function itself.
At this time, argument stealing is only applicable to simple types
such as integers and pointers.  However, this appears to be adequate for generating
scripting language errors.


\begin{table}[t]
\begin{center}
\begin{tabular}{ll}
Python symbol                 &   Error return value \\ \hline
call\_builtin                 &   NULL \\
PyObject\_Print               & -1 \\
PyObject\_CallFunction        & NULL \\
PyObject\_CallMethod          & NULL \\
PyObject\_CallObject          & NULL \\
PyObject\_Cmp                 & -1 \\
PyObject\_DelAttrString       & -1 \\
PyObject\_DelItem             & -1 \\
PyObject\_GetAttrString       & NULL \\
\end{tabular}
\end{center}

\label{returnpoints}
\caption{A partial list of symbolic return locations in the Python interpreter}
\end{table}

\section{Register Management}

A final issue concerning the return mechanism has to do with the
behavior of the non-local return to the interpreter.  Roughly
speaking, this emulates the C {\tt longjmp}
library call.  However, this is done without the use of a matching
{\tt setjmp} in the interpreter.  

The primary problem with aborting execution and returning to the
interpreter in this manner is that most compilers use a register
management technique known as callee-save \cite{prag}.  In this case,
it is the responsibility of the called function to save the state of
the registers and to restore them before returning to the caller. By
making a non-local jump, registers may be left in an inconsistent
state due to the fact that they are not restored to their original
values.  The {\tt longjmp} function in the C library avoids this
problem by relying upon {\tt setjmp} to save the registers.  Unfortunately,
WAD does not have this luxury.   As a result, a return from the signal
handler may produce a corrupted set of registers at the point of return
in the interpreter.

The severity of this problem depends greatly on the architecture and
compiler.  For example, on the SPARC, register windows effectively
solve the callee-save problem \cite{sparc}.  In this case, each stack
frame has its own register window and the windows are flushed to the
stack whenever a signal occurs.  Therefore, the recovery mechanism can
simply examine the stack and arrange to restore the registers to their
proper values when control is returned.  Furthermore, certain
conventions of the SPARC ABI resolve several related issues. For
example, floating point registers are caller-saved and the contents of
the SPARC global registers are not guaranteed to be preserved across
procedure calls (in fact, they are not even saved by {\tt setjmp}).

On other platforms, the problem of register management becomes 
more interesting.  In this case, a heuristic approach that examines
the machine code for each function on the call stack can be used to
determine where the registers might have been saved.  This approach is
used by gdb and other debuggers when they allow users to inspect
register values within arbitrary stack frames \cite{gdb}.  Even though
this sounds complicated to implement, the algorithm is greatly
simplified by the fact that compilers typically generate code to store
the callee-save registers immediately upon the entry to each function.
In addition, this code is highly regular and easy to examine.  For
instance, on i386-Linux, the callee-save registers can be restored by
simply examining the first few bytes of the machine code for each
function on the call stack to figure out where values have been saved.
The following code shows a typical sequence of machine instructions
used to store callee-save registers on i386-Linux:

\begin{verbatim}
foo:
55       pushl %ebp
89 e5    mov  %esp, %ebp
83 a0    subl  $0xa0,%esp 
56       pushl %esi   
57       pushl %edi
...
\end{verbatim}

%
% Include an example
%

% more interesting.  One approach is to simply ignore the problem
% altogether and return to the interpreter with the registers in an
% essentially random state.  Surprisingly, this approach actually seems to work (although a considerable degree of
% caution might be in order).
% This is because the return of an error code tends to trigger
% a cascade of procedure returns within the implementation of the interpreter.
% As a result, the values of the registers are simply discarded and
% overwritten with restored values as the interpreter unwinds itself and prepares to handle an
% exception.  A better solution to this problem is to modify the recovery mechanism to discover and
% restore saved registers from the stack.  Unfortunately, there is
% no standardized way to know exactly where the registers might have been saved.
% Therefore, a heuristic scheme that examines the machine code for each procedure would
% have to be used to try and identify stack locations. This approach is used by gdb
% and other debuggers when they allow users to inspect register values
% within arbitrary stack frames \cite{gdb}.  However, this technique has 
% not yet been implemented in WAD due to its obvious implementation difficulty and the
% fact that the WAD prototype has primarily been developed for the SPARC.

As a fall-back, WAD could be configured to return control to a location
previously specified with {\tt setjmp}.  Unfortunately, this either
requires modifications to the interpreter or its extension modules.
Although this kind of instrumentation could be facilitated by automatic
wrapper code generators, it is not a preferred solution and is not discussed further.

\section{Initialization}

To simplify the debugging of extension modules, it
is desirable to make the use of WAD as transparent as possible.
Currently, there are two ways in which the system is used.  First, WAD
may be explicitly loaded as a scripting language extension module.
For instance, in Python, a user can include the statement {\tt import
libwadpy} in a script to load the debugger.  Alternatively, WAD can be
enabled by linking it to an extension module as a shared
library.  For instance:

\begin{verbatim}
% ld -shared $(OBJS) -lwadpy
\end{verbatim}

In this latter case, WAD initializes itself whenever the extension module is
loaded.  The same shared library is used for both situations by making
sure two types of initialization techniques are used.  First, an empty
initialization function is written to make WAD appear like a proper
scripting language extension module (although it adds no functions to
the interpreter).  Second, the real initialization of the system is
placed into the initialization section of the WAD shared library
object file (the ``init'' section of ELF files).  This code always executes
when a library is loaded by the dynamic loader is commonly used to
properly initialize C++ objects.  Therefore, a fairly portable way
to force code into the initialization section is to encapsulate the
initialization in a C++ statically constructed object like this:

\begin{verbatim}
class InitWad {
   public:
      InitWad() { wad_init(); }
};
/* This forces InitWad() to execute
   on loading. */
static InitWad init;
\end{verbatim}

The nice part about this technique is that it allows WAD to be enabled
simply by linking or loading; no special initialization code needs to
be added to an extension module to make it work.  In addition, due to
the way in which the loader resolves and initializes libraries, the
initialization of WAD is guaranteed to execute before any of the code
in the extension module to which it has been linked.  The primary
downside to this approach is that the WAD shared object file can not be
linked directly to an interpreter.   This is because WAD sometimes needs to call the
interpreter to properly initialize its exception handling mechanism (for instance, in Python,
four new types of exceptions are added to the interpreter).  Clearly this type of initialization
is impossible if WAD is linked directly to an interpreter as 
its initialization process would execute before before the main program of the
interpreter started.  However, 
if you wanted to permanently add WAD to an interpreter, the problem is easily
corrected by first removing the C++ initializer from WAD and then replacing it with an explicit
initialization call someplace within the interpreter's startup function.

\section{Exception Objects}

Before WAD returns control to the interpreter, it collects all of the
stack-trace and debugging information it was able to obtain into a
special exception object. This object represents the state of the call
stack and includes things like symbolic names for each stack frame,
the names, types, and values of function parameters and stack
variables, as well as a complete copy of data on the stack. This
information is represented in a generic manner that hides
platform specific details related to the CPU, object file formats,
debugging tables, and so forth.

Minimally, the exception data is used to print a stack trace as shown
in Figure 1.  However, if the interpreter is successfully able to
regain control, the contents of the exception object can be
freely examined after an error has occurred.  For example, a Python
script could catch a segmentation fault and print debugging information
like this:

\begin{verbatim}
try:
   # Some buggy code
   ...
except SegFault,e:
   print 'Whoa!'
   # Get WAD exception object
   t = e.args[0]
   # Print location info
   print t.__FILE__
   print t.__LINE__
   print t.__NAME__
   print t.__SOURCE__
   ...
\end{verbatim}

Inspection of the exception object also makes it possible to write post mortem
script debuggers that merge the call stacks of the two languages and
provide cross language diagnostics.  Figure 4 shows an
example of a simple mixed language debugging session using the WAD
post-mortem debugger (wpm) after an extension error has occurred in a
Python program.  In the figure, the user is first presented with a
multi-language stack trace.  The information in this trace is obtained
both from the WAD exception object and from the Python traceback
generated when the exception was raised. Next, we see the user walking
up the call stack using the 'u' command of the debugger.  As this
proceeds, there is a seamless transition from C to Python where the
trace crosses between the two languages.  An optional feature of the
debugger (not shown) allows the debugger to walk up the entire C
call-stack (in this case, the trace shows information about the
implementation of the Python interpreter).  More advanced features of
the debugger allow the user to query values of function
parameters, local variables, and stack frames (although some of this
information may not be obtainable due to compiler optimizations and the
difficulties of accurately recovering register values).

\begin{figure*}[t]
{\small
\begin{verbatim}
[ Error occurred ]
>>> from wpm import *
*** WAD Debugger ***
#5   [ Python ] in self.widget._report_exception() in ...
#4   [ Python ] in Button(self,text="Die", command=lambda x=self: ...
#3   [ Python ] in death_by_segmentation() in death.py, line 22
#2   [ Python ] in debug.seg_crash() in death.py, line 5
#1   0xfeee2780 in _wrap_seg_crash(self=0x0,args=0x18f114) in 'pydebug.c', line 512
#0   0xfeee1320 in seg_crash() in 'debug.c', line 20

      int *a = 0;
 =>   *a = 3;
      return 1;

>>> u
#1   0xfeee2780 in _wrap_seg_crash(self=0x0,args=0x18f114) in 'pydebug.c', line 512
        
        if(!PyArg_ParseTuple(args,":seg_crash")) return NULL;
 =>     result = (int )seg_crash();
        resultobj = PyInt_FromLong((long)result);

>>> u
#2   [ Python ] in debug.seg_crash() in death.py, line 5
    
    def death_by_segmentation():
 =>     debug.seg_crash()
    
>>> u
#3   [ Python ] in death_by_segmentation() in death.py, line 22

        if ty == 1:
 =>         death_by_segmentation()
        elif ty == 2:
>>> \end{verbatim}
}
\caption{Cross-language debugging session in Python where a user is walking a mixed language call stack.}
\end{figure*}

\section{Implementation Details}

Currently, WAD is implemented in ANSI C and small amount of assembly
code to assist in the return to the interpreter.  The current
implementation supports Python and Tcl extensions on SPARC Solaris and
i386-Linux.  Each scripting language is currently supported by a
separate shared library such as {\tt libwadpy.so} and {\tt
libwadtcl.so}.  In addition, a language neutral library {\tt
libwad.so} can be linked against non-scripted applications (in which case
a stack trace is simply printed to standard error when a problem occurs). 
The entire implementation contains approximately 2000
semicolons. Most of this code pertains to the gathering of debugging
information from object files.  Only a small part of the code is
specific to a particular scripting language (170 semicolons for Python
and 50 semicolons for Tcl).

Although there are libraries such as the GNU Binary File Descriptor
(BFD) library that can assist with the manipulation of object files,
these are not used in the implementation \cite{bfd}.  These
libraries tend to be quite large and are oriented more towards
stand-alone tools such as debuggers, linkers, and loaders.  In addition,
the behavior of these libraries with respect to memory management
would need to be carefully studied before they could be safely used in
an embedded environment. Finally, given the small size of the prototype 
implementation, it didn't seem necessary to rely upon such a 
heavyweight solution.

A surprising feature of the implementation is that a significant
amount of the code is language independent.  This is achieved by
placing all of the process introspection, data collection, and
platform specific code within a centralized core.  To provide a
specific scripting language interface, a developer only needs to
supply two things; a table containing symbolic function names where
control can be returned (Table 1), and a handler function in the form
of a callback.  As input, this handler receives an exception object as
described in an earlier section.  From this, the handler can
raise a scripting language exception in whatever manner is most
appropriate.

Significant portions of the core are also relatively straightforward
to port between different Unix systems.  For instance, code to read
ELF object files and stabs debugging data is essentially identical for
Linux and Solaris.  In addition, the high-level control logic is
unchanged between platforms.  Platform specific differences primarily
arise in the obvious places such as the examination of CPU
registers, manipulation of the process context in the signal handler,
reading virtual memory maps from /proc, and so forth.  Additional
changes would also need to be made on systems with different object
file formats such as COFF and DWARF2.  To extent that it is possible,
these differences could be hidden by abstraction mechanisms (although
the initial implementation of WAD is weak in this regard and would
benefit from techniques used in more advanced debuggers such as gdb).
Despite these porting issues, the primary requirement for WAD is a fully
functional implementation of SVR4 signal handling that allows for
modifications of the process context.

Due to the heavy dependence on Unix signal handling, process
introspection, and object file formats, it is unlikely that WAD could
be easily ported to non-Unix systems such as Windows.  However, it may
be possible to provide a similar capability using advanced features of
Windows structured exception handling \cite{seh}.  For instance, structured
exception handlers can be used to catch hardware faults, they can
receive process context information, and they can arrange to take
corrective action much like the signal implementation described here.  

\section{Modification of Interpreters?}

A logical question to ask about the implementation of WAD is whether
or not it would make sense to modify existing interpreters to assist
in the recovery process. For instance, instrumenting Python or Tcl with setjmp
functions might simplify the implementation since it would eliminate
issues related to register restoration and finding a suitable return
location.

Although it may be possible to make these changes, there are 
several drawbacks to this approach.  First, the number of required modifications may be
quite large.  For instance, there are well over 50 entry points to
extension code within the implementation of Python.  Second, an
extension module may perform callbacks and evaluation of script code.
This means that the call stack would cross back and forth
between languages and that these modifications would have to be made
in a way that allows arbitrary nesting of extension calls.  Finally,
instrumenting the code in this manner may introduce a performance
impact--a clearly undesirable side effect considering the infrequent
occurrence of fatal extension errors.

\section{Discussion}

The primary goal of embedded error recovery is to provide an
alternative approach for debugging scripting language extensions.
Although this approach has many benefits, there are a number
drawbacks and issues that must be discussed.

First, like the C {\tt longjmp} function, the error recovery mechanism
does not cleanly unwind the call stack.  For C++, this means that
objects allocated on stack will not be finalized (destructors will not
be invoked) and that memory allocated on the heap may be
leaked. Similarly, this could result in open files, sockets, and other
system resources. In a multi-threaded environment,
deadlock may occur if a procedure holds a lock when an error occurs.

In certain cases, the use of signals in WAD may interact adversely with scripting
language signal handling. Since scripting languages ordinarily do not catch signals such as
SIGSEGV, SIGBUS, and SIGABRT, the use of WAD is unlikely to conflict
with any existing signal handling. However, most scripting languages would not 
prevent a user from disabling the WAD error recovery mechanism by 
simply specifying a new handler for one or more of these signals.  In addition, the use of 
certain extensions such as the Perl sigtrap module would completely 
disable WAD \cite{perl}.

A more difficult signal handling problem arises when thread libraries
are used. These libraries tend to override default signal handling
behavior in a way that defines how signals are delivered to each
thread \cite{thread}.  In general, asynchronous signals can be
delivered to any thread within a process.  However, this does not
appear to be a problem for WAD since hardware exceptions are delivered
to a signal handler that runs within the same thread in which the
error occurred.  Unfortunately, even in this case, personal experience has
shown that certain implementations of user thread libraries (particularly on older versions
of Linux) do not reliably pass
signal context information nor do they universally support advanced
signal operations such as {\tt sigaltstack}.  Because of this, WAD may
be incompatible with a crippled implementation of user threads on
these platforms.  

A even more subtle problem with threads is that the recovery process
itself is not thread-safe (i.e., it is not possible to concurrently
handle fatal errors occurring in different threads).  For most
scripting language extensions, this limitation does not apply due to
strict run-time restrictions that interpreters currently place on
thread support.  For instance, even though Python supports threaded
programs, it places a global mutex-lock around the interpreter that
makes it impossible for more than one thread to concurrently execute
within the interpreter at once. A consequence of this restriction is
that extension functions are not interruptible by thread-switching
unless they explicitly release the interpreter lock.  Currently, the
behavior of WAD is undefined if extension code releases the lock and
proceeds to generate a fault.  In this case, the recovery process may
either cause an exception to be raised in an entirely different
thread or cause execution to violate the interpreter's mutual exclusion
constraint on the interpreter.

In certain cases, errors may result in an unrecoverable crash.  For
example, if an application overwrites the heap, it may destroy
critical data structures within the interpreter.  Similarly,
destruction of the call stack (via buffer overflow) makes it
impossible for the recovery mechanism to create a stack-trace and
return to the interpreter.    More subtle memory management problems
such as double-freeing of heap allocated memory can also cause a system
to fail in a manner that bears little resemblance to actual source
of the problem.    Given that WAD lives in the same process as the
faulting application and that such errors may occur, a common
question to ask is to what extent does WAD complicate debugging when it
doesn't work.

To handle potential problems in the implementation of WAD itself,
great care is taken to avoid the use of library functions and
functions that rely on heap allocation (malloc, free, etc.).  For
instance, to provide dynamic memory allocation, WAD implements its own
memory allocator using mmap.  In addition, signals are disabled
immediately upon entry to the WAD signal handler.  Should a fatal
error occur inside WAD, the application will dump core and exit.  Since
the resulting core file contains the stack trace of both WAD and the
faulting application, a traditional C debugger can be used to identify
the problem as before.  The only difference is that a few additional
stack frames will appear on the traceback.

An application may also fail after the WAD signal handler has completed
execution if memory or stack frames within the interpreter have been
corrupted in a way that prevents proper exception handling. In this case, the
application may fail in a manner that does not represent the original
programming error. It might also cause the WAD signal handler to be
immediately reinvoked with a different process state--causing it to
report information about a different type of failure.  To address
these kinds of problems, WAD creates a tracefile {\tt
wadtrace} in the current working directory that contains information
about each error that it has handled.  If no recovery was possible, a
programmer can look at this file to obtain all of the stack traces
that were generated.

If an application is experiencing a very serious problem, WAD
does not prevent a standard debugger from being attached to the
process.  This is because the debugger overrides the current signal
handling so that it can catch fatal errors.  As a result, even if WAD
is loaded, fatal signals are simply redirected to the attached
debugger.  Such an approach also allows for more complex debugging
tasks such as single-step execution, breakpoints, and
watchpoints--none of which are easily added to WAD itself.

%
% Add comments about what WAD does in this case?
%

Finally, there are a number of issues that pertain
to the interaction of the recovery mechanism with the interpreter.
For instance, the recovery scheme is unable to return to procedures
that might invoke wrapper functions with conflicting return codes.
This problem manifests itself when the interpreter's virtual
machine is built around a large {\tt switch} statement from which different
types of wrapper functions are called.  For example, in Python, certain
internal procedures call a mix of functions where both NULL and -1 are
returned to indicate errors (depending on the function).  In this case, there
is no way to specify a proper erro…