/IronPython_Main/Runtime/Tests/LinqDlrTests/testenv/perl/lib/pod/perlguts.pod
Unknown | 2318 lines | 1722 code | 596 blank | 0 comment | 0 complexity | fc01ac7c98729bd3b2f13f0038743c8c MD5 | raw file
Possible License(s): GPL-2.0, MPL-2.0-no-copyleft-exception, CPL-1.0, CC-BY-SA-3.0, BSD-3-Clause, ISC, AGPL-3.0, LGPL-2.1, Apache-2.0
Large files files are truncated, but you can click here to view the full file
- =head1 NAME
-
- perlguts - Introduction to the Perl API
-
- =head1 DESCRIPTION
-
- This document attempts to describe how to use the Perl API, as well as
- containing some info on the basic workings of the Perl core. It is far
- from complete and probably contains many errors. Please refer any
- questions or comments to the author below.
-
- =head1 Variables
-
- =head2 Datatypes
-
- Perl has three typedefs that handle Perl's three main data types:
-
- SV Scalar Value
- AV Array Value
- HV Hash Value
-
- Each typedef has specific routines that manipulate the various data types.
-
- =head2 What is an "IV"?
-
- Perl uses a special typedef IV which is a simple signed integer type that is
- guaranteed to be large enough to hold a pointer (as well as an integer).
- Additionally, there is the UV, which is simply an unsigned IV.
-
- Perl also uses two special typedefs, I32 and I16, which will always be at
- least 32-bits and 16-bits long, respectively. (Again, there are U32 and U16,
- as well.)
-
- =head2 Working with SVs
-
- An SV can be created and loaded with one command. There are four types of
- values that can be loaded: an integer value (IV), a double (NV),
- a string (PV), and another scalar (SV).
-
- The six routines are:
-
- SV* newSViv(IV);
- SV* newSVnv(double);
- SV* newSVpv(const char*, int);
- SV* newSVpvn(const char*, int);
- SV* newSVpvf(const char*, ...);
- SV* newSVsv(SV*);
-
- To change the value of an *already-existing* SV, there are seven routines:
-
- void sv_setiv(SV*, IV);
- void sv_setuv(SV*, UV);
- void sv_setnv(SV*, double);
- void sv_setpv(SV*, const char*);
- void sv_setpvn(SV*, const char*, int)
- void sv_setpvf(SV*, const char*, ...);
- void sv_setpvfn(SV*, const char*, STRLEN, va_list *, SV **, I32, bool);
- void sv_setsv(SV*, SV*);
-
- Notice that you can choose to specify the length of the string to be
- assigned by using C<sv_setpvn>, C<newSVpvn>, or C<newSVpv>, or you may
- allow Perl to calculate the length by using C<sv_setpv> or by specifying
- 0 as the second argument to C<newSVpv>. Be warned, though, that Perl will
- determine the string's length by using C<strlen>, which depends on the
- string terminating with a NUL character.
-
- The arguments of C<sv_setpvf> are processed like C<sprintf>, and the
- formatted output becomes the value.
-
- C<sv_setpvfn> is an analogue of C<vsprintf>, but it allows you to specify
- either a pointer to a variable argument list or the address and length of
- an array of SVs. The last argument points to a boolean; on return, if that
- boolean is true, then locale-specific information has been used to format
- the string, and the string's contents are therefore untrustworthy (see
- L<perlsec>). This pointer may be NULL if that information is not
- important. Note that this function requires you to specify the length of
- the format.
-
- STRLEN is an integer type (Size_t, usually defined as size_t in
- config.h) guaranteed to be large enough to represent the size of
- any string that perl can handle.
-
- The C<sv_set*()> functions are not generic enough to operate on values
- that have "magic". See L<Magic Virtual Tables> later in this document.
-
- All SVs that contain strings should be terminated with a NUL character.
- If it is not NUL-terminated there is a risk of
- core dumps and corruptions from code which passes the string to C
- functions or system calls which expect a NUL-terminated string.
- Perl's own functions typically add a trailing NUL for this reason.
- Nevertheless, you should be very careful when you pass a string stored
- in an SV to a C function or system call.
-
- To access the actual value that an SV points to, you can use the macros:
-
- SvIV(SV*)
- SvUV(SV*)
- SvNV(SV*)
- SvPV(SV*, STRLEN len)
- SvPV_nolen(SV*)
-
- which will automatically coerce the actual scalar type into an IV, UV, double,
- or string.
-
- In the C<SvPV> macro, the length of the string returned is placed into the
- variable C<len> (this is a macro, so you do I<not> use C<&len>). If you do
- not care what the length of the data is, use the C<SvPV_nolen> macro.
- Historically the C<SvPV> macro with the global variable C<PL_na> has been
- used in this case. But that can be quite inefficient because C<PL_na> must
- be accessed in thread-local storage in threaded Perl. In any case, remember
- that Perl allows arbitrary strings of data that may both contain NULs and
- might not be terminated by a NUL.
-
- Also remember that C doesn't allow you to safely say C<foo(SvPV(s, len),
- len);>. It might work with your compiler, but it won't work for everyone.
- Break this sort of statement up into separate assignments:
-
- SV *s;
- STRLEN len;
- char * ptr;
- ptr = SvPV(s, len);
- foo(ptr, len);
-
- If you want to know if the scalar value is TRUE, you can use:
-
- SvTRUE(SV*)
-
- Although Perl will automatically grow strings for you, if you need to force
- Perl to allocate more memory for your SV, you can use the macro
-
- SvGROW(SV*, STRLEN newlen)
-
- which will determine if more memory needs to be allocated. If so, it will
- call the function C<sv_grow>. Note that C<SvGROW> can only increase, not
- decrease, the allocated memory of an SV and that it does not automatically
- add a byte for the a trailing NUL (perl's own string functions typically do
- C<SvGROW(sv, len + 1)>).
-
- If you have an SV and want to know what kind of data Perl thinks is stored
- in it, you can use the following macros to check the type of SV you have.
-
- SvIOK(SV*)
- SvNOK(SV*)
- SvPOK(SV*)
-
- You can get and set the current length of the string stored in an SV with
- the following macros:
-
- SvCUR(SV*)
- SvCUR_set(SV*, I32 val)
-
- You can also get a pointer to the end of the string stored in the SV
- with the macro:
-
- SvEND(SV*)
-
- But note that these last three macros are valid only if C<SvPOK()> is true.
-
- If you want to append something to the end of string stored in an C<SV*>,
- you can use the following functions:
-
- void sv_catpv(SV*, const char*);
- void sv_catpvn(SV*, const char*, STRLEN);
- void sv_catpvf(SV*, const char*, ...);
- void sv_catpvfn(SV*, const char*, STRLEN, va_list *, SV **, I32, bool);
- void sv_catsv(SV*, SV*);
-
- The first function calculates the length of the string to be appended by
- using C<strlen>. In the second, you specify the length of the string
- yourself. The third function processes its arguments like C<sprintf> and
- appends the formatted output. The fourth function works like C<vsprintf>.
- You can specify the address and length of an array of SVs instead of the
- va_list argument. The fifth function extends the string stored in the first
- SV with the string stored in the second SV. It also forces the second SV
- to be interpreted as a string.
-
- The C<sv_cat*()> functions are not generic enough to operate on values that
- have "magic". See L<Magic Virtual Tables> later in this document.
-
- If you know the name of a scalar variable, you can get a pointer to its SV
- by using the following:
-
- SV* get_sv("package::varname", FALSE);
-
- This returns NULL if the variable does not exist.
-
- If you want to know if this variable (or any other SV) is actually C<defined>,
- you can call:
-
- SvOK(SV*)
-
- The scalar C<undef> value is stored in an SV instance called C<PL_sv_undef>. Its
- address can be used whenever an C<SV*> is needed.
-
- There are also the two values C<PL_sv_yes> and C<PL_sv_no>, which contain Boolean
- TRUE and FALSE values, respectively. Like C<PL_sv_undef>, their addresses can
- be used whenever an C<SV*> is needed.
-
- Do not be fooled into thinking that C<(SV *) 0> is the same as C<&PL_sv_undef>.
- Take this code:
-
- SV* sv = (SV*) 0;
- if (I-am-to-return-a-real-value) {
- sv = sv_2mortal(newSViv(42));
- }
- sv_setsv(ST(0), sv);
-
- This code tries to return a new SV (which contains the value 42) if it should
- return a real value, or undef otherwise. Instead it has returned a NULL
- pointer which, somewhere down the line, will cause a segmentation violation,
- bus error, or just weird results. Change the zero to C<&PL_sv_undef> in the first
- line and all will be well.
-
- To free an SV that you've created, call C<SvREFCNT_dec(SV*)>. Normally this
- call is not necessary (see L<Reference Counts and Mortality>).
-
- =head2 Offsets
-
- Perl provides the function C<sv_chop> to efficiently remove characters
- from the beginning of a string; you give it an SV and a pointer to
- somewhere inside the the PV, and it discards everything before the
- pointer. The efficiency comes by means of a little hack: instead of
- actually removing the characters, C<sv_chop> sets the flag C<OOK>
- (offset OK) to signal to other functions that the offset hack is in
- effect, and it puts the number of bytes chopped off into the IV field
- of the SV. It then moves the PV pointer (called C<SvPVX>) forward that
- many bytes, and adjusts C<SvCUR> and C<SvLEN>.
-
- Hence, at this point, the start of the buffer that we allocated lives
- at C<SvPVX(sv) - SvIV(sv)> in memory and the PV pointer is pointing
- into the middle of this allocated storage.
-
- This is best demonstrated by example:
-
- % ./perl -Ilib -MDevel::Peek -le '$a="12345"; $a=~s/.//; Dump($a)'
- SV = PVIV(0x8128450) at 0x81340f0
- REFCNT = 1
- FLAGS = (POK,OOK,pPOK)
- IV = 1 (OFFSET)
- PV = 0x8135781 ( "1" . ) "2345"\0
- CUR = 4
- LEN = 5
-
- Here the number of bytes chopped off (1) is put into IV, and
- C<Devel::Peek::Dump> helpfully reminds us that this is an offset. The
- portion of the string between the "real" and the "fake" beginnings is
- shown in parentheses, and the values of C<SvCUR> and C<SvLEN> reflect
- the fake beginning, not the real one.
-
- Something similar to the offset hack is perfomed on AVs to enable
- efficient shifting and splicing off the beginning of the array; while
- C<AvARRAY> points to the first element in the array that is visible from
- Perl, C<AvALLOC> points to the real start of the C array. These are
- usually the same, but a C<shift> operation can be carried out by
- increasing C<AvARRAY> by one and decreasing C<AvFILL> and C<AvLEN>.
- Again, the location of the real start of the C array only comes into
- play when freeing the array. See C<av_shift> in F<av.c>.
-
- =head2 What's Really Stored in an SV?
-
- Recall that the usual method of determining the type of scalar you have is
- to use C<Sv*OK> macros. Because a scalar can be both a number and a string,
- usually these macros will always return TRUE and calling the C<Sv*V>
- macros will do the appropriate conversion of string to integer/double or
- integer/double to string.
-
- If you I<really> need to know if you have an integer, double, or string
- pointer in an SV, you can use the following three macros instead:
-
- SvIOKp(SV*)
- SvNOKp(SV*)
- SvPOKp(SV*)
-
- These will tell you if you truly have an integer, double, or string pointer
- stored in your SV. The "p" stands for private.
-
- In general, though, it's best to use the C<Sv*V> macros.
-
- =head2 Working with AVs
-
- There are two ways to create and load an AV. The first method creates an
- empty AV:
-
- AV* newAV();
-
- The second method both creates the AV and initially populates it with SVs:
-
- AV* av_make(I32 num, SV **ptr);
-
- The second argument points to an array containing C<num> C<SV*>'s. Once the
- AV has been created, the SVs can be destroyed, if so desired.
-
- Once the AV has been created, the following operations are possible on AVs:
-
- void av_push(AV*, SV*);
- SV* av_pop(AV*);
- SV* av_shift(AV*);
- void av_unshift(AV*, I32 num);
-
- These should be familiar operations, with the exception of C<av_unshift>.
- This routine adds C<num> elements at the front of the array with the C<undef>
- value. You must then use C<av_store> (described below) to assign values
- to these new elements.
-
- Here are some other functions:
-
- I32 av_len(AV*);
- SV** av_fetch(AV*, I32 key, I32 lval);
- SV** av_store(AV*, I32 key, SV* val);
-
- The C<av_len> function returns the highest index value in array (just
- like $#array in Perl). If the array is empty, -1 is returned. The
- C<av_fetch> function returns the value at index C<key>, but if C<lval>
- is non-zero, then C<av_fetch> will store an undef value at that index.
- The C<av_store> function stores the value C<val> at index C<key>, and does
- not increment the reference count of C<val>. Thus the caller is responsible
- for taking care of that, and if C<av_store> returns NULL, the caller will
- have to decrement the reference count to avoid a memory leak. Note that
- C<av_fetch> and C<av_store> both return C<SV**>'s, not C<SV*>'s as their
- return value.
-
- void av_clear(AV*);
- void av_undef(AV*);
- void av_extend(AV*, I32 key);
-
- The C<av_clear> function deletes all the elements in the AV* array, but
- does not actually delete the array itself. The C<av_undef> function will
- delete all the elements in the array plus the array itself. The
- C<av_extend> function extends the array so that it contains at least C<key+1>
- elements. If C<key+1> is less than the currently allocated length of the array,
- then nothing is done.
-
- If you know the name of an array variable, you can get a pointer to its AV
- by using the following:
-
- AV* get_av("package::varname", FALSE);
-
- This returns NULL if the variable does not exist.
-
- See L<Understanding the Magic of Tied Hashes and Arrays> for more
- information on how to use the array access functions on tied arrays.
-
- =head2 Working with HVs
-
- To create an HV, you use the following routine:
-
- HV* newHV();
-
- Once the HV has been created, the following operations are possible on HVs:
-
- SV** hv_store(HV*, const char* key, U32 klen, SV* val, U32 hash);
- SV** hv_fetch(HV*, const char* key, U32 klen, I32 lval);
-
- The C<klen> parameter is the length of the key being passed in (Note that
- you cannot pass 0 in as a value of C<klen> to tell Perl to measure the
- length of the key). The C<val> argument contains the SV pointer to the
- scalar being stored, and C<hash> is the precomputed hash value (zero if
- you want C<hv_store> to calculate it for you). The C<lval> parameter
- indicates whether this fetch is actually a part of a store operation, in
- which case a new undefined value will be added to the HV with the supplied
- key and C<hv_fetch> will return as if the value had already existed.
-
- Remember that C<hv_store> and C<hv_fetch> return C<SV**>'s and not just
- C<SV*>. To access the scalar value, you must first dereference the return
- value. However, you should check to make sure that the return value is
- not NULL before dereferencing it.
-
- These two functions check if a hash table entry exists, and deletes it.
-
- bool hv_exists(HV*, const char* key, U32 klen);
- SV* hv_delete(HV*, const char* key, U32 klen, I32 flags);
-
- If C<flags> does not include the C<G_DISCARD> flag then C<hv_delete> will
- create and return a mortal copy of the deleted value.
-
- And more miscellaneous functions:
-
- void hv_clear(HV*);
- void hv_undef(HV*);
-
- Like their AV counterparts, C<hv_clear> deletes all the entries in the hash
- table but does not actually delete the hash table. The C<hv_undef> deletes
- both the entries and the hash table itself.
-
- Perl keeps the actual data in linked list of structures with a typedef of HE.
- These contain the actual key and value pointers (plus extra administrative
- overhead). The key is a string pointer; the value is an C<SV*>. However,
- once you have an C<HE*>, to get the actual key and value, use the routines
- specified below.
-
- I32 hv_iterinit(HV*);
- /* Prepares starting point to traverse hash table */
- HE* hv_iternext(HV*);
- /* Get the next entry, and return a pointer to a
- structure that has both the key and value */
- char* hv_iterkey(HE* entry, I32* retlen);
- /* Get the key from an HE structure and also return
- the length of the key string */
- SV* hv_iterval(HV*, HE* entry);
- /* Return a SV pointer to the value of the HE
- structure */
- SV* hv_iternextsv(HV*, char** key, I32* retlen);
- /* This convenience routine combines hv_iternext,
- hv_iterkey, and hv_iterval. The key and retlen
- arguments are return values for the key and its
- length. The value is returned in the SV* argument */
-
- If you know the name of a hash variable, you can get a pointer to its HV
- by using the following:
-
- HV* get_hv("package::varname", FALSE);
-
- This returns NULL if the variable does not exist.
-
- The hash algorithm is defined in the C<PERL_HASH(hash, key, klen)> macro:
-
- hash = 0;
- while (klen--)
- hash = (hash * 33) + *key++;
- hash = hash + (hash >> 5); /* after 5.6 */
-
- The last step was added in version 5.6 to improve distribution of
- lower bits in the resulting hash value.
-
- See L<Understanding the Magic of Tied Hashes and Arrays> for more
- information on how to use the hash access functions on tied hashes.
-
- =head2 Hash API Extensions
-
- Beginning with version 5.004, the following functions are also supported:
-
- HE* hv_fetch_ent (HV* tb, SV* key, I32 lval, U32 hash);
- HE* hv_store_ent (HV* tb, SV* key, SV* val, U32 hash);
-
- bool hv_exists_ent (HV* tb, SV* key, U32 hash);
- SV* hv_delete_ent (HV* tb, SV* key, I32 flags, U32 hash);
-
- SV* hv_iterkeysv (HE* entry);
-
- Note that these functions take C<SV*> keys, which simplifies writing
- of extension code that deals with hash structures. These functions
- also allow passing of C<SV*> keys to C<tie> functions without forcing
- you to stringify the keys (unlike the previous set of functions).
-
- They also return and accept whole hash entries (C<HE*>), making their
- use more efficient (since the hash number for a particular string
- doesn't have to be recomputed every time). See L<perlapi> for detailed
- descriptions.
-
- The following macros must always be used to access the contents of hash
- entries. Note that the arguments to these macros must be simple
- variables, since they may get evaluated more than once. See
- L<perlapi> for detailed descriptions of these macros.
-
- HePV(HE* he, STRLEN len)
- HeVAL(HE* he)
- HeHASH(HE* he)
- HeSVKEY(HE* he)
- HeSVKEY_force(HE* he)
- HeSVKEY_set(HE* he, SV* sv)
-
- These two lower level macros are defined, but must only be used when
- dealing with keys that are not C<SV*>s:
-
- HeKEY(HE* he)
- HeKLEN(HE* he)
-
- Note that both C<hv_store> and C<hv_store_ent> do not increment the
- reference count of the stored C<val>, which is the caller's responsibility.
- If these functions return a NULL value, the caller will usually have to
- decrement the reference count of C<val> to avoid a memory leak.
-
- =head2 References
-
- References are a special type of scalar that point to other data types
- (including references).
-
- To create a reference, use either of the following functions:
-
- SV* newRV_inc((SV*) thing);
- SV* newRV_noinc((SV*) thing);
-
- The C<thing> argument can be any of an C<SV*>, C<AV*>, or C<HV*>. The
- functions are identical except that C<newRV_inc> increments the reference
- count of the C<thing>, while C<newRV_noinc> does not. For historical
- reasons, C<newRV> is a synonym for C<newRV_inc>.
-
- Once you have a reference, you can use the following macro to dereference
- the reference:
-
- SvRV(SV*)
-
- then call the appropriate routines, casting the returned C<SV*> to either an
- C<AV*> or C<HV*>, if required.
-
- To determine if an SV is a reference, you can use the following macro:
-
- SvROK(SV*)
-
- To discover what type of value the reference refers to, use the following
- macro and then check the return value.
-
- SvTYPE(SvRV(SV*))
-
- The most useful types that will be returned are:
-
- SVt_IV Scalar
- SVt_NV Scalar
- SVt_PV Scalar
- SVt_RV Scalar
- SVt_PVAV Array
- SVt_PVHV Hash
- SVt_PVCV Code
- SVt_PVGV Glob (possible a file handle)
- SVt_PVMG Blessed or Magical Scalar
-
- See the sv.h header file for more details.
-
- =head2 Blessed References and Class Objects
-
- References are also used to support object-oriented programming. In the
- OO lexicon, an object is simply a reference that has been blessed into a
- package (or class). Once blessed, the programmer may now use the reference
- to access the various methods in the class.
-
- A reference can be blessed into a package with the following function:
-
- SV* sv_bless(SV* sv, HV* stash);
-
- The C<sv> argument must be a reference. The C<stash> argument specifies
- which class the reference will belong to. See
- L<Stashes and Globs> for information on converting class names into stashes.
-
- /* Still under construction */
-
- Upgrades rv to reference if not already one. Creates new SV for rv to
- point to. If C<classname> is non-null, the SV is blessed into the specified
- class. SV is returned.
-
- SV* newSVrv(SV* rv, const char* classname);
-
- Copies integer or double into an SV whose reference is C<rv>. SV is blessed
- if C<classname> is non-null.
-
- SV* sv_setref_iv(SV* rv, const char* classname, IV iv);
- SV* sv_setref_nv(SV* rv, const char* classname, NV iv);
-
- Copies the pointer value (I<the address, not the string!>) into an SV whose
- reference is rv. SV is blessed if C<classname> is non-null.
-
- SV* sv_setref_pv(SV* rv, const char* classname, PV iv);
-
- Copies string into an SV whose reference is C<rv>. Set length to 0 to let
- Perl calculate the string length. SV is blessed if C<classname> is non-null.
-
- SV* sv_setref_pvn(SV* rv, const char* classname, PV iv, STRLEN length);
-
- Tests whether the SV is blessed into the specified class. It does not
- check inheritance relationships.
-
- int sv_isa(SV* sv, const char* name);
-
- Tests whether the SV is a reference to a blessed object.
-
- int sv_isobject(SV* sv);
-
- Tests whether the SV is derived from the specified class. SV can be either
- a reference to a blessed object or a string containing a class name. This
- is the function implementing the C<UNIVERSAL::isa> functionality.
-
- bool sv_derived_from(SV* sv, const char* name);
-
- To check if you've got an object derived from a specific class you have
- to write:
-
- if (sv_isobject(sv) && sv_derived_from(sv, class)) { ... }
-
- =head2 Creating New Variables
-
- To create a new Perl variable with an undef value which can be accessed from
- your Perl script, use the following routines, depending on the variable type.
-
- SV* get_sv("package::varname", TRUE);
- AV* get_av("package::varname", TRUE);
- HV* get_hv("package::varname", TRUE);
-
- Notice the use of TRUE as the second parameter. The new variable can now
- be set, using the routines appropriate to the data type.
-
- There are additional macros whose values may be bitwise OR'ed with the
- C<TRUE> argument to enable certain extra features. Those bits are:
-
- GV_ADDMULTI Marks the variable as multiply defined, thus preventing the
- "Name <varname> used only once: possible typo" warning.
- GV_ADDWARN Issues the warning "Had to create <varname> unexpectedly" if
- the variable did not exist before the function was called.
-
- If you do not specify a package name, the variable is created in the current
- package.
-
- =head2 Reference Counts and Mortality
-
- Perl uses an reference count-driven garbage collection mechanism. SVs,
- AVs, or HVs (xV for short in the following) start their life with a
- reference count of 1. If the reference count of an xV ever drops to 0,
- then it will be destroyed and its memory made available for reuse.
-
- This normally doesn't happen at the Perl level unless a variable is
- undef'ed or the last variable holding a reference to it is changed or
- overwritten. At the internal level, however, reference counts can be
- manipulated with the following macros:
-
- int SvREFCNT(SV* sv);
- SV* SvREFCNT_inc(SV* sv);
- void SvREFCNT_dec(SV* sv);
-
- However, there is one other function which manipulates the reference
- count of its argument. The C<newRV_inc> function, you will recall,
- creates a reference to the specified argument. As a side effect,
- it increments the argument's reference count. If this is not what
- you want, use C<newRV_noinc> instead.
-
- For example, imagine you want to return a reference from an XSUB function.
- Inside the XSUB routine, you create an SV which initially has a reference
- count of one. Then you call C<newRV_inc>, passing it the just-created SV.
- This returns the reference as a new SV, but the reference count of the
- SV you passed to C<newRV_inc> has been incremented to two. Now you
- return the reference from the XSUB routine and forget about the SV.
- But Perl hasn't! Whenever the returned reference is destroyed, the
- reference count of the original SV is decreased to one and nothing happens.
- The SV will hang around without any way to access it until Perl itself
- terminates. This is a memory leak.
-
- The correct procedure, then, is to use C<newRV_noinc> instead of
- C<newRV_inc>. Then, if and when the last reference is destroyed,
- the reference count of the SV will go to zero and it will be destroyed,
- stopping any memory leak.
-
- There are some convenience functions available that can help with the
- destruction of xVs. These functions introduce the concept of "mortality".
- An xV that is mortal has had its reference count marked to be decremented,
- but not actually decremented, until "a short time later". Generally the
- term "short time later" means a single Perl statement, such as a call to
- an XSUB function. The actual determinant for when mortal xVs have their
- reference count decremented depends on two macros, SAVETMPS and FREETMPS.
- See L<perlcall> and L<perlxs> for more details on these macros.
-
- "Mortalization" then is at its simplest a deferred C<SvREFCNT_dec>.
- However, if you mortalize a variable twice, the reference count will
- later be decremented twice.
-
- You should be careful about creating mortal variables. Strange things
- can happen if you make the same value mortal within multiple contexts,
- or if you make a variable mortal multiple times.
-
- To create a mortal variable, use the functions:
-
- SV* sv_newmortal()
- SV* sv_2mortal(SV*)
- SV* sv_mortalcopy(SV*)
-
- The first call creates a mortal SV, the second converts an existing
- SV to a mortal SV (and thus defers a call to C<SvREFCNT_dec>), and the
- third creates a mortal copy of an existing SV.
-
- The mortal routines are not just for SVs -- AVs and HVs can be
- made mortal by passing their address (type-casted to C<SV*>) to the
- C<sv_2mortal> or C<sv_mortalcopy> routines.
-
- =head2 Stashes and Globs
-
- A "stash" is a hash that contains all of the different objects that
- are contained within a package. Each key of the stash is a symbol
- name (shared by all the different types of objects that have the same
- name), and each value in the hash table is a GV (Glob Value). This GV
- in turn contains references to the various objects of that name,
- including (but not limited to) the following:
-
- Scalar Value
- Array Value
- Hash Value
- I/O Handle
- Format
- Subroutine
-
- There is a single stash called "PL_defstash" that holds the items that exist
- in the "main" package. To get at the items in other packages, append the
- string "::" to the package name. The items in the "Foo" package are in
- the stash "Foo::" in PL_defstash. The items in the "Bar::Baz" package are
- in the stash "Baz::" in "Bar::"'s stash.
-
- To get the stash pointer for a particular package, use the function:
-
- HV* gv_stashpv(const char* name, I32 create)
- HV* gv_stashsv(SV*, I32 create)
-
- The first function takes a literal string, the second uses the string stored
- in the SV. Remember that a stash is just a hash table, so you get back an
- C<HV*>. The C<create> flag will create a new package if it is set.
-
- The name that C<gv_stash*v> wants is the name of the package whose symbol table
- you want. The default package is called C<main>. If you have multiply nested
- packages, pass their names to C<gv_stash*v>, separated by C<::> as in the Perl
- language itself.
-
- Alternately, if you have an SV that is a blessed reference, you can find
- out the stash pointer by using:
-
- HV* SvSTASH(SvRV(SV*));
-
- then use the following to get the package name itself:
-
- char* HvNAME(HV* stash);
-
- If you need to bless or re-bless an object you can use the following
- function:
-
- SV* sv_bless(SV*, HV* stash)
-
- where the first argument, an C<SV*>, must be a reference, and the second
- argument is a stash. The returned C<SV*> can now be used in the same way
- as any other SV.
-
- For more information on references and blessings, consult L<perlref>.
-
- =head2 Double-Typed SVs
-
- Scalar variables normally contain only one type of value, an integer,
- double, pointer, or reference. Perl will automatically convert the
- actual scalar data from the stored type into the requested type.
-
- Some scalar variables contain more than one type of scalar data. For
- example, the variable C<$!> contains either the numeric value of C<errno>
- or its string equivalent from either C<strerror> or C<sys_errlist[]>.
-
- To force multiple data values into an SV, you must do two things: use the
- C<sv_set*v> routines to add the additional scalar type, then set a flag
- so that Perl will believe it contains more than one type of data. The
- four macros to set the flags are:
-
- SvIOK_on
- SvNOK_on
- SvPOK_on
- SvROK_on
-
- The particular macro you must use depends on which C<sv_set*v> routine
- you called first. This is because every C<sv_set*v> routine turns on
- only the bit for the particular type of data being set, and turns off
- all the rest.
-
- For example, to create a new Perl variable called "dberror" that contains
- both the numeric and descriptive string error values, you could use the
- following code:
-
- extern int dberror;
- extern char *dberror_list;
-
- SV* sv = get_sv("dberror", TRUE);
- sv_setiv(sv, (IV) dberror);
- sv_setpv(sv, dberror_list[dberror]);
- SvIOK_on(sv);
-
- If the order of C<sv_setiv> and C<sv_setpv> had been reversed, then the
- macro C<SvPOK_on> would need to be called instead of C<SvIOK_on>.
-
- =head2 Magic Variables
-
- [This section still under construction. Ignore everything here. Post no
- bills. Everything not permitted is forbidden.]
-
- Any SV may be magical, that is, it has special features that a normal
- SV does not have. These features are stored in the SV structure in a
- linked list of C<struct magic>'s, typedef'ed to C<MAGIC>.
-
- struct magic {
- MAGIC* mg_moremagic;
- MGVTBL* mg_virtual;
- U16 mg_private;
- char mg_type;
- U8 mg_flags;
- SV* mg_obj;
- char* mg_ptr;
- I32 mg_len;
- };
-
- Note this is current as of patchlevel 0, and could change at any time.
-
- =head2 Assigning Magic
-
- Perl adds magic to an SV using the sv_magic function:
-
- void sv_magic(SV* sv, SV* obj, int how, const char* name, I32 namlen);
-
- The C<sv> argument is a pointer to the SV that is to acquire a new magical
- feature.
-
- If C<sv> is not already magical, Perl uses the C<SvUPGRADE> macro to
- set the C<SVt_PVMG> flag for the C<sv>. Perl then continues by adding
- it to the beginning of the linked list of magical features. Any prior
- entry of the same type of magic is deleted. Note that this can be
- overridden, and multiple instances of the same type of magic can be
- associated with an SV.
-
- The C<name> and C<namlen> arguments are used to associate a string with
- the magic, typically the name of a variable. C<namlen> is stored in the
- C<mg_len> field and if C<name> is non-null and C<namlen> >= 0 a malloc'd
- copy of the name is stored in C<mg_ptr> field.
-
- The sv_magic function uses C<how> to determine which, if any, predefined
- "Magic Virtual Table" should be assigned to the C<mg_virtual> field.
- See the "Magic Virtual Table" section below. The C<how> argument is also
- stored in the C<mg_type> field.
-
- The C<obj> argument is stored in the C<mg_obj> field of the C<MAGIC>
- structure. If it is not the same as the C<sv> argument, the reference
- count of the C<obj> object is incremented. If it is the same, or if
- the C<how> argument is "#", or if it is a NULL pointer, then C<obj> is
- merely stored, without the reference count being incremented.
-
- There is also a function to add magic to an C<HV>:
-
- void hv_magic(HV *hv, GV *gv, int how);
-
- This simply calls C<sv_magic> and coerces the C<gv> argument into an C<SV>.
-
- To remove the magic from an SV, call the function sv_unmagic:
-
- void sv_unmagic(SV *sv, int type);
-
- The C<type> argument should be equal to the C<how> value when the C<SV>
- was initially made magical.
-
- =head2 Magic Virtual Tables
-
- The C<mg_virtual> field in the C<MAGIC> structure is a pointer to a
- C<MGVTBL>, which is a structure of function pointers and stands for
- "Magic Virtual Table" to handle the various operations that might be
- applied to that variable.
-
- The C<MGVTBL> has five pointers to the following routine types:
-
- int (*svt_get)(SV* sv, MAGIC* mg);
- int (*svt_set)(SV* sv, MAGIC* mg);
- U32 (*svt_len)(SV* sv, MAGIC* mg);
- int (*svt_clear)(SV* sv, MAGIC* mg);
- int (*svt_free)(SV* sv, MAGIC* mg);
-
- This MGVTBL structure is set at compile-time in C<perl.h> and there are
- currently 19 types (or 21 with overloading turned on). These different
- structures contain pointers to various routines that perform additional
- actions depending on which function is being called.
-
- Function pointer Action taken
- ---------------- ------------
- svt_get Do something after the value of the SV is retrieved.
- svt_set Do something after the SV is assigned a value.
- svt_len Report on the SV's length.
- svt_clear Clear something the SV represents.
- svt_free Free any extra storage associated with the SV.
-
- For instance, the MGVTBL structure called C<vtbl_sv> (which corresponds
- to an C<mg_type> of '\0') contains:
-
- { magic_get, magic_set, magic_len, 0, 0 }
-
- Thus, when an SV is determined to be magical and of type '\0', if a get
- operation is being performed, the routine C<magic_get> is called. All
- the various routines for the various magical types begin with C<magic_>.
- NOTE: the magic routines are not considered part of the Perl API, and may
- not be exported by the Perl library.
-
- The current kinds of Magic Virtual Tables are:
-
- mg_type MGVTBL Type of magic
- ------- ------ ----------------------------
- \0 vtbl_sv Special scalar variable
- A vtbl_amagic %OVERLOAD hash
- a vtbl_amagicelem %OVERLOAD hash element
- c (none) Holds overload table (AMT) on stash
- B vtbl_bm Boyer-Moore (fast string search)
- D vtbl_regdata Regex match position data (@+ and @- vars)
- d vtbl_regdatum Regex match position data element
- E vtbl_env %ENV hash
- e vtbl_envelem %ENV hash element
- f vtbl_fm Formline ('compiled' format)
- g vtbl_mglob m//g target / study()ed string
- I vtbl_isa @ISA array
- i vtbl_isaelem @ISA array element
- k vtbl_nkeys scalar(keys()) lvalue
- L (none) Debugger %_<filename
- l vtbl_dbline Debugger %_<filename element
- o vtbl_collxfrm Locale transformation
- P vtbl_pack Tied array or hash
- p vtbl_packelem Tied array or hash element
- q vtbl_packelem Tied scalar or handle
- S vtbl_sig %SIG hash
- s vtbl_sigelem %SIG hash element
- t vtbl_taint Taintedness
- U vtbl_uvar Available for use by extensions
- v vtbl_vec vec() lvalue
- x vtbl_substr substr() lvalue
- y vtbl_defelem Shadow "foreach" iterator variable /
- smart parameter vivification
- * vtbl_glob GV (typeglob)
- # vtbl_arylen Array length ($#ary)
- . vtbl_pos pos() lvalue
- ~ (none) Available for use by extensions
-
- When an uppercase and lowercase letter both exist in the table, then the
- uppercase letter is used to represent some kind of composite type (a list
- or a hash), and the lowercase letter is used to represent an element of
- that composite type.
-
- The '~' and 'U' magic types are defined specifically for use by
- extensions and will not be used by perl itself. Extensions can use
- '~' magic to 'attach' private information to variables (typically
- objects). This is especially useful because there is no way for
- normal perl code to corrupt this private information (unlike using
- extra elements of a hash object).
-
- Similarly, 'U' magic can be used much like tie() to call a C function
- any time a scalar's value is used or changed. The C<MAGIC>'s
- C<mg_ptr> field points to a C<ufuncs> structure:
-
- struct ufuncs {
- I32 (*uf_val)(IV, SV*);
- I32 (*uf_set)(IV, SV*);
- IV uf_index;
- };
-
- When the SV is read from or written to, the C<uf_val> or C<uf_set>
- function will be called with C<uf_index> as the first arg and a
- pointer to the SV as the second. A simple example of how to add 'U'
- magic is shown below. Note that the ufuncs structure is copied by
- sv_magic, so you can safely allocate it on the stack.
-
- void
- Umagic(sv)
- SV *sv;
- PREINIT:
- struct ufuncs uf;
- CODE:
- uf.uf_val = &my_get_fn;
- uf.uf_set = &my_set_fn;
- uf.uf_index = 0;
- sv_magic(sv, 0, 'U', (char*)&uf, sizeof(uf));
-
- Note that because multiple extensions may be using '~' or 'U' magic,
- it is important for extensions to take extra care to avoid conflict.
- Typically only using the magic on objects blessed into the same class
- as the extension is sufficient. For '~' magic, it may also be
- appropriate to add an I32 'signature' at the top of the private data
- area and check that.
-
- Also note that the C<sv_set*()> and C<sv_cat*()> functions described
- earlier do B<not> invoke 'set' magic on their targets. This must
- be done by the user either by calling the C<SvSETMAGIC()> macro after
- calling these functions, or by using one of the C<sv_set*_mg()> or
- C<sv_cat*_mg()> functions. Similarly, generic C code must call the
- C<SvGETMAGIC()> macro to invoke any 'get' magic if they use an SV
- obtained from external sources in functions that don't handle magic.
- See L<perlapi> for a description of these functions.
- For example, calls to the C<sv_cat*()> functions typically need to be
- followed by C<SvSETMAGIC()>, but they don't need a prior C<SvGETMAGIC()>
- since their implementation handles 'get' magic.
-
- =head2 Finding Magic
-
- MAGIC* mg_find(SV*, int type); /* Finds the magic pointer of that type */
-
- This routine returns a pointer to the C<MAGIC> structure stored in the SV.
- If the SV does not have that magical feature, C<NULL> is returned. Also,
- if the SV is not of type SVt_PVMG, Perl may core dump.
-
- int mg_copy(SV* sv, SV* nsv, const char* key, STRLEN klen);
-
- This routine checks to see what types of magic C<sv> has. If the mg_type
- field is an uppercase letter, then the mg_obj is copied to C<nsv>, but
- the mg_type field is changed to be the lowercase letter.
-
- =head2 Understanding the Magic of Tied Hashes and Arrays
-
- Tied hashes and arrays are magical beasts of the 'P' magic type.
-
- WARNING: As of the 5.004 release, proper usage of the array and hash
- access functions requires understanding a few caveats. Some
- of these caveats are actually considered bugs in the API, to be fixed
- in later releases, and are bracketed with [MAYCHANGE] below. If
- you find yourself actually applying such information in this section, be
- aware that the behavior may change in the future, umm, without warning.
-
- The perl tie function associates a variable with an object that implements
- the various GET, SET etc methods. To perform the equivalent of the perl
- tie function from an XSUB, you must mimic this behaviour. The code below
- carries out the necessary steps - firstly it creates a new hash, and then
- creates a second hash which it blesses into the class which will implement
- the tie methods. Lastly it ties the two hashes together, and returns a
- reference to the new tied hash. Note that the code below does NOT call the
- TIEHASH method in the MyTie class -
- see L<Calling Perl Routines from within C Programs> for details on how
- to do this.
-
- SV*
- mytie()
- PREINIT:
- HV *hash;
- HV *stash;
- SV *tie;
- CODE:
- hash = newHV();
- tie = newRV_noinc((SV*)newHV());
- stash = gv_stashpv("MyTie", TRUE);
- sv_bless(tie, stash);
- hv_magic(hash, tie, 'P');
- RETVAL = newRV_noinc(hash);
- OUTPUT:
- RETVAL
-
- The C<av_store> function, when given a tied array argument, merely
- copies the magic of the array onto the value to be "stored", using
- C<mg_copy>. It may also return NULL, indicating that the value did not
- actually need to be stored in the array. [MAYCHANGE] After a call to
- C<av_store> on a tied array, the caller will usually need to call
- C<mg_set(val)> to actually invoke the perl level "STORE" method on the
- TIEARRAY object. If C<av_store> did return NULL, a call to
- C<SvREFCNT_dec(val)> will also be usually necessary to avoid a memory
- leak. [/MAYCHANGE]
-
- The previous paragraph is applicable verbatim to tied hash access using the
- C<hv_store> and C<hv_store_ent> functions as well.
-
- C<av_fetch> and the corresponding hash functions C<hv_fetch> and
- C<hv_fetch_ent> actually return an undefined mortal value whose magic
- has been initialized using C<mg_copy>. Note the value so returned does not
- need to be deallocated, as it is already mortal. [MAYCHANGE] But you will
- need to call C<mg_get()> on the returned value in order to actually invoke
- the perl level "FETCH" method on the underlying TIE object. Similarly,
- you may also call C<mg_set()> on the return value after possibly assigning
- a suitable value to it using C<sv_setsv>, which will invoke the "STORE"
- method on the TIE object. [/MAYCHANGE]
-
- [MAYCHANGE]
- In other words, the array or hash fetch/store functions don't really
- fetch and store actual values in the case of tied arrays and hashes. They
- merely call C<mg_copy> to attach magic to the values that were meant to be
- "stored" or "fetched". Later calls to C<mg_get> and C<mg_set> actually
- do the job of invoking the TIE methods on the underlying objects. Thus
- the magic mechanism currently implements a kind of lazy access to arrays
- and hashes.
-
- Currently (as of perl version 5.004), use of the hash and array access
- functions requires the user to be aware of whether they are operating on
- "normal" hashes and arrays, or on their tied variants. The API may be
- changed to provide more transparent access to both tied and normal data
- types in future versions.
- [/MAYCHANGE]
-
- You would do well to understand that the TIEARRAY and TIEHASH interfaces
- are mere sugar to invoke some perl method calls while using the uniform hash
- and array syntax. The use of this sugar imposes some overhead (typically
- about two to four extra opcodes per FETCH/STORE operation, in addition to
- the creation of all the mortal variables required to invoke the methods).
- This overhead will be comparatively small if the TIE methods are themselves
- substantial, but if they are only a few statements long, the overhead
- will not be insignificant.
-
- =head2 Localizing changes
-
- Perl has a very handy construction
-
- {
- local $var = 2;
- ...
- }
-
- This construction is I<approximately> equivalent to
-
- {
- my $oldvar = $var;
- $var = 2;
- ...
- $var = $oldvar;
- }
-
- The biggest difference is that the first construction would
- reinstate the initial value of $var, irrespective of how control exits
- the block: C<goto>, C<return>, C<die>/C<eval> etc. It is a little bit
- more efficient as well.
-
- There is a way to achieve a similar task from C via Perl API: create a
- I<pseudo-block>, and arrange for some changes to be automatically
- undone at the end of it, either explicit, or via a non-local exit (via
- die()). A I<block>-like construct is created by a pair of
- C<ENTER>/C<LEAVE> macros (see L<perlcall/"Returning a Scalar">).
- Such a construct may be created specially for some important localized
- task, or an existing one (like boundaries of enclosing Perl
- subroutine/block, or an existing pair for freeing TMPs) may be
- used. (In the second case the overhead of additional localization must
- be almost negligible.) Note that any XSUB is automatically enclosed in
- an C<ENTER>/C<LEAVE> pair.
-
- Inside such a I<pseudo-block> the following service is available:
-
- =over 4
-
- =item C<SAVEINT(int i)>
-
- =item C<SAVEIV(IV i)>
-
- =item C<SAVEI32(I32 i)>
-
- =item C<SAVELONG(long i)>
-
- These macros arrange things to restore the value of integer variable
- C<i> at the end of enclosing I<pseudo-block>.
-
- =item C<SAVESPTR(s)>
-
- =item C<SAVEPPTR(p)>
-
- These macros arrange things to restore the value of pointers C<s> and
- C<p>. C<s> must be a pointer of a type which survives conversion to
- C<SV*> and back, C<p> should be able to survive conversion to C<char*>
- and back.
-
- =item C<SAVEFREESV(SV *sv)>
-
- The refcount of C<sv> would be decremented at the end of
- I<pseudo-block>. This is similar to C<sv_2mortal> in that it is also a
- mechanism for doing a delayed C<SvREFCNT_dec>. However, while C<sv_2mortal>
- extends the lifetime of C<sv> until the beginning of the next statement,
- C<SAVEFREESV> extends it until the end of the enclosing scope. These
- lifetimes can be wildly different.
-
- Also compare C<SAVEMORTALIZESV>.
-
- =item C<SAVEMORTALIZESV(SV *sv)>
-
- Just like C<SAVEFREESV>, but mortalizes C<sv> at the end of the current
- scope instead of decrementing its reference count. This usually has the
- effect of keeping C<sv> alive until the statement that called the currently
- live scope has finished executing.
-
- =item C<SAVEFREEOP(OP *op)>
-
- The C<OP *> is op_free()ed at the end of I<pseudo-block>.
-
- =item C<SAVEFREEPV(p)>
-
- The chunk of memory which is pointed to by C<p> is Safefree()ed at the
- end of I<pseudo-block>.
-
- =item C<SAVECLEARSV(SV *sv)>
-
- Clears a slot in the current scratchpad which corresponds to C<sv> at
- the end of I<pseudo-block>.
-
- =item C<SAVEDELETE(HV *hv, char *key, I32 length)>
-
- The key C<key> of C<hv> is deleted at the end of I<pseudo-block>. The
- string pointed to by C<key> is Safefree()ed. If one has a I<key> in
- short-lived storage, the corresponding string may be reallocated like
- this:
-
- SAVEDELETE(PL_defstash, savepv(tmpbuf), strlen(tmpbuf));
-
- =item C<SAVEDESTRUCTOR(DESTRUCTORFUNC_NOCONTEXT_t f, void *p)>
-
- At the end of I<pseudo-block> the function C<f> is called with the
- only argument C<p>.
-
- =item C<SAVEDESTRUCTOR_X(DESTRUCTORFUNC_t f, void *p)>
-
- At the end of I<pseudo-block> the function C<f> is called with the
- implicit context argument (if any), and C<p>.
-
- =item C<SAVESTACK_POS()>
-
- The current offset on the Perl internal stack (cf. C<SP>) is restored
- at the end of I<pseudo-block>.
-
- =back
-
- The following API list contains functions, thus one needs to
- provide pointers to the modifiable data explicitly (either C pointers,
- or Perlish C<GV *>s). Where the above macros take C<int>, a similar
- function takes C<int *>.
-
- =over 4
-
- =item C<SV* save_scalar(GV *gv)>
-
- Equivalent to Perl code C<local $gv>.
-
- =item C<AV* save_ary(GV *gv)>
-
- =item C<HV* save_hash(GV *gv)>
-
- Similar to C<save_scalar>, but localize C<@gv> and C<%gv>.
-
- =item C<void save_item(SV *item)>
-
- Duplicates the current value of C<SV>, on the exit from the current
- C<ENTER>/C<LEAVE> I<pseudo-block> will restore the value of C<SV>
- using the stored value.
-
- =item C<void save_list(SV **sarg, I32 maxsarg)>
-
- A variant of C<save_item> which takes multiple arguments via an array
- C<sarg> of C<SV*> of length C<maxsarg>.
-
- =item C<SV* save_svref(SV **sptr)>
-
- Similar to C<save_scalar>, but will reinstate a C<SV *>.
-
- =item C<void save_aptr(AV **aptr)>
-
- =item C<void save_hptr(HV **hptr)>
-
- Similar to C<save_svref>, but localize C<AV *> and C<HV *>.
-
- =back
-
- The C<Alias> module implements localization of the basic types within the
- I<caller's scope>. People who are interested in how to localize things in
- the containing scope should take a look there too.
-
- =head1 Subroutines
-
- =head2 XSUBs and the Argument Stack
-
- The XSUB mechanism is a simple way for Perl programs to access C subroutines.
- An XSUB routine will have a stack that contains the arguments from the Perl
- program, and a way to map from the Perl data structures to a C equivalent.
-
- The stack arguments are accessible through the C<ST(n)> macro, which returns
- the C<n>'th stack argument. Argument 0 is the first argument passed in the
- Perl subroutine call. These arguments are C<SV*>, and can be used anywhere
- an C<SV*> is used.
-
- Most of the time, output from the C routine can be handled through use of
- the RETVAL and OUTPUT directives. However, there are some cases where the
- argument stack is not already long enough to handle all the return values.
- An example is the POSIX tzname() call, which takes no arguments, but returns
- two, the local time zone's standard and summer time abbreviations.
-
- To handle this …
Large files files are truncated, but you can click here to view the full file