/Demo/metaclasses/index.html
http://unladen-swallow.googlecode.com/ · HTML · 605 lines · 452 code · 153 blank · 0 comment · 0 complexity · 8c1e9a3d66a373c7241b6b02d8761ff0 MD5 · raw file
- <HTML>
- <HEAD>
- <TITLE>Metaclasses in Python 1.5</TITLE>
- </HEAD>
- <BODY BGCOLOR="FFFFFF">
- <H1>Metaclasses in Python 1.5</H1>
- <H2>(A.k.a. The Killer Joke :-)</H2>
- <HR>
- (<i>Postscript:</i> reading this essay is probably not the best way to
- understand the metaclass hook described here. See a <A
- HREF="meta-vladimir.txt">message posted by Vladimir Marangozov</A>
- which may give a gentler introduction to the matter. You may also
- want to search Deja News for messages with "metaclass" in the subject
- posted to comp.lang.python in July and August 1998.)
- <HR>
- <P>In previous Python releases (and still in 1.5), there is something
- called the ``Don Beaudry hook'', after its inventor and champion.
- This allows C extensions to provide alternate class behavior, thereby
- allowing the Python class syntax to be used to define other class-like
- entities. Don Beaudry has used this in his infamous <A
- HREF="http://maigret.cog.brown.edu/pyutil/">MESS</A> package; Jim
- Fulton has used it in his <A
- HREF="http://www.digicool.com/releases/ExtensionClass/">Extension
- Classes</A> package. (It has also been referred to as the ``Don
- Beaudry <i>hack</i>,'' but that's a misnomer. There's nothing hackish
- about it -- in fact, it is rather elegant and deep, even though
- there's something dark to it.)
- <P>(On first reading, you may want to skip directly to the examples in
- the section "Writing Metaclasses in Python" below, unless you want
- your head to explode.)
- <P>
- <HR>
- <P>Documentation of the Don Beaudry hook has purposefully been kept
- minimal, since it is a feature of incredible power, and is easily
- abused. Basically, it checks whether the <b>type of the base
- class</b> is callable, and if so, it is called to create the new
- class.
- <P>Note the two indirection levels. Take a simple example:
- <PRE>
- class B:
- pass
- class C(B):
- pass
- </PRE>
- Take a look at the second class definition, and try to fathom ``the
- type of the base class is callable.''
- <P>(Types are not classes, by the way. See questions 4.2, 4.19 and in
- particular 6.22 in the <A
- HREF="http://www.python.org/cgi-bin/faqw.py" >Python FAQ</A>
- for more on this topic.)
- <P>
- <UL>
- <LI>The <b>base class</b> is B; this one's easy.<P>
- <LI>Since B is a class, its type is ``class''; so the <b>type of the
- base class</b> is the type ``class''. This is also known as
- types.ClassType, assuming the standard module <code>types</code> has
- been imported.<P>
- <LI>Now is the type ``class'' <b>callable</b>? No, because types (in
- core Python) are never callable. Classes are callable (calling a
- class creates a new instance) but types aren't.<P>
- </UL>
- <P>So our conclusion is that in our example, the type of the base
- class (of C) is not callable. So the Don Beaudry hook does not apply,
- and the default class creation mechanism is used (which is also used
- when there is no base class). In fact, the Don Beaudry hook never
- applies when using only core Python, since the type of a core object
- is never callable.
- <P>So what do Don and Jim do in order to use Don's hook? Write an
- extension that defines at least two new Python object types. The
- first would be the type for ``class-like'' objects usable as a base
- class, to trigger Don's hook. This type must be made callable.
- That's why we need a second type. Whether an object is callable
- depends on its type. So whether a type object is callable depends on
- <i>its</i> type, which is a <i>meta-type</i>. (In core Python there
- is only one meta-type, the type ``type'' (types.TypeType), which is
- the type of all type objects, even itself.) A new meta-type must
- be defined that makes the type of the class-like objects callable.
- (Normally, a third type would also be needed, the new ``instance''
- type, but this is not an absolute requirement -- the new class type
- could return an object of some existing type when invoked to create an
- instance.)
- <P>Still confused? Here's a simple device due to Don himself to
- explain metaclasses. Take a simple class definition; assume B is a
- special class that triggers Don's hook:
- <PRE>
- class C(B):
- a = 1
- b = 2
- </PRE>
- This can be though of as equivalent to:
- <PRE>
- C = type(B)('C', (B,), {'a': 1, 'b': 2})
- </PRE>
- If that's too dense for you, here's the same thing written out using
- temporary variables:
- <PRE>
- creator = type(B) # The type of the base class
- name = 'C' # The name of the new class
- bases = (B,) # A tuple containing the base class(es)
- namespace = {'a': 1, 'b': 2} # The namespace of the class statement
- C = creator(name, bases, namespace)
- </PRE>
- This is analogous to what happens without the Don Beaudry hook, except
- that in that case the creator function is set to the default class
- creator.
- <P>In either case, the creator is called with three arguments. The
- first one, <i>name</i>, is the name of the new class (as given at the
- top of the class statement). The <i>bases</i> argument is a tuple of
- base classes (a singleton tuple if there's only one base class, like
- the example). Finally, <i>namespace</i> is a dictionary containing
- the local variables collected during execution of the class statement.
- <P>Note that the contents of the namespace dictionary is simply
- whatever names were defined in the class statement. A little-known
- fact is that when Python executes a class statement, it enters a new
- local namespace, and all assignments and function definitions take
- place in this namespace. Thus, after executing the following class
- statement:
- <PRE>
- class C:
- a = 1
- def f(s): pass
- </PRE>
- the class namespace's contents would be {'a': 1, 'f': <function f
- ...>}.
- <P>But enough already about writing Python metaclasses in C; read the
- documentation of <A
- HREF="http://maigret.cog.brown.edu/pyutil/">MESS</A> or <A
- HREF="http://www.digicool.com/papers/ExtensionClass.html" >Extension
- Classes</A> for more information.
- <P>
- <HR>
- <H2>Writing Metaclasses in Python</H2>
- <P>In Python 1.5, the requirement to write a C extension in order to
- write metaclasses has been dropped (though you can still do
- it, of course). In addition to the check ``is the type of the base
- class callable,'' there's a check ``does the base class have a
- __class__ attribute.'' If so, it is assumed that the __class__
- attribute refers to a class.
- <P>Let's repeat our simple example from above:
- <PRE>
- class C(B):
- a = 1
- b = 2
- </PRE>
- Assuming B has a __class__ attribute, this translates into:
- <PRE>
- C = B.__class__('C', (B,), {'a': 1, 'b': 2})
- </PRE>
- This is exactly the same as before except that instead of type(B),
- B.__class__ is invoked. If you have read <A HREF=
- "http://www.python.org/cgi-bin/faqw.py?req=show&file=faq06.022.htp"
- >FAQ question 6.22</A> you will understand that while there is a big
- technical difference between type(B) and B.__class__, they play the
- same role at different abstraction levels. And perhaps at some point
- in the future they will really be the same thing (at which point you
- would be able to derive subclasses from built-in types).
- <P>At this point it may be worth mentioning that C.__class__ is the
- same object as B.__class__, i.e., C's metaclass is the same as B's
- metaclass. In other words, subclassing an existing class creates a
- new (meta)inststance of the base class's metaclass.
- <P>Going back to the example, the class B.__class__ is instantiated,
- passing its constructor the same three arguments that are passed to
- the default class constructor or to an extension's metaclass:
- <i>name</i>, <i>bases</i>, and <i>namespace</i>.
- <P>It is easy to be confused by what exactly happens when using a
- metaclass, because we lose the absolute distinction between classes
- and instances: a class is an instance of a metaclass (a
- ``metainstance''), but technically (i.e. in the eyes of the python
- runtime system), the metaclass is just a class, and the metainstance
- is just an instance. At the end of the class statement, the metaclass
- whose metainstance is used as a base class is instantiated, yielding a
- second metainstance (of the same metaclass). This metainstance is
- then used as a (normal, non-meta) class; instantiation of the class
- means calling the metainstance, and this will return a real instance.
- And what class is that an instance of? Conceptually, it is of course
- an instance of our metainstance; but in most cases the Python runtime
- system will see it as an instance of a a helper class used by the
- metaclass to implement its (non-meta) instances...
- <P>Hopefully an example will make things clearer. Let's presume we
- have a metaclass MetaClass1. It's helper class (for non-meta
- instances) is callled HelperClass1. We now (manually) instantiate
- MetaClass1 once to get an empty special base class:
- <PRE>
- BaseClass1 = MetaClass1("BaseClass1", (), {})
- </PRE>
- We can now use BaseClass1 as a base class in a class statement:
- <PRE>
- class MySpecialClass(BaseClass1):
- i = 1
- def f(s): pass
- </PRE>
- At this point, MySpecialClass is defined; it is a metainstance of
- MetaClass1 just like BaseClass1, and in fact the expression
- ``BaseClass1.__class__ == MySpecialClass.__class__ == MetaClass1''
- yields true.
- <P>We are now ready to create instances of MySpecialClass. Let's
- assume that no constructor arguments are required:
- <PRE>
- x = MySpecialClass()
- y = MySpecialClass()
- print x.__class__, y.__class__
- </PRE>
- The print statement shows that x and y are instances of HelperClass1.
- How did this happen? MySpecialClass is an instance of MetaClass1
- (``meta'' is irrelevant here); when an instance is called, its
- __call__ method is invoked, and presumably the __call__ method defined
- by MetaClass1 returns an instance of HelperClass1.
- <P>Now let's see how we could use metaclasses -- what can we do
- with metaclasses that we can't easily do without them? Here's one
- idea: a metaclass could automatically insert trace calls for all
- method calls. Let's first develop a simplified example, without
- support for inheritance or other ``advanced'' Python features (we'll
- add those later).
- <PRE>
- import types
- class Tracing:
- def __init__(self, name, bases, namespace):
- """Create a new class."""
- self.__name__ = name
- self.__bases__ = bases
- self.__namespace__ = namespace
- def __call__(self):
- """Create a new instance."""
- return Instance(self)
- class Instance:
- def __init__(self, klass):
- self.__klass__ = klass
- def __getattr__(self, name):
- try:
- value = self.__klass__.__namespace__[name]
- except KeyError:
- raise AttributeError, name
- if type(value) is not types.FunctionType:
- return value
- return BoundMethod(value, self)
- class BoundMethod:
- def __init__(self, function, instance):
- self.function = function
- self.instance = instance
- def __call__(self, *args):
- print "calling", self.function, "for", self.instance, "with", args
- return apply(self.function, (self.instance,) + args)
- Trace = Tracing('Trace', (), {})
- class MyTracedClass(Trace):
- def method1(self, a):
- self.a = a
- def method2(self):
- return self.a
- aninstance = MyTracedClass()
- aninstance.method1(10)
- print "the answer is %d" % aninstance.method2()
- </PRE>
- Confused already? The intention is to read this from top down. The
- Tracing class is the metaclass we're defining. Its structure is
- really simple.
- <P>
- <UL>
- <LI>The __init__ method is invoked when a new Tracing instance is
- created, e.g. the definition of class MyTracedClass later in the
- example. It simply saves the class name, base classes and namespace
- as instance variables.<P>
- <LI>The __call__ method is invoked when a Tracing instance is called,
- e.g. the creation of aninstance later in the example. It returns an
- instance of the class Instance, which is defined next.<P>
- </UL>
- <P>The class Instance is the class used for all instances of classes
- built using the Tracing metaclass, e.g. aninstance. It has two
- methods:
- <P>
- <UL>
- <LI>The __init__ method is invoked from the Tracing.__call__ method
- above to initialize a new instance. It saves the class reference as
- an instance variable. It uses a funny name because the user's
- instance variables (e.g. self.a later in the example) live in the same
- namespace.<P>
- <LI>The __getattr__ method is invoked whenever the user code
- references an attribute of the instance that is not an instance
- variable (nor a class variable; but except for __init__ and
- __getattr__ there are no class variables). It will be called, for
- example, when aninstance.method1 is referenced in the example, with
- self set to aninstance and name set to the string "method1".<P>
- </UL>
- <P>The __getattr__ method looks the name up in the __namespace__
- dictionary. If it isn't found, it raises an AttributeError exception.
- (In a more realistic example, it would first have to look through the
- base classes as well.) If it is found, there are two possibilities:
- it's either a function or it isn't. If it's not a function, it is
- assumed to be a class variable, and its value is returned. If it's a
- function, we have to ``wrap'' it in instance of yet another helper
- class, BoundMethod.
- <P>The BoundMethod class is needed to implement a familiar feature:
- when a method is defined, it has an initial argument, self, which is
- automatically bound to the relevant instance when it is called. For
- example, aninstance.method1(10) is equivalent to method1(aninstance,
- 10). In the example if this call, first a temporary BoundMethod
- instance is created with the following constructor call: temp =
- BoundMethod(method1, aninstance); then this instance is called as
- temp(10). After the call, the temporary instance is discarded.
- <P>
- <UL>
- <LI>The __init__ method is invoked for the constructor call
- BoundMethod(method1, aninstance). It simply saves away its
- arguments.<P>
- <LI>The __call__ method is invoked when the bound method instance is
- called, as in temp(10). It needs to call method1(aninstance, 10).
- However, even though self.function is now method1 and self.instance is
- aninstance, it can't call self.function(self.instance, args) directly,
- because it should work regardless of the number of arguments passed.
- (For simplicity, support for keyword arguments has been omitted.)<P>
- </UL>
- <P>In order to be able to support arbitrary argument lists, the
- __call__ method first constructs a new argument tuple. Conveniently,
- because of the notation *args in __call__'s own argument list, the
- arguments to __call__ (except for self) are placed in the tuple args.
- To construct the desired argument list, we concatenate a singleton
- tuple containing the instance with the args tuple: (self.instance,) +
- args. (Note the trailing comma used to construct the singleton
- tuple.) In our example, the resulting argument tuple is (aninstance,
- 10).
- <P>The intrinsic function apply() takes a function and an argument
- tuple and calls the function for it. In our example, we are calling
- apply(method1, (aninstance, 10)) which is equivalent to calling
- method(aninstance, 10).
- <P>From here on, things should come together quite easily. The output
- of the example code is something like this:
- <PRE>
- calling <function method1 at ae8d8> for <Instance instance at 95ab0> with (10,)
- calling <function method2 at ae900> for <Instance instance at 95ab0> with ()
- the answer is 10
- </PRE>
- <P>That was about the shortest meaningful example that I could come up
- with. A real tracing metaclass (for example, <A
- HREF="#Trace">Trace.py</A> discussed below) needs to be more
- complicated in two dimensions.
- <P>First, it needs to support more advanced Python features such as
- class variables, inheritance, __init__ methods, and keyword arguments.
- <P>Second, it needs to provide a more flexible way to handle the
- actual tracing information; perhaps it should be possible to write
- your own tracing function that gets called, perhaps it should be
- possible to enable and disable tracing on a per-class or per-instance
- basis, and perhaps a filter so that only interesting calls are traced;
- it should also be able to trace the return value of the call (or the
- exception it raised if an error occurs). Even the Trace.py example
- doesn't support all these features yet.
- <P>
- <HR>
- <H1>Real-life Examples</H1>
- <P>Have a look at some very preliminary examples that I coded up to
- teach myself how to write metaclasses:
- <DL>
- <DT><A HREF="Enum.py">Enum.py</A>
- <DD>This (ab)uses the class syntax as an elegant way to define
- enumerated types. The resulting classes are never instantiated --
- rather, their class attributes are the enumerated values. For
- example:
- <PRE>
- class Color(Enum):
- red = 1
- green = 2
- blue = 3
- print Color.red
- </PRE>
- will print the string ``Color.red'', while ``Color.red==1'' is true,
- and ``Color.red + 1'' raise a TypeError exception.
- <P>
- <DT><A NAME=Trace></A><A HREF="Trace.py">Trace.py</A>
- <DD>The resulting classes work much like standard
- classes, but by setting a special class or instance attribute
- __trace_output__ to point to a file, all calls to the class's methods
- are traced. It was a bit of a struggle to get this right. This
- should probably redone using the generic metaclass below.
- <P>
- <DT><A HREF="Meta.py">Meta.py</A>
- <DD>A generic metaclass. This is an attempt at finding out how much
- standard class behavior can be mimicked by a metaclass. The
- preliminary answer appears to be that everything's fine as long as the
- class (or its clients) don't look at the instance's __class__
- attribute, nor at the class's __dict__ attribute. The use of
- __getattr__ internally makes the classic implementation of __getattr__
- hooks tough; we provide a similar hook _getattr_ instead.
- (__setattr__ and __delattr__ are not affected.)
- (XXX Hm. Could detect presence of __getattr__ and rename it.)
- <P>
- <DT><A HREF="Eiffel.py">Eiffel.py</A>
- <DD>Uses the above generic metaclass to implement Eiffel style
- pre-conditions and post-conditions.
- <P>
- <DT><A HREF="Synch.py">Synch.py</A>
- <DD>Uses the above generic metaclass to implement synchronized
- methods.
- <P>
- <DT><A HREF="Simple.py">Simple.py</A>
- <DD>The example module used above.
- <P>
- </DL>
- <P>A pattern seems to be emerging: almost all these uses of
- metaclasses (except for Enum, which is probably more cute than useful)
- mostly work by placing wrappers around method calls. An obvious
- problem with that is that it's not easy to combine the features of
- different metaclasses, while this would actually be quite useful: for
- example, I wouldn't mind getting a trace from the test run of the
- Synch module, and it would be interesting to add preconditions to it
- as well. This needs more research. Perhaps a metaclass could be
- provided that allows stackable wrappers...
- <P>
- <HR>
- <H2>Things You Could Do With Metaclasses</H2>
- <P>There are lots of things you could do with metaclasses. Most of
- these can also be done with creative use of __getattr__, but
- metaclasses make it easier to modify the attribute lookup behavior of
- classes. Here's a partial list.
- <P>
- <UL>
- <LI>Enforce different inheritance semantics, e.g. automatically call
- base class methods when a derived class overrides<P>
- <LI>Implement class methods (e.g. if the first argument is not named
- 'self')<P>
- <LI>Implement that each instance is initialized with <b>copies</b> of
- all class variables<P>
- <LI>Implement a different way to store instance variables (e.g. in a
- list kept outside the instance but indexed by the instance's id())<P>
- <LI>Automatically wrap or trap all or certain methods
- <UL>
- <LI>for tracing
- <LI>for precondition and postcondition checking
- <LI>for synchronized methods
- <LI>for automatic value caching
- </UL>
- <P>
- <LI>When an attribute is a parameterless function, call it on
- reference (to mimic it being an instance variable); same on assignment<P>
- <LI>Instrumentation: see how many times various attributes are used<P>
- <LI>Different semantics for __setattr__ and __getattr__ (e.g. disable
- them when they are being used recursively)<P>
- <LI>Abuse class syntax for other things<P>
- <LI>Experiment with automatic type checking<P>
- <LI>Delegation (or acquisition)<P>
- <LI>Dynamic inheritance patterns<P>
- <LI>Automatic caching of methods<P>
- </UL>
- <P>
- <HR>
- <H4>Credits</H4>
- <P>Many thanks to David Ascher and Donald Beaudry for their comments
- on earlier draft of this paper. Also thanks to Matt Conway and Tommy
- Burnette for putting a seed for the idea of metaclasses in my
- mind, nearly three years ago, even though at the time my response was
- ``you can do that with __getattr__ hooks...'' :-)
- <P>
- <HR>
- </BODY>
- </HTML>