/Doc/whatsnew/2.3.rst
ReStructuredText | 2078 lines | 1595 code | 483 blank | 0 comment | 0 complexity | 8db9b7b78675d404b6b3341c06d9e9af MD5 | raw file
Possible License(s): 0BSD, BSD-3-Clause
- ****************************
- What's New in Python 2.3
- ****************************
- :Author: A.M. Kuchling
- .. |release| replace:: 1.01
- .. $Id: whatsnew23.tex 54631 2007-03-31 11:58:36Z georg.brandl $
- This article explains the new features in Python 2.3. Python 2.3 was released
- on July 29, 2003.
- The main themes for Python 2.3 are polishing some of the features added in 2.2,
- adding various small but useful enhancements to the core language, and expanding
- the standard library. The new object model introduced in the previous version
- has benefited from 18 months of bugfixes and from optimization efforts that have
- improved the performance of new-style classes. A few new built-in functions
- have been added such as :func:`sum` and :func:`enumerate`. The :keyword:`in`
- operator can now be used for substring searches (e.g. ``"ab" in "abc"`` returns
- :const:`True`).
- Some of the many new library features include Boolean, set, heap, and date/time
- data types, the ability to import modules from ZIP-format archives, metadata
- support for the long-awaited Python catalog, an updated version of IDLE, and
- modules for logging messages, wrapping text, parsing CSV files, processing
- command-line options, using BerkeleyDB databases... the list of new and
- enhanced modules is lengthy.
- This article doesn't attempt to provide a complete specification of the new
- features, but instead provides a convenient overview. For full details, you
- should refer to the documentation for Python 2.3, such as the Python Library
- Reference and the Python Reference Manual. If you want to understand the
- complete implementation and design rationale, refer to the PEP for a particular
- new feature.
- .. ======================================================================
- PEP 218: A Standard Set Datatype
- ================================
- The new :mod:`sets` module contains an implementation of a set datatype. The
- :class:`Set` class is for mutable sets, sets that can have members added and
- removed. The :class:`ImmutableSet` class is for sets that can't be modified,
- and instances of :class:`ImmutableSet` can therefore be used as dictionary keys.
- Sets are built on top of dictionaries, so the elements within a set must be
- hashable.
- Here's a simple example::
- >>> import sets
- >>> S = sets.Set([1,2,3])
- >>> S
- Set([1, 2, 3])
- >>> 1 in S
- True
- >>> 0 in S
- False
- >>> S.add(5)
- >>> S.remove(3)
- >>> S
- Set([1, 2, 5])
- >>>
- The union and intersection of sets can be computed with the :meth:`union` and
- :meth:`intersection` methods; an alternative notation uses the bitwise operators
- ``&`` and ``|``. Mutable sets also have in-place versions of these methods,
- :meth:`union_update` and :meth:`intersection_update`. ::
- >>> S1 = sets.Set([1,2,3])
- >>> S2 = sets.Set([4,5,6])
- >>> S1.union(S2)
- Set([1, 2, 3, 4, 5, 6])
- >>> S1 | S2 # Alternative notation
- Set([1, 2, 3, 4, 5, 6])
- >>> S1.intersection(S2)
- Set([])
- >>> S1 & S2 # Alternative notation
- Set([])
- >>> S1.union_update(S2)
- >>> S1
- Set([1, 2, 3, 4, 5, 6])
- >>>
- It's also possible to take the symmetric difference of two sets. This is the
- set of all elements in the union that aren't in the intersection. Another way
- of putting it is that the symmetric difference contains all elements that are in
- exactly one set. Again, there's an alternative notation (``^``), and an in-
- place version with the ungainly name :meth:`symmetric_difference_update`. ::
- >>> S1 = sets.Set([1,2,3,4])
- >>> S2 = sets.Set([3,4,5,6])
- >>> S1.symmetric_difference(S2)
- Set([1, 2, 5, 6])
- >>> S1 ^ S2
- Set([1, 2, 5, 6])
- >>>
- There are also :meth:`issubset` and :meth:`issuperset` methods for checking
- whether one set is a subset or superset of another::
- >>> S1 = sets.Set([1,2,3])
- >>> S2 = sets.Set([2,3])
- >>> S2.issubset(S1)
- True
- >>> S1.issubset(S2)
- False
- >>> S1.issuperset(S2)
- True
- >>>
- .. seealso::
- :pep:`218` - Adding a Built-In Set Object Type
- PEP written by Greg V. Wilson. Implemented by Greg V. Wilson, Alex Martelli, and
- GvR.
- .. ======================================================================
- .. _section-generators:
- PEP 255: Simple Generators
- ==========================
- In Python 2.2, generators were added as an optional feature, to be enabled by a
- ``from __future__ import generators`` directive. In 2.3 generators no longer
- need to be specially enabled, and are now always present; this means that
- :keyword:`yield` is now always a keyword. The rest of this section is a copy of
- the description of generators from the "What's New in Python 2.2" document; if
- you read it back when Python 2.2 came out, you can skip the rest of this
- section.
- You're doubtless familiar with how function calls work in Python or C. When you
- call a function, it gets a private namespace where its local variables are
- created. When the function reaches a :keyword:`return` statement, the local
- variables are destroyed and the resulting value is returned to the caller. A
- later call to the same function will get a fresh new set of local variables.
- But, what if the local variables weren't thrown away on exiting a function?
- What if you could later resume the function where it left off? This is what
- generators provide; they can be thought of as resumable functions.
- Here's the simplest example of a generator function::
- def generate_ints(N):
- for i in range(N):
- yield i
- A new keyword, :keyword:`yield`, was introduced for generators. Any function
- containing a :keyword:`yield` statement is a generator function; this is
- detected by Python's bytecode compiler which compiles the function specially as
- a result.
- When you call a generator function, it doesn't return a single value; instead it
- returns a generator object that supports the iterator protocol. On executing
- the :keyword:`yield` statement, the generator outputs the value of ``i``,
- similar to a :keyword:`return` statement. The big difference between
- :keyword:`yield` and a :keyword:`return` statement is that on reaching a
- :keyword:`yield` the generator's state of execution is suspended and local
- variables are preserved. On the next call to the generator's ``.next()``
- method, the function will resume executing immediately after the
- :keyword:`yield` statement. (For complicated reasons, the :keyword:`yield`
- statement isn't allowed inside the :keyword:`try` block of a :keyword:`try`...\
- :keyword:`finally` statement; read :pep:`255` for a full explanation of the
- interaction between :keyword:`yield` and exceptions.)
- Here's a sample usage of the :func:`generate_ints` generator::
- >>> gen = generate_ints(3)
- >>> gen
- <generator object at 0x8117f90>
- >>> gen.next()
- 0
- >>> gen.next()
- 1
- >>> gen.next()
- 2
- >>> gen.next()
- Traceback (most recent call last):
- File "stdin", line 1, in ?
- File "stdin", line 2, in generate_ints
- StopIteration
- You could equally write ``for i in generate_ints(5)``, or ``a,b,c =
- generate_ints(3)``.
- Inside a generator function, the :keyword:`return` statement can only be used
- without a value, and signals the end of the procession of values; afterwards the
- generator cannot return any further values. :keyword:`return` with a value, such
- as ``return 5``, is a syntax error inside a generator function. The end of the
- generator's results can also be indicated by raising :exc:`StopIteration`
- manually, or by just letting the flow of execution fall off the bottom of the
- function.
- You could achieve the effect of generators manually by writing your own class
- and storing all the local variables of the generator as instance variables. For
- example, returning a list of integers could be done by setting ``self.count`` to
- 0, and having the :meth:`next` method increment ``self.count`` and return it.
- However, for a moderately complicated generator, writing a corresponding class
- would be much messier. :file:`Lib/test/test_generators.py` contains a number of
- more interesting examples. The simplest one implements an in-order traversal of
- a tree using generators recursively. ::
- # A recursive generator that generates Tree leaves in in-order.
- def inorder(t):
- if t:
- for x in inorder(t.left):
- yield x
- yield t.label
- for x in inorder(t.right):
- yield x
- Two other examples in :file:`Lib/test/test_generators.py` produce solutions for
- the N-Queens problem (placing $N$ queens on an $NxN$ chess board so that no
- queen threatens another) and the Knight's Tour (a route that takes a knight to
- every square of an $NxN$ chessboard without visiting any square twice).
- The idea of generators comes from other programming languages, especially Icon
- (http://www.cs.arizona.edu/icon/), where the idea of generators is central. In
- Icon, every expression and function call behaves like a generator. One example
- from "An Overview of the Icon Programming Language" at
- http://www.cs.arizona.edu/icon/docs/ipd266.htm gives an idea of what this looks
- like::
- sentence := "Store it in the neighboring harbor"
- if (i := find("or", sentence)) > 5 then write(i)
- In Icon the :func:`find` function returns the indexes at which the substring
- "or" is found: 3, 23, 33. In the :keyword:`if` statement, ``i`` is first
- assigned a value of 3, but 3 is less than 5, so the comparison fails, and Icon
- retries it with the second value of 23. 23 is greater than 5, so the comparison
- now succeeds, and the code prints the value 23 to the screen.
- Python doesn't go nearly as far as Icon in adopting generators as a central
- concept. Generators are considered part of the core Python language, but
- learning or using them isn't compulsory; if they don't solve any problems that
- you have, feel free to ignore them. One novel feature of Python's interface as
- compared to Icon's is that a generator's state is represented as a concrete
- object (the iterator) that can be passed around to other functions or stored in
- a data structure.
- .. seealso::
- :pep:`255` - Simple Generators
- Written by Neil Schemenauer, Tim Peters, Magnus Lie Hetland. Implemented mostly
- by Neil Schemenauer and Tim Peters, with other fixes from the Python Labs crew.
- .. ======================================================================
- .. _section-encodings:
- PEP 263: Source Code Encodings
- ==============================
- Python source files can now be declared as being in different character set
- encodings. Encodings are declared by including a specially formatted comment in
- the first or second line of the source file. For example, a UTF-8 file can be
- declared with::
- #!/usr/bin/env python
- # -*- coding: UTF-8 -*-
- Without such an encoding declaration, the default encoding used is 7-bit ASCII.
- Executing or importing modules that contain string literals with 8-bit
- characters and have no encoding declaration will result in a
- :exc:`DeprecationWarning` being signalled by Python 2.3; in 2.4 this will be a
- syntax error.
- The encoding declaration only affects Unicode string literals, which will be
- converted to Unicode using the specified encoding. Note that Python identifiers
- are still restricted to ASCII characters, so you can't have variable names that
- use characters outside of the usual alphanumerics.
- .. seealso::
- :pep:`263` - Defining Python Source Code Encodings
- Written by Marc-André Lemburg and Martin von Löwis; implemented by Suzuki Hisao
- and Martin von Löwis.
- .. ======================================================================
- PEP 273: Importing Modules from ZIP Archives
- ============================================
- The new :mod:`zipimport` module adds support for importing modules from a ZIP-
- format archive. You don't need to import the module explicitly; it will be
- automatically imported if a ZIP archive's filename is added to ``sys.path``.
- For example::
- amk@nyman:~/src/python$ unzip -l /tmp/example.zip
- Archive: /tmp/example.zip
- Length Date Time Name
- -------- ---- ---- ----
- 8467 11-26-02 22:30 jwzthreading.py
- -------- -------
- 8467 1 file
- amk@nyman:~/src/python$ ./python
- Python 2.3 (#1, Aug 1 2003, 19:54:32)
- >>> import sys
- >>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path
- >>> import jwzthreading
- >>> jwzthreading.__file__
- '/tmp/example.zip/jwzthreading.py'
- >>>
- An entry in ``sys.path`` can now be the filename of a ZIP archive. The ZIP
- archive can contain any kind of files, but only files named :file:`\*.py`,
- :file:`\*.pyc`, or :file:`\*.pyo` can be imported. If an archive only contains
- :file:`\*.py` files, Python will not attempt to modify the archive by adding the
- corresponding :file:`\*.pyc` file, meaning that if a ZIP archive doesn't contain
- :file:`\*.pyc` files, importing may be rather slow.
- A path within the archive can also be specified to only import from a
- subdirectory; for example, the path :file:`/tmp/example.zip/lib/` would only
- import from the :file:`lib/` subdirectory within the archive.
- .. seealso::
- :pep:`273` - Import Modules from Zip Archives
- Written by James C. Ahlstrom, who also provided an implementation. Python 2.3
- follows the specification in :pep:`273`, but uses an implementation written by
- Just van Rossum that uses the import hooks described in :pep:`302`. See section
- :ref:`section-pep302` for a description of the new import hooks.
- .. ======================================================================
- PEP 277: Unicode file name support for Windows NT
- =================================================
- On Windows NT, 2000, and XP, the system stores file names as Unicode strings.
- Traditionally, Python has represented file names as byte strings, which is
- inadequate because it renders some file names inaccessible.
- Python now allows using arbitrary Unicode strings (within the limitations of the
- file system) for all functions that expect file names, most notably the
- :func:`open` built-in function. If a Unicode string is passed to
- :func:`os.listdir`, Python now returns a list of Unicode strings. A new
- function, :func:`os.getcwdu`, returns the current directory as a Unicode string.
- Byte strings still work as file names, and on Windows Python will transparently
- convert them to Unicode using the ``mbcs`` encoding.
- Other systems also allow Unicode strings as file names but convert them to byte
- strings before passing them to the system, which can cause a :exc:`UnicodeError`
- to be raised. Applications can test whether arbitrary Unicode strings are
- supported as file names by checking :attr:`os.path.supports_unicode_filenames`,
- a Boolean value.
- Under MacOS, :func:`os.listdir` may now return Unicode filenames.
- .. seealso::
- :pep:`277` - Unicode file name support for Windows NT
- Written by Neil Hodgson; implemented by Neil Hodgson, Martin von Löwis, and Mark
- Hammond.
- .. ======================================================================
- PEP 278: Universal Newline Support
- ==================================
- The three major operating systems used today are Microsoft Windows, Apple's
- Macintosh OS, and the various Unix derivatives. A minor irritation of cross-
- platform work is that these three platforms all use different characters to
- mark the ends of lines in text files. Unix uses the linefeed (ASCII character
- 10), MacOS uses the carriage return (ASCII character 13), and Windows uses a
- two-character sequence of a carriage return plus a newline.
- Python's file objects can now support end of line conventions other than the one
- followed by the platform on which Python is running. Opening a file with the
- mode ``'U'`` or ``'rU'`` will open a file for reading in universal newline mode.
- All three line ending conventions will be translated to a ``'\n'`` in the
- strings returned by the various file methods such as :meth:`read` and
- :meth:`readline`.
- Universal newline support is also used when importing modules and when executing
- a file with the :func:`execfile` function. This means that Python modules can
- be shared between all three operating systems without needing to convert the
- line-endings.
- This feature can be disabled when compiling Python by specifying the
- :option:`--without-universal-newlines` switch when running Python's
- :program:`configure` script.
- .. seealso::
- :pep:`278` - Universal Newline Support
- Written and implemented by Jack Jansen.
- .. ======================================================================
- .. _section-enumerate:
- PEP 279: enumerate()
- ====================
- A new built-in function, :func:`enumerate`, will make certain loops a bit
- clearer. ``enumerate(thing)``, where *thing* is either an iterator or a
- sequence, returns a iterator that will return ``(0, thing[0])``, ``(1,
- thing[1])``, ``(2, thing[2])``, and so forth.
- A common idiom to change every element of a list looks like this::
- for i in range(len(L)):
- item = L[i]
- # ... compute some result based on item ...
- L[i] = result
- This can be rewritten using :func:`enumerate` as::
- for i, item in enumerate(L):
- # ... compute some result based on item ...
- L[i] = result
- .. seealso::
- :pep:`279` - The enumerate() built-in function
- Written and implemented by Raymond D. Hettinger.
- .. ======================================================================
- PEP 282: The logging Package
- ============================
- A standard package for writing logs, :mod:`logging`, has been added to Python
- 2.3. It provides a powerful and flexible mechanism for generating logging
- output which can then be filtered and processed in various ways. A
- configuration file written in a standard format can be used to control the
- logging behavior of a program. Python includes handlers that will write log
- records to standard error or to a file or socket, send them to the system log,
- or even e-mail them to a particular address; of course, it's also possible to
- write your own handler classes.
- The :class:`Logger` class is the primary class. Most application code will deal
- with one or more :class:`Logger` objects, each one used by a particular
- subsystem of the application. Each :class:`Logger` is identified by a name, and
- names are organized into a hierarchy using ``.`` as the component separator.
- For example, you might have :class:`Logger` instances named ``server``,
- ``server.auth`` and ``server.network``. The latter two instances are below
- ``server`` in the hierarchy. This means that if you turn up the verbosity for
- ``server`` or direct ``server`` messages to a different handler, the changes
- will also apply to records logged to ``server.auth`` and ``server.network``.
- There's also a root :class:`Logger` that's the parent of all other loggers.
- For simple uses, the :mod:`logging` package contains some convenience functions
- that always use the root log::
- import logging
- logging.debug('Debugging information')
- logging.info('Informational message')
- logging.warning('Warning:config file %s not found', 'server.conf')
- logging.error('Error occurred')
- logging.critical('Critical error -- shutting down')
- This produces the following output::
- WARNING:root:Warning:config file server.conf not found
- ERROR:root:Error occurred
- CRITICAL:root:Critical error -- shutting down
- In the default configuration, informational and debugging messages are
- suppressed and the output is sent to standard error. You can enable the display
- of informational and debugging messages by calling the :meth:`setLevel` method
- on the root logger.
- Notice the :func:`warning` call's use of string formatting operators; all of the
- functions for logging messages take the arguments ``(msg, arg1, arg2, ...)`` and
- log the string resulting from ``msg % (arg1, arg2, ...)``.
- There's also an :func:`exception` function that records the most recent
- traceback. Any of the other functions will also record the traceback if you
- specify a true value for the keyword argument *exc_info*. ::
- def f():
- try: 1/0
- except: logging.exception('Problem recorded')
- f()
- This produces the following output::
- ERROR:root:Problem recorded
- Traceback (most recent call last):
- File "t.py", line 6, in f
- 1/0
- ZeroDivisionError: integer division or modulo by zero
- Slightly more advanced programs will use a logger other than the root logger.
- The :func:`getLogger(name)` function is used to get a particular log, creating
- it if it doesn't exist yet. :func:`getLogger(None)` returns the root logger. ::
- log = logging.getLogger('server')
- ...
- log.info('Listening on port %i', port)
- ...
- log.critical('Disk full')
- ...
- Log records are usually propagated up the hierarchy, so a message logged to
- ``server.auth`` is also seen by ``server`` and ``root``, but a :class:`Logger`
- can prevent this by setting its :attr:`propagate` attribute to :const:`False`.
- There are more classes provided by the :mod:`logging` package that can be
- customized. When a :class:`Logger` instance is told to log a message, it
- creates a :class:`LogRecord` instance that is sent to any number of different
- :class:`Handler` instances. Loggers and handlers can also have an attached list
- of filters, and each filter can cause the :class:`LogRecord` to be ignored or
- can modify the record before passing it along. When they're finally output,
- :class:`LogRecord` instances are converted to text by a :class:`Formatter`
- class. All of these classes can be replaced by your own specially-written
- classes.
- With all of these features the :mod:`logging` package should provide enough
- flexibility for even the most complicated applications. This is only an
- incomplete overview of its features, so please see the package's reference
- documentation for all of the details. Reading :pep:`282` will also be helpful.
- .. seealso::
- :pep:`282` - A Logging System
- Written by Vinay Sajip and Trent Mick; implemented by Vinay Sajip.
- .. ======================================================================
- .. _section-bool:
- PEP 285: A Boolean Type
- =======================
- A Boolean type was added to Python 2.3. Two new constants were added to the
- :mod:`__builtin__` module, :const:`True` and :const:`False`. (:const:`True` and
- :const:`False` constants were added to the built-ins in Python 2.2.1, but the
- 2.2.1 versions are simply set to integer values of 1 and 0 and aren't a
- different type.)
- The type object for this new type is named :class:`bool`; the constructor for it
- takes any Python value and converts it to :const:`True` or :const:`False`. ::
- >>> bool(1)
- True
- >>> bool(0)
- False
- >>> bool([])
- False
- >>> bool( (1,) )
- True
- Most of the standard library modules and built-in functions have been changed to
- return Booleans. ::
- >>> obj = []
- >>> hasattr(obj, 'append')
- True
- >>> isinstance(obj, list)
- True
- >>> isinstance(obj, tuple)
- False
- Python's Booleans were added with the primary goal of making code clearer. For
- example, if you're reading a function and encounter the statement ``return 1``,
- you might wonder whether the ``1`` represents a Boolean truth value, an index,
- or a coefficient that multiplies some other quantity. If the statement is
- ``return True``, however, the meaning of the return value is quite clear.
- Python's Booleans were *not* added for the sake of strict type-checking. A very
- strict language such as Pascal would also prevent you performing arithmetic with
- Booleans, and would require that the expression in an :keyword:`if` statement
- always evaluate to a Boolean result. Python is not this strict and never will
- be, as :pep:`285` explicitly says. This means you can still use any expression
- in an :keyword:`if` statement, even ones that evaluate to a list or tuple or
- some random object. The Boolean type is a subclass of the :class:`int` class so
- that arithmetic using a Boolean still works. ::
- >>> True + 1
- 2
- >>> False + 1
- 1
- >>> False * 75
- 0
- >>> True * 75
- 75
- To sum up :const:`True` and :const:`False` in a sentence: they're alternative
- ways to spell the integer values 1 and 0, with the single difference that
- :func:`str` and :func:`repr` return the strings ``'True'`` and ``'False'``
- instead of ``'1'`` and ``'0'``.
- .. seealso::
- :pep:`285` - Adding a bool type
- Written and implemented by GvR.
- .. ======================================================================
- PEP 293: Codec Error Handling Callbacks
- =======================================
- When encoding a Unicode string into a byte string, unencodable characters may be
- encountered. So far, Python has allowed specifying the error processing as
- either "strict" (raising :exc:`UnicodeError`), "ignore" (skipping the
- character), or "replace" (using a question mark in the output string), with
- "strict" being the default behavior. It may be desirable to specify alternative
- processing of such errors, such as inserting an XML character reference or HTML
- entity reference into the converted string.
- Python now has a flexible framework to add different processing strategies. New
- error handlers can be added with :func:`codecs.register_error`, and codecs then
- can access the error handler with :func:`codecs.lookup_error`. An equivalent C
- API has been added for codecs written in C. The error handler gets the necessary
- state information such as the string being converted, the position in the string
- where the error was detected, and the target encoding. The handler can then
- either raise an exception or return a replacement string.
- Two additional error handlers have been implemented using this framework:
- "backslashreplace" uses Python backslash quoting to represent unencodable
- characters and "xmlcharrefreplace" emits XML character references.
- .. seealso::
- :pep:`293` - Codec Error Handling Callbacks
- Written and implemented by Walter Dörwald.
- .. ======================================================================
- .. _section-pep301:
- PEP 301: Package Index and Metadata for Distutils
- =================================================
- Support for the long-requested Python catalog makes its first appearance in 2.3.
- The heart of the catalog is the new Distutils :command:`register` command.
- Running ``python setup.py register`` will collect the metadata describing a
- package, such as its name, version, maintainer, description, &c., and send it to
- a central catalog server. The resulting catalog is available from
- http://www.python.org/pypi.
- To make the catalog a bit more useful, a new optional *classifiers* keyword
- argument has been added to the Distutils :func:`setup` function. A list of
- `Trove <http://catb.org/~esr/trove/>`_-style strings can be supplied to help
- classify the software.
- Here's an example :file:`setup.py` with classifiers, written to be compatible
- with older versions of the Distutils::
- from distutils import core
- kw = {'name': "Quixote",
- 'version': "0.5.1",
- 'description': "A highly Pythonic Web application framework",
- # ...
- }
- if (hasattr(core, 'setup_keywords') and
- 'classifiers' in core.setup_keywords):
- kw['classifiers'] = \
- ['Topic :: Internet :: WWW/HTTP :: Dynamic Content',
- 'Environment :: No Input/Output (Daemon)',
- 'Intended Audience :: Developers'],
- core.setup(**kw)
- The full list of classifiers can be obtained by running ``python setup.py
- register --list-classifiers``.
- .. seealso::
- :pep:`301` - Package Index and Metadata for Distutils
- Written and implemented by Richard Jones.
- .. ======================================================================
- .. _section-pep302:
- PEP 302: New Import Hooks
- =========================
- While it's been possible to write custom import hooks ever since the
- :mod:`ihooks` module was introduced in Python 1.3, no one has ever been really
- happy with it because writing new import hooks is difficult and messy. There
- have been various proposed alternatives such as the :mod:`imputil` and :mod:`iu`
- modules, but none of them has ever gained much acceptance, and none of them were
- easily usable from C code.
- :pep:`302` borrows ideas from its predecessors, especially from Gordon
- McMillan's :mod:`iu` module. Three new items are added to the :mod:`sys`
- module:
- * ``sys.path_hooks`` is a list of callable objects; most often they'll be
- classes. Each callable takes a string containing a path and either returns an
- importer object that will handle imports from this path or raises an
- :exc:`ImportError` exception if it can't handle this path.
- * ``sys.path_importer_cache`` caches importer objects for each path, so
- ``sys.path_hooks`` will only need to be traversed once for each path.
- * ``sys.meta_path`` is a list of importer objects that will be traversed before
- ``sys.path`` is checked. This list is initially empty, but user code can add
- objects to it. Additional built-in and frozen modules can be imported by an
- object added to this list.
- Importer objects must have a single method, :meth:`find_module(fullname,
- path=None)`. *fullname* will be a module or package name, e.g. ``string`` or
- ``distutils.core``. :meth:`find_module` must return a loader object that has a
- single method, :meth:`load_module(fullname)`, that creates and returns the
- corresponding module object.
- Pseudo-code for Python's new import logic, therefore, looks something like this
- (simplified a bit; see :pep:`302` for the full details)::
- for mp in sys.meta_path:
- loader = mp(fullname)
- if loader is not None:
- <module> = loader.load_module(fullname)
- for path in sys.path:
- for hook in sys.path_hooks:
- try:
- importer = hook(path)
- except ImportError:
- # ImportError, so try the other path hooks
- pass
- else:
- loader = importer.find_module(fullname)
- <module> = loader.load_module(fullname)
- # Not found!
- raise ImportError
- .. seealso::
- :pep:`302` - New Import Hooks
- Written by Just van Rossum and Paul Moore. Implemented by Just van Rossum.
- .. ======================================================================
- .. _section-pep305:
- PEP 305: Comma-separated Files
- ==============================
- Comma-separated files are a format frequently used for exporting data from
- databases and spreadsheets. Python 2.3 adds a parser for comma-separated files.
- Comma-separated format is deceptively simple at first glance::
- Costs,150,200,3.95
- Read a line and call ``line.split(',')``: what could be simpler? But toss in
- string data that can contain commas, and things get more complicated::
- "Costs",150,200,3.95,"Includes taxes, shipping, and sundry items"
- A big ugly regular expression can parse this, but using the new :mod:`csv`
- package is much simpler::
- import csv
- input = open('datafile', 'rb')
- reader = csv.reader(input)
- for line in reader:
- print line
- The :func:`reader` function takes a number of different options. The field
- separator isn't limited to the comma and can be changed to any character, and so
- can the quoting and line-ending characters.
- Different dialects of comma-separated files can be defined and registered;
- currently there are two dialects, both used by Microsoft Excel. A separate
- :class:`csv.writer` class will generate comma-separated files from a succession
- of tuples or lists, quoting strings that contain the delimiter.
- .. seealso::
- :pep:`305` - CSV File API
- Written and implemented by Kevin Altis, Dave Cole, Andrew McNamara, Skip
- Montanaro, Cliff Wells.
- .. ======================================================================
- .. _section-pep307:
- PEP 307: Pickle Enhancements
- ============================
- The :mod:`pickle` and :mod:`cPickle` modules received some attention during the
- 2.3 development cycle. In 2.2, new-style classes could be pickled without
- difficulty, but they weren't pickled very compactly; :pep:`307` quotes a trivial
- example where a new-style class results in a pickled string three times longer
- than that for a classic class.
- The solution was to invent a new pickle protocol. The :func:`pickle.dumps`
- function has supported a text-or-binary flag for a long time. In 2.3, this
- flag is redefined from a Boolean to an integer: 0 is the old text-mode pickle
- format, 1 is the old binary format, and now 2 is a new 2.3-specific format. A
- new constant, :const:`pickle.HIGHEST_PROTOCOL`, can be used to select the
- fanciest protocol available.
- Unpickling is no longer considered a safe operation. 2.2's :mod:`pickle`
- provided hooks for trying to prevent unsafe classes from being unpickled
- (specifically, a :attr:`__safe_for_unpickling__` attribute), but none of this
- code was ever audited and therefore it's all been ripped out in 2.3. You should
- not unpickle untrusted data in any version of Python.
- To reduce the pickling overhead for new-style classes, a new interface for
- customizing pickling was added using three special methods:
- :meth:`__getstate__`, :meth:`__setstate__`, and :meth:`__getnewargs__`. Consult
- :pep:`307` for the full semantics of these methods.
- As a way to compress pickles yet further, it's now possible to use integer codes
- instead of long strings to identify pickled classes. The Python Software
- Foundation will maintain a list of standardized codes; there's also a range of
- codes for private use. Currently no codes have been specified.
- .. seealso::
- :pep:`307` - Extensions to the pickle protocol
- Written and implemented by Guido van Rossum and Tim Peters.
- .. ======================================================================
- .. _section-slices:
- Extended Slices
- ===============
- Ever since Python 1.4, the slicing syntax has supported an optional third "step"
- or "stride" argument. For example, these are all legal Python syntax:
- ``L[1:10:2]``, ``L[:-1:1]``, ``L[::-1]``. This was added to Python at the
- request of the developers of Numerical Python, which uses the third argument
- extensively. However, Python's built-in list, tuple, and string sequence types
- have never supported this feature, raising a :exc:`TypeError` if you tried it.
- Michael Hudson contributed a patch to fix this shortcoming.
- For example, you can now easily extract the elements of a list that have even
- indexes::
- >>> L = range(10)
- >>> L[::2]
- [0, 2, 4, 6, 8]
- Negative values also work to make a copy of the same list in reverse order::
- >>> L[::-1]
- [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
- This also works for tuples, arrays, and strings::
- >>> s='abcd'
- >>> s[::2]
- 'ac'
- >>> s[::-1]
- 'dcba'
- If you have a mutable sequence such as a list or an array you can assign to or
- delete an extended slice, but there are some differences between assignment to
- extended and regular slices. Assignment to a regular slice can be used to
- change the length of the sequence::
- >>> a = range(3)
- >>> a
- [0, 1, 2]
- >>> a[1:3] = [4, 5, 6]
- >>> a
- [0, 4, 5, 6]
- Extended slices aren't this flexible. When assigning to an extended slice, the
- list on the right hand side of the statement must contain the same number of
- items as the slice it is replacing::
- >>> a = range(4)
- >>> a
- [0, 1, 2, 3]
- >>> a[::2]
- [0, 2]
- >>> a[::2] = [0, -1]
- >>> a
- [0, 1, -1, 3]
- >>> a[::2] = [0,1,2]
- Traceback (most recent call last):
- File "<stdin>", line 1, in ?
- ValueError: attempt to assign sequence of size 3 to extended slice of size 2
- Deletion is more straightforward::
- >>> a = range(4)
- >>> a
- [0, 1, 2, 3]
- >>> a[::2]
- [0, 2]
- >>> del a[::2]
- >>> a
- [1, 3]
- One can also now pass slice objects to the :meth:`__getitem__` methods of the
- built-in sequences::
- >>> range(10).__getitem__(slice(0, 5, 2))
- [0, 2, 4]
- Or use slice objects directly in subscripts::
- >>> range(10)[slice(0, 5, 2)]
- [0, 2, 4]
- To simplify implementing sequences that support extended slicing, slice objects
- now have a method :meth:`indices(length)` which, given the length of a sequence,
- returns a ``(start, stop, step)`` tuple that can be passed directly to
- :func:`range`. :meth:`indices` handles omitted and out-of-bounds indices in a
- manner consistent with regular slices (and this innocuous phrase hides a welter
- of confusing details!). The method is intended to be used like this::
- class FakeSeq:
- ...
- def calc_item(self, i):
- ...
- def __getitem__(self, item):
- if isinstance(item, slice):
- indices = item.indices(len(self))
- return FakeSeq([self.calc_item(i) for i in range(*indices)])
- else:
- return self.calc_item(i)
- From this example you can also see that the built-in :class:`slice` object is
- now the type object for the slice type, and is no longer a function. This is
- consistent with Python 2.2, where :class:`int`, :class:`str`, etc., underwent
- the same change.
- .. ======================================================================
- Other Language Changes
- ======================
- Here are all of the changes that Python 2.3 makes to the core Python language.
- * The :keyword:`yield` statement is now always a keyword, as described in
- section :ref:`section-generators` of this document.
- * A new built-in function :func:`enumerate` was added, as described in section
- :ref:`section-enumerate` of this document.
- * Two new constants, :const:`True` and :const:`False` were added along with the
- built-in :class:`bool` type, as described in section :ref:`section-bool` of this
- document.
- * The :func:`int` type constructor will now return a long integer instead of
- raising an :exc:`OverflowError` when a string or floating-point number is too
- large to fit into an integer. This can lead to the paradoxical result that
- ``isinstance(int(expression), int)`` is false, but that seems unlikely to cause
- problems in practice.
- * Built-in types now support the extended slicing syntax, as described in
- section :ref:`section-slices` of this document.
- * A new built-in function, :func:`sum(iterable, start=0)`, adds up the numeric
- items in the iterable object and returns their sum. :func:`sum` only accepts
- numbers, meaning that you can't use it to concatenate a bunch of strings.
- (Contributed by Alex Martelli.)
- * ``list.insert(pos, value)`` used to insert *value* at the front of the list
- when *pos* was negative. The behaviour has now been changed to be consistent
- with slice indexing, so when *pos* is -1 the value will be inserted before the
- last element, and so forth.
- * ``list.index(value)``, which searches for *value* within the list and returns
- its index, now takes optional *start* and *stop* arguments to limit the search
- to only part of the list.
- * Dictionaries have a new method, :meth:`pop(key[, *default*])`, that returns
- the value corresponding to *key* and removes that key/value pair from the
- dictionary. If the requested key isn't present in the dictionary, *default* is
- returned if it's specified and :exc:`KeyError` raised if it isn't. ::
- >>> d = {1:2}
- >>> d
- {1: 2}
- >>> d.pop(4)
- Traceback (most recent call last):
- File "stdin", line 1, in ?
- KeyError: 4
- >>> d.pop(1)
- 2
- >>> d.pop(1)
- Traceback (most recent call last):
- File "stdin", line 1, in ?
- KeyError: 'pop(): dictionary is empty'
- >>> d
- {}
- >>>
- There's also a new class method, :meth:`dict.fromkeys(iterable, value)`, that
- creates a dictionary with keys taken from the supplied iterator *iterable* and
- all values set to *value*, defaulting to ``None``.
- (Patches contributed by Raymond Hettinger.)
- Also, the :func:`dict` constructor now accepts keyword arguments to simplify
- creating small dictionaries::
- >>> dict(red=1, blue=2, green=3, black=4)
- {'blue': 2, 'black': 4, 'green': 3, 'red': 1}
- (Contributed by Just van Rossum.)
- * The :keyword:`assert` statement no longer checks the ``__debug__`` flag, so
- you can no longer disable assertions by assigning to ``__debug__``. Running
- Python with the :option:`-O` switch will still generate code that doesn't
- execute any assertions.
- * Most type objects are now callable, so you can use them to create new objects
- such as functions, classes, and modules. (This means that the :mod:`new` module
- can be deprecated in a future Python version, because you can now use the type
- objects available in the :mod:`types` module.) For example, you can create a new
- module object with the following code:
- ::
- >>> import types
- >>> m = types.ModuleType('abc','docstring')
- >>> m
- <module 'abc' (built-in)>
- >>> m.__doc__
- 'docstring'
- * A new warning, :exc:`PendingDeprecationWarning` was added to indicate features
- which are in the process of being deprecated. The warning will *not* be printed
- by default. To check for use of features that will be deprecated in the future,
- supply :option:`-Walways::PendingDeprecationWarning::` on the command line or
- use :func:`warnings.filterwarnings`.
- * The process of deprecating string-based exceptions, as in ``raise "Error
- occurred"``, has begun. Raising a string will now trigger
- :exc:`PendingDeprecationWarning`.
- * Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
- warning. In a future version of Python, ``None`` may finally become a keyword.
- * The :meth:`xreadlines` method of file objects, introduced in Python 2.1, is no
- longer necessary because files now behave as their own iterator.
- :meth:`xreadlines` was originally introduced as a faster way to loop over all
- the lines in a file, but now you can simply write ``for line in file_obj``.
- File objects also have a new read-only :attr:`encoding` attribute that gives the
- encoding used by the file; Unicode strings written to the file will be
- automatically converted to bytes using the given encoding.
- * The method resolution order used by new-style classes has changed, though
- you'll only notice the difference if you have a really complicated inheritance
- hierarchy. Classic classes are unaffected by this change. Python 2.2
- originally used a topological sort of a class's ancestors, but 2.3 now uses the
- C3 algorithm as described in the paper `"A Monotonic Superclass Linearization
- for Dylan" <http://www.webcom.com/haahr/dylan/linearization-oopsla96.html>`_. To
- understand the motivation for this change, read Michele Simionato's article
- `"Python 2.3 Method Resolution Order" <http://www.python.org/2.3/mro.html>`_, or
- read the thread on python-dev starting with the message at
- http://mail.python.org/pipermail/python-dev/2002-October/029035.html. Samuele
- Pedroni first pointed out the problem and also implemented the fix by coding the
- C3 algorithm.
- * Python runs multithreaded programs by switching between threads after
- executing N bytecodes. The default value for N has been increased from 10 to
- 100 bytecodes, speeding up single-threaded applications by reducing the
- switching overhead. Some multithreaded applications may suffer slower response
- time, but that's easily fixed by setting the limit back to a lower number using
- :func:`sys.setcheckinterval(N)`. The limit can be retrieved with the new
- :func:`sys.getcheckinterval` function.
- * One minor but far-reaching change is that the names of extension types defined
- by the modules included with Python now contain the module and a ``'.'`` in
- front of the type name. For example, in Python 2.2, if you created a socket and
- printed its :attr:`__class__`, you'd get this output::
- >>> s = socket.socket()
- >>> s.__class__
- <type 'socket'>
- In 2.3, you get this::
- >>> s.__class__
- <type '_socket.socket'>
- * One of the noted incompatibilities between old- and new-style classes has been
- removed: you can now assign to the :attr:`__name__` and :attr:`__bases__`
- attributes of new-style classes. There are some restrictions on what can be
- assigned to :attr:`__bases__` along the lines of those relating to assigning to
- an instance's :attr:`__class__` attribute.
- .. ======================================================================
- String Changes
- --------------
- * The :keyword:`in` operator now works differently for strings. Previously, when
- evaluating ``X in Y`` where *X* and *Y* are strings, *X* could only be a single
- character. That's now changed; *X* can be a string of any length, and ``X in Y``
- will return :const:`True` if *X* is a substring of *Y*. If *X* is the empty
- string, the result is always :const:`True`. ::
- >>> 'ab' in 'abcd'
- True
- >>> 'ad' in 'abcd'
- False
- >>> '' in 'abcd'
- True
- Note that this doesn't tell you where the substring starts; if you need that
- information, use the :meth:`find` string method.
- * The :meth:`strip`, :meth:`lstrip`, and :meth:`rstrip` string methods now have
- an optional argument for specifying the characters to strip. The default is
- still to remove all whitespace characters::
- >>> ' abc '.strip()
- 'abc'
- >>> '><><abc<><><>'.strip('<>')
- 'abc'
- >>> '><><abc<><><>\n'.strip('<>')
- 'abc<><><>\n'
- >>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
- u'\u4001abc'
- >>>
- (Suggested by Simon Brunning and implemented by Walter Dörwald.)
- * The :meth:`startswith` and :meth:`endswith` string methods now accept negative
- numbers for the *start* and *end* parameters.
- * Another new string method is :meth:`zfill`, originally a function in the
- :mod:`string` module. :meth:`zfill` pads a numeric string with zeros on the
- left until it's the specified width. Note that the ``%`` operator is still more
- flexible and powerful than :meth:`zfill`. ::
- >>> '45'.zfill(4)
- '0045'
- >>> '12345'.zfill(4)
- '12345'
- >>> 'goofy'.zfill(6)
- '0goofy'
- (Contributed by Walter Dörwald.)
- * A new type object, :class:`basestring`, has been added. Both 8-bit strings and
- Unicode strings inherit from this type, so ``isinstance(obj, basestring)`` will
- return :const:`True` for either kind of string. It's a completely abstract
- type, so you can't create :class:`basestring` instances.
- * Interned strings are no longer immortal and will now be garbage-collected in
- the usual way when the only reference to them is from the internal dictionary of
- interned strings. (Implemented by Oren Tirosh.)
- .. ======================================================================
- Optimizations
- -------------
- * The creation of new-style class instances has been made much faster; they're
- now faster than classic classes!
- * The :meth:`sort` method of list objects has been extensively rewritten by Tim
- Peters, and the implementation is significantly faster.
- * Multiplication of large long integers is now much faster thanks to an
- implementation of Karatsuba multiplication, an algorithm that scales better than
- the O(n\*n) required for the grade-school multiplication algorithm. (Original
- patch by Christopher A. Craig, and significantly reworked by Tim Peters.)
- * The ``SET_LINENO`` opcode is now gone. This may provide a small speed
- increase, depending on your compiler's idiosyncrasies. See section
- :ref:`section-other` for a longer explanation. (Removed by Michael Hudson.)
- * :func:`xrange` objects now have their own iterator, making ``for i in
- xrange(n)`` slightly faster than ``for i in range(n)``. (Patch by Raymond
- Hettinger.)
- * A number of small rearrangements have been made in various hotspots to improve
- performance, such as inlining a function or removing some code. (Implemented
- mostly by GvR, but lots of people have contributed single changes.)
- The net result of the 2.3 optimizations is that Python 2.3 runs the pystone
- benchmark around 25% faster than Python 2.2.
- .. ======================================================================
- New, Improved, and Deprecated Modules
- =====================================
- As usual, Python's standard library received a number of enhancements and bug
- fixes. Here's a partial list of the most notable changes, sorted alphabetically
- by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more
- complete list of changes, or look through the CVS logs for all the details.
- * The :mod:`array` module now supports arrays of Unicode characters using the
- ``'u'`` format character. Arrays also now support using the ``+=`` assignment
- operator to add another array's contents, and the ``*=`` assignment operator to
- repeat an array. (Contributed by Jason Orendorff.)
- * The :mod:`bsddb` module has been replaced by version 4.1.6 of the `PyBSDDB
- <http://pybsddb.sourceforge.net>`_ package, providing a more complete interface
- to the transactional features of the BerkeleyDB library.
- The old version of the module has been renamed to :mod:`bsddb185` and is no
- longer built automatically; you'll have to edit :file:`Modules/Setup` to enable
- it. Note that the new :mod:`bsddb` package is intended to be compatible with
- the old module, so be sure to file bugs if you discover any incompatibilities.
- When upgrading to Python 2.3, if the new interpreter is compiled with a new
- version of the underlying BerkeleyDB library, you will almost certainly have to
- convert your database files to the new version. You can do this fairly easily
- with the new scripts :file:`db2pickle.py` and :file:`pickle2db.py` which you
- will find in the distribution's :file:`Tools/scripts` directory. If you've
- already been using the PyBSDDB package and importing it as :mod:`bsddb3`, you
- will have to change your ``import`` statements to import it as :mod:`bsddb`.
- * The new :mod:`bz2` module is an interface to the bz2 data compression library.
- bz2-compressed data is usually smaller than corresponding :mod:`zlib`\
- -compressed data. (Contributed by Gustavo Niemeyer.)
- * A set of standard date/time types has been added in the new :mod:`datetime`
- module. See the following section for more details.
- * The Distutils :class:`Extension` class now supports an extra constructor
- argument named *depends* for listing additional source files that an extension
- depends on. This lets Distutils recompile the module if any of the dependency
- files are modified. For example, if :file:`sampmodule.c` includes the header
- file :file:`sample.h`, you would create the :class:`Extension` object like
- this::
- ext = Extension("samp",
- sources=["sampmodule.c"],
- depends=["sample.h"])
- Modifying :file:`sample.h` would then cause the module to be recompiled.
- (Contributed by Jeremy Hylton.)
- * Other minor changes to Distutils: it now checks for the :envvar:`CC`,
- :envvar:`CFLAGS`, :envvar:`CPP`, :envvar:`LDFLAGS`, and :envvar:`CPPFLAGS`
- environment variables, using them to override the settings in Python's
- configuration (contributed by Robert Weber).
- * Previously the :mod:`doctest` module would only search the docstrings of
- public methods and functions for test cases, but it now also examines private
- ones as well. The :func:`DocTestSuite(` function creates a
- :class:`unittest.TestSuite` object from a set of :mod:`doctest` tests.
- * The new :func:`gc.get_referents(object)` function returns a list of all the
- objects referenced by *object*.
- * The :mod:`getopt` module gained a new function, :func:`gnu_getopt`, that
- supports the same arguments as the existing :func:`getopt` function but uses
- GNU-style scanning mode. The existing :func:`getopt` stops processing options as
- soon as a non-option argument is encountered, but in GNU-style mode processing
- continues, meaning that options and arguments can be mixed. For example::
- >>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v')
- ([('-f', 'filename')], ['output', '-v'])
- >>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v')
- ([('-f', 'filename'), ('-v', '')], ['output'])
- (Contributed by Peter Ă…strand.)
- * The :mod:`grp`, :mod:`pwd`, and :mod:`resource` modules now return enhanced
- tuples::
- >>> import grp
- >>> g = grp.getgrnam('amk')
- >>> g.gr_name, g.gr_gid
- ('amk', 500)
- * The :mod:`gzip` module can now handle files exceeding 2 GiB.
- * The new :mod:`heapq` module contains an implementation of a heap queue
- algorithm. A heap is an array-like data structure that keeps items in a
- partially sorted order such that, for every index *k*, ``heap[k] <=
- heap[2*k+1]`` and ``heap[k] <= heap[2*k+2]``. This makes it quick to remove the
- smallest item, and inserting a new item while maintaining the heap property is
- O(lg n). (See http://www.nist.gov/dads/HTML/priorityque.html for more
- information about the priority queue data structure.)
- The :mod:`heapq` module provides :func:`heappush` and :func:`heappop` functions
- for adding and removing items while maintaining the heap property on top of some
- other mutable Python sequence type. Here's an example that uses a Python list::
- >>> import heapq
- >>> heap = []
- >>> for item in [3, 7, 5, 11, 1]:
- ... heapq.heappush(heap, item)
- ...
- >>> heap
- [1, 3, 5, 11, 7]
- >>> heapq.heappop(heap)
- 1
- >>> heapq.heappop(heap)
- 3
- >>> heap
- [5, 7, 11]
- (Contributed by Kevin O'Connor.)
- * The IDLE integrated development environment has been updated using the code
- from the IDLEfork project (http://idlefork.sf.net). The most notable feature is
- that the code being developed is now executed in a subprocess, meaning that
- there's no longer any need for manual ``reload()`` operations. IDLE's core code
- has been incorporated into the standard library as the :mod:`idlelib` package.
- * The :mod:`imaplib` module now supports IMAP over SSL. (Contributed by Piers
- Lauder and Tino Lange.)
- * The :mod:`itertools` contains a number of useful functions for use with
- iterators, inspired by various functions provided by the ML and Haskell
- languages. For example, ``itertools.ifilter(predicate, iterator)`` returns all
- elements in the iterator for which the function :func:`predicate` returns
- :const:`True`, and ``itertools.repeat(obj, N)`` returns ``obj`` *N* times.
- There are a number of other functions in the module; see the package's reference
- documentation for details.
- (Contributed by Raymond Hettinger.)
- * Two new functions in the :mod:`math` module, :func:`degrees(rads)` and
- :func:`radians(degs)`, convert between radians and degrees. Other functions in
- the :mod:`math` module such as :func:`math.sin` and :func:`math.cos` have always
- required input values measured in radians. Also, an optional *base* argument
- was added to :func:`math.log` to make it easier to compute logarithms for bases
- other than ``e`` and ``10``. (Contributed by Raymond Hettinger.)
- * Several new POSIX functions (:func:`getpgid`, :func:`killpg`, :func:`lchown`,
- :func:`loadavg`, :func:`major`, :func:`makedev`, :func:`minor`, and
- :func:`mknod`) were added to the :mod:`posix` module that underlies the
- :mod:`os` module. (Contributed by Gustavo Niemeyer, Geert Jansen, and Denis S.
- Otkidach.)
- * In the :mod:`os` module, the :func:`\*stat` family of functions can now report
- fractions of a second in a timestamp. Such time stamps are represented as
- floats, similar to the value returned by :func:`time.time`.
- During testing, it was found that some applications will break if time stamps
- are floats. For compatibility, when using the tuple interface of the
- :class:`stat_result` time stamps will be represented as integers. When using
- named fields (a feature first introduced in Python 2.2), time stamps are still
- represented as integers, unless :func:`os.stat_float_times` is invoked to enable
- float return values::
- >>> os.stat("/tmp").st_mtime
- 1034791200
- >>> os.stat_float_times(True)
- >>> os.stat("/tmp").st_mtime
- 1034791200.6335014
- In Python 2.4, the default will change to always returning floats.
- Application developers should enable this feature only if all their libraries
- work properly when confronted with floating point time stamps, or if they use
- the tuple API. If used, the feature should be activated on an application level
- instead of trying to enable it on a per-use basis.
- * The :mod:`optparse` module contains a new parser for command-line arguments
- that can convert option values to a particular Python type and will
- automatically generate a usage message. See the following section for more
- details.
- * The old and never-documented :mod:`linuxaudiodev` module has been deprecated,
- and a new version named :mod:`ossaudiodev` has been added. The module was
- renamed because the OSS sound drivers can be used on platforms other than Linux,
- and the interface has also been tidied and brought up to date in various ways.
- (Contributed by Greg Ward and Nicholas FitzRoy-Dale.)
- * The new :mod:`platform` module contains a number of functions that try to
- determine various properties of the platform you're running on. There are
- functions for getting the architecture, CPU type, the Windows OS version, and
- even the Linux distribution version. (Contributed by Marc-André Lemburg.)
- * The parser objects provided by the :mod:`pyexpat` module can now optionally
- buffer character data, resulting in fewer calls to your character data handler
- and therefore faster performance. Setting the parser object's
- :attr:`buffer_text` attribute to :const:`True` will enable buffering.
- * The :func:`sample(population, k)` function was added to the :mod:`random`
- module. *population* is a sequence or :class:`xrange` object containing the
- elements of a population, and :func:`sample` chooses *k* elements from the
- population without replacing chosen elements. *k* can be any value up to
- ``len(population)``. For example::
- >>> days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'St', 'Sn']
- >>> random.sample(days, 3) # Choose 3 elements
- ['St', 'Sn', 'Th']
- >>> random.sample(days, 7) # Choose 7 elements
- ['Tu', 'Th', 'Mo', 'We', 'St', 'Fr', 'Sn']
- >>> random.sample(days, 7) # Choose 7 again
- ['We', 'Mo', 'Sn', 'Fr', 'Tu', 'St', 'Th']
- >>> random.sample(days, 8) # Can't choose eight
- Traceback (most recent call last):
- File "<stdin>", line 1, in ?
- File "random.py", line 414, in sample
- raise ValueError, "sample larger than population"
- ValueError: sample larger than population
- >>> random.sample(xrange(1,10000,2), 10) # Choose ten odd nos. under 10000
- [3407, 3805, 1505, 7023, 2401, 2267, 9733, 3151, 8083, 9195]
- The :mod:`random` module now uses a new algorithm, the Mersenne Twister,
- implemented in C. It's faster and more extensively studied than the previous
- algorithm.
- (All changes contributed by Raymond Hettinger.)
- * The :mod:`readline` module also gained a number of new functions:
- :func:`get_history_item`, :func:`get_current_history_length`, and
- :func:`redisplay`.
- * The :mod:`rexec` and :mod:`Bastion` modules have been declared dead, and
- attempts to import them will fail with a :exc:`RuntimeError`. New-style classes
- provide new ways to break out of the restricted execution environment provided
- by :mod:`rexec`, and no one has interest in fixing them or time to do so. If
- you have applications using :mod:`rexec`, rewrite them to use something else.
- (Sticking with Python 2.2 or 2.1 will not make your applications any safer
- because there are known bugs in the :mod:`rexec` module in those versions. To
- repeat: if you're using :mod:`rexec`, stop using it immediately.)
- * The :mod:`rotor` module has been deprecated because the algorithm it uses for
- encryption is not believed to be secure. If you need encryption, use one of the
- several AES Python modules that are available separately.
- * The :mod:`shutil` module gained a :func:`move(src, dest)` function that
- recursively moves a file or directory to a new location.
- * Support for more advanced POSIX signal handling was added to the :mod:`signal`
- but then removed again as it proved impossible to make it work reliably across
- platforms.
- * The :mod:`socket` module now supports timeouts. You can call the
- :meth:`settimeout(t)` method on a socket object to set a timeout of *t* seconds.
- Subsequent socket operations that take longer than *t* seconds to complete will
- abort and raise a :exc:`socket.timeout` exception.
- The original timeout implementation was by Tim O'Malley. Michael Gilfix
- integrated it into the Python :mod:`socket` module and shepherded it through a
- lengthy review. After the code was checked in, Guido van Rossum rewrote parts
- of it. (This is a good example of a collaborative development process in
- action.)
- * On Windows, the :mod:`socket` module now ships with Secure Sockets Layer
- (SSL) support.
- * The value of the C :const:`PYTHON_API_VERSION` macro is now exposed at the
- Python level as ``sys.api_version``. The current exception can be cleared by
- calling the new :func:`sys.exc_clear` function.
- * The new :mod:`tarfile` module allows reading from and writing to
- :program:`tar`\ -format archive files. (Contributed by Lars Gustäbel.)
- * The new :mod:`textwrap` module contains functions for wrapping strings
- containing paragraphs of text. The :func:`wrap(text, width)` function takes a
- string and returns a list containing the text split into lines of no more than
- the chosen width. The :func:`fill(text, width)` function returns a single
- string, reformatted to fit into lines no longer than the chosen width. (As you
- can guess, :func:`fill` is built on top of :func:`wrap`. For example::
- >>> import textwrap
- >>> paragraph = "Not a whit, we defy augury: ... more text ..."
- >>> textwrap.wrap(paragraph, 60)
- ["Not a whit, we defy augury: there's a special providence in",
- "the fall of a sparrow. If it be now, 'tis not to come; if it",
- ...]
- >>> print textwrap.fill(paragraph, 35)
- Not a whit, we defy augury: there's
- a special providence in the fall of
- a sparrow. If it be now, 'tis not
- to come; if it be not to come, it
- will be now; if it be not now, yet
- it will come: the readiness is all.
- >>>
- The module also contains a :class:`TextWrapper` class that actually implements
- the text wrapping strategy. Both the :class:`TextWrapper` class and the
- :func:`wrap` and :func:`fill` functions support a number of additional keyword
- arguments for fine-tuning the formatting; consult the module's documentation
- for details. (Contributed by Greg Ward.)
- * The :mod:`thread` and :mod:`threading` modules now have companion modules,
- :mod:`dummy_thread` and :mod:`dummy_threading`, that provide a do-nothing
- implementation of the :mod:`thread` module's interface for platforms where
- threads are not supported. The intention is to simplify thread-aware modules
- (ones that *don't* rely on threads to run) by putting the following code at the
- top::
- try:
- import threading as _threading
- except ImportError:
- import dummy_threading as _threading
- In this example, :mod:`_threading` is used as the module name to make it clear
- that the module being used is not necessarily the actual :mod:`threading`
- module. Code can call functions and use classes in :mod:`_threading` whether or
- not threads are supported, avoiding an :keyword:`if` statement and making the
- code slightly clearer. This module will not magically make multithreaded code
- run without threads; code that waits for another thread to return or to do
- something will simply hang forever.
- * The :mod:`time` module's :func:`strptime` function has long been an annoyance
- because it uses the platform C library's :func:`strptime` implementation, and
- different platforms sometimes have odd bugs. Brett Cannon contributed a
- portable implementation that's written in pure Python and should behave
- identically on all platforms.
- * The new :mod:`timeit` module helps measure how long snippets of Python code
- take to execute. The :file:`timeit.py` file can be run directly from the
- command line, or the module's :class:`Timer` class can be imported and used
- directly. Here's a short example that figures out whether it's faster to
- convert an 8-bit string to Unicode by appending an empty Unicode string to it or
- by using the :func:`unicode` function::
- import timeit
- timer1 = timeit.Timer('unicode("abc")')
- timer2 = timeit.Timer('"abc" + u""')
- # Run three trials
- print timer1.repeat(repeat=3, number=100000)
- print timer2.repeat(repeat=3, number=100000)
- # On my laptop this outputs:
- # [0.36831796169281006, 0.37441694736480713, 0.35304892063140869]
- # [0.17574405670166016, 0.18193507194519043, 0.17565798759460449]
- * The :mod:`Tix` module has received various bug fixes and updates for the
- current version of the Tix package.
- * The :mod:`Tkinter` module now works with a thread-enabled version of Tcl.
- Tcl's threading model requires that widgets only be accessed from the thread in
- which they're created; accesses from another thread can cause Tcl to panic. For
- certain Tcl interfaces, :mod:`Tkinter` will now automatically avoid this when a
- widget is accessed from a different thread by marshalling a command, passing it
- to the correct thread, and waiting for the results. Other interfaces can't be
- handled automatically but :mod:`Tkinter` will now raise an exception on such an
- access so that you can at least find out about the problem. See
- http://mail.python.org/pipermail/python-dev/2002-December/031107.html for a more
- detailed explanation of this change. (Implemented by Martin von Löwis.)
- * Calling Tcl methods through :mod:`_tkinter` no longer returns only strings.
- Instead, if Tcl returns other objects those objects are converted to their
- Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
- object if no Python equivalent exists. This behavior can be controlled through
- the :meth:`wantobjects` method of :class:`tkapp` objects.
- When using :mod:`_tkinter` through the :mod:`Tkinter` module (as most Tkinter
- applications will), this feature is always activated. It should not cause
- compatibility problems, since Tkinter would always convert string results to
- Python types where possible.
- If any incompatibilities are found, the old behavior can be restored by setting
- the :attr:`wantobjects` variable in the :mod:`Tkinter` module to false before
- creating the first :class:`tkapp` object. ::
- import Tkinter
- Tkinter.wantobjects = 0
- Any breakage caused by this change should be reported as a bug.
- * The :mod:`UserDict` module has a new :class:`DictMixin` class which defines
- all dictionary methods for classes that already have a minimum mapping
- interface. This greatly simplifies writing classes that need to be
- substitutable for dictionaries, such as the classes in the :mod:`shelve`
- module.
- Adding the mix-in as a superclass provides the full dictionary interface
- whenever the class defines :meth:`__getitem__`, :meth:`__setitem__`,
- :meth:`__delitem__`, and :meth:`keys`. For example::
- >>> import UserDict
- >>> class SeqDict(UserDict.DictMixin):
- ... """Dictionary lookalike implemented with lists."""
- ... def __init__(self):
- ... self.keylist = []
- ... self.valuelist = []
- ... def __getitem__(self, key):
- ... try:
- ... i = self.keylist.index(key)
- ... except ValueError:
- ... raise KeyError
- ... return self.valuelist[i]
- ... def __setitem__(self, key, value):
- ... try:
- ... i = self.keylist.index(key)
- ... self.valuelist[i] = value
- ... except ValueError:
- ... self.keylist.append(key)
- ... self.valuelist.append(value)
- ... def __delitem__(self, key):
- ... try:
- ... i = self.keylist.index(key)
- ... except ValueError:
- ... raise KeyError
- ... self.keylist.pop(i)
- ... self.valuelist.pop(i)
- ... def keys(self):
- ... return list(self.keylist)
- ...
- >>> s = SeqDict()
- >>> dir(s) # See that other dictionary methods are implemented
- ['__cmp__', '__contains__', '__delitem__', '__doc__', '__getitem__',
- '__init__', '__iter__', '__len__', '__module__', '__repr__',
- '__setitem__', 'clear', 'get', 'has_key', 'items', 'iteritems',
- 'iterkeys', 'itervalues', 'keylist', 'keys', 'pop', 'popitem',
- 'setdefault', 'update', 'valuelist', 'values']
- (Contributed by Raymond Hettinger.)
- * The DOM implementation in :mod:`xml.dom.minidom` can now generate XML output
- in a particular encoding by providing an optional encoding argument to the
- :meth:`toxml` and :meth:`toprettyxml` methods of DOM nodes.
- * The :mod:`xmlrpclib` module now supports an XML-RPC extension for handling nil
- data values such as Python's ``None``. Nil values are always supported on
- unmarshalling an XML-RPC response. To generate requests containing ``None``,
- you must supply a true value for the *allow_none* parameter when creating a
- :class:`Marshaller` instance.
- * The new :mod:`DocXMLRPCServer` module allows writing self-documenting XML-RPC
- servers. Run it in demo mode (as a program) to see it in action. Pointing the
- Web browser to the RPC server produces pydoc-style documentation; pointing
- xmlrpclib to the server allows invoking the actual methods. (Contributed by
- Brian Quinlan.)
- * Support for internationalized domain names (RFCs 3454, 3490, 3491, and 3492)
- has been added. The "idna" encoding can be used to convert between a Unicode
- domain name and the ASCII-compatible encoding (ACE) of that name. ::
- >{}>{}> u"www.Alliancefrançaise.nu".encode("idna")
- 'www.xn--alliancefranaise-npb.nu'
- The :mod:`socket` module has also been extended to transparently convert
- Unicode hostnames to the ACE version before passing them to the C library.
- Modules that deal with hostnames such as :mod:`httplib` and :mod:`ftplib`)
- also support Unicode host names; :mod:`httplib` also sends HTTP ``Host``
- headers using the ACE version of the domain name. :mod:`urllib` supports
- Unicode URLs with non-ASCII host names as long as the ``path`` part of the URL
- is ASCII only.
- To implement this change, the :mod:`stringprep` module, the ``mkstringprep``
- tool and the ``punycode`` encoding have been added.
- .. ======================================================================
- Date/Time Type
- --------------
- Date and time types suitable for expressing timestamps were added as the
- :mod:`datetime` module. The types don't support different calendars or many
- fancy features, and just stick to the basics of representing time.
- The three primary types are: :class:`date`, representing a day, month, and year;
- :class:`time`, consisting of hour, minute, and second; and :class:`datetime`,
- which contains all the attributes of both :class:`date` and :class:`time`.
- There's also a :class:`timedelta` class representing differences between two
- points in time, and time zone logic is implemented by classes inheriting from
- the abstract :class:`tzinfo` class.
- You can create instances of :class:`date` and :class:`time` by either supplying
- keyword arguments to the appropriate constructor, e.g.
- ``datetime.date(year=1972, month=10, day=15)``, or by using one of a number of
- class methods. For example, the :meth:`date.today` class method returns the
- current local date.
- Once created, instances of the date/time classes are all immutable. There are a
- number of methods for producing formatted strings from objects::
- >>> import datetime
- >>> now = datetime.datetime.now()
- >>> now.isoformat()
- '2002-12-30T21:27:03.994956'
- >>> now.ctime() # Only available on date, datetime
- 'Mon Dec 30 21:27:03 2002'
- >>> now.strftime('%Y %d %b')
- '2002 30 Dec'
- The :meth:`replace` method allows modifying one or more fields of a
- :class:`date` or :class:`datetime` instance, returning a new instance::
- >>> d = datetime.datetime.now()
- >>> d
- datetime.datetime(2002, 12, 30, 22, 15, 38, 827738)
- >>> d.replace(year=2001, hour = 12)
- datetime.datetime(2001, 12, 30, 12, 15, 38, 827738)
- >>>
- Instances can be compared, hashed, and converted to strings (the result is the
- same as that of :meth:`isoformat`). :class:`date` and :class:`datetime`
- instances can be subtracted from each other, and added to :class:`timedelta`
- instances. The largest missing feature is that there's no standard library
- support for parsing strings and getting back a :class:`date` or
- :class:`datetime`.
- For more information, refer to the module's reference documentation.
- (Contributed by Tim Peters.)
- .. ======================================================================
- The optparse Module
- -------------------
- The :mod:`getopt` module provides simple parsing of command-line arguments. The
- new :mod:`optparse` module (originally named Optik) provides more elaborate
- command-line parsing that follows the Unix conventions, automatically creates
- the output for :option:`--help`, and can perform different actions for different
- options.
- You start by creating an instance of :class:`OptionParser` and telling it what
- your program's options are. ::
- import sys
- from optparse import OptionParser
- op = OptionParser()
- op.add_option('-i', '--input',
- action='store', type='string', dest='input',
- help='set input filename')
- op.add_option('-l', '--length',
- action='store', type='int', dest='length',
- help='set maximum length of output')
- Parsing a command line is then done by calling the :meth:`parse_args` method. ::
- options, args = op.parse_args(sys.argv[1:])
- print options
- print args
- This returns an object containing all of the option values, and a list of
- strings containing the remaining arguments.
- Invoking the script with the various arguments now works as you'd expect it to.
- Note that the length argument is automatically converted to an integer. ::
- $ ./python opt.py -i data arg1
- <Values at 0x400cad4c: {'input': 'data', 'length': None}>
- ['arg1']
- $ ./python opt.py --input=data --length=4
- <Values at 0x400cad2c: {'input': 'data', 'length': 4}>
- []
- $
- The help message is automatically generated for you::
- $ ./python opt.py --help
- usage: opt.py [options]
- options:
- -h, --help show this help message and exit
- -iINPUT, --input=INPUT
- set input filename
- -lLENGTH, --length=LENGTH
- set maximum length of output
- $
- See the module's documentation for more details.
- Optik was written by Greg Ward, with suggestions from the readers of the Getopt
- SIG.
- .. ======================================================================
- .. _section-pymalloc:
- Pymalloc: A Specialized Object Allocator
- ========================================
- Pymalloc, a specialized object allocator written by Vladimir Marangozov, was a
- feature added to Python 2.1. Pymalloc is intended to be faster than the system
- :cfunc:`malloc` and to have less memory overhead for allocation patterns typical
- of Python programs. The allocator uses C's :cfunc:`malloc` function to get large
- pools of memory and then fulfills smaller memory requests from these pools.
- In 2.1 and 2.2, pymalloc was an experimental feature and wasn't enabled by
- default; you had to explicitly enable it when compiling Python by providing the
- :option:`--with-pymalloc` option to the :program:`configure` script. In 2.3,
- pymalloc has had further enhancements and is now enabled by default; you'll have
- to supply :option:`--without-pymalloc` to disable it.
- This change is transparent to code written in Python; however, pymalloc may
- expose bugs in C extensions. Authors of C extension modules should test their
- code with pymalloc enabled, because some incorrect code may cause core dumps at
- runtime.
- There's one particularly common error that causes problems. There are a number
- of memory allocation functions in Python's C API that have previously just been
- aliases for the C library's :cfunc:`malloc` and :cfunc:`free`, meaning that if
- you accidentally called mismatched functions the error wouldn't be noticeable.
- When the object allocator is enabled, these functions aren't aliases of
- :cfunc:`malloc` and :cfunc:`free` any more, and calling the wrong function to
- free memory may get you a core dump. For example, if memory was allocated using
- :cfunc:`PyObject_Malloc`, it has to be freed using :cfunc:`PyObject_Free`, not
- :cfunc:`free`. A few modules included with Python fell afoul of this and had to
- be fixed; doubtless there are more third-party modules that will have the same
- problem.
- As part of this change, the confusing multiple interfaces for allocating memory
- have been consolidated down into two API families. Memory allocated with one
- family must not be manipulated with functions from the other family. There is
- one family for allocating chunks of memory and another family of functions
- specifically for allocating Python objects.
- * To allocate and free an undistinguished chunk of memory use the "raw memory"
- family: :cfunc:`PyMem_Malloc`, :cfunc:`PyMem_Realloc`, and :cfunc:`PyMem_Free`.
- * The "object memory" family is the interface to the pymalloc facility described
- above and is biased towards a large number of "small" allocations:
- :cfunc:`PyObject_Malloc`, :cfunc:`PyObject_Realloc`, and :cfunc:`PyObject_Free`.
- * To allocate and free Python objects, use the "object" family
- :cfunc:`PyObject_New`, :cfunc:`PyObject_NewVar`, and :cfunc:`PyObject_Del`.
- Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides debugging
- features to catch memory overwrites and doubled frees in both extension modules
- and in the interpreter itself. To enable this support, compile a debugging
- version of the Python interpreter by running :program:`configure` with
- :option:`--with-pydebug`.
- To aid extension writers, a header file :file:`Misc/pymemcompat.h` is
- distributed with the source to Python 2.3 that allows Python extensions to use
- the 2.3 interfaces to memory allocation while compiling against any version of
- Python since 1.5.2. You would copy the file from Python's source distribution
- and bundle it with the source of your extension.
- .. seealso::
- http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/obmalloc.c
- For the full details of the pymalloc implementation, see the comments at the top
- of the file :file:`Objects/obmalloc.c` in the Python source code. The above
- link points to the file within the SourceForge CVS browser.
- .. ======================================================================
- Build and C API Changes
- =======================
- Changes to Python's build process and to the C API include:
- * The cycle detection implementation used by the garbage collection has proven
- to be stable, so it's now been made mandatory. You can no longer compile Python
- without it, and the :option:`--with-cycle-gc` switch to :program:`configure` has
- been removed.
- * Python can now optionally be built as a shared library
- (:file:`libpython2.3.so`) by supplying :option:`--enable-shared` when running
- Python's :program:`configure` script. (Contributed by Ondrej Palkovsky.)
- * The :cmacro:`DL_EXPORT` and :cmacro:`DL_IMPORT` macros are now deprecated.
- Initialization functions for Python extension modules should now be declared
- using the new macro :cmacro:`PyMODINIT_FUNC`, while the Python core will
- generally use the :cmacro:`PyAPI_FUNC` and :cmacro:`PyAPI_DATA` macros.
- * The interpreter can be compiled without any docstrings for the built-in
- functions and modules by supplying :option:`--without-doc-strings` to the
- :program:`configure` script. This makes the Python executable about 10% smaller,
- but will also mean that you can't get help for Python's built-ins. (Contributed
- by Gustavo Niemeyer.)
- * The :cfunc:`PyArg_NoArgs` macro is now deprecated, and code that uses it
- should be changed. For Python 2.2 and later, the method definition table can
- specify the :const:`METH_NOARGS` flag, signalling that there are no arguments,
- and the argument checking can then be removed. If compatibility with pre-2.2
- versions of Python is important, the code could use ``PyArg_ParseTuple(args,
- "")`` instead, but this will be slower than using :const:`METH_NOARGS`.
- * :cfunc:`PyArg_ParseTuple` accepts new format characters for various sizes of
- unsigned integers: ``B`` for :ctype:`unsigned char`, ``H`` for :ctype:`unsigned
- short int`, ``I`` for :ctype:`unsigned int`, and ``K`` for :ctype:`unsigned
- long long`.
- * A new function, :cfunc:`PyObject_DelItemString(mapping, char \*key)` was added
- as shorthand for ``PyObject_DelItem(mapping, PyString_New(key))``.
- * File objects now manage their internal string buffer differently, increasing
- it exponentially when needed. This results in the benchmark tests in
- :file:`Lib/test/test_bufio.py` speeding up considerably (from 57 seconds to 1.7
- seconds, according to one measurement).
- * It's now possible to define class and static methods for a C extension type by
- setting either the :const:`METH_CLASS` or :const:`METH_STATIC` flags in a
- method's :ctype:`PyMethodDef` structure.
- * Python now includes a copy of the Expat XML parser's source code, removing any
- dependence on a system version or local installation of Expat.
- * If you dynamically allocate type objects in your extension, you should be
- aware of a change in the rules relating to the :attr:`__module__` and
- :attr:`__name__` attributes. In summary, you will want to ensure the type's
- dictionary contains a ``'__module__'`` key; making the module name the part of
- the type name leading up to the final period will no longer have the desired
- effect. For more detail, read the API reference documentation or the source.
- .. ======================================================================
- Port-Specific Changes
- ---------------------
- Support for a port to IBM's OS/2 using the EMX runtime environment was merged
- into the main Python source tree. EMX is a POSIX emulation layer over the OS/2
- system APIs. The Python port for EMX tries to support all the POSIX-like
- capability exposed by the EMX runtime, and mostly succeeds; :func:`fork` and
- :func:`fcntl` are restricted by the limitations of the underlying emulation
- layer. The standard OS/2 port, which uses IBM's Visual Age compiler, also
- gained support for case-sensitive import semantics as part of the integration of
- the EMX port into CVS. (Contributed by Andrew MacIntyre.)
- On MacOS, most toolbox modules have been weaklinked to improve backward
- compatibility. This means that modules will no longer fail to load if a single
- routine is missing on the current OS version. Instead calling the missing
- routine will raise an exception. (Contributed by Jack Jansen.)
- The RPM spec files, found in the :file:`Misc/RPM/` directory in the Python
- source distribution, were updated for 2.3. (Contributed by Sean Reifschneider.)
- Other new platforms now supported by Python include AtheOS
- (http://www.atheos.cx/), GNU/Hurd, and OpenVMS.
- .. ======================================================================
- .. _section-other:
- Other Changes and Fixes
- =======================
- As usual, there were a bunch of other improvements and bugfixes scattered
- throughout the source tree. A search through the CVS change logs finds there
- were 523 patches applied and 514 bugs fixed between Python 2.2 and 2.3. Both
- figures are likely to be underestimates.
- Some of the more notable changes are:
- * If the :envvar:`PYTHONINSPECT` environment variable is set, the Python
- interpreter will enter the interactive prompt after running a Python program, as
- if Python had been invoked with the :option:`-i` option. The environment
- variable can be set before running the Python interpreter, or it can be set by
- the Python program as part of its execution.
- * The :file:`regrtest.py` script now provides a way to allow "all resources
- except *foo*." A resource name passed to the :option:`-u` option can now be
- prefixed with a hyphen (``'-'``) to mean "remove this resource." For example,
- the option '``-uall,-bsddb``' could be used to enable the use of all resources
- except ``bsddb``.
- * The tools used to build the documentation now work under Cygwin as well as
- Unix.
- * The ``SET_LINENO`` opcode has been removed. Back in the mists of time, this
- opcode was needed to produce line numbers in tracebacks and support trace
- functions (for, e.g., :mod:`pdb`). Since Python 1.5, the line numbers in
- tracebacks have been computed using a different mechanism that works with
- "python -O". For Python 2.3 Michael Hudson implemented a similar scheme to
- determine when to call the trace function, removing the need for ``SET_LINENO``
- entirely.
- It would be difficult to detect any resulting difference from Python code, apart
- from a slight speed up when Python is run without :option:`-O`.
- C extensions that access the :attr:`f_lineno` field of frame objects should
- instead call ``PyCode_Addr2Line(f->f_code, f->f_lasti)``. This will have the
- added effect of making the code work as desired under "python -O" in earlier
- versions of Python.
- A nifty new feature is that trace functions can now assign to the
- :attr:`f_lineno` attribute of frame objects, changing the line that will be
- executed next. A ``jump`` command has been added to the :mod:`pdb` debugger
- taking advantage of this new feature. (Implemented by Richie Hindle.)
- .. ======================================================================
- Porting to Python 2.3
- =====================
- This section lists previously described changes that may require changes to your
- code:
- * :keyword:`yield` is now always a keyword; if it's used as a variable name in
- your code, a different name must be chosen.
- * For strings *X* and *Y*, ``X in Y`` now works if *X* is more than one
- character long.
- * The :func:`int` type constructor will now return a long integer instead of
- raising an :exc:`OverflowError` when a string or floating-point number is too
- large to fit into an integer.
- * If you have Unicode strings that contain 8-bit characters, you must declare
- the file's encoding (UTF-8, Latin-1, or whatever) by adding a comment to the top
- of the file. See section :ref:`section-encodings` for more information.
- * Calling Tcl methods through :mod:`_tkinter` no longer returns only strings.
- Instead, if Tcl returns other objects those objects are converted to their
- Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
- object if no Python equivalent exists.
- * Large octal and hex literals such as ``0xffffffff`` now trigger a
- :exc:`FutureWarning`. Currently they're stored as 32-bit numbers and result in a
- negative value, but in Python 2.4 they'll become positive long integers.
- There are a few ways to fix this warning. If you really need a positive number,
- just add an ``L`` to the end of the literal. If you're trying to get a 32-bit
- integer with low bits set and have previously used an expression such as ``~(1
- << 31)``, it's probably clearest to start with all bits set and clear the
- desired upper bits. For example, to clear just the top bit (bit 31), you could
- write ``0xffffffffL &~(1L<<31)``.
- * You can no longer disable assertions by assigning to ``__debug__``.
- * The Distutils :func:`setup` function has gained various new keyword arguments
- such as *depends*. Old versions of the Distutils will abort if passed unknown
- keywords. A solution is to check for the presence of the new
- :func:`get_distutil_options` function in your :file:`setup.py` and only uses the
- new keywords with a version of the Distutils that supports them::
- from distutils import core
- kw = {'sources': 'foo.c', ...}
- if hasattr(core, 'get_distutil_options'):
- kw['depends'] = ['foo.h']
- ext = Extension(**kw)
- * Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
- warning.
- * Names of extension types defined by the modules included with Python now
- contain the module and a ``'.'`` in front of the type name.
- .. ======================================================================
- .. _23acks:
- Acknowledgements
- ================
- The author would like to thank the following people for offering suggestions,
- corrections and assistance with various drafts of this article: Jeff Bauer,
- Simon Brunning, Brett Cannon, Michael Chermside, Andrew Dalke, Scott David
- Daniels, Fred L. Drake, Jr., David Fraser, Kelly Gerber, Raymond Hettinger,
- Michael Hudson, Chris Lambert, Detlef Lannert, Martin von Löwis, Andrew
- MacIntyre, Lalo Martins, Chad Netzer, Gustavo Niemeyer, Neal Norwitz, Hans
- Nowak, Chris Reedy, Francesco Ricciardi, Vinay Sajip, Neil Schemenauer, Roman
- Suzi, Jason Tishler, Just van Rossum.