PageRenderTime 812ms CodeModel.GetById 543ms app.highlight 15ms RepoModel.GetById 247ms app.codeStats 0ms

/Doc/tutorial/introduction.rst

http://unladen-swallow.googlecode.com/
ReStructuredText | 651 lines | 512 code | 139 blank | 0 comment | 0 complexity | 30f5aa970da0aee2332868fcab690810 MD5 | raw file
  1.. _tut-informal:
  2
  3**********************************
  4An Informal Introduction to Python
  5**********************************
  6
  7In the following examples, input and output are distinguished by the presence or
  8absence of prompts (``>>>`` and ``...``): to repeat the example, you must type
  9everything after the prompt, when the prompt appears; lines that do not begin
 10with a prompt are output from the interpreter. Note that a secondary prompt on a
 11line by itself in an example means you must type a blank line; this is used to
 12end a multi-line command.
 13
 14Many of the examples in this manual, even those entered at the interactive
 15prompt, include comments.  Comments in Python start with the hash character,
 16``#``, and extend to the end of the physical line.  A comment may appear at the
 17start of a line or following whitespace or code, but not within a string
 18literal.  A hash character within a string literal is just a hash character.
 19Since comments are to clarify code and are not interpreted by Python, they may
 20be omitted when typing in examples.
 21
 22Some examples::
 23
 24   # this is the first comment
 25   SPAM = 1                 # and this is the second comment
 26                            # ... and now a third!
 27   STRING = "# This is not a comment."
 28
 29
 30.. _tut-calculator:
 31
 32Using Python as a Calculator
 33============================
 34
 35Let's try some simple Python commands.  Start the interpreter and wait for the
 36primary prompt, ``>>>``.  (It shouldn't take long.)
 37
 38
 39.. _tut-numbers:
 40
 41Numbers
 42-------
 43
 44The interpreter acts as a simple calculator: you can type an expression at it
 45and it will write the value.  Expression syntax is straightforward: the
 46operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages
 47(for example, Pascal or C); parentheses can be used for grouping.  For example::
 48
 49   >>> 2+2
 50   4
 51   >>> # This is a comment
 52   ... 2+2
 53   4
 54   >>> 2+2  # and a comment on the same line as code
 55   4
 56   >>> (50-5*6)/4
 57   5
 58   >>> # Integer division returns the floor:
 59   ... 7/3
 60   2
 61   >>> 7/-3
 62   -3
 63
 64The equal sign (``'='``) is used to assign a value to a variable. Afterwards, no
 65result is displayed before the next interactive prompt::
 66
 67   >>> width = 20
 68   >>> height = 5*9
 69   >>> width * height
 70   900
 71
 72A value can be assigned to several variables simultaneously::
 73
 74   >>> x = y = z = 0  # Zero x, y and z
 75   >>> x
 76   0
 77   >>> y
 78   0
 79   >>> z
 80   0
 81
 82Variables must be "defined" (assigned a value) before they can be used, or an
 83error will occur::
 84
 85   >>> # try to access an undefined variable
 86   ... n
 87   Traceback (most recent call last):
 88     File "<stdin>", line 1, in <module>
 89   NameError: name 'n' is not defined
 90
 91There is full support for floating point; operators with mixed type operands
 92convert the integer operand to floating point::
 93
 94   >>> 3 * 3.75 / 1.5
 95   7.5
 96   >>> 7.0 / 2
 97   3.5
 98
 99Complex numbers are also supported; imaginary numbers are written with a suffix
100of ``j`` or ``J``.  Complex numbers with a nonzero real component are written as
101``(real+imagj)``, or can be created with the ``complex(real, imag)`` function.
102::
103
104   >>> 1j * 1J
105   (-1+0j)
106   >>> 1j * complex(0,1)
107   (-1+0j)
108   >>> 3+1j*3
109   (3+3j)
110   >>> (3+1j)*3
111   (9+3j)
112   >>> (1+2j)/(1+1j)
113   (1.5+0.5j)
114
115Complex numbers are always represented as two floating point numbers, the real
116and imaginary part.  To extract these parts from a complex number *z*, use
117``z.real`` and ``z.imag``.   ::
118
119   >>> a=1.5+0.5j
120   >>> a.real
121   1.5
122   >>> a.imag
123   0.5
124
125The conversion functions to floating point and integer (:func:`float`,
126:func:`int` and :func:`long`) don't work for complex numbers --- there is no one
127correct way to convert a complex number to a real number.  Use ``abs(z)`` to get
128its magnitude (as a float) or ``z.real`` to get its real part. ::
129
130   >>> a=3.0+4.0j
131   >>> float(a)
132   Traceback (most recent call last):
133     File "<stdin>", line 1, in ?
134   TypeError: can't convert complex to float; use abs(z)
135   >>> a.real
136   3.0
137   >>> a.imag
138   4.0
139   >>> abs(a)  # sqrt(a.real**2 + a.imag**2)
140   5.0
141   >>>
142
143In interactive mode, the last printed expression is assigned to the variable
144``_``.  This means that when you are using Python as a desk calculator, it is
145somewhat easier to continue calculations, for example::
146
147   >>> tax = 12.5 / 100
148   >>> price = 100.50
149   >>> price * tax
150   12.5625
151   >>> price + _
152   113.0625
153   >>> round(_, 2)
154   113.06
155   >>>
156
157This variable should be treated as read-only by the user.  Don't explicitly
158assign a value to it --- you would create an independent local variable with the
159same name masking the built-in variable with its magic behavior.
160
161
162.. _tut-strings:
163
164Strings
165-------
166
167Besides numbers, Python can also manipulate strings, which can be expressed in
168several ways.  They can be enclosed in single quotes or double quotes::
169
170   >>> 'spam eggs'
171   'spam eggs'
172   >>> 'doesn\'t'
173   "doesn't"
174   >>> "doesn't"
175   "doesn't"
176   >>> '"Yes," he said.'
177   '"Yes," he said.'
178   >>> "\"Yes,\" he said."
179   '"Yes," he said.'
180   >>> '"Isn\'t," she said.'
181   '"Isn\'t," she said.'
182
183String literals can span multiple lines in several ways.  Continuation lines can
184be used, with a backslash as the last character on the line indicating that the
185next line is a logical continuation of the line::
186
187   hello = "This is a rather long string containing\n\
188   several lines of text just as you would do in C.\n\
189       Note that whitespace at the beginning of the line is\
190    significant."
191
192   print hello
193
194Note that newlines still need to be embedded in the string using ``\n``; the
195newline following the trailing backslash is discarded.  This example would print
196the following::
197
198   This is a rather long string containing
199   several lines of text just as you would do in C.
200       Note that whitespace at the beginning of the line is significant.
201
202Or, strings can be surrounded in a pair of matching triple-quotes: ``"""`` or
203``'''``.  End of lines do not need to be escaped when using triple-quotes, but
204they will be included in the string. ::
205
206   print """
207   Usage: thingy [OPTIONS]
208        -h                        Display this usage message
209        -H hostname               Hostname to connect to
210   """
211
212produces the following output::
213
214   Usage: thingy [OPTIONS]
215        -h                        Display this usage message
216        -H hostname               Hostname to connect to
217
218If we make the string literal a "raw" string, ``\n`` sequences are not converted
219to newlines, but the backslash at the end of the line, and the newline character
220in the source, are both included in the string as data.  Thus, the example::
221
222   hello = r"This is a rather long string containing\n\
223   several lines of text much as you would do in C."
224
225   print hello
226
227would print::
228
229   This is a rather long string containing\n\
230   several lines of text much as you would do in C.
231
232The interpreter prints the result of string operations in the same way as they
233are typed for input: inside quotes, and with quotes and other funny characters
234escaped by backslashes, to show the precise value.  The string is enclosed in
235double quotes if the string contains a single quote and no double quotes, else
236it's enclosed in single quotes.  (The :keyword:`print` statement, described
237later, can be used to write strings without quotes or escapes.)
238
239Strings can be concatenated (glued together) with the ``+`` operator, and
240repeated with ``*``::
241
242   >>> word = 'Help' + 'A'
243   >>> word
244   'HelpA'
245   >>> '<' + word*5 + '>'
246   '<HelpAHelpAHelpAHelpAHelpA>'
247
248Two string literals next to each other are automatically concatenated; the first
249line above could also have been written ``word = 'Help' 'A'``; this only works
250with two literals, not with arbitrary string expressions::
251
252   >>> 'str' 'ing'                   #  <-  This is ok
253   'string'
254   >>> 'str'.strip() + 'ing'   #  <-  This is ok
255   'string'
256   >>> 'str'.strip() 'ing'     #  <-  This is invalid
257     File "<stdin>", line 1, in ?
258       'str'.strip() 'ing'
259                         ^
260   SyntaxError: invalid syntax
261
262Strings can be subscripted (indexed); like in C, the first character of a string
263has subscript (index) 0.  There is no separate character type; a character is
264simply a string of size one.  Like in Icon, substrings can be specified with the
265*slice notation*: two indices separated by a colon. ::
266
267   >>> word[4]
268   'A'
269   >>> word[0:2]
270   'He'
271   >>> word[2:4]
272   'lp'
273
274Slice indices have useful defaults; an omitted first index defaults to zero, an
275omitted second index defaults to the size of the string being sliced. ::
276
277   >>> word[:2]    # The first two characters
278   'He'
279   >>> word[2:]    # Everything except the first two characters
280   'lpA'
281
282Unlike a C string, Python strings cannot be changed.  Assigning to an indexed
283position in the string results in an error::
284
285   >>> word[0] = 'x'
286   Traceback (most recent call last):
287     File "<stdin>", line 1, in ?
288   TypeError: object does not support item assignment
289   >>> word[:1] = 'Splat'
290   Traceback (most recent call last):
291     File "<stdin>", line 1, in ?
292   TypeError: object does not support slice assignment
293
294However, creating a new string with the combined content is easy and efficient::
295
296   >>> 'x' + word[1:]
297   'xelpA'
298   >>> 'Splat' + word[4]
299   'SplatA'
300
301Here's a useful invariant of slice operations: ``s[:i] + s[i:]`` equals ``s``.
302::
303
304   >>> word[:2] + word[2:]
305   'HelpA'
306   >>> word[:3] + word[3:]
307   'HelpA'
308
309Degenerate slice indices are handled gracefully: an index that is too large is
310replaced by the string size, an upper bound smaller than the lower bound returns
311an empty string. ::
312
313   >>> word[1:100]
314   'elpA'
315   >>> word[10:]
316   ''
317   >>> word[2:1]
318   ''
319
320Indices may be negative numbers, to start counting from the right. For example::
321
322   >>> word[-1]     # The last character
323   'A'
324   >>> word[-2]     # The last-but-one character
325   'p'
326   >>> word[-2:]    # The last two characters
327   'pA'
328   >>> word[:-2]    # Everything except the last two characters
329   'Hel'
330
331But note that -0 is really the same as 0, so it does not count from the right!
332::
333
334   >>> word[-0]     # (since -0 equals 0)
335   'H'
336
337Out-of-range negative slice indices are truncated, but don't try this for
338single-element (non-slice) indices::
339
340   >>> word[-100:]
341   'HelpA'
342   >>> word[-10]    # error
343   Traceback (most recent call last):
344     File "<stdin>", line 1, in ?
345   IndexError: string index out of range
346
347One way to remember how slices work is to think of the indices as pointing
348*between* characters, with the left edge of the first character numbered 0.
349Then the right edge of the last character of a string of *n* characters has
350index *n*, for example::
351
352    +---+---+---+---+---+
353    | H | e | l | p | A |
354    +---+---+---+---+---+
355    0   1   2   3   4   5
356   -5  -4  -3  -2  -1
357
358The first row of numbers gives the position of the indices 0...5 in the string;
359the second row gives the corresponding negative indices. The slice from *i* to
360*j* consists of all characters between the edges labeled *i* and *j*,
361respectively.
362
363For non-negative indices, the length of a slice is the difference of the
364indices, if both are within bounds.  For example, the length of ``word[1:3]`` is
3652.
366
367The built-in function :func:`len` returns the length of a string::
368
369   >>> s = 'supercalifragilisticexpialidocious'
370   >>> len(s)
371   34
372
373
374.. seealso::
375
376   :ref:`typesseq`
377      Strings, and the Unicode strings described in the next section, are
378      examples of *sequence types*, and support the common operations supported
379      by such types.
380
381   :ref:`string-methods`
382      Both strings and Unicode strings support a large number of methods for
383      basic transformations and searching.
384
385   :ref:`new-string-formatting`
386      Information about string formatting with :meth:`str.format` is described
387      here.
388
389   :ref:`string-formatting`
390      The old formatting operations invoked when strings and Unicode strings are
391      the left operand of the ``%`` operator are described in more detail here.
392
393
394.. _tut-unicodestrings:
395
396Unicode Strings
397---------------
398
399.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com>
400
401
402Starting with Python 2.0 a new data type for storing text data is available to
403the programmer: the Unicode object. It can be used to store and manipulate
404Unicode data (see http://www.unicode.org/) and integrates well with the existing
405string objects, providing auto-conversions where necessary.
406
407Unicode has the advantage of providing one ordinal for every character in every
408script used in modern and ancient texts. Previously, there were only 256
409possible ordinals for script characters. Texts were typically bound to a code
410page which mapped the ordinals to script characters. This lead to very much
411confusion especially with respect to internationalization (usually written as
412``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software.  Unicode solves
413these problems by defining one code page for all scripts.
414
415Creating Unicode strings in Python is just as simple as creating normal
416strings::
417
418   >>> u'Hello World !'
419   u'Hello World !'
420
421The small ``'u'`` in front of the quote indicates that a Unicode string is
422supposed to be created. If you want to include special characters in the string,
423you can do so by using the Python *Unicode-Escape* encoding. The following
424example shows how::
425
426   >>> u'Hello\u0020World !'
427   u'Hello World !'
428
429The escape sequence ``\u0020`` indicates to insert the Unicode character with
430the ordinal value 0x0020 (the space character) at the given position.
431
432Other characters are interpreted by using their respective ordinal values
433directly as Unicode ordinals.  If you have literal strings in the standard
434Latin-1 encoding that is used in many Western countries, you will find it
435convenient that the lower 256 characters of Unicode are the same as the 256
436characters of Latin-1.
437
438For experts, there is also a raw mode just like the one for normal strings. You
439have to prefix the opening quote with 'ur' to have Python use the
440*Raw-Unicode-Escape* encoding. It will only apply the above ``\uXXXX``
441conversion if there is an uneven number of backslashes in front of the small
442'u'. ::
443
444   >>> ur'Hello\u0020World !'
445   u'Hello World !'
446   >>> ur'Hello\\u0020World !'
447   u'Hello\\\\u0020World !'
448
449The raw mode is most useful when you have to enter lots of backslashes, as can
450be necessary in regular expressions.
451
452Apart from these standard encodings, Python provides a whole set of other ways
453of creating Unicode strings on the basis of a known encoding.
454
455.. index:: builtin: unicode
456
457The built-in function :func:`unicode` provides access to all registered Unicode
458codecs (COders and DECoders). Some of the more well known encodings which these
459codecs can convert are *Latin-1*, *ASCII*, *UTF-8*, and *UTF-16*. The latter two
460are variable-length encodings that store each Unicode character in one or more
461bytes. The default encoding is normally set to ASCII, which passes through
462characters in the range 0 to 127 and rejects any other characters with an error.
463When a Unicode string is printed, written to a file, or converted with
464:func:`str`, conversion takes place using this default encoding. ::
465
466   >>> u"abc"
467   u'abc'
468   >>> str(u"abc")
469   'abc'
470   >>> u"äöü"
471   u'\xe4\xf6\xfc'
472   >>> str(u"äöü")
473   Traceback (most recent call last):
474     File "<stdin>", line 1, in ?
475   UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
476
477To convert a Unicode string into an 8-bit string using a specific encoding,
478Unicode objects provide an :func:`encode` method that takes one argument, the
479name of the encoding.  Lowercase names for encodings are preferred. ::
480
481   >>> u"äöü".encode('utf-8')
482   '\xc3\xa4\xc3\xb6\xc3\xbc'
483
484If you have data in a specific encoding and want to produce a corresponding
485Unicode string from it, you can use the :func:`unicode` function with the
486encoding name as the second argument. ::
487
488   >>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')
489   u'\xe4\xf6\xfc'
490
491
492.. _tut-lists:
493
494Lists
495-----
496
497Python knows a number of *compound* data types, used to group together other
498values.  The most versatile is the *list*, which can be written as a list of
499comma-separated values (items) between square brackets.  List items need not all
500have the same type. ::
501
502   >>> a = ['spam', 'eggs', 100, 1234]
503   >>> a
504   ['spam', 'eggs', 100, 1234]
505
506Like string indices, list indices start at 0, and lists can be sliced,
507concatenated and so on::
508
509   >>> a[0]
510   'spam'
511   >>> a[3]
512   1234
513   >>> a[-2]
514   100
515   >>> a[1:-1]
516   ['eggs', 100]
517   >>> a[:2] + ['bacon', 2*2]
518   ['spam', 'eggs', 'bacon', 4]
519   >>> 3*a[:3] + ['Boo!']
520   ['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!']
521
522Unlike strings, which are *immutable*, it is possible to change individual
523elements of a list::
524
525   >>> a
526   ['spam', 'eggs', 100, 1234]
527   >>> a[2] = a[2] + 23
528   >>> a
529   ['spam', 'eggs', 123, 1234]
530
531Assignment to slices is also possible, and this can even change the size of the
532list or clear it entirely::
533
534   >>> # Replace some items:
535   ... a[0:2] = [1, 12]
536   >>> a
537   [1, 12, 123, 1234]
538   >>> # Remove some:
539   ... a[0:2] = []
540   >>> a
541   [123, 1234]
542   >>> # Insert some:
543   ... a[1:1] = ['bletch', 'xyzzy']
544   >>> a
545   [123, 'bletch', 'xyzzy', 1234]
546   >>> # Insert (a copy of) itself at the beginning
547   >>> a[:0] = a
548   >>> a
549   [123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]
550   >>> # Clear the list: replace all items with an empty list
551   >>> a[:] = []
552   >>> a
553   []
554
555The built-in function :func:`len` also applies to lists::
556
557   >>> a = ['a', 'b', 'c', 'd']
558   >>> len(a)
559   4
560
561It is possible to nest lists (create lists containing other lists), for
562example::
563
564   >>> q = [2, 3]
565   >>> p = [1, q, 4]
566   >>> len(p)
567   3
568   >>> p[1]
569   [2, 3]
570   >>> p[1][0]
571   2
572   >>> p[1].append('xtra')     # See section 5.1
573   >>> p
574   [1, [2, 3, 'xtra'], 4]
575   >>> q
576   [2, 3, 'xtra']
577
578Note that in the last example, ``p[1]`` and ``q`` really refer to the same
579object!  We'll come back to *object semantics* later.
580
581
582.. _tut-firststeps:
583
584First Steps Towards Programming
585===============================
586
587Of course, we can use Python for more complicated tasks than adding two and two
588together.  For instance, we can write an initial sub-sequence of the *Fibonacci*
589series as follows::
590
591   >>> # Fibonacci series:
592   ... # the sum of two elements defines the next
593   ... a, b = 0, 1
594   >>> while b < 10:
595   ...     print b
596   ...     a, b = b, a+b
597   ...
598   1
599   1
600   2
601   3
602   5
603   8
604
605This example introduces several new features.
606
607* The first line contains a *multiple assignment*: the variables ``a`` and ``b``
608  simultaneously get the new values 0 and 1.  On the last line this is used again,
609  demonstrating that the expressions on the right-hand side are all evaluated
610  first before any of the assignments take place.  The right-hand side expressions
611  are evaluated  from the left to the right.
612
613* The :keyword:`while` loop executes as long as the condition (here: ``b < 10``)
614  remains true.  In Python, like in C, any non-zero integer value is true; zero is
615  false.  The condition may also be a string or list value, in fact any sequence;
616  anything with a non-zero length is true, empty sequences are false.  The test
617  used in the example is a simple comparison.  The standard comparison operators
618  are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==``
619  (equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to)
620  and ``!=`` (not equal to).
621
622* The *body* of the loop is *indented*: indentation is Python's way of grouping
623  statements.  Python does not (yet!) provide an intelligent input line editing
624  facility, so you have to type a tab or space(s) for each indented line.  In
625  practice you will prepare more complicated input for Python with a text editor;
626  most text editors have an auto-indent facility.  When a compound statement is
627  entered interactively, it must be followed by a blank line to indicate
628  completion (since the parser cannot guess when you have typed the last line).
629  Note that each line within a basic block must be indented by the same amount.
630
631* The :keyword:`print` statement writes the value of the expression(s) it is
632  given.  It differs from just writing the expression you want to write (as we did
633  earlier in the calculator examples) in the way it handles multiple expressions
634  and strings.  Strings are printed without quotes, and a space is inserted
635  between items, so you can format things nicely, like this::
636
637     >>> i = 256*256
638     >>> print 'The value of i is', i
639     The value of i is 65536
640
641  A trailing comma avoids the newline after the output::
642
643     >>> a, b = 0, 1
644     >>> while b < 1000:
645     ...     print b,
646     ...     a, b = b, a+b
647     ...
648     1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
649
650  Note that the interpreter inserts a newline before it prints the next prompt if
651  the last line was not completed.