/Doc/tutorial/datastructures.rst

http://unladen-swallow.googlecode.com/ · ReStructuredText · 667 lines · 505 code · 162 blank · 0 comment · 0 complexity · 2b4a9dbb027faa4d43bf15c696afe92e MD5 · raw file

  1. .. _tut-structures:
  2. ***************
  3. Data Structures
  4. ***************
  5. This chapter describes some things you've learned about already in more detail,
  6. and adds some new things as well.
  7. .. _tut-morelists:
  8. More on Lists
  9. =============
  10. The list data type has some more methods. Here are all of the methods of list
  11. objects:
  12. .. method:: list.append(x)
  13. :noindex:
  14. Add an item to the end of the list; equivalent to ``a[len(a):] = [x]``.
  15. .. method:: list.extend(L)
  16. :noindex:
  17. Extend the list by appending all the items in the given list; equivalent to
  18. ``a[len(a):] = L``.
  19. .. method:: list.insert(i, x)
  20. :noindex:
  21. Insert an item at a given position. The first argument is the index of the
  22. element before which to insert, so ``a.insert(0, x)`` inserts at the front of
  23. the list, and ``a.insert(len(a), x)`` is equivalent to ``a.append(x)``.
  24. .. method:: list.remove(x)
  25. :noindex:
  26. Remove the first item from the list whose value is *x*. It is an error if there
  27. is no such item.
  28. .. method:: list.pop([i])
  29. :noindex:
  30. Remove the item at the given position in the list, and return it. If no index
  31. is specified, ``a.pop()`` removes and returns the last item in the list. (The
  32. square brackets around the *i* in the method signature denote that the parameter
  33. is optional, not that you should type square brackets at that position. You
  34. will see this notation frequently in the Python Library Reference.)
  35. .. method:: list.index(x)
  36. :noindex:
  37. Return the index in the list of the first item whose value is *x*. It is an
  38. error if there is no such item.
  39. .. method:: list.count(x)
  40. :noindex:
  41. Return the number of times *x* appears in the list.
  42. .. method:: list.sort()
  43. :noindex:
  44. Sort the items of the list, in place.
  45. .. method:: list.reverse()
  46. :noindex:
  47. Reverse the elements of the list, in place.
  48. An example that uses most of the list methods::
  49. >>> a = [66.25, 333, 333, 1, 1234.5]
  50. >>> print a.count(333), a.count(66.25), a.count('x')
  51. 2 1 0
  52. >>> a.insert(2, -1)
  53. >>> a.append(333)
  54. >>> a
  55. [66.25, 333, -1, 333, 1, 1234.5, 333]
  56. >>> a.index(333)
  57. 1
  58. >>> a.remove(333)
  59. >>> a
  60. [66.25, -1, 333, 1, 1234.5, 333]
  61. >>> a.reverse()
  62. >>> a
  63. [333, 1234.5, 1, 333, -1, 66.25]
  64. >>> a.sort()
  65. >>> a
  66. [-1, 1, 66.25, 333, 333, 1234.5]
  67. .. _tut-lists-as-stacks:
  68. Using Lists as Stacks
  69. ---------------------
  70. .. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
  71. The list methods make it very easy to use a list as a stack, where the last
  72. element added is the first element retrieved ("last-in, first-out"). To add an
  73. item to the top of the stack, use :meth:`append`. To retrieve an item from the
  74. top of the stack, use :meth:`pop` without an explicit index. For example::
  75. >>> stack = [3, 4, 5]
  76. >>> stack.append(6)
  77. >>> stack.append(7)
  78. >>> stack
  79. [3, 4, 5, 6, 7]
  80. >>> stack.pop()
  81. 7
  82. >>> stack
  83. [3, 4, 5, 6]
  84. >>> stack.pop()
  85. 6
  86. >>> stack.pop()
  87. 5
  88. >>> stack
  89. [3, 4]
  90. .. _tut-lists-as-queues:
  91. Using Lists as Queues
  92. ---------------------
  93. .. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
  94. It is also possible to use a list as a queue, where the first element added is
  95. the first element retrieved ("first-in, first-out"); however, lists are not
  96. efficient for this purpose. While appends and pops from the end of list are
  97. fast, doing inserts or pops from the beginning of a list is slow (because all
  98. of the other elements have to be shifted by one).
  99. To implement a queue, use :class:`collections.deque` which was designed to
  100. have fast appends and pops from both ends. For example::
  101. >>> from collections import deque
  102. >>> queue = deque(["Eric", "John", "Michael"])
  103. >>> queue.append("Terry") # Terry arrives
  104. >>> queue.append("Graham") # Graham arrives
  105. >>> queue.popleft() # The first to arrive now leaves
  106. 'Eric'
  107. >>> queue.popleft() # The second to arrive now leaves
  108. 'John'
  109. >>> queue # Remaining queue in order of arrival
  110. deque(['Michael', 'Terry', 'Graham'])
  111. .. _tut-functional:
  112. Functional Programming Tools
  113. ----------------------------
  114. There are three built-in functions that are very useful when used with lists:
  115. :func:`filter`, :func:`map`, and :func:`reduce`.
  116. ``filter(function, sequence)`` returns a sequence consisting of those items from
  117. the sequence for which ``function(item)`` is true. If *sequence* is a
  118. :class:`string` or :class:`tuple`, the result will be of the same type;
  119. otherwise, it is always a :class:`list`. For example, to compute some primes::
  120. >>> def f(x): return x % 2 != 0 and x % 3 != 0
  121. ...
  122. >>> filter(f, range(2, 25))
  123. [5, 7, 11, 13, 17, 19, 23]
  124. ``map(function, sequence)`` calls ``function(item)`` for each of the sequence's
  125. items and returns a list of the return values. For example, to compute some
  126. cubes::
  127. >>> def cube(x): return x*x*x
  128. ...
  129. >>> map(cube, range(1, 11))
  130. [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
  131. More than one sequence may be passed; the function must then have as many
  132. arguments as there are sequences and is called with the corresponding item from
  133. each sequence (or ``None`` if some sequence is shorter than another). For
  134. example::
  135. >>> seq = range(8)
  136. >>> def add(x, y): return x+y
  137. ...
  138. >>> map(add, seq, seq)
  139. [0, 2, 4, 6, 8, 10, 12, 14]
  140. ``reduce(function, sequence)`` returns a single value constructed by calling the
  141. binary function *function* on the first two items of the sequence, then on the
  142. result and the next item, and so on. For example, to compute the sum of the
  143. numbers 1 through 10::
  144. >>> def add(x,y): return x+y
  145. ...
  146. >>> reduce(add, range(1, 11))
  147. 55
  148. If there's only one item in the sequence, its value is returned; if the sequence
  149. is empty, an exception is raised.
  150. A third argument can be passed to indicate the starting value. In this case the
  151. starting value is returned for an empty sequence, and the function is first
  152. applied to the starting value and the first sequence item, then to the result
  153. and the next item, and so on. For example, ::
  154. >>> def sum(seq):
  155. ... def add(x,y): return x+y
  156. ... return reduce(add, seq, 0)
  157. ...
  158. >>> sum(range(1, 11))
  159. 55
  160. >>> sum([])
  161. 0
  162. Don't use this example's definition of :func:`sum`: since summing numbers is
  163. such a common need, a built-in function ``sum(sequence)`` is already provided,
  164. and works exactly like this.
  165. .. versionadded:: 2.3
  166. List Comprehensions
  167. -------------------
  168. List comprehensions provide a concise way to create lists without resorting to
  169. use of :func:`map`, :func:`filter` and/or :keyword:`lambda`. The resulting list
  170. definition tends often to be clearer than lists built using those constructs.
  171. Each list comprehension consists of an expression followed by a :keyword:`for`
  172. clause, then zero or more :keyword:`for` or :keyword:`if` clauses. The result
  173. will be a list resulting from evaluating the expression in the context of the
  174. :keyword:`for` and :keyword:`if` clauses which follow it. If the expression
  175. would evaluate to a tuple, it must be parenthesized. ::
  176. >>> freshfruit = [' banana', ' loganberry ', 'passion fruit ']
  177. >>> [weapon.strip() for weapon in freshfruit]
  178. ['banana', 'loganberry', 'passion fruit']
  179. >>> vec = [2, 4, 6]
  180. >>> [3*x for x in vec]
  181. [6, 12, 18]
  182. >>> [3*x for x in vec if x > 3]
  183. [12, 18]
  184. >>> [3*x for x in vec if x < 2]
  185. []
  186. >>> [[x,x**2] for x in vec]
  187. [[2, 4], [4, 16], [6, 36]]
  188. >>> [x, x**2 for x in vec] # error - parens required for tuples
  189. File "<stdin>", line 1, in ?
  190. [x, x**2 for x in vec]
  191. ^
  192. SyntaxError: invalid syntax
  193. >>> [(x, x**2) for x in vec]
  194. [(2, 4), (4, 16), (6, 36)]
  195. >>> vec1 = [2, 4, 6]
  196. >>> vec2 = [4, 3, -9]
  197. >>> [x*y for x in vec1 for y in vec2]
  198. [8, 6, -18, 16, 12, -36, 24, 18, -54]
  199. >>> [x+y for x in vec1 for y in vec2]
  200. [6, 5, -7, 8, 7, -5, 10, 9, -3]
  201. >>> [vec1[i]*vec2[i] for i in range(len(vec1))]
  202. [8, 12, -54]
  203. List comprehensions are much more flexible than :func:`map` and can be applied
  204. to complex expressions and nested functions::
  205. >>> [str(round(355/113.0, i)) for i in range(1,6)]
  206. ['3.1', '3.14', '3.142', '3.1416', '3.14159']
  207. Nested List Comprehensions
  208. --------------------------
  209. If you've got the stomach for it, list comprehensions can be nested. They are a
  210. powerful tool but -- like all powerful tools -- they need to be used carefully,
  211. if at all.
  212. Consider the following example of a 3x3 matrix held as a list containing three
  213. lists, one list per row::
  214. >>> mat = [
  215. ... [1, 2, 3],
  216. ... [4, 5, 6],
  217. ... [7, 8, 9],
  218. ... ]
  219. Now, if you wanted to swap rows and columns, you could use a list
  220. comprehension::
  221. >>> print [[row[i] for row in mat] for i in [0, 1, 2]]
  222. [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
  223. Special care has to be taken for the *nested* list comprehension:
  224. To avoid apprehension when nesting list comprehensions, read from right to
  225. left.
  226. A more verbose version of this snippet shows the flow explicitly::
  227. for i in [0, 1, 2]:
  228. for row in mat:
  229. print row[i],
  230. print
  231. In real world, you should prefer built-in functions to complex flow statements.
  232. The :func:`zip` function would do a great job for this use case::
  233. >>> zip(*mat)
  234. [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
  235. See :ref:`tut-unpacking-arguments` for details on the asterisk in this line.
  236. .. _tut-del:
  237. The :keyword:`del` statement
  238. ============================
  239. There is a way to remove an item from a list given its index instead of its
  240. value: the :keyword:`del` statement. This differs from the :meth:`pop` method
  241. which returns a value. The :keyword:`del` statement can also be used to remove
  242. slices from a list or clear the entire list (which we did earlier by assignment
  243. of an empty list to the slice). For example::
  244. >>> a = [-1, 1, 66.25, 333, 333, 1234.5]
  245. >>> del a[0]
  246. >>> a
  247. [1, 66.25, 333, 333, 1234.5]
  248. >>> del a[2:4]
  249. >>> a
  250. [1, 66.25, 1234.5]
  251. >>> del a[:]
  252. >>> a
  253. []
  254. :keyword:`del` can also be used to delete entire variables::
  255. >>> del a
  256. Referencing the name ``a`` hereafter is an error (at least until another value
  257. is assigned to it). We'll find other uses for :keyword:`del` later.
  258. .. _tut-tuples:
  259. Tuples and Sequences
  260. ====================
  261. We saw that lists and strings have many common properties, such as indexing and
  262. slicing operations. They are two examples of *sequence* data types (see
  263. :ref:`typesseq`). Since Python is an evolving language, other sequence data
  264. types may be added. There is also another standard sequence data type: the
  265. *tuple*.
  266. A tuple consists of a number of values separated by commas, for instance::
  267. >>> t = 12345, 54321, 'hello!'
  268. >>> t[0]
  269. 12345
  270. >>> t
  271. (12345, 54321, 'hello!')
  272. >>> # Tuples may be nested:
  273. ... u = t, (1, 2, 3, 4, 5)
  274. >>> u
  275. ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
  276. As you see, on output tuples are always enclosed in parentheses, so that nested
  277. tuples are interpreted correctly; they may be input with or without surrounding
  278. parentheses, although often parentheses are necessary anyway (if the tuple is
  279. part of a larger expression).
  280. Tuples have many uses. For example: (x, y) coordinate pairs, employee records
  281. from a database, etc. Tuples, like strings, are immutable: it is not possible
  282. to assign to the individual items of a tuple (you can simulate much of the same
  283. effect with slicing and concatenation, though). It is also possible to create
  284. tuples which contain mutable objects, such as lists.
  285. A special problem is the construction of tuples containing 0 or 1 items: the
  286. syntax has some extra quirks to accommodate these. Empty tuples are constructed
  287. by an empty pair of parentheses; a tuple with one item is constructed by
  288. following a value with a comma (it is not sufficient to enclose a single value
  289. in parentheses). Ugly, but effective. For example::
  290. >>> empty = ()
  291. >>> singleton = 'hello', # <-- note trailing comma
  292. >>> len(empty)
  293. 0
  294. >>> len(singleton)
  295. 1
  296. >>> singleton
  297. ('hello',)
  298. The statement ``t = 12345, 54321, 'hello!'`` is an example of *tuple packing*:
  299. the values ``12345``, ``54321`` and ``'hello!'`` are packed together in a tuple.
  300. The reverse operation is also possible::
  301. >>> x, y, z = t
  302. This is called, appropriately enough, *sequence unpacking* and works for any
  303. sequence on the right-hand side. Sequence unpacking requires the list of
  304. variables on the left to have the same number of elements as the length of the
  305. sequence. Note that multiple assignment is really just a combination of tuple
  306. packing and sequence unpacking.
  307. .. XXX Add a bit on the difference between tuples and lists.
  308. .. _tut-sets:
  309. Sets
  310. ====
  311. Python also includes a data type for *sets*. A set is an unordered collection
  312. with no duplicate elements. Basic uses include membership testing and
  313. eliminating duplicate entries. Set objects also support mathematical operations
  314. like union, intersection, difference, and symmetric difference.
  315. Here is a brief demonstration::
  316. >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
  317. >>> fruit = set(basket) # create a set without duplicates
  318. >>> fruit
  319. set(['orange', 'pear', 'apple', 'banana'])
  320. >>> 'orange' in fruit # fast membership testing
  321. True
  322. >>> 'crabgrass' in fruit
  323. False
  324. >>> # Demonstrate set operations on unique letters from two words
  325. ...
  326. >>> a = set('abracadabra')
  327. >>> b = set('alacazam')
  328. >>> a # unique letters in a
  329. set(['a', 'r', 'b', 'c', 'd'])
  330. >>> a - b # letters in a but not in b
  331. set(['r', 'd', 'b'])
  332. >>> a | b # letters in either a or b
  333. set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
  334. >>> a & b # letters in both a and b
  335. set(['a', 'c'])
  336. >>> a ^ b # letters in a or b but not both
  337. set(['r', 'd', 'b', 'm', 'z', 'l'])
  338. .. _tut-dictionaries:
  339. Dictionaries
  340. ============
  341. Another useful data type built into Python is the *dictionary* (see
  342. :ref:`typesmapping`). Dictionaries are sometimes found in other languages as
  343. "associative memories" or "associative arrays". Unlike sequences, which are
  344. indexed by a range of numbers, dictionaries are indexed by *keys*, which can be
  345. any immutable type; strings and numbers can always be keys. Tuples can be used
  346. as keys if they contain only strings, numbers, or tuples; if a tuple contains
  347. any mutable object either directly or indirectly, it cannot be used as a key.
  348. You can't use lists as keys, since lists can be modified in place using index
  349. assignments, slice assignments, or methods like :meth:`append` and
  350. :meth:`extend`.
  351. It is best to think of a dictionary as an unordered set of *key: value* pairs,
  352. with the requirement that the keys are unique (within one dictionary). A pair of
  353. braces creates an empty dictionary: ``{}``. Placing a comma-separated list of
  354. key:value pairs within the braces adds initial key:value pairs to the
  355. dictionary; this is also the way dictionaries are written on output.
  356. The main operations on a dictionary are storing a value with some key and
  357. extracting the value given the key. It is also possible to delete a key:value
  358. pair with ``del``. If you store using a key that is already in use, the old
  359. value associated with that key is forgotten. It is an error to extract a value
  360. using a non-existent key.
  361. The :meth:`keys` method of a dictionary object returns a list of all the keys
  362. used in the dictionary, in arbitrary order (if you want it sorted, just apply
  363. the :meth:`sort` method to the list of keys). To check whether a single key is
  364. in the dictionary, use the :keyword:`in` keyword.
  365. Here is a small example using a dictionary::
  366. >>> tel = {'jack': 4098, 'sape': 4139}
  367. >>> tel['guido'] = 4127
  368. >>> tel
  369. {'sape': 4139, 'guido': 4127, 'jack': 4098}
  370. >>> tel['jack']
  371. 4098
  372. >>> del tel['sape']
  373. >>> tel['irv'] = 4127
  374. >>> tel
  375. {'guido': 4127, 'irv': 4127, 'jack': 4098}
  376. >>> tel.keys()
  377. ['guido', 'irv', 'jack']
  378. >>> 'guido' in tel
  379. True
  380. The :func:`dict` constructor builds dictionaries directly from lists of
  381. key-value pairs stored as tuples. When the pairs form a pattern, list
  382. comprehensions can compactly specify the key-value list. ::
  383. >>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
  384. {'sape': 4139, 'jack': 4098, 'guido': 4127}
  385. >>> dict([(x, x**2) for x in (2, 4, 6)]) # use a list comprehension
  386. {2: 4, 4: 16, 6: 36}
  387. Later in the tutorial, we will learn about Generator Expressions which are even
  388. better suited for the task of supplying key-values pairs to the :func:`dict`
  389. constructor.
  390. When the keys are simple strings, it is sometimes easier to specify pairs using
  391. keyword arguments::
  392. >>> dict(sape=4139, guido=4127, jack=4098)
  393. {'sape': 4139, 'jack': 4098, 'guido': 4127}
  394. .. _tut-loopidioms:
  395. Looping Techniques
  396. ==================
  397. When looping through dictionaries, the key and corresponding value can be
  398. retrieved at the same time using the :meth:`iteritems` method. ::
  399. >>> knights = {'gallahad': 'the pure', 'robin': 'the brave'}
  400. >>> for k, v in knights.iteritems():
  401. ... print k, v
  402. ...
  403. gallahad the pure
  404. robin the brave
  405. When looping through a sequence, the position index and corresponding value can
  406. be retrieved at the same time using the :func:`enumerate` function. ::
  407. >>> for i, v in enumerate(['tic', 'tac', 'toe']):
  408. ... print i, v
  409. ...
  410. 0 tic
  411. 1 tac
  412. 2 toe
  413. To loop over two or more sequences at the same time, the entries can be paired
  414. with the :func:`zip` function. ::
  415. >>> questions = ['name', 'quest', 'favorite color']
  416. >>> answers = ['lancelot', 'the holy grail', 'blue']
  417. >>> for q, a in zip(questions, answers):
  418. ... print 'What is your {0}? It is {1}.'.format(q, a)
  419. ...
  420. What is your name? It is lancelot.
  421. What is your quest? It is the holy grail.
  422. What is your favorite color? It is blue.
  423. To loop over a sequence in reverse, first specify the sequence in a forward
  424. direction and then call the :func:`reversed` function. ::
  425. >>> for i in reversed(xrange(1,10,2)):
  426. ... print i
  427. ...
  428. 9
  429. 7
  430. 5
  431. 3
  432. 1
  433. To loop over a sequence in sorted order, use the :func:`sorted` function which
  434. returns a new sorted list while leaving the source unaltered. ::
  435. >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
  436. >>> for f in sorted(set(basket)):
  437. ... print f
  438. ...
  439. apple
  440. banana
  441. orange
  442. pear
  443. .. _tut-conditions:
  444. More on Conditions
  445. ==================
  446. The conditions used in ``while`` and ``if`` statements can contain any
  447. operators, not just comparisons.
  448. The comparison operators ``in`` and ``not in`` check whether a value occurs
  449. (does not occur) in a sequence. The operators ``is`` and ``is not`` compare
  450. whether two objects are really the same object; this only matters for mutable
  451. objects like lists. All comparison operators have the same priority, which is
  452. lower than that of all numerical operators.
  453. Comparisons can be chained. For example, ``a < b == c`` tests whether ``a`` is
  454. less than ``b`` and moreover ``b`` equals ``c``.
  455. Comparisons may be combined using the Boolean operators ``and`` and ``or``, and
  456. the outcome of a comparison (or of any other Boolean expression) may be negated
  457. with ``not``. These have lower priorities than comparison operators; between
  458. them, ``not`` has the highest priority and ``or`` the lowest, so that ``A and
  459. not B or C`` is equivalent to ``(A and (not B)) or C``. As always, parentheses
  460. can be used to express the desired composition.
  461. The Boolean operators ``and`` and ``or`` are so-called *short-circuit*
  462. operators: their arguments are evaluated from left to right, and evaluation
  463. stops as soon as the outcome is determined. For example, if ``A`` and ``C`` are
  464. true but ``B`` is false, ``A and B and C`` does not evaluate the expression
  465. ``C``. When used as a general value and not as a Boolean, the return value of a
  466. short-circuit operator is the last evaluated argument.
  467. It is possible to assign the result of a comparison or other Boolean expression
  468. to a variable. For example, ::
  469. >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
  470. >>> non_null = string1 or string2 or string3
  471. >>> non_null
  472. 'Trondheim'
  473. Note that in Python, unlike C, assignment cannot occur inside expressions. C
  474. programmers may grumble about this, but it avoids a common class of problems
  475. encountered in C programs: typing ``=`` in an expression when ``==`` was
  476. intended.
  477. .. _tut-comparing:
  478. Comparing Sequences and Other Types
  479. ===================================
  480. Sequence objects may be compared to other objects with the same sequence type.
  481. The comparison uses *lexicographical* ordering: first the first two items are
  482. compared, and if they differ this determines the outcome of the comparison; if
  483. they are equal, the next two items are compared, and so on, until either
  484. sequence is exhausted. If two items to be compared are themselves sequences of
  485. the same type, the lexicographical comparison is carried out recursively. If
  486. all items of two sequences compare equal, the sequences are considered equal.
  487. If one sequence is an initial sub-sequence of the other, the shorter sequence is
  488. the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII
  489. ordering for individual characters. Some examples of comparisons between
  490. sequences of the same type::
  491. (1, 2, 3) < (1, 2, 4)
  492. [1, 2, 3] < [1, 2, 4]
  493. 'ABC' < 'C' < 'Pascal' < 'Python'
  494. (1, 2, 3, 4) < (1, 2, 4)
  495. (1, 2) < (1, 2, -1)
  496. (1, 2, 3) == (1.0, 2.0, 3.0)
  497. (1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4)
  498. Note that comparing objects of different types is legal. The outcome is
  499. deterministic but arbitrary: the types are ordered by their name. Thus, a list
  500. is always smaller than a string, a string is always smaller than a tuple, etc.
  501. [#]_ Mixed numeric types are compared according to their numeric value, so 0
  502. equals 0.0, etc.
  503. .. rubric:: Footnotes
  504. .. [#] The rules for comparing objects of different types should not be relied upon;
  505. they may change in a future version of the language.