#### /Doc/tutorial/datastructures.rst

ReStructuredText | 667 lines | 505 code | 162 blank | 0 comment | 0 complexity | 2b4a9dbb027faa4d43bf15c696afe92e MD5 | raw file

1.. _tut-structures: 2 3*************** 4Data Structures 5*************** 6 7This chapter describes some things you've learned about already in more detail, 8and adds some new things as well. 9 10 11.. _tut-morelists: 12 13More on Lists 14============= 15 16The list data type has some more methods. Here are all of the methods of list 17objects: 18 19 20.. method:: list.append(x) 21 :noindex: 22 23 Add an item to the end of the list; equivalent to ``a[len(a):] = [x]``. 24 25 26.. method:: list.extend(L) 27 :noindex: 28 29 Extend the list by appending all the items in the given list; equivalent to 30 ``a[len(a):] = L``. 31 32 33.. method:: list.insert(i, x) 34 :noindex: 35 36 Insert an item at a given position. The first argument is the index of the 37 element before which to insert, so ``a.insert(0, x)`` inserts at the front of 38 the list, and ``a.insert(len(a), x)`` is equivalent to ``a.append(x)``. 39 40 41.. method:: list.remove(x) 42 :noindex: 43 44 Remove the first item from the list whose value is *x*. It is an error if there 45 is no such item. 46 47 48.. method:: list.pop([i]) 49 :noindex: 50 51 Remove the item at the given position in the list, and return it. If no index 52 is specified, ``a.pop()`` removes and returns the last item in the list. (The 53 square brackets around the *i* in the method signature denote that the parameter 54 is optional, not that you should type square brackets at that position. You 55 will see this notation frequently in the Python Library Reference.) 56 57 58.. method:: list.index(x) 59 :noindex: 60 61 Return the index in the list of the first item whose value is *x*. It is an 62 error if there is no such item. 63 64 65.. method:: list.count(x) 66 :noindex: 67 68 Return the number of times *x* appears in the list. 69 70 71.. method:: list.sort() 72 :noindex: 73 74 Sort the items of the list, in place. 75 76 77.. method:: list.reverse() 78 :noindex: 79 80 Reverse the elements of the list, in place. 81 82An example that uses most of the list methods:: 83 84 >>> a = [66.25, 333, 333, 1, 1234.5] 85 >>> print a.count(333), a.count(66.25), a.count('x') 86 2 1 0 87 >>> a.insert(2, -1) 88 >>> a.append(333) 89 >>> a 90 [66.25, 333, -1, 333, 1, 1234.5, 333] 91 >>> a.index(333) 92 1 93 >>> a.remove(333) 94 >>> a 95 [66.25, -1, 333, 1, 1234.5, 333] 96 >>> a.reverse() 97 >>> a 98 [333, 1234.5, 1, 333, -1, 66.25] 99 >>> a.sort() 100 >>> a 101 [-1, 1, 66.25, 333, 333, 1234.5] 102 103 104.. _tut-lists-as-stacks: 105 106Using Lists as Stacks 107--------------------- 108 109.. sectionauthor:: Ka-Ping Yee <ping@lfw.org> 110 111 112The list methods make it very easy to use a list as a stack, where the last 113element added is the first element retrieved ("last-in, first-out"). To add an 114item to the top of the stack, use :meth:`append`. To retrieve an item from the 115top of the stack, use :meth:`pop` without an explicit index. For example:: 116 117 >>> stack = [3, 4, 5] 118 >>> stack.append(6) 119 >>> stack.append(7) 120 >>> stack 121 [3, 4, 5, 6, 7] 122 >>> stack.pop() 123 7 124 >>> stack 125 [3, 4, 5, 6] 126 >>> stack.pop() 127 6 128 >>> stack.pop() 129 5 130 >>> stack 131 [3, 4] 132 133 134.. _tut-lists-as-queues: 135 136Using Lists as Queues 137--------------------- 138 139.. sectionauthor:: Ka-Ping Yee <ping@lfw.org> 140 141It is also possible to use a list as a queue, where the first element added is 142the first element retrieved ("first-in, first-out"); however, lists are not 143efficient for this purpose. While appends and pops from the end of list are 144fast, doing inserts or pops from the beginning of a list is slow (because all 145of the other elements have to be shifted by one). 146 147To implement a queue, use :class:`collections.deque` which was designed to 148have fast appends and pops from both ends. For example:: 149 150 >>> from collections import deque 151 >>> queue = deque(["Eric", "John", "Michael"]) 152 >>> queue.append("Terry") # Terry arrives 153 >>> queue.append("Graham") # Graham arrives 154 >>> queue.popleft() # The first to arrive now leaves 155 'Eric' 156 >>> queue.popleft() # The second to arrive now leaves 157 'John' 158 >>> queue # Remaining queue in order of arrival 159 deque(['Michael', 'Terry', 'Graham']) 160 161 162.. _tut-functional: 163 164Functional Programming Tools 165---------------------------- 166 167There are three built-in functions that are very useful when used with lists: 168:func:`filter`, :func:`map`, and :func:`reduce`. 169 170``filter(function, sequence)`` returns a sequence consisting of those items from 171the sequence for which ``function(item)`` is true. If *sequence* is a 172:class:`string` or :class:`tuple`, the result will be of the same type; 173otherwise, it is always a :class:`list`. For example, to compute some primes:: 174 175 >>> def f(x): return x % 2 != 0 and x % 3 != 0 176 ... 177 >>> filter(f, range(2, 25)) 178 [5, 7, 11, 13, 17, 19, 23] 179 180``map(function, sequence)`` calls ``function(item)`` for each of the sequence's 181items and returns a list of the return values. For example, to compute some 182cubes:: 183 184 >>> def cube(x): return x*x*x 185 ... 186 >>> map(cube, range(1, 11)) 187 [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000] 188 189More than one sequence may be passed; the function must then have as many 190arguments as there are sequences and is called with the corresponding item from 191each sequence (or ``None`` if some sequence is shorter than another). For 192example:: 193 194 >>> seq = range(8) 195 >>> def add(x, y): return x+y 196 ... 197 >>> map(add, seq, seq) 198 [0, 2, 4, 6, 8, 10, 12, 14] 199 200``reduce(function, sequence)`` returns a single value constructed by calling the 201binary function *function* on the first two items of the sequence, then on the 202result and the next item, and so on. For example, to compute the sum of the 203numbers 1 through 10:: 204 205 >>> def add(x,y): return x+y 206 ... 207 >>> reduce(add, range(1, 11)) 208 55 209 210If there's only one item in the sequence, its value is returned; if the sequence 211is empty, an exception is raised. 212 213A third argument can be passed to indicate the starting value. In this case the 214starting value is returned for an empty sequence, and the function is first 215applied to the starting value and the first sequence item, then to the result 216and the next item, and so on. For example, :: 217 218 >>> def sum(seq): 219 ... def add(x,y): return x+y 220 ... return reduce(add, seq, 0) 221 ... 222 >>> sum(range(1, 11)) 223 55 224 >>> sum([]) 225 0 226 227Don't use this example's definition of :func:`sum`: since summing numbers is 228such a common need, a built-in function ``sum(sequence)`` is already provided, 229and works exactly like this. 230 231.. versionadded:: 2.3 232 233 234List Comprehensions 235------------------- 236 237List comprehensions provide a concise way to create lists without resorting to 238use of :func:`map`, :func:`filter` and/or :keyword:`lambda`. The resulting list 239definition tends often to be clearer than lists built using those constructs. 240Each list comprehension consists of an expression followed by a :keyword:`for` 241clause, then zero or more :keyword:`for` or :keyword:`if` clauses. The result 242will be a list resulting from evaluating the expression in the context of the 243:keyword:`for` and :keyword:`if` clauses which follow it. If the expression 244would evaluate to a tuple, it must be parenthesized. :: 245 246 >>> freshfruit = [' banana', ' loganberry ', 'passion fruit '] 247 >>> [weapon.strip() for weapon in freshfruit] 248 ['banana', 'loganberry', 'passion fruit'] 249 >>> vec = [2, 4, 6] 250 >>> [3*x for x in vec] 251 [6, 12, 18] 252 >>> [3*x for x in vec if x > 3] 253 [12, 18] 254 >>> [3*x for x in vec if x < 2] 255 [] 256 >>> [[x,x**2] for x in vec] 257 [[2, 4], [4, 16], [6, 36]] 258 >>> [x, x**2 for x in vec] # error - parens required for tuples 259 File "<stdin>", line 1, in ? 260 [x, x**2 for x in vec] 261 ^ 262 SyntaxError: invalid syntax 263 >>> [(x, x**2) for x in vec] 264 [(2, 4), (4, 16), (6, 36)] 265 >>> vec1 = [2, 4, 6] 266 >>> vec2 = [4, 3, -9] 267 >>> [x*y for x in vec1 for y in vec2] 268 [8, 6, -18, 16, 12, -36, 24, 18, -54] 269 >>> [x+y for x in vec1 for y in vec2] 270 [6, 5, -7, 8, 7, -5, 10, 9, -3] 271 >>> [vec1[i]*vec2[i] for i in range(len(vec1))] 272 [8, 12, -54] 273 274List comprehensions are much more flexible than :func:`map` and can be applied 275to complex expressions and nested functions:: 276 277 >>> [str(round(355/113.0, i)) for i in range(1,6)] 278 ['3.1', '3.14', '3.142', '3.1416', '3.14159'] 279 280 281Nested List Comprehensions 282-------------------------- 283 284If you've got the stomach for it, list comprehensions can be nested. They are a 285powerful tool but -- like all powerful tools -- they need to be used carefully, 286if at all. 287 288Consider the following example of a 3x3 matrix held as a list containing three 289lists, one list per row:: 290 291 >>> mat = [ 292 ... [1, 2, 3], 293 ... [4, 5, 6], 294 ... [7, 8, 9], 295 ... ] 296 297Now, if you wanted to swap rows and columns, you could use a list 298comprehension:: 299 300 >>> print [[row[i] for row in mat] for i in [0, 1, 2]] 301 [[1, 4, 7], [2, 5, 8], [3, 6, 9]] 302 303Special care has to be taken for the *nested* list comprehension: 304 305 To avoid apprehension when nesting list comprehensions, read from right to 306 left. 307 308A more verbose version of this snippet shows the flow explicitly:: 309 310 for i in [0, 1, 2]: 311 for row in mat: 312 print row[i], 313 print 314 315In real world, you should prefer built-in functions to complex flow statements. 316The :func:`zip` function would do a great job for this use case:: 317 318 >>> zip(*mat) 319 [(1, 4, 7), (2, 5, 8), (3, 6, 9)] 320 321See :ref:`tut-unpacking-arguments` for details on the asterisk in this line. 322 323.. _tut-del: 324 325The :keyword:`del` statement 326============================ 327 328There is a way to remove an item from a list given its index instead of its 329value: the :keyword:`del` statement. This differs from the :meth:`pop` method 330which returns a value. The :keyword:`del` statement can also be used to remove 331slices from a list or clear the entire list (which we did earlier by assignment 332of an empty list to the slice). For example:: 333 334 >>> a = [-1, 1, 66.25, 333, 333, 1234.5] 335 >>> del a[0] 336 >>> a 337 [1, 66.25, 333, 333, 1234.5] 338 >>> del a[2:4] 339 >>> a 340 [1, 66.25, 1234.5] 341 >>> del a[:] 342 >>> a 343 [] 344 345:keyword:`del` can also be used to delete entire variables:: 346 347 >>> del a 348 349Referencing the name ``a`` hereafter is an error (at least until another value 350is assigned to it). We'll find other uses for :keyword:`del` later. 351 352 353.. _tut-tuples: 354 355Tuples and Sequences 356==================== 357 358We saw that lists and strings have many common properties, such as indexing and 359slicing operations. They are two examples of *sequence* data types (see 360:ref:`typesseq`). Since Python is an evolving language, other sequence data 361types may be added. There is also another standard sequence data type: the 362*tuple*. 363 364A tuple consists of a number of values separated by commas, for instance:: 365 366 >>> t = 12345, 54321, 'hello!' 367 >>> t[0] 368 12345 369 >>> t 370 (12345, 54321, 'hello!') 371 >>> # Tuples may be nested: 372 ... u = t, (1, 2, 3, 4, 5) 373 >>> u 374 ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5)) 375 376As you see, on output tuples are always enclosed in parentheses, so that nested 377tuples are interpreted correctly; they may be input with or without surrounding 378parentheses, although often parentheses are necessary anyway (if the tuple is 379part of a larger expression). 380 381Tuples have many uses. For example: (x, y) coordinate pairs, employee records 382from a database, etc. Tuples, like strings, are immutable: it is not possible 383to assign to the individual items of a tuple (you can simulate much of the same 384effect with slicing and concatenation, though). It is also possible to create 385tuples which contain mutable objects, such as lists. 386 387A special problem is the construction of tuples containing 0 or 1 items: the 388syntax has some extra quirks to accommodate these. Empty tuples are constructed 389by an empty pair of parentheses; a tuple with one item is constructed by 390following a value with a comma (it is not sufficient to enclose a single value 391in parentheses). Ugly, but effective. For example:: 392 393 >>> empty = () 394 >>> singleton = 'hello', # <-- note trailing comma 395 >>> len(empty) 396 0 397 >>> len(singleton) 398 1 399 >>> singleton 400 ('hello',) 401 402The statement ``t = 12345, 54321, 'hello!'`` is an example of *tuple packing*: 403the values ``12345``, ``54321`` and ``'hello!'`` are packed together in a tuple. 404The reverse operation is also possible:: 405 406 >>> x, y, z = t 407 408This is called, appropriately enough, *sequence unpacking* and works for any 409sequence on the right-hand side. Sequence unpacking requires the list of 410variables on the left to have the same number of elements as the length of the 411sequence. Note that multiple assignment is really just a combination of tuple 412packing and sequence unpacking. 413 414.. XXX Add a bit on the difference between tuples and lists. 415 416 417.. _tut-sets: 418 419Sets 420==== 421 422Python also includes a data type for *sets*. A set is an unordered collection 423with no duplicate elements. Basic uses include membership testing and 424eliminating duplicate entries. Set objects also support mathematical operations 425like union, intersection, difference, and symmetric difference. 426 427Here is a brief demonstration:: 428 429 >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana'] 430 >>> fruit = set(basket) # create a set without duplicates 431 >>> fruit 432 set(['orange', 'pear', 'apple', 'banana']) 433 >>> 'orange' in fruit # fast membership testing 434 True 435 >>> 'crabgrass' in fruit 436 False 437 438 >>> # Demonstrate set operations on unique letters from two words 439 ... 440 >>> a = set('abracadabra') 441 >>> b = set('alacazam') 442 >>> a # unique letters in a 443 set(['a', 'r', 'b', 'c', 'd']) 444 >>> a - b # letters in a but not in b 445 set(['r', 'd', 'b']) 446 >>> a | b # letters in either a or b 447 set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) 448 >>> a & b # letters in both a and b 449 set(['a', 'c']) 450 >>> a ^ b # letters in a or b but not both 451 set(['r', 'd', 'b', 'm', 'z', 'l']) 452 453 454.. _tut-dictionaries: 455 456Dictionaries 457============ 458 459Another useful data type built into Python is the *dictionary* (see 460:ref:`typesmapping`). Dictionaries are sometimes found in other languages as 461"associative memories" or "associative arrays". Unlike sequences, which are 462indexed by a range of numbers, dictionaries are indexed by *keys*, which can be 463any immutable type; strings and numbers can always be keys. Tuples can be used 464as keys if they contain only strings, numbers, or tuples; if a tuple contains 465any mutable object either directly or indirectly, it cannot be used as a key. 466You can't use lists as keys, since lists can be modified in place using index 467assignments, slice assignments, or methods like :meth:`append` and 468:meth:`extend`. 469 470It is best to think of a dictionary as an unordered set of *key: value* pairs, 471with the requirement that the keys are unique (within one dictionary). A pair of 472braces creates an empty dictionary: ``{}``. Placing a comma-separated list of 473key:value pairs within the braces adds initial key:value pairs to the 474dictionary; this is also the way dictionaries are written on output. 475 476The main operations on a dictionary are storing a value with some key and 477extracting the value given the key. It is also possible to delete a key:value 478pair with ``del``. If you store using a key that is already in use, the old 479value associated with that key is forgotten. It is an error to extract a value 480using a non-existent key. 481 482The :meth:`keys` method of a dictionary object returns a list of all the keys 483used in the dictionary, in arbitrary order (if you want it sorted, just apply 484the :meth:`sort` method to the list of keys). To check whether a single key is 485in the dictionary, use the :keyword:`in` keyword. 486 487Here is a small example using a dictionary:: 488 489 >>> tel = {'jack': 4098, 'sape': 4139} 490 >>> tel['guido'] = 4127 491 >>> tel 492 {'sape': 4139, 'guido': 4127, 'jack': 4098} 493 >>> tel['jack'] 494 4098 495 >>> del tel['sape'] 496 >>> tel['irv'] = 4127 497 >>> tel 498 {'guido': 4127, 'irv': 4127, 'jack': 4098} 499 >>> tel.keys() 500 ['guido', 'irv', 'jack'] 501 >>> 'guido' in tel 502 True 503 504The :func:`dict` constructor builds dictionaries directly from lists of 505key-value pairs stored as tuples. When the pairs form a pattern, list 506comprehensions can compactly specify the key-value list. :: 507 508 >>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)]) 509 {'sape': 4139, 'jack': 4098, 'guido': 4127} 510 >>> dict([(x, x**2) for x in (2, 4, 6)]) # use a list comprehension 511 {2: 4, 4: 16, 6: 36} 512 513Later in the tutorial, we will learn about Generator Expressions which are even 514better suited for the task of supplying key-values pairs to the :func:`dict` 515constructor. 516 517When the keys are simple strings, it is sometimes easier to specify pairs using 518keyword arguments:: 519 520 >>> dict(sape=4139, guido=4127, jack=4098) 521 {'sape': 4139, 'jack': 4098, 'guido': 4127} 522 523 524.. _tut-loopidioms: 525 526Looping Techniques 527================== 528 529When looping through dictionaries, the key and corresponding value can be 530retrieved at the same time using the :meth:`iteritems` method. :: 531 532 >>> knights = {'gallahad': 'the pure', 'robin': 'the brave'} 533 >>> for k, v in knights.iteritems(): 534 ... print k, v 535 ... 536 gallahad the pure 537 robin the brave 538 539When looping through a sequence, the position index and corresponding value can 540be retrieved at the same time using the :func:`enumerate` function. :: 541 542 >>> for i, v in enumerate(['tic', 'tac', 'toe']): 543 ... print i, v 544 ... 545 0 tic 546 1 tac 547 2 toe 548 549To loop over two or more sequences at the same time, the entries can be paired 550with the :func:`zip` function. :: 551 552 >>> questions = ['name', 'quest', 'favorite color'] 553 >>> answers = ['lancelot', 'the holy grail', 'blue'] 554 >>> for q, a in zip(questions, answers): 555 ... print 'What is your {0}? It is {1}.'.format(q, a) 556 ... 557 What is your name? It is lancelot. 558 What is your quest? It is the holy grail. 559 What is your favorite color? It is blue. 560 561To loop over a sequence in reverse, first specify the sequence in a forward 562direction and then call the :func:`reversed` function. :: 563 564 >>> for i in reversed(xrange(1,10,2)): 565 ... print i 566 ... 567 9 568 7 569 5 570 3 571 1 572 573To loop over a sequence in sorted order, use the :func:`sorted` function which 574returns a new sorted list while leaving the source unaltered. :: 575 576 >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana'] 577 >>> for f in sorted(set(basket)): 578 ... print f 579 ... 580 apple 581 banana 582 orange 583 pear 584 585 586.. _tut-conditions: 587 588More on Conditions 589================== 590 591The conditions used in ``while`` and ``if`` statements can contain any 592operators, not just comparisons. 593 594The comparison operators ``in`` and ``not in`` check whether a value occurs 595(does not occur) in a sequence. The operators ``is`` and ``is not`` compare 596whether two objects are really the same object; this only matters for mutable 597objects like lists. All comparison operators have the same priority, which is 598lower than that of all numerical operators. 599 600Comparisons can be chained. For example, ``a < b == c`` tests whether ``a`` is 601less than ``b`` and moreover ``b`` equals ``c``. 602 603Comparisons may be combined using the Boolean operators ``and`` and ``or``, and 604the outcome of a comparison (or of any other Boolean expression) may be negated 605with ``not``. These have lower priorities than comparison operators; between 606them, ``not`` has the highest priority and ``or`` the lowest, so that ``A and 607not B or C`` is equivalent to ``(A and (not B)) or C``. As always, parentheses 608can be used to express the desired composition. 609 610The Boolean operators ``and`` and ``or`` are so-called *short-circuit* 611operators: their arguments are evaluated from left to right, and evaluation 612stops as soon as the outcome is determined. For example, if ``A`` and ``C`` are 613true but ``B`` is false, ``A and B and C`` does not evaluate the expression 614``C``. When used as a general value and not as a Boolean, the return value of a 615short-circuit operator is the last evaluated argument. 616 617It is possible to assign the result of a comparison or other Boolean expression 618to a variable. For example, :: 619 620 >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance' 621 >>> non_null = string1 or string2 or string3 622 >>> non_null 623 'Trondheim' 624 625Note that in Python, unlike C, assignment cannot occur inside expressions. C 626programmers may grumble about this, but it avoids a common class of problems 627encountered in C programs: typing ``=`` in an expression when ``==`` was 628intended. 629 630 631.. _tut-comparing: 632 633Comparing Sequences and Other Types 634=================================== 635 636Sequence objects may be compared to other objects with the same sequence type. 637The comparison uses *lexicographical* ordering: first the first two items are 638compared, and if they differ this determines the outcome of the comparison; if 639they are equal, the next two items are compared, and so on, until either 640sequence is exhausted. If two items to be compared are themselves sequences of 641the same type, the lexicographical comparison is carried out recursively. If 642all items of two sequences compare equal, the sequences are considered equal. 643If one sequence is an initial sub-sequence of the other, the shorter sequence is 644the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII 645ordering for individual characters. Some examples of comparisons between 646sequences of the same type:: 647 648 (1, 2, 3) < (1, 2, 4) 649 [1, 2, 3] < [1, 2, 4] 650 'ABC' < 'C' < 'Pascal' < 'Python' 651 (1, 2, 3, 4) < (1, 2, 4) 652 (1, 2) < (1, 2, -1) 653 (1, 2, 3) == (1.0, 2.0, 3.0) 654 (1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4) 655 656Note that comparing objects of different types is legal. The outcome is 657deterministic but arbitrary: the types are ordered by their name. Thus, a list 658is always smaller than a string, a string is always smaller than a tuple, etc. 659[#]_ Mixed numeric types are compared according to their numeric value, so 0 660equals 0.0, etc. 661 662 663.. rubric:: Footnotes 664 665.. [#] The rules for comparing objects of different types should not be relied upon; 666 they may change in a future version of the language. 667