/Doc/library/cgi.rst
ReStructuredText | 537 lines | 383 code | 154 blank | 0 comment | 0 complexity | 6e1f46b2abc354cde979506329631a33 MD5 | raw file
1 2:mod:`cgi` --- Common Gateway Interface support. 3================================================ 4 5.. module:: cgi 6 :synopsis: Helpers for running Python scripts via the Common Gateway Interface. 7 8 9.. index:: 10 pair: WWW; server 11 pair: CGI; protocol 12 pair: HTTP; protocol 13 pair: MIME; headers 14 single: URL 15 single: Common Gateway Interface 16 17Support module for Common Gateway Interface (CGI) scripts. 18 19This module defines a number of utilities for use by CGI scripts written in 20Python. 21 22 23Introduction 24------------ 25 26.. _cgi-intro: 27 28A CGI script is invoked by an HTTP server, usually to process user input 29submitted through an HTML ``<FORM>`` or ``<ISINDEX>`` element. 30 31Most often, CGI scripts live in the server's special :file:`cgi-bin` directory. 32The HTTP server places all sorts of information about the request (such as the 33client's hostname, the requested URL, the query string, and lots of other 34goodies) in the script's shell environment, executes the script, and sends the 35script's output back to the client. 36 37The script's input is connected to the client too, and sometimes the form data 38is read this way; at other times the form data is passed via the "query string" 39part of the URL. This module is intended to take care of the different cases 40and provide a simpler interface to the Python script. It also provides a number 41of utilities that help in debugging scripts, and the latest addition is support 42for file uploads from a form (if your browser supports it). 43 44The output of a CGI script should consist of two sections, separated by a blank 45line. The first section contains a number of headers, telling the client what 46kind of data is following. Python code to generate a minimal header section 47looks like this:: 48 49 print "Content-Type: text/html" # HTML is following 50 print # blank line, end of headers 51 52The second section is usually HTML, which allows the client software to display 53nicely formatted text with header, in-line images, etc. Here's Python code that 54prints a simple piece of HTML:: 55 56 print "<TITLE>CGI script output</TITLE>" 57 print "<H1>This is my first CGI script</H1>" 58 print "Hello, world!" 59 60 61.. _using-the-cgi-module: 62 63Using the cgi module 64-------------------- 65 66Begin by writing ``import cgi``. Do not use ``from cgi import *`` --- the 67module defines all sorts of names for its own use or for backward compatibility 68that you don't want in your namespace. 69 70When you write a new script, consider adding these lines:: 71 72 import cgitb 73 cgitb.enable() 74 75This activates a special exception handler that will display detailed reports in 76the Web browser if any errors occur. If you'd rather not show the guts of your 77program to users of your script, you can have the reports saved to files 78instead, with code like this:: 79 80 import cgitb 81 cgitb.enable(display=0, logdir="/tmp") 82 83It's very helpful to use this feature during script development. The reports 84produced by :mod:`cgitb` provide information that can save you a lot of time in 85tracking down bugs. You can always remove the ``cgitb`` line later when you 86have tested your script and are confident that it works correctly. 87 88To get at submitted form data, it's best to use the :class:`FieldStorage` class. 89The other classes defined in this module are provided mostly for backward 90compatibility. Instantiate it exactly once, without arguments. This reads the 91form contents from standard input or the environment (depending on the value of 92various environment variables set according to the CGI standard). Since it may 93consume standard input, it should be instantiated only once. 94 95The :class:`FieldStorage` instance can be indexed like a Python dictionary, and 96also supports the standard dictionary methods :meth:`has_key` and :meth:`keys`. 97The built-in :func:`len` is also supported. Form fields containing empty 98strings are ignored and do not appear in the dictionary; to keep such values, 99provide a true value for the optional *keep_blank_values* keyword parameter when 100creating the :class:`FieldStorage` instance. 101 102For instance, the following code (which assumes that the 103:mailheader:`Content-Type` header and blank line have already been printed) 104checks that the fields ``name`` and ``addr`` are both set to a non-empty 105string:: 106 107 form = cgi.FieldStorage() 108 if not (form.has_key("name") and form.has_key("addr")): 109 print "<H1>Error</H1>" 110 print "Please fill in the name and addr fields." 111 return 112 print "<p>name:", form["name"].value 113 print "<p>addr:", form["addr"].value 114 ...further form processing here... 115 116Here the fields, accessed through ``form[key]``, are themselves instances of 117:class:`FieldStorage` (or :class:`MiniFieldStorage`, depending on the form 118encoding). The :attr:`value` attribute of the instance yields the string value 119of the field. The :meth:`getvalue` method returns this string value directly; 120it also accepts an optional second argument as a default to return if the 121requested key is not present. 122 123If the submitted form data contains more than one field with the same name, the 124object retrieved by ``form[key]`` is not a :class:`FieldStorage` or 125:class:`MiniFieldStorage` instance but a list of such instances. Similarly, in 126this situation, ``form.getvalue(key)`` would return a list of strings. If you 127expect this possibility (when your HTML form contains multiple fields with the 128same name), use the :func:`getlist` function, which always returns a list of 129values (so that you do not need to special-case the single item case). For 130example, this code concatenates any number of username fields, separated by 131commas:: 132 133 value = form.getlist("username") 134 usernames = ",".join(value) 135 136If a field represents an uploaded file, accessing the value via the 137:attr:`value` attribute or the :func:`getvalue` method reads the entire file in 138memory as a string. This may not be what you want. You can test for an uploaded 139file by testing either the :attr:`filename` attribute or the :attr:`file` 140attribute. You can then read the data at leisure from the :attr:`file` 141attribute:: 142 143 fileitem = form["userfile"] 144 if fileitem.file: 145 # It's an uploaded file; count lines 146 linecount = 0 147 while 1: 148 line = fileitem.file.readline() 149 if not line: break 150 linecount = linecount + 1 151 152If an error is encountered when obtaining the contents of an uploaded file 153(for example, when the user interrupts the form submission by clicking on 154a Back or Cancel button) the :attr:`done` attribute of the object for the 155field will be set to the value -1. 156 157The file upload draft standard entertains the possibility of uploading multiple 158files from one field (using a recursive :mimetype:`multipart/\*` encoding). 159When this occurs, the item will be a dictionary-like :class:`FieldStorage` item. 160This can be determined by testing its :attr:`type` attribute, which should be 161:mimetype:`multipart/form-data` (or perhaps another MIME type matching 162:mimetype:`multipart/\*`). In this case, it can be iterated over recursively 163just like the top-level form object. 164 165When a form is submitted in the "old" format (as the query string or as a single 166data part of type :mimetype:`application/x-www-form-urlencoded`), the items will 167actually be instances of the class :class:`MiniFieldStorage`. In this case, the 168:attr:`list`, :attr:`file`, and :attr:`filename` attributes are always ``None``. 169 170A form submitted via POST that also has a query string will contain both 171:class:`FieldStorage` and :class:`MiniFieldStorage` items. 172 173Higher Level Interface 174---------------------- 175 176.. versionadded:: 2.2 177 178The previous section explains how to read CGI form data using the 179:class:`FieldStorage` class. This section describes a higher level interface 180which was added to this class to allow one to do it in a more readable and 181intuitive way. The interface doesn't make the techniques described in previous 182sections obsolete --- they are still useful to process file uploads efficiently, 183for example. 184 185.. XXX: Is this true ? 186 187The interface consists of two simple methods. Using the methods you can process 188form data in a generic way, without the need to worry whether only one or more 189values were posted under one name. 190 191In the previous section, you learned to write following code anytime you 192expected a user to post more than one value under one name:: 193 194 item = form.getvalue("item") 195 if isinstance(item, list): 196 # The user is requesting more than one item. 197 else: 198 # The user is requesting only one item. 199 200This situation is common for example when a form contains a group of multiple 201checkboxes with the same name:: 202 203 <input type="checkbox" name="item" value="1" /> 204 <input type="checkbox" name="item" value="2" /> 205 206In most situations, however, there's only one form control with a particular 207name in a form and then you expect and need only one value associated with this 208name. So you write a script containing for example this code:: 209 210 user = form.getvalue("user").upper() 211 212The problem with the code is that you should never expect that a client will 213provide valid input to your scripts. For example, if a curious user appends 214another ``user=foo`` pair to the query string, then the script would crash, 215because in this situation the ``getvalue("user")`` method call returns a list 216instead of a string. Calling the :meth:`toupper` method on a list is not valid 217(since lists do not have a method of this name) and results in an 218:exc:`AttributeError` exception. 219 220Therefore, the appropriate way to read form data values was to always use the 221code which checks whether the obtained value is a single value or a list of 222values. That's annoying and leads to less readable scripts. 223 224A more convenient approach is to use the methods :meth:`getfirst` and 225:meth:`getlist` provided by this higher level interface. 226 227 228.. method:: FieldStorage.getfirst(name[, default]) 229 230 This method always returns only one value associated with form field *name*. 231 The method returns only the first value in case that more values were posted 232 under such name. Please note that the order in which the values are received 233 may vary from browser to browser and should not be counted on. [#]_ If no such 234 form field or value exists then the method returns the value specified by the 235 optional parameter *default*. This parameter defaults to ``None`` if not 236 specified. 237 238 239.. method:: FieldStorage.getlist(name) 240 241 This method always returns a list of values associated with form field *name*. 242 The method returns an empty list if no such form field or value exists for 243 *name*. It returns a list consisting of one item if only one such value exists. 244 245Using these methods you can write nice compact code:: 246 247 import cgi 248 form = cgi.FieldStorage() 249 user = form.getfirst("user", "").upper() # This way it's safe. 250 for item in form.getlist("item"): 251 do_something(item) 252 253 254Old classes 255----------- 256 257.. deprecated:: 2.6 258 259 These classes, present in earlier versions of the :mod:`cgi` module, are 260 still supported for backward compatibility. New applications should use the 261 :class:`FieldStorage` class. 262 263:class:`SvFormContentDict` stores single value form content as dictionary; it 264assumes each field name occurs in the form only once. 265 266:class:`FormContentDict` stores multiple value form content as a dictionary (the 267form items are lists of values). Useful if your form contains multiple fields 268with the same name. 269 270Other classes (:class:`FormContent`, :class:`InterpFormContentDict`) are present 271for backwards compatibility with really old applications only. 272 273 274.. _functions-in-cgi-module: 275 276Functions 277--------- 278 279These are useful if you want more control, or if you want to employ some of the 280algorithms implemented in this module in other circumstances. 281 282 283.. function:: parse(fp[, keep_blank_values[, strict_parsing]]) 284 285 Parse a query in the environment or from a file (the file defaults to 286 ``sys.stdin``). The *keep_blank_values* and *strict_parsing* parameters are 287 passed to :func:`urlparse.parse_qs` unchanged. 288 289 290.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]]) 291 292 This function is deprecated in this module. Use :func:`urlparse.parse_qs` 293 instead. It is maintained here only for backward compatiblity. 294 295.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]]) 296 297 This function is deprecated in this module. Use :func:`urlparse.parse_qsl` 298 instead. It is maintained here only for backward compatiblity. 299 300.. function:: parse_multipart(fp, pdict) 301 302 Parse input of type :mimetype:`multipart/form-data` (for file uploads). 303 Arguments are *fp* for the input file and *pdict* for a dictionary containing 304 other parameters in the :mailheader:`Content-Type` header. 305 306 Returns a dictionary just like :func:`urlparse.parse_qs` keys are the field names, each 307 value is a list of values for that field. This is easy to use but not much good 308 if you are expecting megabytes to be uploaded --- in that case, use the 309 :class:`FieldStorage` class instead which is much more flexible. 310 311 Note that this does not parse nested multipart parts --- use 312 :class:`FieldStorage` for that. 313 314 315.. function:: parse_header(string) 316 317 Parse a MIME header (such as :mailheader:`Content-Type`) into a main value and a 318 dictionary of parameters. 319 320 321.. function:: test() 322 323 Robust test CGI script, usable as main program. Writes minimal HTTP headers and 324 formats all information provided to the script in HTML form. 325 326 327.. function:: print_environ() 328 329 Format the shell environment in HTML. 330 331 332.. function:: print_form(form) 333 334 Format a form in HTML. 335 336 337.. function:: print_directory() 338 339 Format the current directory in HTML. 340 341 342.. function:: print_environ_usage() 343 344 Print a list of useful (used by CGI) environment variables in HTML. 345 346 347.. function:: escape(s[, quote]) 348 349 Convert the characters ``'&'``, ``'<'`` and ``'>'`` in string *s* to HTML-safe 350 sequences. Use this if you need to display text that might contain such 351 characters in HTML. If the optional flag *quote* is true, the quotation mark 352 character (``'"'``) is also translated; this helps for inclusion in an HTML 353 attribute value, as in ``<A HREF="...">``. If the value to be quoted might 354 include single- or double-quote characters, or both, consider using the 355 :func:`quoteattr` function in the :mod:`xml.sax.saxutils` module instead. 356 357 358.. _cgi-security: 359 360Caring about security 361--------------------- 362 363.. index:: pair: CGI; security 364 365There's one important rule: if you invoke an external program (via the 366:func:`os.system` or :func:`os.popen` functions. or others with similar 367functionality), make very sure you don't pass arbitrary strings received from 368the client to the shell. This is a well-known security hole whereby clever 369hackers anywhere on the Web can exploit a gullible CGI script to invoke 370arbitrary shell commands. Even parts of the URL or field names cannot be 371trusted, since the request doesn't have to come from your form! 372 373To be on the safe side, if you must pass a string gotten from a form to a shell 374command, you should make sure the string contains only alphanumeric characters, 375dashes, underscores, and periods. 376 377 378Installing your CGI script on a Unix system 379------------------------------------------- 380 381Read the documentation for your HTTP server and check with your local system 382administrator to find the directory where CGI scripts should be installed; 383usually this is in a directory :file:`cgi-bin` in the server tree. 384 385Make sure that your script is readable and executable by "others"; the Unix file 386mode should be ``0755`` octal (use ``chmod 0755 filename``). Make sure that the 387first line of the script contains ``#!`` starting in column 1 followed by the 388pathname of the Python interpreter, for instance:: 389 390 #!/usr/local/bin/python 391 392Make sure the Python interpreter exists and is executable by "others". 393 394Make sure that any files your script needs to read or write are readable or 395writable, respectively, by "others" --- their mode should be ``0644`` for 396readable and ``0666`` for writable. This is because, for security reasons, the 397HTTP server executes your script as user "nobody", without any special 398privileges. It can only read (write, execute) files that everybody can read 399(write, execute). The current directory at execution time is also different (it 400is usually the server's cgi-bin directory) and the set of environment variables 401is also different from what you get when you log in. In particular, don't count 402on the shell's search path for executables (:envvar:`PATH`) or the Python module 403search path (:envvar:`PYTHONPATH`) to be set to anything interesting. 404 405If you need to load modules from a directory which is not on Python's default 406module search path, you can change the path in your script, before importing 407other modules. For example:: 408 409 import sys 410 sys.path.insert(0, "/usr/home/joe/lib/python") 411 sys.path.insert(0, "/usr/local/lib/python") 412 413(This way, the directory inserted last will be searched first!) 414 415Instructions for non-Unix systems will vary; check your HTTP server's 416documentation (it will usually have a section on CGI scripts). 417 418 419Testing your CGI script 420----------------------- 421 422Unfortunately, a CGI script will generally not run when you try it from the 423command line, and a script that works perfectly from the command line may fail 424mysteriously when run from the server. There's one reason why you should still 425test your script from the command line: if it contains a syntax error, the 426Python interpreter won't execute it at all, and the HTTP server will most likely 427send a cryptic error to the client. 428 429Assuming your script has no syntax errors, yet it does not work, you have no 430choice but to read the next section. 431 432 433Debugging CGI scripts 434--------------------- 435 436.. index:: pair: CGI; debugging 437 438First of all, check for trivial installation errors --- reading the section 439above on installing your CGI script carefully can save you a lot of time. If 440you wonder whether you have understood the installation procedure correctly, try 441installing a copy of this module file (:file:`cgi.py`) as a CGI script. When 442invoked as a script, the file will dump its environment and the contents of the 443form in HTML form. Give it the right mode etc, and send it a request. If it's 444installed in the standard :file:`cgi-bin` directory, it should be possible to 445send it a request by entering a URL into your browser of the form:: 446 447 http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home 448 449If this gives an error of type 404, the server cannot find the script -- perhaps 450you need to install it in a different directory. If it gives another error, 451there's an installation problem that you should fix before trying to go any 452further. If you get a nicely formatted listing of the environment and form 453content (in this example, the fields should be listed as "addr" with value "At 454Home" and "name" with value "Joe Blow"), the :file:`cgi.py` script has been 455installed correctly. If you follow the same procedure for your own script, you 456should now be able to debug it. 457 458The next step could be to call the :mod:`cgi` module's :func:`test` function 459from your script: replace its main code with the single statement :: 460 461 cgi.test() 462 463This should produce the same results as those gotten from installing the 464:file:`cgi.py` file itself. 465 466When an ordinary Python script raises an unhandled exception (for whatever 467reason: of a typo in a module name, a file that can't be opened, etc.), the 468Python interpreter prints a nice traceback and exits. While the Python 469interpreter will still do this when your CGI script raises an exception, most 470likely the traceback will end up in one of the HTTP server's log files, or be 471discarded altogether. 472 473Fortunately, once you have managed to get your script to execute *some* code, 474you can easily send tracebacks to the Web browser using the :mod:`cgitb` module. 475If you haven't done so already, just add the lines:: 476 477 import cgitb 478 cgitb.enable() 479 480to the top of your script. Then try running it again; when a problem occurs, 481you should see a detailed report that will likely make apparent the cause of the 482crash. 483 484If you suspect that there may be a problem in importing the :mod:`cgitb` module, 485you can use an even more robust approach (which only uses built-in modules):: 486 487 import sys 488 sys.stderr = sys.stdout 489 print "Content-Type: text/plain" 490 print 491 ...your code here... 492 493This relies on the Python interpreter to print the traceback. The content type 494of the output is set to plain text, which disables all HTML processing. If your 495script works, the raw HTML will be displayed by your client. If it raises an 496exception, most likely after the first two lines have been printed, a traceback 497will be displayed. Because no HTML interpretation is going on, the traceback 498will be readable. 499 500 501Common problems and solutions 502----------------------------- 503 504* Most HTTP servers buffer the output from CGI scripts until the script is 505 completed. This means that it is not possible to display a progress report on 506 the client's display while the script is running. 507 508* Check the installation instructions above. 509 510* Check the HTTP server's log files. (``tail -f logfile`` in a separate window 511 may be useful!) 512 513* Always check a script for syntax errors first, by doing something like 514 ``python script.py``. 515 516* If your script does not have any syntax errors, try adding ``import cgitb; 517 cgitb.enable()`` to the top of the script. 518 519* When invoking external programs, make sure they can be found. Usually, this 520 means using absolute path names --- :envvar:`PATH` is usually not set to a very 521 useful value in a CGI script. 522 523* When reading or writing external files, make sure they can be read or written 524 by the userid under which your CGI script will be running: this is typically the 525 userid under which the web server is running, or some explicitly specified 526 userid for a web server's ``suexec`` feature. 527 528* Don't try to give a CGI script a set-uid mode. This doesn't work on most 529 systems, and is a security liability as well. 530 531.. rubric:: Footnotes 532 533.. [#] Note that some recent versions of the HTML specification do state what order the 534 field values should be supplied in, but knowing whether a request was 535 received from a conforming browser, or even from a browser at all, is tedious 536 and error-prone. 537