/docs/rfc3875.txt
Plain Text | 2019 lines | 1392 code | 627 blank | 0 comment | 0 complexity | bcc195229bb970719dbfd6dc4ef06049 MD5 | raw file
Large files files are truncated, but you can click here to view the full file
1 2 3 4 5 6 7Network Working Group D. Robinson 8Request for Comments: 3875 K. Coar 9Category: Informational The Apache Software Foundation 10 October 2004 11 12 13 The Common Gateway Interface (CGI) Version 1.1 14 15Status of this Memo 16 17 This memo provides information for the Internet community. It does 18 not specify an Internet standard of any kind. Distribution of this 19 memo is unlimited. 20 21Copyright Notice 22 23 Copyright (C) The Internet Society (2004). 24 25IESG Note 26 27 This document is not a candidate for any level of Internet Standard. 28 The IETF disclaims any knowledge of the fitness of this document for 29 any purpose, and in particular notes that it has not had IETF review 30 for such things as security, congestion control or inappropriate 31 interaction with deployed protocols. The RFC Editor has chosen to 32 publish this document at its discretion. Readers of this document 33 should exercise caution in evaluating its value for implementation 34 and deployment. 35 36Abstract 37 38 The Common Gateway Interface (CGI) is a simple interface for running 39 external programs, software or gateways under an information server 40 in a platform-independent manner. Currently, the supported 41 information servers are HTTP servers. 42 43 The interface has been in use by the World-Wide Web (WWW) since 1993. 44 This specification defines the 'current practice' parameters of the 45 'CGI/1.1' interface developed and documented at the U.S. National 46 Centre for Supercomputing Applications. This document also defines 47 the use of the CGI/1.1 interface on UNIX(R) and other, similar 48 systems. 49 50 51 52 53 54 55 56 57 58Robinson & Coar Informational [Page 1] 59 60RFC 3875 CGI Version 1.1 October 2004 61 62 63Table of Contents 64 65 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 4 66 1.1. Purpose . . . . . . . . . . . . . . . . . . . . . . . . 4 67 1.2. Requirements . . . . . . . . . . . . . . . . . . . . . . 4 68 1.3. Specifications . . . . . . . . . . . . . . . . . . . . . 4 69 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . 5 70 71 2. Notational Conventions and Generic Grammar. . . . . . . . . . 5 72 2.1. Augmented BNF . . . . . . . . . . . . . . . . . . . . . 5 73 2.2. Basic Rules . . . . . . . . . . . . . . . . . . . . . . 6 74 2.3. URL Encoding . . . . . . . . . . . . . . . . . . . . . . 7 75 76 3. Invoking the Script . . . . . . . . . . . . . . . . . . . . . 8 77 3.1. Server Responsibilities . . . . . . . . . . . . . . . . 8 78 3.2. Script Selection . . . . . . . . . . . . . . . . . . . . 9 79 3.3. The Script-URI . . . . . . . . . . . . . . . . . . . . . 9 80 3.4. Execution . . . . . . . . . . . . . . . . . . . . . . . 10 81 82 4. The CGI Request . . . . . . . . . . . . . . . . . . . . . . . 10 83 4.1. Request Meta-Variables . . . . . . . . . . . . . . . . . 10 84 4.1.1. AUTH_TYPE. . . . . . . . . . . . . . . . . . . . 11 85 4.1.2. CONTENT_LENGTH . . . . . . . . . . . . . . . . . 12 86 4.1.3. CONTENT_TYPE . . . . . . . . . . . . . . . . . . 12 87 4.1.4. GATEWAY_INTERFACE. . . . . . . . . . . . . . . . 13 88 4.1.5. PATH_INFO. . . . . . . . . . . . . . . . . . . . 13 89 4.1.6. PATH_TRANSLATED. . . . . . . . . . . . . . . . . 14 90 4.1.7. QUERY_STRING . . . . . . . . . . . . . . . . . . 15 91 4.1.8. REMOTE_ADDR. . . . . . . . . . . . . . . . . . . 15 92 4.1.9. REMOTE_HOST. . . . . . . . . . . . . . . . . . . 16 93 4.1.10. REMOTE_IDENT . . . . . . . . . . . . . . . . . . 16 94 4.1.11. REMOTE_USER. . . . . . . . . . . . . . . . . . . 16 95 4.1.12. REQUEST_METHOD . . . . . . . . . . . . . . . . . 17 96 4.1.13. SCRIPT_NAME. . . . . . . . . . . . . . . . . . . 17 97 4.1.14. SERVER_NAME. . . . . . . . . . . . . . . . . . . 17 98 4.1.15. SERVER_PORT. . . . . . . . . . . . . . . . . . . 18 99 4.1.16. SERVER_PROTOCOL. . . . . . . . . . . . . . . . . 18 100 4.1.17. SERVER_SOFTWARE. . . . . . . . . . . . . . . . . 19 101 4.1.18. Protocol-Specific Meta-Variables . . . . . . . . 19 102 4.2. Request Message-Body . . . . . . . . . . . . . . . . . . 20 103 4.3. Request Methods . . . . . . . . . . . . . . . . . . . . 20 104 4.3.1. GET. . . . . . . . . . . . . . . . . . . . . . . 20 105 4.3.2. POST . . . . . . . . . . . . . . . . . . . . . . 21 106 4.3.3. HEAD . . . . . . . . . . . . . . . . . . . . . . 21 107 4.3.4. Protocol-Specific Methods. . . . . . . . . . . . 21 108 4.4. The Script Command Line. . . . . . . . . . . . . . . . . 21 109 110 111 112 113 114Robinson & Coar Informational [Page 2] 115 116RFC 3875 CGI Version 1.1 October 2004 117 118 119 5. NPH Scripts . . . . . . . . . . . . . . . . . . . . . . . . . 22 120 5.1. Identification . . . . . . . . . . . . . . . . . . . . . 22 121 5.2. NPH Response . . . . . . . . . . . . . . . . . . . . . . 22 122 123 6. CGI Response. . . . . . . . . . . . . . . . . . . . . . . . . 23 124 6.1. Response Handling. . . . . . . . . . . . . . . . . . . . 23 125 6.2. Response Types . . . . . . . . . . . . . . . . . . . . . 23 126 6.2.1. Document Response. . . . . . . . . . . . . . . . 23 127 6.2.2. Local Redirect Response. . . . . . . . . . . . . 24 128 6.2.3. Client Redirect Response . . . . . . . . . . . . 24 129 6.2.4. Client Redirect Response with Document . . . . . 24 130 6.3. Response Header Fields . . . . . . . . . . . . . . . . . 25 131 6.3.1. Content-Type . . . . . . . . . . . . . . . . . . 25 132 6.3.2. Location . . . . . . . . . . . . . . . . . . . . 26 133 6.3.3. Status . . . . . . . . . . . . . . . . . . . . . 26 134 6.3.4. Protocol-Specific Header Fields. . . . . . . . . 27 135 6.3.5. Extension Header Fields. . . . . . . . . . . . . 27 136 6.4. Response Message-Body. . . . . . . . . . . . . . . . . . 28 137 138 7. System Specifications . . . . . . . . . . . . . . . . . . . . 28 139 7.1. AmigaDOS . . . . . . . . . . . . . . . . . . . . . . . . 28 140 7.2. UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . 28 141 7.3. EBCDIC/POSIX . . . . . . . . . . . . . . . . . . . . . . 29 142 143 8. Implementation. . . . . . . . . . . . . . . . . . . . . . . . 29 144 8.1. Recommendations for Servers. . . . . . . . . . . . . . . 29 145 8.2. Recommendations for Scripts. . . . . . . . . . . . . . . 30 146 147 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30 148 9.1. Safe Methods . . . . . . . . . . . . . . . . . . . . . . 30 149 9.2. Header Fields Containing Sensitive Information . . . . . 31 150 9.3. Data Privacy . . . . . . . . . . . . . . . . . . . . . . 31 151 9.4. Information Security Model . . . . . . . . . . . . . . . 31 152 9.5. Script Interference with the Server. . . . . . . . . . . 31 153 9.6. Data Length and Buffering Considerations . . . . . . . . 32 154 9.7. Stateless Processing . . . . . . . . . . . . . . . . . . 32 155 9.8. Relative Paths . . . . . . . . . . . . . . . . . . . . . 33 156 9.9. Non-parsed Header Output . . . . . . . . . . . . . . . . 33 157 158 10. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . 33 159 160 11. References. . . . . . . . . . . . . . . . . . . . . . . . . . 33 161 11.1. Normative References. . . . . . . . . . . . . . . . . . 33 162 11.2. Informative References. . . . . . . . . . . . . . . . . 34 163 164 12. Authors' Addresses. . . . . . . . . . . . . . . . . . . . . . 35 165 166 13. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 36 167 168 169 170Robinson & Coar Informational [Page 3] 171 172RFC 3875 CGI Version 1.1 October 2004 173 174 1751. Introduction 176 1771.1. Purpose 178 179 The Common Gateway Interface (CGI) [22] allows an HTTP [1], [4] 180 server and a CGI script to share responsibility for responding to 181 client requests. The client request comprises a Uniform Resource 182 Identifier (URI) [11], a request method and various ancillary 183 information about the request provided by the transport protocol. 184 185 The CGI defines the abstract parameters, known as meta-variables, 186 which describe a client's request. Together with a concrete 187 programmer interface this specifies a platform-independent interface 188 between the script and the HTTP server. 189 190 The server is responsible for managing connection, data transfer, 191 transport and network issues related to the client request, whereas 192 the CGI script handles the application issues, such as data access 193 and document processing. 194 1951.2. Requirements 196 197 The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT', 198 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'MAY' and 'OPTIONAL' in this 199 document are to be interpreted as described in BCP 14, RFC 2119 [3]. 200 201 An implementation is not compliant if it fails to satisfy one or more 202 of the 'must' requirements for the protocols it implements. An 203 implementation that satisfies all of the 'must' and all of the 204 'should' requirements for its features is said to be 'unconditionally 205 compliant'; one that satisfies all of the 'must' requirements but not 206 all of the 'should' requirements for its features is said to be 207 'conditionally compliant'. 208 2091.3. Specifications 210 211 Not all of the functions and features of the CGI are defined in the 212 main part of this specification. The following phrases are used to 213 describe the features that are not specified: 214 215 'system-defined' 216 The feature may differ between systems, but must be the same for 217 different implementations using the same system. A system will 218 usually identify a class of operating systems. Some systems are 219 defined in section 7 of this document. New systems may be defined 220 by new specifications without revision of this document. 221 222 223 224 225 226Robinson & Coar Informational [Page 4] 227 228RFC 3875 CGI Version 1.1 October 2004 229 230 231 'implementation-defined' 232 The behaviour of the feature may vary from implementation to 233 implementation; a particular implementation must document its 234 behaviour. 235 2361.4. Terminology 237 238 This specification uses many terms defined in the HTTP/1.1 239 specification [4]; however, the following terms are used here in a 240 sense which may not accord with their definitions in that document, 241 or with their common meaning. 242 243 'meta-variable' 244 A named parameter which carries information from the server to the 245 script. It is not necessarily a variable in the operating 246 system's environment, although that is the most common 247 implementation. 248 249 'script' 250 The software that is invoked by the server according to this 251 interface. It need not be a standalone program, but could be a 252 dynamically-loaded or shared library, or even a subroutine in the 253 server. It might be a set of statements interpreted at run-time, 254 as the term 'script' is frequently understood, but that is not a 255 requirement and within the context of this specification the term 256 has the broader definition stated. 257 258 'server' 259 The application program that invokes the script in order to 260 service requests from the client. 261 2622. Notational Conventions and Generic Grammar 263 2642.1. Augmented BNF 265 266 All of the mechanisms specified in this document are described in 267 both prose and an augmented Backus-Naur Form (BNF) similar to that 268 used by RFC 822 [13]. Unless stated otherwise, the elements are 269 case-sensitive. This augmented BNF contains the following 270 constructs: 271 272 name = definition 273 The name of a rule and its definition are separated by the equals 274 character ('='). Whitespace is only significant in that 275 continuation lines of a definition are indented. 276 277 278 279 280 281 282Robinson & Coar Informational [Page 5] 283 284RFC 3875 CGI Version 1.1 October 2004 285 286 287 "literal" 288 Double quotation marks (") surround literal text, except for a 289 literal quotation mark, which is surrounded by angle-brackets ('<' 290 and '>'). 291 292 rule1 | rule2 293 Alternative rules are separated by a vertical bar ('|'). 294 295 (rule1 rule2 rule3) 296 Elements enclosed in parentheses are treated as a single element. 297 298 *rule 299 A rule preceded by an asterisk ('*') may have zero or more 300 occurrences. The full form is 'n*m rule' indicating at least n 301 and at most m occurrences of the rule. n and m are optional 302 decimal values with default values of 0 and infinity respectively. 303 304 [rule] 305 An element enclosed in square brackets ('[' and ']') is optional, 306 and is equivalent to '*1 rule'. 307 308 N rule 309 A rule preceded by a decimal number represents exactly N 310 occurrences of the rule. It is equivalent to 'N*N rule'. 311 3122.2. Basic Rules 313 314 This specification uses a BNF-like grammar defined in terms of 315 characters. Unlike many specifications which define the bytes 316 allowed by a protocol, here each literal in the grammar corresponds 317 to the character it represents. How these characters are represented 318 in terms of bits and bytes within a system are either system-defined 319 or specified in the particular context. The single exception is the 320 rule 'OCTET', defined below. 321 322 The following rules are used throughout this specification to 323 describe basic parsing constructs. 324 325 alpha = lowalpha | hialpha 326 lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | 327 "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | 328 "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | 329 "y" | "z" 330 hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | 331 "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | 332 "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | 333 "Y" | "Z" 334 335 336 337 338Robinson & Coar Informational [Page 6] 339 340RFC 3875 CGI Version 1.1 October 2004 341 342 343 digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | 344 "8" | "9" 345 alphanum = alpha | digit 346 OCTET = <any 8-bit byte> 347 CHAR = alpha | digit | separator | "!" | "#" | "$" | 348 "%" | "&" | "'" | "*" | "+" | "-" | "." | "`" | 349 "^" | "_" | "{" | "|" | "}" | "~" | CTL 350 CTL = <any control character> 351 SP = <space character> 352 HT = <horizontal tab character> 353 NL = <newline> 354 LWSP = SP | HT | NL 355 separator = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | 356 "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | 357 "}" | SP | HT 358 token = 1*<any CHAR except CTLs or separators> 359 quoted-string = <"> *qdtext <"> 360 qdtext = <any CHAR except <"> and CTLs but including LWSP> 361 TEXT = <any printable character> 362 363 Note that newline (NL) need not be a single control character, but 364 can be a sequence of control characters. A system MAY define TEXT to 365 be a larger set of characters than <any CHAR excluding CTLs but 366 including LWSP>. 367 3682.3. URL Encoding 369 370 Some variables and constructs used here are described as being 371 'URL-encoded'. This encoding is described in section 2 of RFC 2396 372 [2]. In a URL-encoded string an escape sequence consists of a 373 percent character ("%") followed by two hexadecimal digits, where the 374 two hexadecimal digits form an octet. An escape sequence represents 375 the graphic character that has the octet as its code within the 376 US-ASCII [9] coded character set, if it exists. Currently there is 377 no provision within the URI syntax to identify which character set 378 non-ASCII codes represent, so CGI handles this issue on an ad-hoc 379 basis. 380 381 Note that some unsafe (reserved) characters may have different 382 semantics when encoded. The definition of which characters are 383 unsafe depends on the context; see section 2 of RFC 2396 [2], updated 384 by RFC 2732 [7], for an authoritative treatment. These reserved 385 characters are generally used to provide syntactic structure to the 386 character string, for example as field separators. In all cases, the 387 string is first processed with regard to any reserved characters 388 present, and then the resulting data can be URL-decoded by replacing 389 "%" escape sequences by their character values. 390 391 392 393 394Robinson & Coar Informational [Page 7] 395 396RFC 3875 CGI Version 1.1 October 2004 397 398 399 To encode a character string, all reserved and forbidden characters 400 are replaced by the corresponding "%" escape sequences. The string 401 can then be used in assembling a URI. The reserved characters will 402 vary from context to context, but will always be drawn from this set: 403 404 reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | 405 "," | "[" | "]" 406 407 The last two characters were added by RFC 2732 [7]. In any 408 particular context, a sub-set of these characters will be reserved; 409 the other characters from this set MUST NOT be encoded when a string 410 is URL-encoded in that context. Other basic rules used to describe 411 URI syntax are: 412 413 hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" 414 | "c" | "d" | "e" | "f" 415 escaped = "%" hex hex 416 unreserved = alpha | digit | mark 417 mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" 418 4193. Invoking the Script 420 4213.1. Server Responsibilities 422 423 The server acts as an application gateway. It receives the request 424 from the client, selects a CGI script to handle the request, converts 425 the client request to a CGI request, executes the script and converts 426 the CGI response into a response for the client. When processing the 427 client request, it is responsible for implementing any protocol or 428 transport level authentication and security. The server MAY also 429 function in a 'non-transparent' manner, modifying the request or 430 response in order to provide some additional service, such as media 431 type transformation or protocol reduction. 432 433 The server MUST perform translations and protocol conversions on the 434 client request data required by this specification. Furthermore, the 435 server retains its responsibility to the client to conform to the 436 relevant network protocol even if the CGI script fails to conform to 437 this specification. 438 439 If the server is applying authentication to the request, then it MUST 440 NOT execute the script unless the request passes all defined access 441 controls. 442 443 444 445 446 447 448 449 450Robinson & Coar Informational [Page 8] 451 452RFC 3875 CGI Version 1.1 October 2004 453 454 4553.2. Script Selection 456 457 The server determines which CGI is script to be executed based on a 458 generic-form URI supplied by the client. This URI includes a 459 hierarchical path with components separated by "/". For any 460 particular request, the server will identify all or a leading part of 461 this path with an individual script, thus placing the script at a 462 particular point in the path hierarchy. The remainder of the path, 463 if any, is a resource or sub-resource identifier to be interpreted by 464 the script. 465 466 Information about this split of the path is available to the script 467 in the meta-variables, described below. Support for non-hierarchical 468 URI schemes is outside the scope of this specification. 469 4703.3. The Script-URI 471 472 The mapping from client request URI to choice of script is defined by 473 the particular server implementation and its configuration. The 474 server may allow the script to be identified with a set of several 475 different URI path hierarchies, and therefore is permitted to replace 476 the URI by other members of this set during processing and generation 477 of the meta-variables. The server 478 479 1. MAY preserve the URI in the particular client request; or 480 481 2. it MAY select a canonical URI from the set of possible values 482 for each script; or 483 484 3. it can implement any other selection of URI from the set. 485 486 From the meta-variables thus generated, a URI, the 'Script-URI', can 487 be constructed. This MUST have the property that if the client had 488 accessed this URI instead, then the script would have been executed 489 with the same values for the SCRIPT_NAME, PATH_INFO and QUERY_STRING 490 meta-variables. The Script-URI has the structure of a generic URI as 491 defined in section 3 of RFC 2396 [2], with the exception that object 492 parameters and fragment identifiers are not permitted. The various 493 components of the Script-URI are defined by some of the 494 meta-variables (see below); 495 496 script-URI = <scheme> "://" <server-name> ":" <server-port> 497 <script-path> <extra-path> "?" <query-string> 498 499 where <scheme> is found from SERVER_PROTOCOL, <server-name>, 500 <server-port> and <query-string> are the values of the respective 501 meta-variables. The SCRIPT_NAME and PATH_INFO values, URL-encoded 502 with ";", "=" and "?" reserved, give <script-path> and <extra-path>. 503 504 505 506Robinson & Coar Informational [Page 9] 507 508RFC 3875 CGI Version 1.1 October 2004 509 510 511 See section 4.1.5 for more information about the PATH_INFO 512 meta-variable. 513 514 The scheme and the protocol are not identical as the scheme 515 identifies the access method in addition to the application protocol. 516 For example, a resource accessed using Transport Layer Security (TLS) 517 [14] would have a request URI with a scheme of https when using the 518 HTTP protocol [19]. CGI/1.1 provides no generic means for the script 519 to reconstruct this, and therefore the Script-URI as defined includes 520 the base protocol used. However, a script MAY make use of 521 scheme-specific meta-variables to better deduce the URI scheme. 522 523 Note that this definition also allows URIs to be constructed which 524 would invoke the script with any permitted values for the path-info 525 or query-string, by modifying the appropriate components. 526 5273.4. Execution 528 529 The script is invoked in a system-defined manner. Unless specified 530 otherwise, the file containing the script will be invoked as an 531 executable program. The server prepares the CGI request as described 532 in section 4; this comprises the request meta-variables (immediately 533 available to the script on execution) and request message data. The 534 request data need not be immediately available to the script; the 535 script can be executed before all this data has been received by the 536 server from the client. The response from the script is returned to 537 the server as described in sections 5 and 6. 538 539 In the event of an error condition, the server can interrupt or 540 terminate script execution at any time and without warning. That 541 could occur, for example, in the event of a transport failure between 542 the server and the client; so the script SHOULD be prepared to handle 543 abnormal termination. 544 5454. The CGI Request 546 547 Information about a request comes from two different sources; the 548 request meta-variables and any associated message-body. 549 5504.1. Request Meta-Variables 551 552 Meta-variables contain data about the request passed from the server 553 to the script, and are accessed by the script in a system-defined 554 manner. Meta-variables are identified by case-insensitive names; 555 there cannot be two different variables whose names differ in case 556 only. Here they are shown using a canonical representation of 557 capitals plus underscore ("_"). A particular system can define a 558 different representation. 559 560 561 562Robinson & Coar Informational [Page 10] 563 564RFC 3875 CGI Version 1.1 October 2004 565 566 567 meta-variable-name = "AUTH_TYPE" | "CONTENT_LENGTH" | 568 "CONTENT_TYPE" | "GATEWAY_INTERFACE" | 569 "PATH_INFO" | "PATH_TRANSLATED" | 570 "QUERY_STRING" | "REMOTE_ADDR" | 571 "REMOTE_HOST" | "REMOTE_IDENT" | 572 "REMOTE_USER" | "REQUEST_METHOD" | 573 "SCRIPT_NAME" | "SERVER_NAME" | 574 "SERVER_PORT" | "SERVER_PROTOCOL" | 575 "SERVER_SOFTWARE" | scheme | 576 protocol-var-name | extension-var-name 577 protocol-var-name = ( protocol | scheme ) "_" var-name 578 scheme = alpha *( alpha | digit | "+" | "-" | "." ) 579 var-name = token 580 extension-var-name = token 581 582 Meta-variables with the same name as a scheme, and names beginning 583 with the name of a protocol or scheme (e.g., HTTP_ACCEPT) are also 584 defined. The number and meaning of these variables may change 585 independently of this specification. (See also section 4.1.18.) 586 587 The server MAY set additional implementation-defined extension meta- 588 variables, whose names SHOULD be prefixed with "X_". 589 590 This specification does not distinguish between zero-length (NULL) 591 values and missing values. For example, a script cannot distinguish 592 between the two requests http://host/script and http://host/script? 593 as in both cases the QUERY_STRING meta-variable would be NULL. 594 595 meta-variable-value = "" | 1*<TEXT, CHAR or tokens of value> 596 597 An optional meta-variable may be omitted (left unset) if its value is 598 NULL. Meta-variable values MUST be considered case-sensitive except 599 as noted otherwise. The representation of the characters in the 600 meta-variables is system-defined; the server MUST convert values to 601 that representation. 602 6034.1.1. AUTH_TYPE 604 605 The AUTH_TYPE variable identifies any mechanism used by the server to 606 authenticate the user. It contains a case-insensitive value defined 607 by the client protocol or server implementation. 608 609 For HTTP, if the client request required authentication for external 610 access, then the server MUST set the value of this variable from the 611 'auth-scheme' token in the request Authorization header field. 612 613 614 615 616 617 618Robinson & Coar Informational [Page 11] 619 620RFC 3875 CGI Version 1.1 October 2004 621 622 623 AUTH_TYPE = "" | auth-scheme 624 auth-scheme = "Basic" | "Digest" | extension-auth 625 extension-auth = token 626 627 HTTP access authentication schemes are described in RFC 2617 [5]. 628 6294.1.2. CONTENT_LENGTH 630 631 The CONTENT_LENGTH variable contains the size of the message-body 632 attached to the request, if any, in decimal number of octets. If no 633 data is attached, then NULL (or unset). 634 635 CONTENT_LENGTH = "" | 1*digit 636 637 The server MUST set this meta-variable if and only if the request is 638 accompanied by a message-body entity. The CONTENT_LENGTH value must 639 reflect the length of the message-body after the server has removed 640 any transfer-codings or content-codings. 641 6424.1.3. CONTENT_TYPE 643 644 If the request includes a message-body, the CONTENT_TYPE variable is 645 set to the Internet Media Type [6] of the message-body. 646 647 CONTENT_TYPE = "" | media-type 648 media-type = type "/" subtype *( ";" parameter ) 649 type = token 650 subtype = token 651 parameter = attribute "=" value 652 attribute = token 653 value = token | quoted-string 654 655 The type, subtype and parameter attribute names are not 656 case-sensitive. Parameter values may be case sensitive. Media types 657 and their use in HTTP are described section 3.7 of the HTTP/1.1 658 specification [4]. 659 660 There is no default value for this variable. If and only if it is 661 unset, then the script MAY attempt to determine the media type from 662 the data received. If the type remains unknown, then the script MAY 663 choose to assume a type of application/octet-stream or it may reject 664 the request with an error (as described in section 6.3.3). 665 666 Each media-type defines a set of optional and mandatory parameters. 667 This may include a charset parameter with a case-insensitive value 668 defining the coded character set for the message-body. If the 669 670 671 672 673 674Robinson & Coar Informational [Page 12] 675 676RFC 3875 CGI Version 1.1 October 2004 677 678 679 charset parameter is omitted, then the default value should be 680 derived according to whichever of the following rules is the first to 681 apply: 682 683 1. There MAY be a system-defined default charset for some 684 media-types. 685 686 2. The default for media-types of type "text" is ISO-8859-1 [4]. 687 688 3. Any default defined in the media-type specification. 689 690 4. The default is US-ASCII. 691 692 The server MUST set this meta-variable if an HTTP Content-Type field 693 is present in the client request header. If the server receives a 694 request with an attached entity but no Content-Type header field, it 695 MAY attempt to determine the correct content type, otherwise it 696 should omit this meta-variable. 697 6984.1.4. GATEWAY_INTERFACE 699 700 The GATEWAY_INTERFACE variable MUST be set to the dialect of CGI 701 being used by the server to communicate with the script. Syntax: 702 703 GATEWAY_INTERFACE = "CGI" "/" 1*digit "." 1*digit 704 705 Note that the major and minor numbers are treated as separate 706 integers and hence each may be incremented higher than a single 707 digit. Thus CGI/2.4 is a lower version than CGI/2.13 which in turn 708 is lower than CGI/12.3. Leading zeros MUST be ignored by the script 709 and MUST NOT be generated by the server. 710 711 This document defines the 1.1 version of the CGI interface. 712 7134.1.5. PATH_INFO 714 715 The PATH_INFO variable specifies a path to be interpreted by the CGI 716 script. It identifies the resource or sub-resource to be returned by 717 the CGI script, and is derived from the portion of the URI path 718 hierarchy following the part that identifies the script itself. 719 Unlike a URI path, the PATH_INFO is not URL-encoded, and cannot 720 contain path-segment parameters. A PATH_INFO of "/" represents a 721 single void path segment. 722 723 PATH_INFO = "" | ( "/" path ) 724 path = lsegment *( "/" lsegment ) 725 lsegment = *lchar 726 lchar = <any TEXT or CTL except "/"> 727 728 729 730Robinson & Coar Informational [Page 13] 731 732RFC 3875 CGI Version 1.1 October 2004 733 734 735 The value is considered case-sensitive and the server MUST preserve 736 the case of the path as presented in the request URI. The server MAY 737 impose restrictions and limitations on what values it permits for 738 PATH_INFO, and MAY reject the request with an error if it encounters 739 any values considered objectionable. That MAY include any requests 740 that would result in an encoded "/" being decoded into PATH_INFO, as 741 this might represent a loss of information to the script. Similarly, 742 treatment of non US-ASCII characters in the path is system-defined. 743 744 URL-encoded, the PATH_INFO string forms the extra-path component of 745 the Script-URI (see section 3.3) which follows the SCRIPT_NAME part 746 of that path. 747 7484.1.6. PATH_TRANSLATED 749 750 The PATH_TRANSLATED variable is derived by taking the PATH_INFO 751 value, parsing it as a local URI in its own right, and performing any 752 virtual-to-physical translation appropriate to map it onto the 753 server's document repository structure. The set of characters 754 permitted in the result is system-defined. 755 756 PATH_TRANSLATED = *<any character> 757 758 This is the file location that would be accessed by a request for 759 760 <scheme> "://" <server-name> ":" <server-port> <extra-path> 761 762 where <scheme> is the scheme for the original client request and 763 <extra-path> is a URL-encoded version of PATH_INFO, with ";", "=" and 764 "?" reserved. For example, a request such as the following: 765 766 http://somehost.com/cgi-bin/somescript/this%2eis%2epath%3binfo 767 768 would result in a PATH_INFO value of 769 770 /this.is.the.path;info 771 772 An internal URI is constructed from the scheme, server location and 773 the URL-encoded PATH_INFO: 774 775 http://somehost.com/this.is.the.path%3binfo 776 777 This would then be translated to a location in the server's document 778 repository, perhaps a filesystem path something like this: 779 780 /usr/local/www/htdocs/this.is.the.path;info 781 782 The value of PATH_TRANSLATED is the result of the translation. 783 784 785 786Robinson & Coar Informational [Page 14] 787 788RFC 3875 CGI Version 1.1 October 2004 789 790 791 The value is derived in this way irrespective of whether it maps to a 792 valid repository location. The server MUST preserve the case of the 793 extra-path segment unless the underlying repository supports case- 794 insensitive names. If the repository is only case-aware, case- 795 preserving, or case-blind with regard to document names, the server 796 is not required to preserve the case of the original segment through 797 the translation. 798 799 The translation algorithm the server uses to derive PATH_TRANSLATED 800 is implementation-defined; CGI scripts which use this variable may 801 suffer limited portability. 802 803 The server SHOULD set this meta-variable if the request URI includes 804 a path-info component. If PATH_INFO is NULL, then the 805 PATH_TRANSLATED variable MUST be set to NULL (or unset). 806 8074.1.7. QUERY_STRING 808 809 The QUERY_STRING variable contains a URL-encoded search or parameter 810 string; it provides information to the CGI script to affect or refine 811 the document to be returned by the script. 812 813 The URL syntax for a search string is described in section 3 of RFC 814 2396 [2]. The QUERY_STRING value is case-sensitive. 815 816 QUERY_STRING = query-string 817 query-string = *uric 818 uric = reserved | unreserved | escaped 819 820 When parsing and decoding the query string, the details of the 821 parsing, reserved characters and support for non US-ASCII characters 822 depends on the context. For example, form submission from an HTML 823 document [18] uses application/x-www-form-urlencoded encoding, in 824 which the characters "+", "&" and "=" are reserved, and the ISO 825 8859-1 encoding may be used for non US-ASCII characters. 826 827 The QUERY_STRING value provides the query-string part of the 828 Script-URI. (See section 3.3). 829 830 The server MUST set this variable; if the Script-URI does not include 831 a query component, the QUERY_STRING MUST be defined as an empty 832 string (""). 833 8344.1.8. REMOTE_ADDR 835 836 The REMOTE_ADDR variable MUST be set to the network address of the 837 client sending the request to the server. 838 839 840 841 842Robinson & Coar Informational [Page 15] 843 844RFC 3875 CGI Version 1.1 October 2004 845 846 847 REMOTE_ADDR = hostnumber 848 hostnumber = ipv4-address | ipv6-address 849 ipv4-address = 1*3digit "." 1*3digit "." 1*3digit "." 1*3digit 850 ipv6-address = hexpart [ ":" ipv4-address ] 851 hexpart = hexseq | ( [ hexseq ] "::" [ hexseq ] ) 852 hexseq = 1*4hex *( ":" 1*4hex ) 853 854 The format of an IPv6 address is described in RFC 3513 [15]. 855 8564.1.9. REMOTE_HOST 857 858 The REMOTE_HOST variable contains the fully qualified domain name of 859 the client sending the request to the server, if available, otherwise 860 NULL. Fully qualified domain names take the form as described in 861 section 3.5 of RFC 1034 [17] and section 2.1 of RFC 1123 [12]. 862 Domain names are not case sensitive. 863 864 REMOTE_HOST = "" | hostname | hostnumber 865 hostname = *( domainlabel "." ) toplabel [ "." ] 866 domainlabel = alphanum [ *alphahypdigit alphanum ] 867 toplabel = alpha [ *alphahypdigit alphanum ] 868 alphahypdigit = alphanum | "-" 869 870 The server SHOULD set this variable. If the hostname is not 871 available for performance reasons or otherwise, the server MAY 872 substitute the REMOTE_ADDR value. 873 8744.1.10. REMOTE_IDENT 875 876 The REMOTE_IDENT variable MAY be used to provide identity information 877 reported about the connection by an RFC 1413 [20] request to the 878 remote agent, if available. The server may choose not to support 879 this feature, or not to request the data for efficiency reasons, or 880 not to return available identity data. 881 882 REMOTE_IDENT = *TEXT 883 884 The data returned may be used for authentication purposes, but the 885 level of trust reposed in it should be minimal. 886 8874.1.11. REMOTE_USER 888 889 The REMOTE_USER variable provides a user identification string 890 supplied by client as part of user authentication. 891 892 REMOTE_USER = *TEXT 893 894 895 896 897 898Robinson & Coar Informational [Page 16] 899 900RFC 3875 CGI Version 1.1 October 2004 901 902 903 If the client request required HTTP Authentication [5] (e.g., the 904 AUTH_TYPE meta-variable is set to "Basic" or "Digest"), then the 905 value of the REMOTE_USER meta-variable MUST be set to the user-ID 906 supplied. 907 9084.1.12. REQUEST_METHOD 909 910 The REQUEST_METHOD meta-variable MUST be set to the method which 911 should be used by the script to process the request, as described in 912 section 4.3. 913 914 REQUEST_METHOD = method 915 method = "GET" | "POST" | "HEAD" | extension-method 916 extension-method = "PUT" | "DELETE" | token 917 918 The method is case sensitive. The HTTP methods are described in 919 section 5.1.1 of the HTTP/1.0 specification [1] and section 5.1.1 of 920 the HTTP/1.1 specification [4]. 921 9224.1.13. SCRIPT_NAME 923 924 The SCRIPT_NAME variable MUST be set to a URI path (not URL-encoded) 925 which could identify the CGI script (rather than the script's 926 output). The syntax is the same as for PATH_INFO (section 4.1.5) 927 928 SCRIPT_NAME = "" | ( "/" path ) 929 930 The leading "/" is not part of the path. It is optional if the path 931 is NULL; however, the variable MUST still be set in that case. 932 933 The SCRIPT_NAME string forms some leading part of the path component 934 of the Script-URI derived in some implementation-defined manner. No 935 PATH_INFO segment (see section 4.1.5) is included in the SCRIPT_NAME 936 value. 937 9384.1.14. SERVER_NAME 939 940 The SERVER_NAME variable MUST be set to the name of the server host 941 to which the client request is directed. It is a case-insensitive 942 hostname or network address. It forms the host part of the 943 Script-URI. 944 945 SERVER_NAME = server-name 946 server-name = hostname | ipv4-address | ( "[" ipv6-address "]" ) 947 948 949 950 951 952 953 954Robinson & Coar Informational [Page 17] 955 956RFC 3875 CGI Version 1.1 October 2004 957 958 959 A deployed server can have more than one possible value for this 960 variable, where several HTTP virtual hosts share the same IP address. 961 In that case, the server would use the contents of the request's Host 962 header field to select the correct virtual host. 963 9644.1.15. SERVER_PORT 965 966 The SERVER_PORT variable MUST be set to the TCP/IP port number on 967 which this request is received from the client. This value is used 968 in the port part of the Script-URI. 969 970 SERVER_PORT = server-port 971 server-port = 1*digit 972 973 Note that this variable MUST be set, even if the port is the default 974 port for the scheme and could otherwise be omitted from a URI. 975 9764.1.16. SERVER_PROTOCOL 977 978 The SERVER_PROTOCOL variable MUST be set to the name and version of 979 the application protocol used for this CGI request. This MAY differ 980 from the protocol version used by the server in its communication 981 with the client. 982 983 SERVER_PROTOCOL = HTTP-Version | "INCLUDED" | extension-version 984 HTTP-Version = "HTTP" "/" 1*digit "." 1*digit 985 extension-version = protocol [ "/" 1*digit "." 1*digit ] 986 protocol = token 987 988 Here, 'protocol' defines the syntax of some of the information 989 passing between the server and the script (the 'protocol-specific' 990 features). It is not case sensitive and is usually presented in 991 upper case. The protocol is not the same as the scheme part of the 992 script URI, which defines the overall access mechanism used by the 993 client to communicate with the server. For example, a request that 994 reaches the script with a protocol of "HTTP" may have used an "https" 995 scheme. 996 997 A well-known value for SERVER_PROTOCOL which the server MAY use is 998 "INCLUDED", which signals that the current document is being included 999 as part of a composite document, rather than being the direct target 1000 of the client request. The script should treat this as an HTTP/1.0 1001 request. 1002 1003 1004 1005 1006 1007 1008 1009 1010Robinson & Coar Informational [Page 18] 1011 1012RFC 3875 CGI Version 1.1 October 2004 1013 1014 10154.1.17. SERVER_SOFTWARE 1016 1017 The SERVER_SOFTWARE meta-variable MUST be set to the name and version 1018 of the information server software making the CGI request (and 1019 running the gateway). It SHOULD be the same as the server 1020 description reported to the client, if any. 1021 1022 SERVER_SOFTWARE = 1*( product | comment ) 1023 product = token [ "/" product-version ] 1024 product-version = token 1025 comment = "(" *( ctext | comment ) ")" 1026 ctext = <any TEXT excluding "(" and ")"> 1027 10284.1.18. Protocol-Specific Meta-Variables 1029 1030 The server SHOULD set meta-variables specific to the protocol and 1031 scheme for the request. Interpretation of protocol-specific 1032 variables depends on the protocol version in SERVER_PROTOCOL. The 1033 server MAY set a meta-variable with the name of the scheme to a 1034 non-NULL value if the scheme is not the same as the protocol. The 1035 presence of such a variable indicates to a script which scheme is 1036 used by the request. 1037 1038 Meta-variables with names beginning with "HTTP_" contain values read 1039 from the client request header fields, if the protocol used is HTTP. 1040 The HTTP header field name is converted to upper case, has all 1041 occurrences of "-" replaced with "_" and has "HTTP_" prepended to 1042 give the meta-variable name. The header data can be presented as 1043 sent by the client, or can be rewritten in ways which do not change 1044 its semantics. If multiple header fields with the same field-name 1045 are received then the server MUST rewrite them as a single value 1046 having the same semantics. Similarly, a header field that spans 1047 multiple lines MUST be merged onto a single line. The server MUST, 1048 if necessary, change the representation of the data (for example, the 1049 character set) to be appropriate for a CGI meta-variable. 1050 1051 The server is not required to create meta-variables for all the 1052 header fields that it receives. In particular, it SHOULD remove any 1053 header fields carrying authentication information, such as 1054 'Authorization'; or that are available to the script in other 1055 variables, such as 'Content-Length' and 'Content-Type'. The server 1056 MAY remove header fields that relate solely to client-side 1057 communication issues, such as 'Connection'. 1058 1059 1060 1061 1062 1063 1064 1065 1066Robinson & Coar Informational [Page 19] 1067 1068RFC 3875 CGI Version 1.1 October 2004 1069 1070 10714.2. Request Message-Body 1072 1073 Request data is accessed by the script in a system-defined method; 1074 unless defined otherwise, this will be by reading the 'standard 1075 input' file descriptor or file handle. 1076 1077 Request-Data = [ request-body ] [ extension-data ] 1078 request-body = <CONTENT_LENGTH>OCTET 1079 extension-data = *OCTET 1080 1081 A request-body is supplied with the request if the CONTENT_LENGTH is 1082 not NULL. The server MUST make at least that many bytes available 1083 for the script to read. The server MAY signal an end-of-file 1084 condition after CONTENT_LENGTH bytes have been read or it MAY supply 1085 extension data. Therefore, the script MUST NOT attempt to read more 1086 than CONTENT_LENGTH bytes, even if more data is available. However, 1087 it is not obliged to read any of the data. 1088 1089 For non-parsed header (NPH) scripts (section 5), the server SHOULD 1090 attempt to ensure that the data supplied to the script is precisely 1091 as supplied by the client and is unaltered by the server. 1092 1093 As transfer-codings are not supported on the request-body, the server 1094 MUST remove any such codings from the message-body, and recalculate 1095 the CONTENT_LENGTH. If this is not possible (for example, because of 1096 large buffering requirements), the server SHOULD reject the client 1097 request. It MAY also remove content-codings from the message-body. 1098 10994.3. Request Methods 1100 1101 The Request Method, as supplied in the REQUEST_METHOD meta-variable, 1102 identifies the processing method to be applied by the script in 1103 producing a response. The script author can choose to implement the 1104 methods most appropriate for the particular application. If the 1105 script receives a request with a method it does not support it SHOULD 1106 reject it with an error (see section 6.3.3). 1107 11084.3.1. GET 1109 1110 The GET method indicates that the script should produce a document 1111 based on the meta-variable values. By convention, the GET method is 1112 'safe' and 'idempotent' and SHOULD NOT have the significance of 1113 taking an action other than producing a document. 1114 1115 The meaning of the GET method may be modified and refined by 1116 protocol-specific meta-variables. 1117 1118 1119 1120 1121 1122Robinson & Coar Informational [Page 20] 1123 1124RFC 3875 CGI Version 1.1 October 2004 1125 1126 11274.3.2. POST 1128 1129 The POST method is used to request the script perform processing and 1130 produce a document based on the data in the request message-body, in 1131 addition to meta-variable values. A common use is form submission in 1132 HTML [18], intended to initiate processing by the script that has a 1133 permanent affect, such a change in a database. 1134 1135 The script MUST check the value of the CONTENT_LENGTH variable before 1136 reading the attached message-body, and SHOULD check the CONTENT_TYPE 1137 value before processing it. 1138 11394.3.3. HEAD 1140 1141 The HEAD method requests the script to do sufficient processing to 1142 return the response header fields, without providing a response 1143 message-body. The script MUST NOT provide a response message-body 1144 for a HEAD request. If it does, then the server MUST discard the 1145 message-body when reading the response from the script. 1146 11474.3.4. Protocol-Specific Methods 1148 1149 The script MAY implement any protocol-specific method, such as 1150 HTTP/1.1 PUT and DELETE; it SHOULD check the value of SERVER_PROTOCOL 1151 when doing so. 1152 1153 The server MAY decide that some methods are not appropriate or 1154 permitted for a script, and may handle the methods itself or return 1155 an error to the client. 1156 11574.4. The Script Command Line 1158 1159 Some systems support a method for supplying an array of strings to 1160 the CGI script. This is only used in the case of an 'indexed' HTTP 1161 query, which is identified by a 'GET' or 'HEAD' request with a URI 1162 query string that does not contain any unencoded "=" characters. For 1163 such a request, the server SHOULD treat the query-string as a 1164 search-string and parse it into words, using the rules 1165 1166 search-string = search-word *( "+" search-word ) 1167 search-word = 1*schar 1168 schar = unreserved | escaped | xreserved 1169 xreserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "," | 1170 "$" 1171 1172 After parsing, each search-word is URL-decoded, optionally encoded in 1173 a system-defined manner and then added to the command line argument 1174 list. 1175 1176 1177 1178Robinson & Coar Informational [Page 21] 1179 1180RFC 3875 CGI Version 1.1 October 2004 1181 1182 1183 If the server cannot create any part of the argument list, then the 1184 server MUST NOT generate any command line information. For example, 1185 the number of arguments may be greater than operating system or 1186 server limits, or one of the words may not be representable as an 1187 argument. 1188 1189 The script SHOULD check to see if the QUERY_STRING value contains an 1190 unencoded "=" character, and SHOULD NOT use the command line 1191 arguments if it does. 1192 11935. NPH Scripts 1194 11955.1. Identification 1196 1197 The server MAY support NPH (Non-Parsed Header) scripts; these are 1198 scripts to which the server passes all responsibility for response 1199 processing. 1200 1201 This specification provides no mechanism for an NPH script to be 1202 identified on the basis of its output data alone. By convention, 1203 therefore, any particular script can only ever provide output of one 1204 type (NPH or CGI) and hence the script itself is described as an 'NPH 1205 script'. A server with NPH support MUST provide an implementation- 1206 defined mechanism for identifying NPH scripts, perhaps based on the 1207 name or location of the script. 1208 12095.2. NPH Response 1210 1211 There MUST be a system-defined method for the script to send data 1212 back to the server or client; a script MUST always return some data. 1213 Unless defined otherwise, this will be the same as for conventional 1214 CGI scripts. 1215 1216 Currently, NPH scripts are only defined for HTTP client requests. An 1217 (HTTP) NPH script MUST return a complete HTTP response message, 1218 currently described in section…
Large files files are truncated, but you can click here to view the full file