/docs/rfc3875.txt

https://bitbucket.org/nileshgr/cxxcms/ · Plain Text · 2019 lines · 1392 code · 627 blank · 0 comment · 0 complexity · bcc195229bb970719dbfd6dc4ef06049 MD5 · raw file

Large files are truncated click here to view the full file

  1. Network Working Group D. Robinson
  2. Request for Comments: 3875 K. Coar
  3. Category: Informational The Apache Software Foundation
  4. October 2004
  5. The Common Gateway Interface (CGI) Version 1.1
  6. Status of this Memo
  7. This memo provides information for the Internet community. It does
  8. not specify an Internet standard of any kind. Distribution of this
  9. memo is unlimited.
  10. Copyright Notice
  11. Copyright (C) The Internet Society (2004).
  12. IESG Note
  13. This document is not a candidate for any level of Internet Standard.
  14. The IETF disclaims any knowledge of the fitness of this document for
  15. any purpose, and in particular notes that it has not had IETF review
  16. for such things as security, congestion control or inappropriate
  17. interaction with deployed protocols. The RFC Editor has chosen to
  18. publish this document at its discretion. Readers of this document
  19. should exercise caution in evaluating its value for implementation
  20. and deployment.
  21. Abstract
  22. The Common Gateway Interface (CGI) is a simple interface for running
  23. external programs, software or gateways under an information server
  24. in a platform-independent manner. Currently, the supported
  25. information servers are HTTP servers.
  26. The interface has been in use by the World-Wide Web (WWW) since 1993.
  27. This specification defines the 'current practice' parameters of the
  28. 'CGI/1.1' interface developed and documented at the U.S. National
  29. Centre for Supercomputing Applications. This document also defines
  30. the use of the CGI/1.1 interface on UNIX(R) and other, similar
  31. systems.
  32. Robinson & Coar Informational [Page 1]
  33. RFC 3875 CGI Version 1.1 October 2004
  34. Table of Contents
  35. 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 4
  36. 1.1. Purpose . . . . . . . . . . . . . . . . . . . . . . . . 4
  37. 1.2. Requirements . . . . . . . . . . . . . . . . . . . . . . 4
  38. 1.3. Specifications . . . . . . . . . . . . . . . . . . . . . 4
  39. 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . 5
  40. 2. Notational Conventions and Generic Grammar. . . . . . . . . . 5
  41. 2.1. Augmented BNF . . . . . . . . . . . . . . . . . . . . . 5
  42. 2.2. Basic Rules . . . . . . . . . . . . . . . . . . . . . . 6
  43. 2.3. URL Encoding . . . . . . . . . . . . . . . . . . . . . . 7
  44. 3. Invoking the Script . . . . . . . . . . . . . . . . . . . . . 8
  45. 3.1. Server Responsibilities . . . . . . . . . . . . . . . . 8
  46. 3.2. Script Selection . . . . . . . . . . . . . . . . . . . . 9
  47. 3.3. The Script-URI . . . . . . . . . . . . . . . . . . . . . 9
  48. 3.4. Execution . . . . . . . . . . . . . . . . . . . . . . . 10
  49. 4. The CGI Request . . . . . . . . . . . . . . . . . . . . . . . 10
  50. 4.1. Request Meta-Variables . . . . . . . . . . . . . . . . . 10
  51. 4.1.1. AUTH_TYPE. . . . . . . . . . . . . . . . . . . . 11
  52. 4.1.2. CONTENT_LENGTH . . . . . . . . . . . . . . . . . 12
  53. 4.1.3. CONTENT_TYPE . . . . . . . . . . . . . . . . . . 12
  54. 4.1.4. GATEWAY_INTERFACE. . . . . . . . . . . . . . . . 13
  55. 4.1.5. PATH_INFO. . . . . . . . . . . . . . . . . . . . 13
  56. 4.1.6. PATH_TRANSLATED. . . . . . . . . . . . . . . . . 14
  57. 4.1.7. QUERY_STRING . . . . . . . . . . . . . . . . . . 15
  58. 4.1.8. REMOTE_ADDR. . . . . . . . . . . . . . . . . . . 15
  59. 4.1.9. REMOTE_HOST. . . . . . . . . . . . . . . . . . . 16
  60. 4.1.10. REMOTE_IDENT . . . . . . . . . . . . . . . . . . 16
  61. 4.1.11. REMOTE_USER. . . . . . . . . . . . . . . . . . . 16
  62. 4.1.12. REQUEST_METHOD . . . . . . . . . . . . . . . . . 17
  63. 4.1.13. SCRIPT_NAME. . . . . . . . . . . . . . . . . . . 17
  64. 4.1.14. SERVER_NAME. . . . . . . . . . . . . . . . . . . 17
  65. 4.1.15. SERVER_PORT. . . . . . . . . . . . . . . . . . . 18
  66. 4.1.16. SERVER_PROTOCOL. . . . . . . . . . . . . . . . . 18
  67. 4.1.17. SERVER_SOFTWARE. . . . . . . . . . . . . . . . . 19
  68. 4.1.18. Protocol-Specific Meta-Variables . . . . . . . . 19
  69. 4.2. Request Message-Body . . . . . . . . . . . . . . . . . . 20
  70. 4.3. Request Methods . . . . . . . . . . . . . . . . . . . . 20
  71. 4.3.1. GET. . . . . . . . . . . . . . . . . . . . . . . 20
  72. 4.3.2. POST . . . . . . . . . . . . . . . . . . . . . . 21
  73. 4.3.3. HEAD . . . . . . . . . . . . . . . . . . . . . . 21
  74. 4.3.4. Protocol-Specific Methods. . . . . . . . . . . . 21
  75. 4.4. The Script Command Line. . . . . . . . . . . . . . . . . 21
  76. Robinson & Coar Informational [Page 2]
  77. RFC 3875 CGI Version 1.1 October 2004
  78. 5. NPH Scripts . . . . . . . . . . . . . . . . . . . . . . . . . 22
  79. 5.1. Identification . . . . . . . . . . . . . . . . . . . . . 22
  80. 5.2. NPH Response . . . . . . . . . . . . . . . . . . . . . . 22
  81. 6. CGI Response. . . . . . . . . . . . . . . . . . . . . . . . . 23
  82. 6.1. Response Handling. . . . . . . . . . . . . . . . . . . . 23
  83. 6.2. Response Types . . . . . . . . . . . . . . . . . . . . . 23
  84. 6.2.1. Document Response. . . . . . . . . . . . . . . . 23
  85. 6.2.2. Local Redirect Response. . . . . . . . . . . . . 24
  86. 6.2.3. Client Redirect Response . . . . . . . . . . . . 24
  87. 6.2.4. Client Redirect Response with Document . . . . . 24
  88. 6.3. Response Header Fields . . . . . . . . . . . . . . . . . 25
  89. 6.3.1. Content-Type . . . . . . . . . . . . . . . . . . 25
  90. 6.3.2. Location . . . . . . . . . . . . . . . . . . . . 26
  91. 6.3.3. Status . . . . . . . . . . . . . . . . . . . . . 26
  92. 6.3.4. Protocol-Specific Header Fields. . . . . . . . . 27
  93. 6.3.5. Extension Header Fields. . . . . . . . . . . . . 27
  94. 6.4. Response Message-Body. . . . . . . . . . . . . . . . . . 28
  95. 7. System Specifications . . . . . . . . . . . . . . . . . . . . 28
  96. 7.1. AmigaDOS . . . . . . . . . . . . . . . . . . . . . . . . 28
  97. 7.2. UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . 28
  98. 7.3. EBCDIC/POSIX . . . . . . . . . . . . . . . . . . . . . . 29
  99. 8. Implementation. . . . . . . . . . . . . . . . . . . . . . . . 29
  100. 8.1. Recommendations for Servers. . . . . . . . . . . . . . . 29
  101. 8.2. Recommendations for Scripts. . . . . . . . . . . . . . . 30
  102. 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30
  103. 9.1. Safe Methods . . . . . . . . . . . . . . . . . . . . . . 30
  104. 9.2. Header Fields Containing Sensitive Information . . . . . 31
  105. 9.3. Data Privacy . . . . . . . . . . . . . . . . . . . . . . 31
  106. 9.4. Information Security Model . . . . . . . . . . . . . . . 31
  107. 9.5. Script Interference with the Server. . . . . . . . . . . 31
  108. 9.6. Data Length and Buffering Considerations . . . . . . . . 32
  109. 9.7. Stateless Processing . . . . . . . . . . . . . . . . . . 32
  110. 9.8. Relative Paths . . . . . . . . . . . . . . . . . . . . . 33
  111. 9.9. Non-parsed Header Output . . . . . . . . . . . . . . . . 33
  112. 10. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . 33
  113. 11. References. . . . . . . . . . . . . . . . . . . . . . . . . . 33
  114. 11.1. Normative References. . . . . . . . . . . . . . . . . . 33
  115. 11.2. Informative References. . . . . . . . . . . . . . . . . 34
  116. 12. Authors' Addresses. . . . . . . . . . . . . . . . . . . . . . 35
  117. 13. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 36
  118. Robinson & Coar Informational [Page 3]
  119. RFC 3875 CGI Version 1.1 October 2004
  120. 1. Introduction
  121. 1.1. Purpose
  122. The Common Gateway Interface (CGI) [22] allows an HTTP [1], [4]
  123. server and a CGI script to share responsibility for responding to
  124. client requests. The client request comprises a Uniform Resource
  125. Identifier (URI) [11], a request method and various ancillary
  126. information about the request provided by the transport protocol.
  127. The CGI defines the abstract parameters, known as meta-variables,
  128. which describe a client's request. Together with a concrete
  129. programmer interface this specifies a platform-independent interface
  130. between the script and the HTTP server.
  131. The server is responsible for managing connection, data transfer,
  132. transport and network issues related to the client request, whereas
  133. the CGI script handles the application issues, such as data access
  134. and document processing.
  135. 1.2. Requirements
  136. The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT',
  137. 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'MAY' and 'OPTIONAL' in this
  138. document are to be interpreted as described in BCP 14, RFC 2119 [3].
  139. An implementation is not compliant if it fails to satisfy one or more
  140. of the 'must' requirements for the protocols it implements. An
  141. implementation that satisfies all of the 'must' and all of the
  142. 'should' requirements for its features is said to be 'unconditionally
  143. compliant'; one that satisfies all of the 'must' requirements but not
  144. all of the 'should' requirements for its features is said to be
  145. 'conditionally compliant'.
  146. 1.3. Specifications
  147. Not all of the functions and features of the CGI are defined in the
  148. main part of this specification. The following phrases are used to
  149. describe the features that are not specified:
  150. 'system-defined'
  151. The feature may differ between systems, but must be the same for
  152. different implementations using the same system. A system will
  153. usually identify a class of operating systems. Some systems are
  154. defined in section 7 of this document. New systems may be defined
  155. by new specifications without revision of this document.
  156. Robinson & Coar Informational [Page 4]
  157. RFC 3875 CGI Version 1.1 October 2004
  158. 'implementation-defined'
  159. The behaviour of the feature may vary from implementation to
  160. implementation; a particular implementation must document its
  161. behaviour.
  162. 1.4. Terminology
  163. This specification uses many terms defined in the HTTP/1.1
  164. specification [4]; however, the following terms are used here in a
  165. sense which may not accord with their definitions in that document,
  166. or with their common meaning.
  167. 'meta-variable'
  168. A named parameter which carries information from the server to the
  169. script. It is not necessarily a variable in the operating
  170. system's environment, although that is the most common
  171. implementation.
  172. 'script'
  173. The software that is invoked by the server according to this
  174. interface. It need not be a standalone program, but could be a
  175. dynamically-loaded or shared library, or even a subroutine in the
  176. server. It might be a set of statements interpreted at run-time,
  177. as the term 'script' is frequently understood, but that is not a
  178. requirement and within the context of this specification the term
  179. has the broader definition stated.
  180. 'server'
  181. The application program that invokes the script in order to
  182. service requests from the client.
  183. 2. Notational Conventions and Generic Grammar
  184. 2.1. Augmented BNF
  185. All of the mechanisms specified in this document are described in
  186. both prose and an augmented Backus-Naur Form (BNF) similar to that
  187. used by RFC 822 [13]. Unless stated otherwise, the elements are
  188. case-sensitive. This augmented BNF contains the following
  189. constructs:
  190. name = definition
  191. The name of a rule and its definition are separated by the equals
  192. character ('='). Whitespace is only significant in that
  193. continuation lines of a definition are indented.
  194. Robinson & Coar Informational [Page 5]
  195. RFC 3875 CGI Version 1.1 October 2004
  196. "literal"
  197. Double quotation marks (") surround literal text, except for a
  198. literal quotation mark, which is surrounded by angle-brackets ('<'
  199. and '>').
  200. rule1 | rule2
  201. Alternative rules are separated by a vertical bar ('|').
  202. (rule1 rule2 rule3)
  203. Elements enclosed in parentheses are treated as a single element.
  204. *rule
  205. A rule preceded by an asterisk ('*') may have zero or more
  206. occurrences. The full form is 'n*m rule' indicating at least n
  207. and at most m occurrences of the rule. n and m are optional
  208. decimal values with default values of 0 and infinity respectively.
  209. [rule]
  210. An element enclosed in square brackets ('[' and ']') is optional,
  211. and is equivalent to '*1 rule'.
  212. N rule
  213. A rule preceded by a decimal number represents exactly N
  214. occurrences of the rule. It is equivalent to 'N*N rule'.
  215. 2.2. Basic Rules
  216. This specification uses a BNF-like grammar defined in terms of
  217. characters. Unlike many specifications which define the bytes
  218. allowed by a protocol, here each literal in the grammar corresponds
  219. to the character it represents. How these characters are represented
  220. in terms of bits and bytes within a system are either system-defined
  221. or specified in the particular context. The single exception is the
  222. rule 'OCTET', defined below.
  223. The following rules are used throughout this specification to
  224. describe basic parsing constructs.
  225. alpha = lowalpha | hialpha
  226. lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
  227. "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
  228. "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
  229. "y" | "z"
  230. hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
  231. "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
  232. "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
  233. "Y" | "Z"
  234. Robinson & Coar Informational [Page 6]
  235. RFC 3875 CGI Version 1.1 October 2004
  236. digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
  237. "8" | "9"
  238. alphanum = alpha | digit
  239. OCTET = <any 8-bit byte>
  240. CHAR = alpha | digit | separator | "!" | "#" | "$" |
  241. "%" | "&" | "'" | "*" | "+" | "-" | "." | "`" |
  242. "^" | "_" | "{" | "|" | "}" | "~" | CTL
  243. CTL = <any control character>
  244. SP = <space character>
  245. HT = <horizontal tab character>
  246. NL = <newline>
  247. LWSP = SP | HT | NL
  248. separator = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" |
  249. "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" |
  250. "}" | SP | HT
  251. token = 1*<any CHAR except CTLs or separators>
  252. quoted-string = <"> *qdtext <">
  253. qdtext = <any CHAR except <"> and CTLs but including LWSP>
  254. TEXT = <any printable character>
  255. Note that newline (NL) need not be a single control character, but
  256. can be a sequence of control characters. A system MAY define TEXT to
  257. be a larger set of characters than <any CHAR excluding CTLs but
  258. including LWSP>.
  259. 2.3. URL Encoding
  260. Some variables and constructs used here are described as being
  261. 'URL-encoded'. This encoding is described in section 2 of RFC 2396
  262. [2]. In a URL-encoded string an escape sequence consists of a
  263. percent character ("%") followed by two hexadecimal digits, where the
  264. two hexadecimal digits form an octet. An escape sequence represents
  265. the graphic character that has the octet as its code within the
  266. US-ASCII [9] coded character set, if it exists. Currently there is
  267. no provision within the URI syntax to identify which character set
  268. non-ASCII codes represent, so CGI handles this issue on an ad-hoc
  269. basis.
  270. Note that some unsafe (reserved) characters may have different
  271. semantics when encoded. The definition of which characters are
  272. unsafe depends on the context; see section 2 of RFC 2396 [2], updated
  273. by RFC 2732 [7], for an authoritative treatment. These reserved
  274. characters are generally used to provide syntactic structure to the
  275. character string, for example as field separators. In all cases, the
  276. string is first processed with regard to any reserved characters
  277. present, and then the resulting data can be URL-decoded by replacing
  278. "%" escape sequences by their character values.
  279. Robinson & Coar Informational [Page 7]
  280. RFC 3875 CGI Version 1.1 October 2004
  281. To encode a character string, all reserved and forbidden characters
  282. are replaced by the corresponding "%" escape sequences. The string
  283. can then be used in assembling a URI. The reserved characters will
  284. vary from context to context, but will always be drawn from this set:
  285. reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" |
  286. "," | "[" | "]"
  287. The last two characters were added by RFC 2732 [7]. In any
  288. particular context, a sub-set of these characters will be reserved;
  289. the other characters from this set MUST NOT be encoded when a string
  290. is URL-encoded in that context. Other basic rules used to describe
  291. URI syntax are:
  292. hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b"
  293. | "c" | "d" | "e" | "f"
  294. escaped = "%" hex hex
  295. unreserved = alpha | digit | mark
  296. mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
  297. 3. Invoking the Script
  298. 3.1. Server Responsibilities
  299. The server acts as an application gateway. It receives the request
  300. from the client, selects a CGI script to handle the request, converts
  301. the client request to a CGI request, executes the script and converts
  302. the CGI response into a response for the client. When processing the
  303. client request, it is responsible for implementing any protocol or
  304. transport level authentication and security. The server MAY also
  305. function in a 'non-transparent' manner, modifying the request or
  306. response in order to provide some additional service, such as media
  307. type transformation or protocol reduction.
  308. The server MUST perform translations and protocol conversions on the
  309. client request data required by this specification. Furthermore, the
  310. server retains its responsibility to the client to conform to the
  311. relevant network protocol even if the CGI script fails to conform to
  312. this specification.
  313. If the server is applying authentication to the request, then it MUST
  314. NOT execute the script unless the request passes all defined access
  315. controls.
  316. Robinson & Coar Informational [Page 8]
  317. RFC 3875 CGI Version 1.1 October 2004
  318. 3.2. Script Selection
  319. The server determines which CGI is script to be executed based on a
  320. generic-form URI supplied by the client. This URI includes a
  321. hierarchical path with components separated by "/". For any
  322. particular request, the server will identify all or a leading part of
  323. this path with an individual script, thus placing the script at a
  324. particular point in the path hierarchy. The remainder of the path,
  325. if any, is a resource or sub-resource identifier to be interpreted by
  326. the script.
  327. Information about this split of the path is available to the script
  328. in the meta-variables, described below. Support for non-hierarchical
  329. URI schemes is outside the scope of this specification.
  330. 3.3. The Script-URI
  331. The mapping from client request URI to choice of script is defined by
  332. the particular server implementation and its configuration. The
  333. server may allow the script to be identified with a set of several
  334. different URI path hierarchies, and therefore is permitted to replace
  335. the URI by other members of this set during processing and generation
  336. of the meta-variables. The server
  337. 1. MAY preserve the URI in the particular client request; or
  338. 2. it MAY select a canonical URI from the set of possible values
  339. for each script; or
  340. 3. it can implement any other selection of URI from the set.
  341. From the meta-variables thus generated, a URI, the 'Script-URI', can
  342. be constructed. This MUST have the property that if the client had
  343. accessed this URI instead, then the script would have been executed
  344. with the same values for the SCRIPT_NAME, PATH_INFO and QUERY_STRING
  345. meta-variables. The Script-URI has the structure of a generic URI as
  346. defined in section 3 of RFC 2396 [2], with the exception that object
  347. parameters and fragment identifiers are not permitted. The various
  348. components of the Script-URI are defined by some of the
  349. meta-variables (see below);
  350. script-URI = <scheme> "://" <server-name> ":" <server-port>
  351. <script-path> <extra-path> "?" <query-string>
  352. where <scheme> is found from SERVER_PROTOCOL, <server-name>,
  353. <server-port> and <query-string> are the values of the respective
  354. meta-variables. The SCRIPT_NAME and PATH_INFO values, URL-encoded
  355. with ";", "=" and "?" reserved, give <script-path> and <extra-path>.
  356. Robinson & Coar Informational [Page 9]
  357. RFC 3875 CGI Version 1.1 October 2004
  358. See section 4.1.5 for more information about the PATH_INFO
  359. meta-variable.
  360. The scheme and the protocol are not identical as the scheme
  361. identifies the access method in addition to the application protocol.
  362. For example, a resource accessed using Transport Layer Security (TLS)
  363. [14] would have a request URI with a scheme of https when using the
  364. HTTP protocol [19]. CGI/1.1 provides no generic means for the script
  365. to reconstruct this, and therefore the Script-URI as defined includes
  366. the base protocol used. However, a script MAY make use of
  367. scheme-specific meta-variables to better deduce the URI scheme.
  368. Note that this definition also allows URIs to be constructed which
  369. would invoke the script with any permitted values for the path-info
  370. or query-string, by modifying the appropriate components.
  371. 3.4. Execution
  372. The script is invoked in a system-defined manner. Unless specified
  373. otherwise, the file containing the script will be invoked as an
  374. executable program. The server prepares the CGI request as described
  375. in section 4; this comprises the request meta-variables (immediately
  376. available to the script on execution) and request message data. The
  377. request data need not be immediately available to the script; the
  378. script can be executed before all this data has been received by the
  379. server from the client. The response from the script is returned to
  380. the server as described in sections 5 and 6.
  381. In the event of an error condition, the server can interrupt or
  382. terminate script execution at any time and without warning. That
  383. could occur, for example, in the event of a transport failure between
  384. the server and the client; so the script SHOULD be prepared to handle
  385. abnormal termination.
  386. 4. The CGI Request
  387. Information about a request comes from two different sources; the
  388. request meta-variables and any associated message-body.
  389. 4.1. Request Meta-Variables
  390. Meta-variables contain data about the request passed from the server
  391. to the script, and are accessed by the script in a system-defined
  392. manner. Meta-variables are identified by case-insensitive names;
  393. there cannot be two different variables whose names differ in case
  394. only. Here they are shown using a canonical representation of
  395. capitals plus underscore ("_"). A particular system can define a
  396. different representation.
  397. Robinson & Coar Informational [Page 10]
  398. RFC 3875 CGI Version 1.1 October 2004
  399. meta-variable-name = "AUTH_TYPE" | "CONTENT_LENGTH" |
  400. "CONTENT_TYPE" | "GATEWAY_INTERFACE" |
  401. "PATH_INFO" | "PATH_TRANSLATED" |
  402. "QUERY_STRING" | "REMOTE_ADDR" |
  403. "REMOTE_HOST" | "REMOTE_IDENT" |
  404. "REMOTE_USER" | "REQUEST_METHOD" |
  405. "SCRIPT_NAME" | "SERVER_NAME" |
  406. "SERVER_PORT" | "SERVER_PROTOCOL" |
  407. "SERVER_SOFTWARE" | scheme |
  408. protocol-var-name | extension-var-name
  409. protocol-var-name = ( protocol | scheme ) "_" var-name
  410. scheme = alpha *( alpha | digit | "+" | "-" | "." )
  411. var-name = token
  412. extension-var-name = token
  413. Meta-variables with the same name as a scheme, and names beginning
  414. with the name of a protocol or scheme (e.g., HTTP_ACCEPT) are also
  415. defined. The number and meaning of these variables may change
  416. independently of this specification. (See also section 4.1.18.)
  417. The server MAY set additional implementation-defined extension meta-
  418. variables, whose names SHOULD be prefixed with "X_".
  419. This specification does not distinguish between zero-length (NULL)
  420. values and missing values. For example, a script cannot distinguish
  421. between the two requests http://host/script and http://host/script?
  422. as in both cases the QUERY_STRING meta-variable would be NULL.
  423. meta-variable-value = "" | 1*<TEXT, CHAR or tokens of value>
  424. An optional meta-variable may be omitted (left unset) if its value is
  425. NULL. Meta-variable values MUST be considered case-sensitive except
  426. as noted otherwise. The representation of the characters in the
  427. meta-variables is system-defined; the server MUST convert values to
  428. that representation.
  429. 4.1.1. AUTH_TYPE
  430. The AUTH_TYPE variable identifies any mechanism used by the server to
  431. authenticate the user. It contains a case-insensitive value defined
  432. by the client protocol or server implementation.
  433. For HTTP, if the client request required authentication for external
  434. access, then the server MUST set the value of this variable from the
  435. 'auth-scheme' token in the request Authorization header field.
  436. Robinson & Coar Informational [Page 11]
  437. RFC 3875 CGI Version 1.1 October 2004
  438. AUTH_TYPE = "" | auth-scheme
  439. auth-scheme = "Basic" | "Digest" | extension-auth
  440. extension-auth = token
  441. HTTP access authentication schemes are described in RFC 2617 [5].
  442. 4.1.2. CONTENT_LENGTH
  443. The CONTENT_LENGTH variable contains the size of the message-body
  444. attached to the request, if any, in decimal number of octets. If no
  445. data is attached, then NULL (or unset).
  446. CONTENT_LENGTH = "" | 1*digit
  447. The server MUST set this meta-variable if and only if the request is
  448. accompanied by a message-body entity. The CONTENT_LENGTH value must
  449. reflect the length of the message-body after the server has removed
  450. any transfer-codings or content-codings.
  451. 4.1.3. CONTENT_TYPE
  452. If the request includes a message-body, the CONTENT_TYPE variable is
  453. set to the Internet Media Type [6] of the message-body.
  454. CONTENT_TYPE = "" | media-type
  455. media-type = type "/" subtype *( ";" parameter )
  456. type = token
  457. subtype = token
  458. parameter = attribute "=" value
  459. attribute = token
  460. value = token | quoted-string
  461. The type, subtype and parameter attribute names are not
  462. case-sensitive. Parameter values may be case sensitive. Media types
  463. and their use in HTTP are described section 3.7 of the HTTP/1.1
  464. specification [4].
  465. There is no default value for this variable. If and only if it is
  466. unset, then the script MAY attempt to determine the media type from
  467. the data received. If the type remains unknown, then the script MAY
  468. choose to assume a type of application/octet-stream or it may reject
  469. the request with an error (as described in section 6.3.3).
  470. Each media-type defines a set of optional and mandatory parameters.
  471. This may include a charset parameter with a case-insensitive value
  472. defining the coded character set for the message-body. If the
  473. Robinson & Coar Informational [Page 12]
  474. RFC 3875 CGI Version 1.1 October 2004
  475. charset parameter is omitted, then the default value should be
  476. derived according to whichever of the following rules is the first to
  477. apply:
  478. 1. There MAY be a system-defined default charset for some
  479. media-types.
  480. 2. The default for media-types of type "text" is ISO-8859-1 [4].
  481. 3. Any default defined in the media-type specification.
  482. 4. The default is US-ASCII.
  483. The server MUST set this meta-variable if an HTTP Content-Type field
  484. is present in the client request header. If the server receives a
  485. request with an attached entity but no Content-Type header field, it
  486. MAY attempt to determine the correct content type, otherwise it
  487. should omit this meta-variable.
  488. 4.1.4. GATEWAY_INTERFACE
  489. The GATEWAY_INTERFACE variable MUST be set to the dialect of CGI
  490. being used by the server to communicate with the script. Syntax:
  491. GATEWAY_INTERFACE = "CGI" "/" 1*digit "." 1*digit
  492. Note that the major and minor numbers are treated as separate
  493. integers and hence each may be incremented higher than a single
  494. digit. Thus CGI/2.4 is a lower version than CGI/2.13 which in turn
  495. is lower than CGI/12.3. Leading zeros MUST be ignored by the script
  496. and MUST NOT be generated by the server.
  497. This document defines the 1.1 version of the CGI interface.
  498. 4.1.5. PATH_INFO
  499. The PATH_INFO variable specifies a path to be interpreted by the CGI
  500. script. It identifies the resource or sub-resource to be returned by
  501. the CGI script, and is derived from the portion of the URI path
  502. hierarchy following the part that identifies the script itself.
  503. Unlike a URI path, the PATH_INFO is not URL-encoded, and cannot
  504. contain path-segment parameters. A PATH_INFO of "/" represents a
  505. single void path segment.
  506. PATH_INFO = "" | ( "/" path )
  507. path = lsegment *( "/" lsegment )
  508. lsegment = *lchar
  509. lchar = <any TEXT or CTL except "/">
  510. Robinson & Coar Informational [Page 13]
  511. RFC 3875 CGI Version 1.1 October 2004
  512. The value is considered case-sensitive and the server MUST preserve
  513. the case of the path as presented in the request URI. The server MAY
  514. impose restrictions and limitations on what values it permits for
  515. PATH_INFO, and MAY reject the request with an error if it encounters
  516. any values considered objectionable. That MAY include any requests
  517. that would result in an encoded "/" being decoded into PATH_INFO, as
  518. this might represent a loss of information to the script. Similarly,
  519. treatment of non US-ASCII characters in the path is system-defined.
  520. URL-encoded, the PATH_INFO string forms the extra-path component of
  521. the Script-URI (see section 3.3) which follows the SCRIPT_NAME part
  522. of that path.
  523. 4.1.6. PATH_TRANSLATED
  524. The PATH_TRANSLATED variable is derived by taking the PATH_INFO
  525. value, parsing it as a local URI in its own right, and performing any
  526. virtual-to-physical translation appropriate to map it onto the
  527. server's document repository structure. The set of characters
  528. permitted in the result is system-defined.
  529. PATH_TRANSLATED = *<any character>
  530. This is the file location that would be accessed by a request for
  531. <scheme> "://" <server-name> ":" <server-port> <extra-path>
  532. where <scheme> is the scheme for the original client request and
  533. <extra-path> is a URL-encoded version of PATH_INFO, with ";", "=" and
  534. "?" reserved. For example, a request such as the following:
  535. http://somehost.com/cgi-bin/somescript/this%2eis%2epath%3binfo
  536. would result in a PATH_INFO value of
  537. /this.is.the.path;info
  538. An internal URI is constructed from the scheme, server location and
  539. the URL-encoded PATH_INFO:
  540. http://somehost.com/this.is.the.path%3binfo
  541. This would then be translated to a location in the server's document
  542. repository, perhaps a filesystem path something like this:
  543. /usr/local/www/htdocs/this.is.the.path;info
  544. The value of PATH_TRANSLATED is the result of the translation.
  545. Robinson & Coar Informational [Page 14]
  546. RFC 3875 CGI Version 1.1 October 2004
  547. The value is derived in this way irrespective of whether it maps to a
  548. valid repository location. The server MUST preserve the case of the
  549. extra-path segment unless the underlying repository supports case-
  550. insensitive names. If the repository is only case-aware, case-
  551. preserving, or case-blind with regard to document names, the server
  552. is not required to preserve the case of the original segment through
  553. the translation.
  554. The translation algorithm the server uses to derive PATH_TRANSLATED
  555. is implementation-defined; CGI scripts which use this variable may
  556. suffer limited portability.
  557. The server SHOULD set this meta-variable if the request URI includes
  558. a path-info component. If PATH_INFO is NULL, then the
  559. PATH_TRANSLATED variable MUST be set to NULL (or unset).
  560. 4.1.7. QUERY_STRING
  561. The QUERY_STRING variable contains a URL-encoded search or parameter
  562. string; it provides information to the CGI script to affect or refine
  563. the document to be returned by the script.
  564. The URL syntax for a search string is described in section 3 of RFC
  565. 2396 [2]. The QUERY_STRING value is case-sensitive.
  566. QUERY_STRING = query-string
  567. query-string = *uric
  568. uric = reserved | unreserved | escaped
  569. When parsing and decoding the query string, the details of the
  570. parsing, reserved characters and support for non US-ASCII characters
  571. depends on the context. For example, form submission from an HTML
  572. document [18] uses application/x-www-form-urlencoded encoding, in
  573. which the characters "+", "&" and "=" are reserved, and the ISO
  574. 8859-1 encoding may be used for non US-ASCII characters.
  575. The QUERY_STRING value provides the query-string part of the
  576. Script-URI. (See section 3.3).
  577. The server MUST set this variable; if the Script-URI does not include
  578. a query component, the QUERY_STRING MUST be defined as an empty
  579. string ("").
  580. 4.1.8. REMOTE_ADDR
  581. The REMOTE_ADDR variable MUST be set to the network address of the
  582. client sending the request to the server.
  583. Robinson & Coar Informational [Page 15]
  584. RFC 3875 CGI Version 1.1 October 2004
  585. REMOTE_ADDR = hostnumber
  586. hostnumber = ipv4-address | ipv6-address
  587. ipv4-address = 1*3digit "." 1*3digit "." 1*3digit "." 1*3digit
  588. ipv6-address = hexpart [ ":" ipv4-address ]
  589. hexpart = hexseq | ( [ hexseq ] "::" [ hexseq ] )
  590. hexseq = 1*4hex *( ":" 1*4hex )
  591. The format of an IPv6 address is described in RFC 3513 [15].
  592. 4.1.9. REMOTE_HOST
  593. The REMOTE_HOST variable contains the fully qualified domain name of
  594. the client sending the request to the server, if available, otherwise
  595. NULL. Fully qualified domain names take the form as described in
  596. section 3.5 of RFC 1034 [17] and section 2.1 of RFC 1123 [12].
  597. Domain names are not case sensitive.
  598. REMOTE_HOST = "" | hostname | hostnumber
  599. hostname = *( domainlabel "." ) toplabel [ "." ]
  600. domainlabel = alphanum [ *alphahypdigit alphanum ]
  601. toplabel = alpha [ *alphahypdigit alphanum ]
  602. alphahypdigit = alphanum | "-"
  603. The server SHOULD set this variable. If the hostname is not
  604. available for performance reasons or otherwise, the server MAY
  605. substitute the REMOTE_ADDR value.
  606. 4.1.10. REMOTE_IDENT
  607. The REMOTE_IDENT variable MAY be used to provide identity information
  608. reported about the connection by an RFC 1413 [20] request to the
  609. remote agent, if available. The server may choose not to support
  610. this feature, or not to request the data for efficiency reasons, or
  611. not to return available identity data.
  612. REMOTE_IDENT = *TEXT
  613. The data returned may be used for authentication purposes, but the
  614. level of trust reposed in it should be minimal.
  615. 4.1.11. REMOTE_USER
  616. The REMOTE_USER variable provides a user identification string
  617. supplied by client as part of user authentication.
  618. REMOTE_USER = *TEXT
  619. Robinson & Coar Informational [Page 16]
  620. RFC 3875 CGI Version 1.1 October 2004
  621. If the client request required HTTP Authentication [5] (e.g., the
  622. AUTH_TYPE meta-variable is set to "Basic" or "Digest"), then the
  623. value of the REMOTE_USER meta-variable MUST be set to the user-ID
  624. supplied.
  625. 4.1.12. REQUEST_METHOD
  626. The REQUEST_METHOD meta-variable MUST be set to the method which
  627. should be used by the script to process the request, as described in
  628. section 4.3.
  629. REQUEST_METHOD = method
  630. method = "GET" | "POST" | "HEAD" | extension-method
  631. extension-method = "PUT" | "DELETE" | token
  632. The method is case sensitive. The HTTP methods are described in
  633. section 5.1.1 of the HTTP/1.0 specification [1] and section 5.1.1 of
  634. the HTTP/1.1 specification [4].
  635. 4.1.13. SCRIPT_NAME
  636. The SCRIPT_NAME variable MUST be set to a URI path (not URL-encoded)
  637. which could identify the CGI script (rather than the script's
  638. output). The syntax is the same as for PATH_INFO (section 4.1.5)
  639. SCRIPT_NAME = "" | ( "/" path )
  640. The leading "/" is not part of the path. It is optional if the path
  641. is NULL; however, the variable MUST still be set in that case.
  642. The SCRIPT_NAME string forms some leading part of the path component
  643. of the Script-URI derived in some implementation-defined manner. No
  644. PATH_INFO segment (see section 4.1.5) is included in the SCRIPT_NAME
  645. value.
  646. 4.1.14. SERVER_NAME
  647. The SERVER_NAME variable MUST be set to the name of the server host
  648. to which the client request is directed. It is a case-insensitive
  649. hostname or network address. It forms the host part of the
  650. Script-URI.
  651. SERVER_NAME = server-name
  652. server-name = hostname | ipv4-address | ( "[" ipv6-address "]" )
  653. Robinson & Coar Informational [Page 17]
  654. RFC 3875 CGI Version 1.1 October 2004
  655. A deployed server can have more than one possible value for this
  656. variable, where several HTTP virtual hosts share the same IP address.
  657. In that case, the server would use the contents of the request's Host
  658. header field to select the correct virtual host.
  659. 4.1.15. SERVER_PORT
  660. The SERVER_PORT variable MUST be set to the TCP/IP port number on
  661. which this request is received from the client. This value is used
  662. in the port part of the Script-URI.
  663. SERVER_PORT = server-port
  664. server-port = 1*digit
  665. Note that this variable MUST be set, even if the port is the default
  666. port for the scheme and could otherwise be omitted from a URI.
  667. 4.1.16. SERVER_PROTOCOL
  668. The SERVER_PROTOCOL variable MUST be set to the name and version of
  669. the application protocol used for this CGI request. This MAY differ
  670. from the protocol version used by the server in its communication
  671. with the client.
  672. SERVER_PROTOCOL = HTTP-Version | "INCLUDED" | extension-version
  673. HTTP-Version = "HTTP" "/" 1*digit "." 1*digit
  674. extension-version = protocol [ "/" 1*digit "." 1*digit ]
  675. protocol = token
  676. Here, 'protocol' defines the syntax of some of the information
  677. passing between the server and the script (the 'protocol-specific'
  678. features). It is not case sensitive and is usually presented in
  679. upper case. The protocol is not the same as the scheme part of the
  680. script URI, which defines the overall access mechanism used by the
  681. client to communicate with the server. For example, a request that
  682. reaches the script with a protocol of "HTTP" may have used an "https"
  683. scheme.
  684. A well-known value for SERVER_PROTOCOL which the server MAY use is
  685. "INCLUDED", which signals that the current document is being included
  686. as part of a composite document, rather than being the direct target
  687. of the client request. The script should treat this as an HTTP/1.0
  688. request.
  689. Robinson & Coar Informational [Page 18]
  690. RFC 3875 CGI Version 1.1 October 2004
  691. 4.1.17. SERVER_SOFTWARE
  692. The SERVER_SOFTWARE meta-variable MUST be set to the name and version
  693. of the information server software making the CGI request (and
  694. running the gateway). It SHOULD be the same as the server
  695. description reported to the client, if any.
  696. SERVER_SOFTWARE = 1*( product | comment )
  697. product = token [ "/" product-version ]
  698. product-version = token
  699. comment = "(" *( ctext | comment ) ")"
  700. ctext = <any TEXT excluding "(" and ")">
  701. 4.1.18. Protocol-Specific Meta-Variables
  702. The server SHOULD set meta-variables specific to the protocol and
  703. scheme for the request. Interpretation of protocol-specific
  704. variables depends on the protocol version in SERVER_PROTOCOL. The
  705. server MAY set a meta-variable with the name of the scheme to a
  706. non-NULL value if the scheme is not the same as the protocol. The
  707. presence of such a variable indicates to a script which scheme is
  708. used by the request.
  709. Meta-variables with names beginning with "HTTP_" contain values read
  710. from the client request header fields, if the protocol used is HTTP.
  711. The HTTP header field name is converted to upper case, has all
  712. occurrences of "-" replaced with "_" and has "HTTP_" prepended to
  713. give the meta-variable name. The header data can be presented as
  714. sent by the client, or can be rewritten in ways which do not change
  715. its semantics. If multiple header fields with the same field-name
  716. are received then the server MUST rewrite them as a single value
  717. having the same semantics. Similarly, a header field that spans
  718. multiple lines MUST be merged onto a single line. The server MUST,
  719. if necessary, change the representation of the data (for example, the
  720. character set) to be appropriate for a CGI meta-variable.
  721. The server is not required to create meta-variables for all the
  722. header fields that it receives. In particular, it SHOULD remove any
  723. header fields carrying authentication information, such as
  724. 'Authorization'; or that are available to the script in other
  725. variables, such as 'Content-Length' and 'Content-Type'. The server
  726. MAY remove header fields that relate solely to client-side
  727. communication issues, such as 'Connection'.
  728. Robinson & Coar Informational [Page 19]
  729. RFC 3875 CGI Version 1.1 October 2004
  730. 4.2. Request Message-Body
  731. Request data is accessed by the script in a system-defined method;
  732. unless defined otherwise, this will be by reading the 'standard
  733. input' file descriptor or file handle.
  734. Request-Data = [ request-body ] [ extension-data ]
  735. request-body = <CONTENT_LENGTH>OCTET
  736. extension-data = *OCTET
  737. A request-body is supplied with the request if the CONTENT_LENGTH is
  738. not NULL. The server MUST make at least that many bytes available
  739. for the script to read. The server MAY signal an end-of-file
  740. condition after CONTENT_LENGTH bytes have been read or it MAY supply
  741. extension data. Therefore, the script MUST NOT attempt to read more
  742. than CONTENT_LENGTH bytes, even if more data is available. However,
  743. it is not obliged to read any of the data.
  744. For non-parsed header (NPH) scripts (section 5), the server SHOULD
  745. attempt to ensure that the data supplied to the script is precisely
  746. as supplied by the client and is unaltered by the server.
  747. As transfer-codings are not supported on the request-body, the server
  748. MUST remove any such codings from the message-body, and recalculate
  749. the CONTENT_LENGTH. If this is not possible (for example, because of
  750. large buffering requirements), the server SHOULD reject the client
  751. request. It MAY also remove content-codings from the message-body.
  752. 4.3. Request Methods
  753. The Request Method, as supplied in the REQUEST_METHOD meta-variable,
  754. identifies the processing method to be applied by the script in
  755. producing a response. The script author can choose to implement the
  756. methods most appropriate for the particular application. If the
  757. script receives a request with a method it does not support it SHOULD
  758. reject it with an error (see section 6.3.3).
  759. 4.3.1. GET
  760. The GET method indicates that the script should produce a document
  761. based on the meta-variable values. By convention, the GET method is
  762. 'safe' and 'idempotent' and SHOULD NOT have the significance of
  763. taking an action other than producing a document.
  764. The meaning of the GET method may be modified and refined by
  765. protocol-specific meta-variables.
  766. Robinson & Coar Informational [Page 20]
  767. RFC 3875 CGI Version 1.1 October 2004
  768. 4.3.2. POST
  769. The POST method is used to request the script perform processing and
  770. produce a document based on the data in the request message-body, in
  771. addition to meta-variable values. A common use is form submission in
  772. HTML [18], intended to initiate processing by the script that has a
  773. permanent affect, such a change in a database.
  774. The script MUST check the value of the CONTENT_LENGTH variable before
  775. reading the attached message-body, and SHOULD check the CONTENT_TYPE
  776. value before processing it.
  777. 4.3.3. HEAD
  778. The HEAD method requests the script to do sufficient processing to
  779. return the response header fields, without providing a response
  780. message-body. The script MUST NOT provide a response message-body
  781. for a HEAD request. If it does, then the server MUST discard the
  782. message-body when reading the response from the script.
  783. 4.3.4. Protocol-Specific Methods
  784. The script MAY implement any protocol-specific method, such as
  785. HTTP/1.1 PUT and DELETE; it SHOULD check the value of SERVER_PROTOCOL
  786. when doing so.
  787. The server MAY decide that some methods are not appropriate or
  788. permitted for a script, and may handle the methods itself or return
  789. an error to the client.
  790. 4.4. The Script Command Line
  791. Some systems support a method for supplying an array of strings to
  792. the CGI script. This is only used in the case of an 'indexed' HTTP
  793. query, which is identified by a 'GET' or 'HEAD' request with a URI
  794. query string that does not contain any unencoded "=" characters. For
  795. such a request, the server SHOULD treat the query-string as a
  796. search-string and parse it into words, using the rules
  797. search-string = search-word *( "+" search-word )
  798. search-word = 1*schar
  799. schar = unreserved | escaped | xreserved
  800. xreserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "," |
  801. "$"
  802. After parsing, each search-word is URL-decoded, optionally encoded in
  803. a system-defined manner and then added to the command line argument
  804. list.
  805. Robinson & Coar Informational [Page 21]
  806. RFC 3875 CGI Version 1.1 October 2004
  807. If the server cannot create any part of the argument list, then the
  808. server MUST NOT generate any command line information. For example,
  809. the number of arguments may be greater than operating system or
  810. server limits, or one of the words may not be representable as an
  811. argument.
  812. The script SHOULD check to see if the QUERY_STRING value contains an
  813. unencoded "=" character, and SHOULD NOT use the command line
  814. arguments if it does.
  815. 5. NPH Scripts
  816. 5.1. Identification
  817. The server MAY support NPH (Non-Parsed Header) scripts; these are
  818. scripts to which the server passes all responsibility for response
  819. processing.
  820. This specification provides no mechanism for an NPH script to be
  821. identified on the basis of its output data alone. By convention,
  822. therefore, any particular script can only ever provide output of one
  823. type (NPH or CGI) and hence the script itself is described as an 'NPH
  824. script'. A server with NPH support MUST provide an implementation-
  825. defined mechanism for identifying NPH scripts, perhaps based on the
  826. name or location of the script.
  827. 5.2. NPH Response
  828. There MUST be a system-defined method for the script to send data
  829. back to the server or client; a script MUST always return some data.
  830. Unless defined otherwise, this will be the same as for conventional
  831. CGI scripts.
  832. Currently, NPH scripts are only defined for HTTP client requests. An
  833. (HTTP) NPH script MUST return a complete HTTP response message,
  834. currently described in section 6 of