PageRenderTime 112ms CodeModel.GetById 37ms RepoModel.GetById 1ms app.codeStats 0ms

/doc/SPEC.rst

http://github.com/skarab/ewgi
ReStructuredText | 527 lines | 422 code | 105 blank | 0 comment | 0 complexity | 054b9a5347bb0a8650633b51073e159d MD5 | raw file
Possible License(s): MPL-2.0-no-copyleft-exception
  1. Abstract
  2. ========
  3. This document specifies a proposed standard interface between web
  4. servers and Erlang web applications or frameworks to promote web
  5. application portability across a variety of web server
  6. implementations.
  7. This EEP is originally based on the Python Web Server Gateway
  8. Interface v1.0 (`PEP 333`_).
  9. Rationale
  10. =========
  11. At the time of writing, there is no standard way for Erlang
  12. applications to interact with a web server or HTTP toolkit. Many
  13. other languages (e.g. Python and Ruby) have dedicated significant
  14. research towards developing robust standards for web applications. In
  15. order for developers interesting in using Erlang to build web
  16. applications, such standards are important to encourage the reuse of
  17. code dealing with common HTTP problems such as cookies, sessions, and
  18. URL routing.
  19. Specification Overview
  20. ======================
  21. The EWGI interface has two sides: the "server" or "gateway" side, and
  22. the "application" or "framework" side. The server side invokes a
  23. function or module (the "application") that is provided by the
  24. application side. The specifics of how that function or module is
  25. provided are up to the server or gateway. It is assumed that some
  26. servers or gateways will require an application's deployer to write a
  27. some code to create an instance of the server or gateway, and supply
  28. it with the application. Other servers and gateways may use
  29. configuration files or other mechanisms to specify where an
  30. application should be obtained.
  31. In addition to "pure" servers/gateways and applications/frameworks, it
  32. is also possible to create "middleware" components that implement both
  33. sides of this specification. Such components act as an application to
  34. their containing server, and as a server to a contained application.
  35. They can be used to provide extended APIs, content transformations,
  36. navigation, and other useful functions.
  37. Hot code reloading
  38. ==================
  39. It is important to note that one of the core features of the Erlang
  40. runtime system, hot code reloading, may be affected by the use of
  41. first-class functions. This specification does not deal directly with
  42. the problems associated with hot code reloading and maintains that it
  43. is the responsibility of the server and application developers to
  44. implement the desired release behaviour.
  45. The Application/Framework Side
  46. ==============================
  47. The application is simply a function that accepts a single 3-tuple
  48. argument. Applications MUST be able to be invoked more than once, as
  49. virtually all servers/gateways will make such repeated requests. The
  50. function should return a similarly-structured 3-tuple argument. The
  51. 3-tuple may be defined by a record for convenience, but this is not
  52. required. The first element of the context tuple MUST be the atom
  53. ``'ewgi_context'``.
  54. `Note: although we refer to it as an "application", this should not be
  55. construed to mean that application developers will use EWGI as a web
  56. programming API! It is assumed that application developers will
  57. continue to use high-level framework services to develop their
  58. applications. EWGI is a tool for framework and server developers, and
  59. is not necessarily intended to directly support application developers
  60. as "yet another web framework."`
  61. Here is an example of an application::
  62. simple_app({ewgi_context, Request, _Response}) ->
  63. StatusCode = 200,
  64. ReasonPhrase = "OK",
  65. Status = {StatusCode, ReasonPhrase},
  66. ResponseHeaders = [{"Content-type", "text/plain"}],
  67. Body = [<<"Hello world!">>],
  68. Response = {ewgi_response, Status,
  69. ResponseHeaders, Body, undefined},
  70. {ewgi_context, Request, Response}.
  71. As stated above, a record may be used for convenience::
  72. -record(ewgi_context, {
  73. request,
  74. response
  75. }).
  76. The Server/Gateway Side
  77. =======================
  78. The server or gateway invokes the application callable once for each
  79. request it receives from an HTTP client that is directed at the
  80. application.
  81. `An example server using the MochiWeb HTTP toolkit is provided with
  82. the EWGI reference implementation.`
  83. Middleware: Components "that Play Both Sides"
  84. =============================================
  85. Note that a single object may play the role of a server with respect
  86. to some application(s), while also acting as an application with
  87. respect to some server(s). Such "middleware" components can perform
  88. such functions as:
  89. * Routing a request to different application objects based on the
  90. target URL, after rewriting the ``Request`` accordingly.
  91. * Allowing multiple applications or frameworks to run side-by-side in
  92. the same process
  93. * Load balancing and remote processing by forwarding requests and
  94. responses over a network
  95. * Content postprocessing, such as applying XSL stylesheets
  96. The presence of middleware in general is transparent to both the
  97. "server/gateway" and the "application/framework" sides of the
  98. interface, and should require no special support. A user who desires
  99. to incorporate middleware into an application simply provides the
  100. middleware component to the server as if it were an application and
  101. configures the middleware component to invoke the application as if
  102. the middleware component were a server. Of course, the "application"
  103. that the middleware wraps may in fact be another middleware component
  104. wrapping another application and so on, creating what is referred to
  105. as a "middleware stack" or "pipeline."
  106. For the most part, middleware must conform to the restrictions and
  107. requirements of both the server and application sides of EWGI. In
  108. some cases, however, requirements for middleware are more stringent
  109. than for a "pure" server or application, and these points will be
  110. noted in the specification.
  111. Following is an example which naively converts the output of an
  112. application to uppercase::
  113. get_upcase_mw(A) when is_function(A, 1) ->
  114. F = fun(Ctx) ->
  115. {ewgi_context, Req, Rsp} = A(Ctx),
  116. Body = case element(4, Rsp) of
  117. Body0 when is_function(Body0, 0) ->
  118. upcase_chunks(Body0);
  119. Body0 when is_list(Body0) ->
  120. upcase_iolist(Body0);
  121. Body0 when is_binary(Body0) ->
  122. upcase_binary(Body0)
  123. end,
  124. {ewgi_context, Req, setelement(4, Rsp, Body)}
  125. end,
  126. F.
  127. %% Lazily wrap a stream
  128. upcase_chunks(F0) ->
  129. F = fun() ->
  130. case F0() of
  131. {H, T} ->
  132. {upcase_iolist(H), upcase_chunks(T)};
  133. {} ->
  134. {}
  135. end
  136. end,
  137. F.
  138. upcase_binary(Bin) when is_binary(Bin) ->
  139. list_to_binary(string:to_upper(binary_to_list(Bin))).
  140. upcase_iolist(L) ->
  141. lists:map(fun(A) when is_integer(A) ->
  142. string:to_upper(A);
  143. (A) when is_binary(A) ->
  144. upcase_binary(A);
  145. (A) when is_list(A) ->
  146. upcase_iolist(A)
  147. end, L).
  148. Specification Details
  149. =====================
  150. The application callable must accept one 3-tuple argument. For the
  151. sake of illustration, we have named the second and third elements of
  152. this tuple ``request`` and ``response``, and the specification shall
  153. refer to them by those names. A server or gateway must invoke the
  154. callable by passing the tuple argument (e.g. by calling ``Result =
  155. Application({ewgi_context, Request, Response})`` as shown above).
  156. Request
  157. -------
  158. The ``Request`` parameter is a tuple containing various CGI-influenced
  159. environment variables. This term must be an 21-tuple, and the
  160. application is allowed to modify the ``Request`` in any way it desires
  161. (except for HTTP header restrictions outlined later). Element 5 of
  162. the tuple must itself be a 6-tuple including certain EWGI-required
  163. terms (described in a later section), and may also include
  164. server-specific extension variables by making use of the final element
  165. (a bag or multiset). Element 7 of the tuple must itself be a 8-tuple
  166. including certain commonly-encountered HTTP headers and a dictionary
  167. for additional variables. The following records may be used for
  168. convenience::
  169. -record(ewgi_spec, {
  170. read_input,
  171. write_error,
  172. url_scheme,
  173. version,
  174. data % set
  175. }).
  176. -record(ewgi_http_headers, {
  177. http_accept,
  178. http_cookie,
  179. http_host,
  180. http_if_modified_since,
  181. http_user_agent,
  182. http_x_http_method_override,
  183. other % multiset
  184. }).
  185. -record(ewgi_request, {
  186. auth_type,
  187. content_length,
  188. content_type,
  189. ewgi=#ewgi_spec{},
  190. gateway_interface,
  191. http_headers=#ewgi_http_headers{},
  192. path_info,
  193. path_translated,
  194. query_string,
  195. remote_addr,
  196. remote_host,
  197. remote_ident,
  198. remote_user,
  199. remote_user_data,
  200. request_method,
  201. script_name,
  202. server_name,
  203. server_port,
  204. server_protocol,
  205. server_software
  206. }).
  207. EWGI request variables
  208. ''''''''''''''''''''''
  209. The ``Request`` tuple is required to contain these CGI environment
  210. variables, as originally defined by the `Common Gateway Interface
  211. specification`_.
  212. ``auth_type``: (Element 2) The type of authentication provided or
  213. ``'undefined'`` if absent.
  214. ``content_length``: (Element 3) The contents of any ``Content-Length``
  215. fields in the HTTP request. May be empty or ``'undefined'``.
  216. ``content_type``: (Element 4) The contents of any ``Content-Type``
  217. fields in the HTTP request. May be empty or ``'undefined'``.
  218. ``ewgi``: (Element 5) See section below
  219. ``gateway_interface``: (Element 6) The gateway interface and revision
  220. used. Should be ``EWGI/1.1`` for this version of the specification.
  221. ``http_headers``: (Element 7) See section below
  222. ``path_info``: (Element 8) The remainder of the request URL's "path",
  223. designating the virtual "location" of the request's target within the
  224. application. This may be an empty string, if the request URL targets
  225. the application root and does not have a trailing slash.
  226. ``path_translated``: (Element 9) The path as may be translated by the
  227. server to a physical location.
  228. ``query_string``: (Element 10) The portion of the request URL that
  229. follows the ``"?"``, if any. May be empty or ``'undefined'``.
  230. ``remote_addr``: (Element 11) The remote IP address of the client
  231. issuing the request
  232. ``remote_host``: (Element 12) The remote hostname of the client
  233. issuing the request. May be empty or ``'undefined'``.
  234. ``remote_ident``: (Element 13) If the server supports `RFC 931`_
  235. identification, this variable may be set to the remote user
  236. name. Should only be used for logging purposes.
  237. ``remote_user``: (Element 14) If authentication is supported by the
  238. server (or middleware), this should be set to the authenticated
  239. username.
  240. ``remote_user_data``: (Element 15) Any additional data provided by the
  241. authentication mechanism.
  242. ``request_method``: (Element 16) An atom or string describing the HTTP
  243. request method. Common methods MUST be atoms and include
  244. ``'OPTIONS'``, ``'GET'``, ``'HEAD'``, ``'POST'``, ``'PUT'``,
  245. ``'DELETE'``, ``'TRACE'``, and ``'CONNECT'``. A value is always
  246. required and it MUST NOT be an empty string.
  247. ``script_name``: (Element 17) The initial portion of the request URL's
  248. "path" that corresponds to the application object, so that the
  249. application knows its virtual "location". This may be an empty
  250. string, if the application corresponds to the "root" of the server.
  251. ``server_name``, ``server_port``: (Element 18,19) When combined with
  252. ``script_name`` and ``path_info``, these variables can be used to
  253. complete the URL. Note, however, that ``http_host``, if present,
  254. should be used in preference to ``server_name`` for reconstructing the
  255. request URL. ``server_name`` and ``server_port`` can never be empty
  256. strings, and so are always required.
  257. ``server_protocol``: (Element 20) The version of the protocol the
  258. client used to send the request. Typically this will be something like
  259. ``"HTTP/1.0"`` or ``"HTTP/1.1"``and may be used by the application to
  260. determine how to treat any HTTP request headers. (This variable
  261. should probably be called ``request_protocol``, since it denotes the
  262. protocol used in the request, and is not necessarily the protocol that
  263. will be used in the server's response. However, for compatibility
  264. with CGI we have to keep the existing name).
  265. ``server_software``: (Element 21) The name and revision of the server
  266. software answering the request.
  267. EWGI-specification parameters
  268. '''''''''''''''''''''''''''''
  269. ``read_input``: (Element 2) A 2-arity function which takes a
  270. ``Callback`` 1-arity function and a ``Size`` non-zero integer. The
  271. ``Callback`` function will be called with chunks of data in the form
  272. ``{data, Bin}`` where ``Bin`` is a binary. At the end of reading, the
  273. ``Callback`` function will be called with ``eof`` as its argument.
  274. The supplied function should return another function of the same kind.
  275. ``write_error``: (Element 3) A 1-arity function which takes an
  276. ``iolist`` and writes to the server-defined error log mechanism
  277. (usually ``error_logger``).
  278. ``url_scheme``: (Element 4) A string representing the "scheme" portion
  279. of the URL at which the application is being invoked. Normally, this
  280. will have the value ``"http"`` or ``"https"`` where appropriate.
  281. ``version``: (Element 5) The tuple ``{1,1}``, representing EWGI major
  282. version 1, minor version 1.
  283. ``data``: (Element 6) A dictionary (implemented by the OTP module
  284. ``gb_trees``) which can be used for server or application-specific
  285. data to be included with the request. A common use for this
  286. dictionary is in configuring higher-level web frameworks or providing
  287. cached data. Additionally, a server or gateway should attempt to
  288. provide as many other CGI variables as are applicable. In addition,
  289. if SSL is in use, the server or gateway should also provide as many of
  290. the `Apache SSL environment variables`_ as are applicable, such as
  291. ``https`` and ``ssl_protocol``. Note, however, that an application
  292. that uses any CGI variables other than the ones listed above are
  293. necessarily non-portable to web servers that do not support the
  294. relevant extensions. An EWGI-compliant server or gateway should
  295. document what variables it provides, along with their definitions as
  296. appropriate. Applications should check for the presence of any
  297. variables they require, and have a fallback plan in the event such a
  298. variable is ``'undefined'``.
  299. HTTP headers
  300. ''''''''''''
  301. EWGI provides a tuple with commonly-used HTTP request headers to
  302. optimise retrieval. Each of the values is a list of 2-tuples of the
  303. form {``FieldName``, ``FieldValue``}. Servers MUST preserve the order
  304. of headers as they are given in the request. Servers SHOULD preserve
  305. the case of the ``FieldName`` values.
  306. ``http_accept``: (Element 2) The ``Accept:`` header
  307. ``http_cookie``: (Element 3) The ``Cookie:`` header
  308. ``http_host``: (Element 4) The ``Host:`` header
  309. ``http_if_modified_since``: (Element 5) The ``If-Modified-Since:``
  310. header
  311. ``http_user_agent``: (Element 6) The ``User-Agent:`` header
  312. ``http_x_http_method_override``: (Element 7) The
  313. ``X-Http-Method-Override:`` header. While not part of the HTTP 1.1
  314. specification, this header can be used to overcome a common browser
  315. limitation which prevents browsers from sending a ``PUT`` or
  316. ``DELETE`` request to a URI.
  317. ``other``: (Element 8) A multiset (implemented by the OTP module
  318. ``gb_trees``) which contains all other HTTP request headers. The keys
  319. of the dictionary should be lower-case representations of the header
  320. names and the values should be a list of tuples of the form
  321. {``HeaderName``, ``HeaderValue``}. Servers SHOULD attempt to preserve
  322. the original case of header names in the tuple list.
  323. Notes
  324. '''''
  325. Missing variables (where allowed, such as ``remote_user`` when no
  326. authentication has occurred) should be defined by the atom
  327. ``'undefined'``. Also note that CGI-defined variables must be strings
  328. if they are defined. It is a violation of this specification for a
  329. CGI variable's value to be of any type other than ``string`` or the
  330. ``'undefined'`` atom.
  331. Response
  332. --------
  333. The ``Response`` parameter is a 5-tuple of the form ``{ewgi_response,
  334. {StatusCode, ReasonPhrase}, HeaderList, MessageBody, Error}``. and A
  335. convenient record definition is::
  336. -record(ewgi_response, {
  337. status={200, "OK"},
  338. headers=[],
  339. message_body,
  340. err
  341. }).
  342. Status Code
  343. '''''''''''
  344. The ``StatusCode`` parameter should be a 3-digit integer corresponding
  345. to the HTTP status code as defined in the HTTP specification (See `RFC
  346. 2616, Section 6.1.1`_ for more information). For example, ``200``
  347. corresponds to a successful request.
  348. Reason Phrase
  349. '''''''''''''
  350. The ``ReasonPhrase`` parameter is intended to be a human readable
  351. representation of ``StatusCode`` and should be a string or binary.
  352. Headers
  353. '''''''
  354. ``Headers`` is a list of ``{HeaderName, HeaderValue}`` tuples
  355. describing the HTTP response headers.
  356. Each ``HeaderName`` must be a valid HTTP header field-name (as defined
  357. by `RFC 2616, Section 4.2`_), without a trailing colon or bother
  358. punctuation. Note: ``HeaderName`` is case insensitive, but should be
  359. lower-case for optimising comparisons. (A reminder for server/gateway
  360. authors: be sure to take that into consideration when examining
  361. application-supplied headers).
  362. Each ``HeaderValue`` must not include any control characters,
  363. including CR or LF, in any position.
  364. In general, the server or gateway is responsible for ensuring that
  365. correct headers are sent to the client: if the application omits a
  366. header required by HTTP (or other relevant specifications that are in
  367. effect), the server or gateway must add it. For example, the HTTP
  368. ``Date:`` and ``Server:`` headers would normally be supplied by the
  369. server or gateway.
  370. Applications and middleware are forbidden from using HTTP/1.1
  371. "hop-by-hop" features or headers, any equivalent features in HTTP/1.0,
  372. or any headers that would affect the persistence of the client's
  373. connection to the web server. These features are the exclusive
  374. province of the actual web server, and a server or gateway should
  375. consider it a fatal error for an application to attempt sending them,
  376. and raise an error if they are supplied.
  377. For example::
  378. [{"content-type", "application/json"}, {"etag", "8a920bc001df"}]
  379. Message Body
  380. ''''''''''''
  381. The ``MessageBody`` parameter is either an ``iolist`` or a "stream,"
  382. which is a lazy, recursive list-like structure. A stream is a
  383. zero-arity function which returns either the empty tuple ``{}`` or a
  384. 2-tuple of the form ``{Head, Tail}`` where ``Head`` is an ``iolist``
  385. and ``Tail`` is another stream. Servers may choose to transmit
  386. message bodies represented by a stream using the chunked transfer
  387. encoding. However, the server or gateway must transmit ``iolist``s to
  388. the client in an unbuffered fashion, completing the transmission of
  389. each ``iolist`` before requesting another one. (In other words,
  390. applications should perform their own buffering).
  391. The server or gateway should not alter the ``iolist`` returned by the
  392. application in any way. The application is responsible for ensuring
  393. that the ``iolist`` to be written is in a format suitable for the
  394. client. However, the server or gateway may apply HTTP transfer
  395. encodings or perform other transformations for the purpose of
  396. implementing HTTP features such as byte-range transmission.
  397. EWGI Reference Implementation
  398. =============================
  399. The EWGI reference implementation includes an API module ``ewgi_api``
  400. which defines helper functions to access and modify the EWGI context,
  401. parse query strings, etc. It also includes a module
  402. ``ewgi_application`` which contains convenience functions for dealing
  403. with application functions as well as sample middleware components.
  404. An include file (``include/ewgi.hrl``) is also provided, which
  405. contains macros for standard HTTP status values and the convenience
  406. record definitions. These may be used to help development of servers
  407. and applications, but should not be required.
  408. Copyright
  409. =========
  410. This document has been placed in the public domain.
  411. .. _PEP 333:
  412. http://www.python.org/dev/peps/pep-0333/
  413. .. _Common Gateway Interface specification:
  414. http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt
  415. .. _Apache SSL environment variables:
  416. http://www.modssl.org/docs/2.8/ssl_reference.html#ToC25
  417. .. _RFC 2616, Section 6.1.1:
  418. http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html#sec6.1.1
  419. .. _RFC 931:
  420. http://www.faqs.org/rfcs/rfc931.html
  421. .. _RFC 2616, Section 4.2:
  422. http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2