PageRenderTime 157ms CodeModel.GetById 71ms app.highlight 17ms RepoModel.GetById 63ms app.codeStats 0ms

/doc/SPEC.rst

http://github.com/skarab/ewgi
ReStructuredText | 527 lines | 422 code | 105 blank | 0 comment | 0 complexity | 054b9a5347bb0a8650633b51073e159d MD5 | raw file
  1Abstract
  2========
  3
  4This document specifies a proposed standard interface between web
  5servers and Erlang web applications or frameworks to promote web
  6application portability across a variety of web server
  7implementations.
  8
  9This EEP is originally based on the Python Web Server Gateway
 10Interface v1.0 (`PEP 333`_).
 11
 12Rationale
 13=========
 14
 15At the time of writing, there is no standard way for Erlang
 16applications to interact with a web server or HTTP toolkit.  Many
 17other languages (e.g. Python and Ruby) have dedicated significant
 18research towards developing robust standards for web applications.  In
 19order for developers interesting in using Erlang to build web
 20applications, such standards are important to encourage the reuse of
 21code dealing with common HTTP problems such as cookies, sessions, and
 22URL routing.
 23
 24Specification Overview
 25======================
 26
 27The EWGI interface has two sides: the "server" or "gateway" side, and
 28the "application" or "framework" side.  The server side invokes a
 29function or module (the "application") that is provided by the
 30application side.  The specifics of how that function or module is
 31provided are up to the server or gateway.  It is assumed that some
 32servers or gateways will require an application's deployer to write a
 33some code to create an instance of the server or gateway, and supply
 34it with the application.  Other servers and gateways may use
 35configuration files or other mechanisms to specify where an
 36application should be obtained.
 37
 38In addition to "pure" servers/gateways and applications/frameworks, it
 39is also possible to create "middleware" components that implement both
 40sides of this specification.  Such components act as an application to
 41their containing server, and as a server to a contained application.
 42They can be used to provide extended APIs, content transformations,
 43navigation, and other useful functions.
 44
 45Hot code reloading
 46==================
 47
 48It is important to note that one of the core features of the Erlang
 49runtime system, hot code reloading, may be affected by the use of
 50first-class functions.  This specification does not deal directly with
 51the problems associated with hot code reloading and maintains that it
 52is the responsibility of the server and application developers to
 53implement the desired release behaviour.
 54
 55The Application/Framework Side
 56==============================
 57
 58The application is simply a function that accepts a single 3-tuple
 59argument.  Applications MUST be able to be invoked more than once, as
 60virtually all servers/gateways will make such repeated requests.  The
 61function should return a similarly-structured 3-tuple argument.  The
 623-tuple may be defined by a record for convenience, but this is not
 63required.  The first element of the context tuple MUST be the atom
 64``'ewgi_context'``.
 65
 66`Note: although we refer to it as an "application", this should not be
 67construed to mean that application developers will use EWGI as a web
 68programming API!  It is assumed that application developers will
 69continue to use high-level framework services to develop their
 70applications.  EWGI is a tool for framework and server developers, and
 71is not necessarily intended to directly support application developers
 72as "yet another web framework."`
 73
 74Here is an example of an application::
 75
 76    simple_app({ewgi_context, Request, _Response}) ->
 77        StatusCode = 200,
 78        ReasonPhrase = "OK",
 79        Status = {StatusCode, ReasonPhrase},
 80        ResponseHeaders = [{"Content-type", "text/plain"}],
 81        Body = [<<"Hello world!">>],
 82        Response = {ewgi_response, Status,
 83                    ResponseHeaders, Body, undefined},
 84        {ewgi_context, Request, Response}.
 85
 86As stated above, a record may be used for convenience::
 87
 88    -record(ewgi_context, {
 89              request,
 90              response
 91             }).
 92
 93The Server/Gateway Side
 94=======================
 95
 96The server or gateway invokes the application callable once for each
 97request it receives from an HTTP client that is directed at the
 98application.
 99
100`An example server using the MochiWeb HTTP toolkit is provided with
101the EWGI reference implementation.`
102
103Middleware: Components "that Play Both Sides"
104=============================================
105
106Note that a single object may play the role of a server with respect
107to some application(s), while also acting as an application with
108respect to some server(s).  Such "middleware" components can perform
109such functions as:
110
111* Routing a request to different application objects based on the
112  target URL, after rewriting the ``Request`` accordingly.
113
114* Allowing multiple applications or frameworks to run side-by-side in
115  the same process
116
117* Load balancing and remote processing by forwarding requests and
118  responses over a network
119
120* Content postprocessing, such as applying XSL stylesheets
121
122The presence of middleware in general is transparent to both the
123"server/gateway" and the "application/framework" sides of the
124interface, and should require no special support.  A user who desires
125to incorporate middleware into an application simply provides the
126middleware component to the server as if it were an application and
127configures the middleware component to invoke the application as if
128the middleware component were a server.  Of course, the "application"
129that the middleware wraps may in fact be another middleware component
130wrapping another application and so on, creating what is referred to
131as a "middleware stack" or "pipeline."
132
133For the most part, middleware must conform to the restrictions and
134requirements of both the server and application sides of EWGI.  In
135some cases, however, requirements for middleware are more stringent
136than for a "pure" server or application, and these points will be
137noted in the specification.
138
139Following is an example which naively converts the output of an
140application to uppercase::
141
142    get_upcase_mw(A) when is_function(A, 1) ->
143        F = fun(Ctx) ->
144                    {ewgi_context, Req, Rsp} = A(Ctx),
145                    Body = case element(4, Rsp) of
146                               Body0 when is_function(Body0, 0) ->
147                                   upcase_chunks(Body0);
148                               Body0 when is_list(Body0) ->
149                                   upcase_iolist(Body0);
150                               Body0 when is_binary(Body0) ->
151                                   upcase_binary(Body0)
152                           end,
153                    {ewgi_context, Req, setelement(4, Rsp, Body)}
154            end,
155        F.
156    
157    %% Lazily wrap a stream
158    upcase_chunks(F0) ->
159        F = fun() ->
160                    case F0() of
161                        {H, T} ->
162                            {upcase_iolist(H), upcase_chunks(T)};
163                        {} ->
164                            {}
165                    end
166            end,
167        F.
168    
169    upcase_binary(Bin) when is_binary(Bin) ->
170        list_to_binary(string:to_upper(binary_to_list(Bin))).
171    
172    upcase_iolist(L) ->
173        lists:map(fun(A) when is_integer(A) ->
174                          string:to_upper(A);
175                     (A) when is_binary(A) ->
176                          upcase_binary(A);
177                     (A) when is_list(A) ->
178                          upcase_iolist(A)
179                  end, L).
180
181Specification Details
182=====================
183
184The application callable must accept one 3-tuple argument.  For the
185sake of illustration, we have named the second and third elements of
186this tuple ``request`` and ``response``, and the specification shall
187refer to them by those names.  A server or gateway must invoke the
188callable by passing the tuple argument (e.g. by calling ``Result =
189Application({ewgi_context, Request, Response})`` as shown above).
190
191Request
192-------
193
194The ``Request`` parameter is a tuple containing various CGI-influenced
195environment variables.  This term must be an 21-tuple, and the
196application is allowed to modify the ``Request`` in any way it desires
197(except for HTTP header restrictions outlined later).  Element 5 of
198the tuple must itself be a 6-tuple including certain EWGI-required
199terms (described in a later section), and may also include
200server-specific extension variables by making use of the final element
201(a bag or multiset).  Element 7 of the tuple must itself be a 8-tuple
202including certain commonly-encountered HTTP headers and a dictionary
203for additional variables. The following records may be used for
204convenience::
205
206    -record(ewgi_spec, {
207              read_input,
208              write_error,
209              url_scheme,
210              version,
211              data % set
212             }).
213    
214    -record(ewgi_http_headers, {
215              http_accept,
216              http_cookie,
217              http_host,
218              http_if_modified_since,
219              http_user_agent,
220              http_x_http_method_override,
221              other % multiset
222             }).
223    
224    -record(ewgi_request, {
225              auth_type,
226              content_length,
227              content_type,
228              ewgi=#ewgi_spec{},
229              gateway_interface,
230              http_headers=#ewgi_http_headers{},
231              path_info,
232              path_translated,
233              query_string,
234              remote_addr,
235              remote_host,
236              remote_ident,
237              remote_user,
238              remote_user_data,
239              request_method,
240              script_name,
241              server_name,
242              server_port,
243              server_protocol,
244              server_software
245             }).
246
247EWGI request variables
248''''''''''''''''''''''
249
250The ``Request`` tuple is required to contain these CGI environment
251variables, as originally defined by the `Common Gateway Interface
252specification`_.
253
254``auth_type``: (Element 2) The type of authentication provided or
255``'undefined'`` if absent.
256
257``content_length``: (Element 3) The contents of any ``Content-Length``
258fields in the HTTP request. May be empty or ``'undefined'``.
259
260``content_type``: (Element 4) The contents of any ``Content-Type``
261fields in the HTTP request. May be empty or ``'undefined'``.
262
263``ewgi``: (Element 5) See section below
264
265``gateway_interface``: (Element 6) The gateway interface and revision
266used. Should be ``EWGI/1.1`` for this version of the specification.
267
268``http_headers``: (Element 7) See section below
269
270``path_info``: (Element 8) The remainder of the request URL's "path",
271designating the virtual "location" of the request's target within the
272application.  This may be an empty string, if the request URL targets
273the application root and does not have a trailing slash.
274
275``path_translated``: (Element 9) The path as may be translated by the
276server to a physical location.
277
278``query_string``: (Element 10) The portion of the request URL that
279follows the ``"?"``, if any. May be empty or ``'undefined'``.
280
281``remote_addr``: (Element 11) The remote IP address of the client
282issuing the request
283
284``remote_host``: (Element 12) The remote hostname of the client
285issuing the request. May be empty or ``'undefined'``.
286
287``remote_ident``: (Element 13) If the server supports `RFC 931`_
288identification, this variable may be set to the remote user
289name. Should only be used for logging purposes.
290
291``remote_user``: (Element 14) If authentication is supported by the
292server (or middleware), this should be set to the authenticated
293username.
294
295``remote_user_data``: (Element 15) Any additional data provided by the
296authentication mechanism.
297
298``request_method``: (Element 16) An atom or string describing the HTTP
299request method.  Common methods MUST be atoms and include
300``'OPTIONS'``, ``'GET'``, ``'HEAD'``, ``'POST'``, ``'PUT'``,
301``'DELETE'``, ``'TRACE'``, and ``'CONNECT'``.  A value is always
302required and it MUST NOT be an empty string.
303
304``script_name``: (Element 17) The initial portion of the request URL's
305"path" that corresponds to the application object, so that the
306application knows its virtual "location".  This may be an empty
307string, if the application corresponds to the "root" of the server.
308
309``server_name``, ``server_port``: (Element 18,19) When combined with
310``script_name`` and ``path_info``, these variables can be used to
311complete the URL.  Note, however, that ``http_host``, if present,
312should be used in preference to ``server_name`` for reconstructing the
313request URL. ``server_name`` and ``server_port`` can never be empty
314strings, and so are always required.
315
316``server_protocol``: (Element 20) The version of the protocol the
317client used to send the request. Typically this will be something like
318``"HTTP/1.0"`` or ``"HTTP/1.1"``and may be used by the application to
319determine how to treat any HTTP request headers.  (This variable
320should probably be called ``request_protocol``, since it denotes the
321protocol used in the request, and is not necessarily the protocol that
322will be used in the server's response.  However, for compatibility
323with CGI we have to keep the existing name).
324
325``server_software``: (Element 21) The name and revision of the server
326software answering the request.
327
328EWGI-specification parameters
329'''''''''''''''''''''''''''''
330
331``read_input``: (Element 2) A 2-arity function which takes a
332``Callback`` 1-arity function and a ``Size`` non-zero integer.  The
333``Callback`` function will be called with chunks of data in the form
334``{data, Bin}`` where ``Bin`` is a binary.  At the end of reading, the
335``Callback`` function will be called with ``eof`` as its argument.
336The supplied function should return another function of the same kind.
337
338``write_error``: (Element 3) A 1-arity function which takes an
339``iolist`` and writes to the server-defined error log mechanism
340(usually ``error_logger``).
341
342``url_scheme``: (Element 4) A string representing the "scheme" portion
343of the URL at which the application is being invoked. Normally, this
344will have the value ``"http"`` or ``"https"`` where appropriate.
345
346``version``: (Element 5) The tuple ``{1,1}``, representing EWGI major
347version 1, minor version 1.
348
349``data``: (Element 6) A dictionary (implemented by the OTP module
350``gb_trees``) which can be used for server or application-specific
351data to be included with the request.  A common use for this
352dictionary is in configuring higher-level web frameworks or providing
353cached data. Additionally, a server or gateway should attempt to
354provide as many other CGI variables as are applicable.  In addition,
355if SSL is in use, the server or gateway should also provide as many of
356the `Apache SSL environment variables`_ as are applicable, such as
357``https`` and ``ssl_protocol``.  Note, however, that an application
358that uses any CGI variables other than the ones listed above are
359necessarily non-portable to web servers that do not support the
360relevant extensions. An EWGI-compliant server or gateway should
361document what variables it provides, along with their definitions as
362appropriate.  Applications should check for the presence of any
363variables they require, and have a fallback plan in the event such a
364variable is ``'undefined'``.
365
366HTTP headers
367''''''''''''
368
369EWGI provides a tuple with commonly-used HTTP request headers to
370optimise retrieval.  Each of the values is a list of 2-tuples of the
371form {``FieldName``, ``FieldValue``}.  Servers MUST preserve the order
372of headers as they are given in the request.  Servers SHOULD preserve
373the case of the ``FieldName`` values.
374
375``http_accept``: (Element 2) The ``Accept:`` header
376
377``http_cookie``: (Element 3) The ``Cookie:`` header
378
379``http_host``: (Element 4) The ``Host:`` header
380
381``http_if_modified_since``: (Element 5) The ``If-Modified-Since:``
382header
383
384``http_user_agent``: (Element 6) The ``User-Agent:`` header
385
386``http_x_http_method_override``: (Element 7) The
387``X-Http-Method-Override:`` header.  While not part of the HTTP 1.1
388specification, this header can be used to overcome a common browser
389limitation which prevents browsers from sending a ``PUT`` or
390``DELETE`` request to a URI.
391
392``other``: (Element 8) A multiset (implemented by the OTP module
393``gb_trees``) which contains all other HTTP request headers. The keys
394of the dictionary should be lower-case representations of the header
395names and the values should be a list of tuples of the form
396{``HeaderName``, ``HeaderValue``}.  Servers SHOULD attempt to preserve
397the original case of header names in the tuple list.
398
399Notes
400'''''
401
402Missing variables (where allowed, such as ``remote_user`` when no
403authentication has occurred) should be defined by the atom
404``'undefined'``.  Also note that CGI-defined variables must be strings
405if they are defined.  It is a violation of this specification for a
406CGI variable's value to be of any type other than ``string`` or the
407``'undefined'`` atom.
408
409Response
410--------
411
412The ``Response`` parameter is a 5-tuple of the form ``{ewgi_response,
413{StatusCode, ReasonPhrase}, HeaderList, MessageBody, Error}``. and A
414convenient record definition is::
415
416    -record(ewgi_response, {
417              status={200, "OK"},
418              headers=[],
419              message_body,
420              err
421             }).
422
423Status Code
424'''''''''''
425
426The ``StatusCode`` parameter should be a 3-digit integer corresponding
427to the HTTP status code as defined in the HTTP specification (See `RFC
4282616, Section 6.1.1`_ for more information).  For example, ``200``
429corresponds to a successful request.
430
431Reason Phrase
432'''''''''''''
433
434The ``ReasonPhrase`` parameter is intended to be a human readable
435representation of ``StatusCode`` and should be a string or binary.
436
437Headers
438'''''''
439
440``Headers`` is a list of ``{HeaderName, HeaderValue}`` tuples
441describing the HTTP response headers.
442
443Each ``HeaderName`` must be a valid HTTP header field-name (as defined
444by `RFC 2616, Section 4.2`_), without a trailing colon or bother
445punctuation.  Note: ``HeaderName`` is case insensitive, but should be
446lower-case for optimising comparisons. (A reminder for server/gateway
447authors: be sure to take that into consideration when examining
448application-supplied headers).
449
450Each ``HeaderValue`` must not include any control characters,
451including CR or LF, in any position.
452
453In general, the server or gateway is responsible for ensuring that
454correct headers are sent to the client: if the application omits a
455header required by HTTP (or other relevant specifications that are in
456effect), the server or gateway must add it.  For example, the HTTP
457``Date:`` and ``Server:`` headers would normally be supplied by the
458server or gateway.
459
460Applications and middleware are forbidden from using HTTP/1.1
461"hop-by-hop" features or headers, any equivalent features in HTTP/1.0,
462or any headers that would affect the persistence of the client's
463connection to the web server.  These features are the exclusive
464province of the actual web server, and a server or gateway should
465consider it a fatal error for an application to attempt sending them,
466and raise an error if they are supplied.
467
468For example::
469
470    [{"content-type", "application/json"}, {"etag", "8a920bc001df"}]
471
472Message Body
473''''''''''''
474
475The ``MessageBody`` parameter is either an ``iolist`` or a "stream,"
476which is a lazy, recursive list-like structure.  A stream is a
477zero-arity function which returns either the empty tuple ``{}`` or a
4782-tuple of the form ``{Head, Tail}`` where ``Head`` is an ``iolist``
479and ``Tail`` is another stream.  Servers may choose to transmit
480message bodies represented by a stream using the chunked transfer
481encoding.  However, the server or gateway must transmit ``iolist``s to
482the client in an unbuffered fashion, completing the transmission of
483each ``iolist`` before requesting another one.  (In other words,
484applications should perform their own buffering).
485
486The server or gateway should not alter the ``iolist`` returned by the
487application in any way.  The application is responsible for ensuring
488that the ``iolist`` to be written is in a format suitable for the
489client.  However, the server or gateway may apply HTTP transfer
490encodings or perform other transformations for the purpose of
491implementing HTTP features such as byte-range transmission.
492
493EWGI Reference Implementation
494=============================
495
496The EWGI reference implementation includes an API module ``ewgi_api``
497which defines helper functions to access and modify the EWGI context,
498parse query strings, etc.  It also includes a module
499``ewgi_application`` which contains convenience functions for dealing
500with application functions as well as sample middleware components.
501An include file (``include/ewgi.hrl``) is also provided, which
502contains macros for standard HTTP status values and the convenience
503record definitions.  These may be used to help development of servers
504and applications, but should not be required.
505
506Copyright
507=========
508
509This document has been placed in the public domain.
510
511.. _PEP 333:
512    http://www.python.org/dev/peps/pep-0333/
513
514.. _Common Gateway Interface specification:
515    http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt
516
517.. _Apache SSL environment variables:
518    http://www.modssl.org/docs/2.8/ssl_reference.html#ToC25
519
520.. _RFC 2616, Section 6.1.1:
521    http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html#sec6.1.1
522
523.. _RFC 931:
524    http://www.faqs.org/rfcs/rfc931.html
525
526.. _RFC 2616, Section 4.2:
527    http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2