PageRenderTime 23ms CodeModel.GetById 14ms app.highlight 4ms RepoModel.GetById 1ms app.codeStats 0ms

/doc/report/prep.tex

http://github.com/hhughes/ocaml-frui
LaTeX | 249 lines | 176 code | 70 blank | 3 comment | 0 complexity | 4617b3ab2dce7efb76b0b04aa6ee6f9f MD5 | raw file
  1\chapter{Preparation}
  2
  3The previous chapter introduced the concept of web applications and functional reactive programming (FRP). This chapter describes in detail the languages and libraries which will be used in the project. It shows the plans for the architecture of the applications to be produced, the infrastructure which will be used to support development and how testing will be carried out. It also contains a work plan for this implementation and testing.
  4
  5\section{Requirements Analysis}
  6A successful implementation of this project should meet the following criteria:
  7\begin{itemize}
  8\item It should provide two or more significantly different web applications.
  9\item These applications must be cross-browser compatible.
 10\item Each must be compiled into JavaScript from a functional programming language and make use of reactive programming.
 11\item Use these applications to determine if these methods aid creation of web applications and what drawbacks they introduce.
 12\end{itemize}
 13
 14\section{Starting Point}
 15
 16I have had previous experience developing web applications in JavaScript. I found the language difficult to use, especially with respect to debugging. For example a common error would be that typing errors (such as misspelling variable names) would lead to uninitialised variables. This would still be valid JavaScript but the program would produce the wrong output. This section will outline the disadvantages of JavaScript and compare it to OCaml, a language which does not contain these drawbacks. 
 17
 18\subsection{JavaScript}
 19JavaScript is used for client-side scripting in web pages. It is run on the local machine inside the browser environment and can access and modify the Document Object Model (\emph{DOM}), which is the browser's representation of the  objects on the current web page. By manipulating the DOM at run-time, JavaScript can be used to make interactive user interfaces and dynamic pages.
 20
 21\subsection{OCaml}
 22OCaml is a language derived from Caml (Categorical Abstract Machine Language). Caml is a general purpose language which supports functional, imperative and object-oriented programming styles. It features a powerful type system that uses parametric polymorphism and type inference. This allows methods to be designed without having to explicitly declare the types of parameters or the result so functions can be reused with many different types of inputs. It also has pattern matching which can be used to direct control flow of the program through functions depending on their inputs~\cite{bib:caml}.
 23
 24OCaml (Objective Caml) is a variant of the Caml language. It is an extension of Caml which adds an object-oriented layer and a module system. It is designed for use in developing commercial systems and is the most popular Caml derivative~\cite{bib:ocaml}.
 25
 26\subsection{JavaScript vs OCaml}
 27In some ways JavaScript is very similar to OCaml. In both languages, functions are first class objects: they can be assigned to variables, passed as function parameters and invoked. However, the typing systems of OCaml and JavaScript are very different. OCaml types start off generic and become more specific each time they are used. In JavaScript variables themselves do not have a type and so can be reassigned to a value of any type. This can lead to programming errors because the user is not guaranteed to know the type of the object at any point. Indeed JavaScript uses type coercion which further complicates the situation. For example if a number is used in place of a boolean, zero will evaluate to \emph{false} and all other numbers to \emph{true}. Even worse, if a string is used as a boolean then the empty string represents \emph{false} and any non-empty strings (including the value "false''!) will evaluate to \emph{true}. Another difference is that OCaml is checked for errors at compile-time whereas JavaScript is checked at run-time. If a piece of code is not executed in a test-run, the user cannot be sure that it will succeed.
 28
 29\section{Tools}
 30The previous section looked at and compared JavaScript and OCaml. This section will look at some libraries that can be used to write web applications in OCaml rather than JavaScript.
 31
 32\subsection{ocamljs}
 33\emph{ocamljs}\footnote{\url{https://github.com/jaked/ocamljs}} is a modified version of the OCaml compiler written by Jake Donham\footnote{\url{http://jaked.org/}}. It uses the standard OCaml compiler up to the point when lambda code (see below) is generated; it replaces the last stage where byte code is generated and outputs JavaScript instead. Figure \ref{fig:ocamlc} shows the stages of the OCaml compiler and \emph{ocamljs}~\cite{bib:oreilly}.
 34
 35\begin{figure}
 36  \center{\includegraphics[scale=0.5]{images/ocamlc.pdf}}
 37  \caption{Stages of the OCaml compiler (ocamlc) and \emph{ocamljs}~\cite{bib:oreilly}}.
 38  \label{fig:ocamlc}
 39\end{figure}
 40
 41\subsubsection{Lambda Code}
 42Lambda code is based on the $\lambda$-calculus, a notation invented by Alonso Church in the 1930s. It is used to describe the most basic ways that operators and functions can be combined. Conversion from mathematical functions to lambda-expressions is fairly straightforward~\cite{bib:lambda}. Here is a short example:
 43
 44\begin{center}
 45$f(x) = x - 1$ becomes $\lambda x.x-1$
 46\end{center}
 47
 48This may not look like a significant change but it makes the notation much closer to that used in programming. When there is more than one parameter for a function, multiple single parameter lambda-expressions are chained instead of having a multi-parameter lambda-expression. For example:
 49
 50\begin{center}
 51$f(x,y) = x-y$ is $\lambda y.\lambda x.x - y$
 52\end{center}
 53
 54Lambda code is very similar to $\lambda$-calculus. Consider the following OCaml program:
 55
 56\begin{lstlisting}[caption={simple.ml},label=lst:simple-ml]
 57let f a b = a+b
 58let three = f 1 2;;
 59\end{lstlisting}
 60
 61The OCaml compiler can output the lambda representation if the \emph{-dlambda} command-line switch is used. The lambda output for the above OCaml is shown below:
 62
 63\begin{lstlisting}[caption={Lambda Code},label=lst:simple-lambda]
 64(setglobal Simple!
 65(let (
 66f/58 (function a/59 b/60 (+ a/59 b/60))
 67three/61 (apply f/58 1 2))
 68(makeblock 0 f/58 three/61)))
 69\end{lstlisting}
 70
 71In this lambda code variables have been renamed. The lambda-expression part is \texttt{function a/59 b/60 (+ a/59 b/60)} which is equivalent to \texttt{$\lambda a.\lambda b.a+b$}.
 72
 73\subsubsection{Lambda to JavaScript}
 74\label{lab:ocaml-js}
 75\emph{ocamljs} transforms lambda code into JavaScript. Functions and exceptions map simply into JavaScript. Integers and floats can be represented as a JavaScript \texttt{number} and booleans by the JavaScript \texttt{bool}. The standard library functions have been reimplemented in a static JavaScript file. Lists are implemented as a collection of nested JavaScript objects, each with two elements, \emph{0} is the value of the list at that position and \emph{1} is the tail of the list~\cite{bib:js_comp}.
 76
 77Function applications are more complex. Functions in JavaScript require the correct number of arguments (and if less arguments are provided the rest default to null) whereas in OCaml, functions can receive more arguments (\emph{tail calls}) or less (\emph{partial application}). When we have a partial application, we want to return a closure and when we have a tail call, we want to apply the extra arguments to the result. This is solved using Simon Peyton Jones' \emph{eval-apply} method~\cite{bib:js_comp,bib:krivines_machine}.
 78
 79\label{lab:eval-apply}
 80With the eval-apply scheme the caller is responsible for providing the correct number of arguments to a function. If there are not enough, a closure has to be created and if there are too many, the left over arguments are applied to the result of the function. This is implemented using the \texttt{apply} function outlined in Figure \ref{eval-apply}~\cite{bib:krivines_machine}.
 81
 82\begin{figure}
 83  \begin{graybox}
 84  \begin{alltt}
 85f a\subs{1} ... a\subs{n} -> apply\subs{n}(f, a\subs{1}, ..., a\subs{n})
 86
 87apply\subs{n} = \lam f x\subs{1} ... x\subs{n}
 88  match arity(f) with
 89    | 1   -> apply\subs{n-1} (f(x\subs{1}), x\subs{2}, ..., x\subs{n})
 90    | ...
 91    | n-1 -> apply\subs{1} (f(x\subs{1}, ..., x\subs{n}), x\subs{n})
 92    | n   -> f(x\subs{1}, ..., x\subs{n})
 93    | n+1 -> papp\subs{n+1,n}(f, x\subs{1}, ..., x\subs{n})
 94    | n+2 -> papp\subs{n+2,n}(f, x\subs{1}, ..., x\subs{n})
 95    | ...
 96
 97papp\subs{p,q} = \lam f x\subs{1} ... x\subs{q}. (\lam x\subs{q+1} ... x\subs{p}. f(x\subs{1}, ..., x\subs{p}))
 98  \end{alltt}
 99  \end{graybox}
100  \caption{Eval-apply implementation}
101  \label{eval-apply}
102\end{figure}
103
104\subsubsection{ocamljs Example}
105Below is the JavaScript code compiled from the OCaml example in listing \ref{lst:simple-ml}.
106
107\begin{lstlisting}[caption={Compiled JavaScript},label=lst:simple-comp]
108function () {
109  var f$58 =
110    _f(2, function (a$59, b$60) {
111      return a$59 + b$60;
112    });
113  var three$61 = _(f$58, [ 1, 2 ]);
114  return $(f$58, three$61);
115}
116\end{lstlisting}
117
118\subsection{froc}
119
120\emph{froc}\footnote{\url{https://github.com/jaked/froc}} is an OCaml library, also written by Jake Donham, for reactive programming in OCaml.
121
122\subsubsection{Self Adjusting Computation}
123\emph{froc} uses \emph{self adjusting computation} to push updates to input variables through data paths in the program. Self adjusting means that once the variable has been defined, the program will automatically forward changes to its dependencies. This is stored by the program as a \emph{dependency graph}.
124
125\subsubsection{Dependency Graphs}
126Given a set of variables and another set of dependencies (pairs of variables) a directed graph can be created where the variables form the nodes and the dependencies are the edges. This is a dependency graph and it is used by reactive programs to represent data-flow throughout the program.
127
128\subsection{Behaviors and Events}
129\label{lab:behavior}
130Reactive programming uses two polymorphic data types, \emph{behaviors} and \emph{events}. \emph{behavior}s are values which vary over time. This could be a property such as the colour of an object or its width. Events are a series of time ordered values which correspond to real events such as mouse presses~\cite{bib:lambda}.
131
132In order to use these time-varying variables \emph{froc} provides a way to create dependencies by \emph{binding} them to functions. Naturally this function is called \texttt{bind} and has the following interface:
133
134\texttt{bind : 'a behavior -> ('a -> 'b behavior) -> 'b behavior}
135
136This is a function which takes a \texttt{behavior} of type $\alpha$ and a callback function which converts an $\alpha$ to a $\beta$ \texttt{behavior} and returns a new \texttt{behavior}. Binding other \emph{behavior}s to a variable is how the dependency graph is constructed; whenever the dependent \emph{behavior}s change the callback function is invoked with the new values. There is also a syntax shortcut for bind, \texttt{\textgreater\textgreater=}.
137\pagebreak
138Any value can be converted into a \emph{behavior} using the \texttt{return} function:
139
140\texttt{return : 'a -> 'a behavior}
141
142Sometimes a callback which is a built in function might be required. Rather than wrapping it in a new function and calling \texttt{bind} there is another function called \texttt{lift} which does this automatically.
143
144\texttt{lift : 'a behavior -> ('a -> 'b) -> 'b behavior}
145
146There are also multiple argument versions of each function which allows variables to depend on more than one \emph{behavior} at once. Here are the two argument versions of \texttt{bind} and \texttt{lift}:
147
148\texttt{bind2 : 'a behavior -> 'b behavior -> ('a -> 'b -> 'c behavior) -> 'c behavior}\\
149\texttt{lift2 : 'a behavior -> 'b behavior -> ('a -> 'b -> 'c) -> 'c behavior}
150
151Consider the following piece of OCaml:
152
153\begin{lstlisting}[caption={Example with \emph{froc} \texttt{bind} (\texttt{\textgreater\textgreater=})}]
154let x = return 1
155let y = return 2
156let z = return 3
157
158let i0 =
159    x >>= fun x ->
160        y >>= fun y ->
161            return (x + y)
162let ans =
163    i0 >>= fun i0 ->
164        z >>= fun z ->
165            return (i0 + z)
166\end{lstlisting}
167
168This program constructs a \emph{froc} \emph{behavior} which is the summation of three other \emph{behavior}s. Figure \ref{add_graph} shows the dependency graph that \emph{froc} holds internally for this program.
169
170Figure \ref{if_graph} shows a more complex dependency graph. In this example if the value of \emph{behavior} \emph{b} was calculated before \emph{behavior} \emph{a} an exception could be raised. This example does not cause a divide by zero exception. The first thing this tells us about \emph{froc} is that it doesn't always execute every statement in the dependency tree, if an intermediate \emph{behavior} is unused \emph{froc} doesn't waste computation time on it. This is called \emph{lazy evaluation}. The second thing this tells us is that \emph{froc} evaluates \emph{behavior}s in a top-down order. It starts with the outputs and works out which \emph{behavior}s it needs to compute next from there~\cite{bib:froc}.
171
172One final thing to mention is that \emph{froc} will not propagate changes to a \emph{behavior} if the callback returns the existing value of the \emph{behavior}. This is useful because it lets there be cycles in the dependency graph. If \emph{froc} did not have this property then a dependency graph with a cycle would not terminate.
173
174\begin{figure}
175  \centering
176  \includegraphics[scale=0.5]{graphs/addition.png}
177  \caption{Example dependency graph}
178  \label{add_graph}
179\end{figure}
180
181\begin{figure}
182  \centering
183  \includegraphics[scale=0.5]{graphs/if.png}
184  \caption{Example if-statement dependency graph}
185  \label{if_graph}
186\end{figure}
187
188%\subsection{HTTP Server}
189%An HTTP server is required to serve up a web page. At first a stock web server, such as \emph{Apache}, seemed like a good idea because it requires minimal setup. This is good for serving static content (such as HTML pages and JavaScript files) but dynamic content (such as time-dependent JSON messages) proves more tricky. In order to deliver interesting JSON data some server code is required. The two options are to use some sort of server side scripting which Apache can execute, although this requires learning a new language such as \emph{PHP} or to find an implementation of an HTTP server in a language this project is already using (such as OCaml) and modify it such that JSON data can be generated at run-time and delivered to the client.
190
191%Using an OCaml HTTP server is the more sensible solution because it gives the greatest amount of time and lets me concentrate on writing the JSON generating code rather than getting stuck learning a new syntax. The OCaml web server I shall use is \emph{ocaml-cohttpserver}\footnote{https://github.com/avsm/ocaml-cohttpserver}.
192
193\section{Architecture}
194
195This section will describe the design of three web applications which will make up this project. Each application will use the common design architecture called \emph{client-server}. Client-server is normally based on many clients and one server. It is designed such that the amount of processing performed by the server after each request is minimal. The majority of the computation, which is usually involved in rendering the data as elements on a page, is performed by each client~\cite{bib:dist_arch}.
196
197\begin{figure}
198  \includegraphics[width=\linewidth]{images/client-server.png}
199  \caption{Client/Server Architecture}
200  \label{fig:client_server}
201\end{figure}
202
203\subsection{Log Viewer}
204If we have a system where there are multiple threads which all run the same code, the log file will be interleaved with messages from each thread. This will make following the path of one thread through the logs difficult. An HTTP server is an example of a system with these properties. Each connection made to the server is run in a new thread and each of these runs almost the same sequence of code.
205
206\textbf{Application 1}: Design a system using \emph{ocamljs} and \emph{froc} which could replace the logging module in a program to display the messages in a more helpful way.
207
208\subsection{Dataset Graph}
209Graphing is generally fairly straightforward. However, sometimes we wish to represent data which has more than two variables. This is difficult on a graph with just two axes. One way we can represent a third axis is by mapping it to time. The graph can show the values for the other two variables with the third one fixed. A \emph{play} feature can be used to show all of these graphs with the third variable changing over time.
210
211\textbf{Application 2}: This application will show data with three variables, one on each axis and one which is varied using a \emph{play} function. 
212
213\subsection{Heat Map}
214A heat map is another type of graph. Instead of varying in the X and Y axis, points remain fixed. Each point varies over time in colour or \emph{heat}. As the time changes each of the data points updates its colour to the new value.
215
216\textbf{Application 3}: Create an application which displays data about energy usage over time for a number of rooms in a building.
217
218\section{Infrastructure}
219
220This section describes the tools that will be used to help with development, version control, source code back-up and compilation of the project.
221
222\subsection{Version Control}
223Version control is very important for a software project. It involves breaking the project into a number of changesets. Each change consists of differences to files along with a brief description explaining what changes were made. The idea is that the code is in a consistent state before and after each commit. This is often used in conjunction with pushing changesets to a remote server which is regularly backed up. As a consequence if files become corrupted, deleted or changed in such a way that work has been undone the files can be reverted to a working copy\footnote{It turned out to be very useful I had used version control on this project because a failed operating system upgrade left my machine unbootable. Once I had it working again I only needed to clone the repository again and I could continue working where I left off.}.
224
225There are many version control systems. \emph{Git}\footnote{\url{http://git-scm.com/}} is the one used for the \emph{ocamljs} and \emph{froc} projects. It has the required functionality and there is a free to use service run by \emph{GitHub}\footnote{http://www.github.com} on condition that your code is publicly viewable and anyone can fork your repository. The \emph{GitHub} service also provides some social networking features which allow other developers to follow changes to repositories they are interested in. The repository for this project can be found at \emph{https://github.com/hhughes/ocaml-frui}.
226
227\subsection{Compiling the Project}
228The \emph{GNU Make}\footnote{\url{http://www.gnu.org/software/make/}} system will be used to perform compilation of the project. Make uses shell scripts and dependencies to compile just those parts of the project which have changed since it was last compiled. Compilation of this project is likely to be reasonably quick but it is good practice to use Make for when projects become larger. Make is also a commonly used and simple tool. It is likely that those who clone this repository will already be familiar with the tool.
229
230\section{Development Model}
231This project will use the \emph{waterfall model} of development. This method separates out each stage of development which simplifies the design-implement-testing process~\cite{bib:royce}. The maintenance stage has been removed from the original model because that is beyond the scope of this project. Figure \ref{fig:waterfall} shows the stages of the waterfall model as used in this project.
232
233\begin{figure}
234  \centering
235  \includegraphics[scale=0.5]{images/waterfall.pdf}
236  \caption{Waterfall development model (with \emph{maintenance} stage removed}
237  \label{fig:waterfall}
238\end{figure}
239
240\section{Testing}
241In order to test the applications written using \emph{ocamljs} and \emph{froc}, each application has to be recreated using handwritten JavaScript. These implementations should provide the same functionality and use as close to the same algorithms as possible. Each version of each application will be tested using dummy input data and the average execution time of the JavaScript in each case will be used as a comparison. Each application should also be tested on at least two web browsers which use different JavaScript engines (for example \emph{Google Chrome}\footnote{\url{http://www.google.com/chrome/}} uses \emph{V8}\footnote{\url{http://code.google.com/p/v8/}} and \emph{Mozilla Firefox}\footnote{\url{http://www.mozilla.org/firefox}} uses \emph{SpiderMonkey}\footnote{\url{https://developer.mozilla.org/en/SpiderMonkey}}). This is because different browsers will optimise different parts of the code. A note should also be made of the estimated number of man-hours invested in both the \emph{ocamljs} and pure JavaScript implementations to compare the \emph{ease} of writing the code.
242
243Micro-tests (e.g. implementations of simple algorithms such as \emph{quick-sort} or \emph{matrix multiplication}) may also be required if test results from the web applications are not significant.
244\vfill
245
246\section{Gantt Chart}
247\begin{center}
248\includegraphics[angle=270]{charts/gantt.pdf}
249\end{center}