/doc/report/eval.tex

http://github.com/hhughes/ocaml-frui · LaTeX · 122 lines · 88 code · 34 blank · 0 comment · 0 complexity · 49407c873f7827f6cb182e1f64858e96 MD5 · raw file

  1. \chapter{Evaluation}
  2. Having written the code, the next stage is to assess how the applications perform under testing. This section will compare the three applications from the implementation section with JavaScript equivalents. It will describe in detail the process of testing from both a quantitative and qualitative viewpoint.
  3. \section{Volume Testing}
  4. Testing can be split into two categories, functional and non-functional tests. Functional tests are related to user work flows. For example, the user clicks a button and a prompt dialog with certain text appears. Non-functional tests, which will be the focus of this section, are more often related to performance. More specifically the testing will be \emph{volume testing} which looks at how the application scales with large amounts of data~\cite{bib:art-of-testing}.
  5. All tests have been carried out using a 2GHz dual core laptop running Ubuntu 11.04.2. The web browsers used were Google Chrome 11.0.696.65 beta and Mozilla Firefox 4.0.1 with Firebug 1.7.0.
  6. \subsection{Purpose}
  7. The purpose of this testing is to find out how applications written using \emph{ocamljs} compare to hand coded JavaScript versions. It is also useful to measure performance when the application is running on different JavaScript engines. The two that shall be used here are Google Chrome's V8 and SpiderMonkey which is used by Mozilla Firefox. They both do some compilation on the JavaScript code but where SpiderMonkey compiles JavaScript to byte code (which gets executed on by a virtual machine), V8 compiles it to native code so should, in theory, be faster~\cite{bib:v8-proj,bib:spidermonkey}.
  8. The re-implementations of the applications will have to be written carefully to follow the same algorithms as the \emph{ocamljs} versions. This should give a better comparison between the two methods rather than trying to optimise the JavaScript code too much.
  9. \subsection{Strategy}
  10. The testing will be carried out manually, as it is difficult to do automated testing on a user interface. There are some tools for doing this, such as \emph{Selenium}\footnote{\url{http://seleniumhq.org/}}, however these are difficult to use and do not integrate with the browser's development tools.
  11. Ideally the client and server processes should be the only ones running on the testing machine. This isolation is not possible in practice because both parts depend on a lot of other parts of the operating system. The best that can be done is to make sure that no other applications are running during the tests except for the server and the web browser. Each test will be repeated three times with the mean of these used as the final result.
  12. One problem is how to get a similar testing result in both Chrome and Firefox. Google Chrome has in-built developer tools which includes a feature it calls the \emph{timeline}. The timeline shows a record of events such as network calls, mouse events or painting events. Events are also nested so if one event follows another they are grouped in the panel. Recording the time for the callback event obtains a measurement of how long it took the JavaScript to parse and set up the data structures for the messages. This is exactly what is needed for these tests, except that such a feature does not exist for Mozilla Firefox.
  13. Both browsers feature a code profiler. The profiler shows how execution time was spent among the JavaScript functions in the application. The running time of the callback function for the AJAX GET call is a measure of how long the application took to load the messages and then render the page. The reason this is not the preferred method for testing is because \emph{ocamljs} makes all of the OCaml functions anonymous, that is, they are declared as values and assigned to variables. These cannot be identified in the profiler because they do not have names. In order to give these functions a distinguishable name, the compiled JavaScript code must be modified. For example, taking the compiled version of the function which adds two numbers together (from listing \ref{lst:simple-comp}):
  14. \begin{lstlisting}[caption={Compiled JavaScript example}]
  15. var f$58 =
  16. _f(2, function (a$59, b$60) {
  17. return a$59 + b$60;
  18. });
  19. \end{lstlisting}
  20. The modification required to show function \emph{f} in the profiler is to declare a named function inside the anonymous one. The function \emph{\_f} is still needed. This returns an object containing the function which can be partially applied to (see the \emph{eval-apply} method from section \ref{lab:eval-apply}).
  21. \begin{lstlisting}[caption={Example with named function}]
  22. var f$58 =
  23. _f(2, function (a$59, b$60) {
  24. function f(a$59, b$60) {
  25. return a$59 + b$60;
  26. }
  27. f(a$59, b$60);
  28. });
  29. \end{lstlisting}
  30. \subsubsection{Application 1: Log Viewer}
  31. The test for the Log Viewer consisted of measuring the loading time for varying numbers of messages. In order to make it a fair test the messages must be the same on each test run. The first task is to obtain these messages. The longer the load time of the test data the more variation we will get with the results. The Google Chrome Profiler only displays times to two decimal places so if the tests take longer than one minute a lot of accuracy is lost. As a result we should aim to use a range of message volumes which can be loaded in less than a minute.
  32. Initially the range 100 to 1000 incrementing in steps of 100 was chosen. However, the \emph{ocamljs} version of the application crashed Google Chrome when loading more than 700 messages. It failed silently but it is likely to be something to do with a limit on how OCaml lists are represented in JavaScript (see section \ref{lab:ocaml-js}). Even worse Mozilla Firefox could not cope with more than 200 messages in the \emph{ocamljs} implementation without crashing. Therefore I decided to start with 25 messages and test every increment of 25 up to 200 messages. Figure \ref{fig:log-viewer-test} shows the results for this testing run.
  33. \begin{figure}
  34. \includegraphics{charts/log-viewer-get.pdf}
  35. \caption{Log Viewer tests}
  36. \label{fig:log-viewer-test}
  37. \end{figure}
  38. Because, unexpectedly, the \emph{ocamljs} application ran faster on V8 than the JavaScript implementation did, extra tests using Google Chrome's timeline were run. The timeline breaks down execution into \emph{loading}, \emph{scripting} and \emph{rendering}. This should give a better idea of where the extra processing time was used. Figures \ref{fig:log-viewer-scripting} \& \ref{fig:log-viewer-rendering} show the results for each implementation for these extra tests (the value for \emph{loading} was always zero).
  39. \begin{figure}
  40. \includegraphics{charts/log-viewer-scripting.pdf}
  41. \caption{Log Viewer scripting tests}
  42. \label{fig:log-viewer-scripting}
  43. \end{figure}
  44. \begin{figure}
  45. \includegraphics{charts/log-viewer-rendering.pdf}
  46. \caption{Log Viewer rendering tests}
  47. \label{fig:log-viewer-rendering}
  48. \end{figure}
  49. \subsubsection{Application 2: Dataset Graph}
  50. Testing the Dataset Graph involved timing how long it took to load varying amounts of data. The source data for the graph is split into several pages. These tests will vary how many of those pages are loaded. Once again, it is important that both implementations on both platforms can run all the test without crashing. The resulting graph can be found in figure \ref{fig:data-graph-test}. The \emph{ocamljs} application performed particularly badly in Mozilla Firefox under this test so a second graph (figure \ref{fig:data-graph-test-noff}) has been created which omits this result.
  51. \begin{figure}
  52. \includegraphics{charts/data-graph-get.pdf}
  53. \caption{Dataset Graph tests}
  54. \label{fig:data-graph-test}
  55. \end{figure}
  56. \begin{figure}
  57. \includegraphics{charts/data-graph-get-noff.pdf}
  58. \caption{Dataset Graph tests (without Firefox \emph{ocamljs})}
  59. \label{fig:data-graph-test-noff}
  60. \end{figure}
  61. \subsubsection{Application 3: Heat Map}
  62. There was not as much data available for these tests as there was for the Dataset Graph. As a result these tests did not provide much variation in results, except for the \emph{ocamljs} application under SpiderMonkey which once again performed extremely badly (see figure \ref{fig:heat-map-test}, figure \ref{fig:heat-map-test-noff} shows the results from the Heat Map test without the Firefox \emph{ocamljs} test).
  63. \begin{figure}
  64. \includegraphics{charts/heat-map-get.pdf}
  65. \caption{Heat Map tests}
  66. \label{fig:heat-map-test}
  67. \end{figure}
  68. \begin{figure}
  69. \includegraphics{charts/heat-map-get-noff.pdf}
  70. \caption{Heat Map tests}
  71. \label{fig:heat-map-test-noff}
  72. \end{figure}
  73. \subsection{Interpretation of Results}
  74. The first thing to note is that in all cases Chrome's V8 was faster than Mozilla SpiderMonkey. This is unsurprising since the V8 project page claims that it is faster than SpiderMonkey. Obviously it depends on the benchmarking suites used but in under these tests V8 appears to be faster~\cite{bib:v8-proj,bib:are-we-fast}.
  75. Generally the JavaScript implementations perform better than the \emph{ocamljs} ones. In the Heat Map and Dataset Graph tests JavaScript is around four times faster. The differences are less pronounced when running SpiderMonkey in the Log Viewer and surprisingly \emph{ocamljs} is in fact faster than the handwritten JavaScript when running the Log Viewer under V8.
  76. It is likely that the compilation from OCaml and the additional book keeping required by \emph{froc} will add extra overhead. So it is expected that the \emph{ocamljs} implementation will be slower than JavaScript. Explaining why it is faster in the case of the Log Viewer is difficult especially since the relative speed of the applications are reversed when using the other JavaScript Engine.
  77. The extra tests for the Log Viewer on Google Chrome (figures \ref{fig:log-viewer-scripting} \& \ref{fig:log-viewer-rendering}) show that the rendering times for each implementation were roughly the same but that the scripting times were consistently different. This could be for several reasons. Either the V8 JavaScript engine was particularly good at optimising the code for the \emph{ocamljs} Log Viewer, or the code for the JavaScript application was computationally less efficient. The fact that the JavaScript solution was faster than the \emph{ocamljs} one when running on SpiderMonkey suggests that it was probably the former reason and that, by chance, V8 could successfully optimise the \emph{ocamljs} compiled code.
  78. \section{Qualitative Evaluation}
  79. As opposed to quantitative testing which concentrates on real facts and numbers, qualitative testing focuses more on the developer's experiences as they developed the applications using the two different methods. This section will provide a brief description as to how the different programming styles of OCaml and JavaScript \emph{performed} during development compared to each other.
  80. \subsection{ocamljs vs JavaScript}
  81. An important factor in comparing these two methods of producing web applications is how long it takes the code to load the same data. A full user study where a large group of programmers with varying levels of experience all coding up many web applications in both OCaml and JavaScript would be time consuming and beyond the scope of this project. However, with a fair amount of experience in writing web applications in JavaScript and some experience in writing commercial OCaml code I feel I am in a good position to comment on this topic.
  82. When writing the OCaml, I found the going was fairly slow to begin with. The code has to be written with a view for it to run in an imperative way. Accessing the style methods (such as for changing the element's colour or position) were wordy and cumbersome. One advantage, however, was that the compiler's type checker found a lot of the programming errors. Although this was frustrating at the time, these would have been a lot harder to find if they were errors in JavaScript code being discovered at run-time. There were a few times when the compilation succeeded and the application did not work (for example the JSON float typing pitfall in section \ref{lab:json-pitfall}). These were time consuming to debug because after compilation the code looked very different to the OCaml source -- the variables had been renamed and all the functions had been moved into variables so they could use the eval-apply closures (section \ref{lab:eval-apply}).
  83. I found writing the JavaScript implementations much easier. This could have been because I knew exactly what algorithms the program should use, how it should look and how it should behave. It could also be down to having had some prior experience writing applications in JavaScript and finally it could have been that it just makes more sense to write these applications in procedural JavaScript. It is probably a combination of all three of these factors that made producing the JavaScript applications faster.
  84. \subsection{Was froc useful?}
  85. \emph{froc} was used in all three applications (although not in the JavaScript implementations). In all three it was used to automatically position elements and update the display when necessary. It was useful in the Log Viewer with the \emph{froc-list} extension (see section \ref{lab:froc-list}) where functions could be associated with the lists and when new elements were added, those functions were automatically bound to the new item and run. However all this could have been done using a single loop which could run the layout function on each element every time anything changed (which is exactly what \emph{froc} ended up doing).
  86. The other two applications used \emph{froc} in a slightly more useful way. There was a \emph{froc} \emph{behavior} (see section \ref{lab:behavior}) for each of the data points' x and y coordinates. When the time variable for the graph was updated, the new values were loaded into these \emph{behavior}s so if the values did not change, so no repositioning was done. This would not affect the overall computational complexity but it might make it run faster in the average case.
  87. \section{Summary}
  88. \emph{ocamljs} and \emph{froc} have much more expressive power than JavaScript. They provide functional and reactive programming styles for web application development. They do introduce an overhead and with these tools you end up spending more time thinking how to write your program than you would with JavaScript. However, you spend less time debugging the application because the type checker will pick up most programming errors. There is also the advantage that the OCaml compiler checks all parts of the code. With JavaScript it will load the code as long as it can be parsed but there is no guarantee that all of the parts of your JavaScript code have been executed on your test run (unless you use a tool such as \emph{JSCoverage}\cite{bib:jscover}).