PageRenderTime 27ms CodeModel.GetById 15ms RepoModel.GetById 0ms app.codeStats 0ms

/report2/implementation.tex

https://github.com/khskrede/mehh
LaTeX | 336 lines | 256 code | 80 blank | 0 comment | 0 complexity | a8139bb782b4f64c5a184e90ab5056a7 MD5 | raw file
  1. \chapter{System description}
  2. \label{chap:impl}
  3. This chapter discusses the implementation of the project with
  4. references to the source tree (figure \ref{fig:overview}).
  5. Note that the source tree has been refactored by E.W. Thomassen
  6. to better fit with other PyPy projects. This subsection discusses how
  7. and where the specific functionality is implemented. Note that there
  8. are a number of problems with the current implementation, and that
  9. these are discussed in chapter \ref{chap:conc}.
  10. \section{Toplevel}
  11. The toplevel of the source tree contains a "readme" file, a folder
  12. called "pyhaskell" where the main interpreter functionality is
  13. implemented, and a file called "targethaskellstandalone.py" that
  14. defines the target and entry point for the PyPy translation tool.
  15. \section{The main functionality}
  16. The pyhaskel folder contains 3 main programs. "main.py",
  17. "makegraph.py" and "run\_tests.py".
  18. "main.py" is the entry point of the compilation system. It defines
  19. the target for the RPython translation tool, imports the \emph{builtin}
  20. functionality and the main function begins the interpretation of the
  21. JSCore program.
  22. The program "makegraph.py" contains a JSON parser and dumps the resulting
  23. JSON tokens to a dot file in order to generate a graph using the graphviz
  24. tool.
  25. The "runtests.py" program contains a program that executes all the programs
  26. in the sub folder "test" and then prints the result of the tests.
  27. \begin{sidewaysfigure}
  28. \begin{figure}[H]
  29. \centering
  30. \includegraphics[width=\textheight]{../diags/overview.png}
  31. \caption{Overview of source tree}
  32. \label{fig:overview}
  33. \end{figure}
  34. \end{sidewaysfigure}
  35. The sub folder "interpreter" contains the actual
  36. interpreter code, the parser and the module system implementation. These are
  37. discussed in the following sections.
  38. \subsection*{Interpreter}
  39. The file "haskell.py" in the folder "interpreter" contains the
  40. Haskell-Python interpreter, the base of the compilation system.
  41. This subsection describes the keys of the Haskell-Python implementation.
  42. Haskell-Python consists of a few base classes;
  43. \begin{itemize}
  44. \item \emph{Symbol} contains a static dictionary of all Symbols. This is
  45. simply used to compare "names" by their object identity.
  46. \item \emph{HaskellObject} is a base class for all objects handled by the
  47. interpreter.
  48. \item \emph{Value} is a base class for evaluated values.
  49. \item \emph{Constructor} inherits from \emph{Value} and implements an abstract
  50. base class for constructors with different number of arguments.
  51. \item \emph{ConstructorN} inherits from \emph{Constructor} and is used as the
  52. base for a number of other Constructor classes created by a function called
  53. \emph{make\_arg\_subclass}. This function is again called by another function
  54. called \emph{make\_constructor} that creates a Constructor based on the length
  55. of the list it receives as argument. This function (\emph{make\_constructor})
  56. is called by another function \emph{constr} that takes the name-argument (a string)
  57. and looks it up in the list of symbols to create the actual Constructor. So the
  58. actual Constructor is created by a coll to the function \emph{constr(name, *args)}.
  59. \item \emph{AbstractFunction} inherits from \emph{Value} and is the base class for
  60. the functions of Haskell-Python.
  61. \item \emph{Function} inherits from AbstractFunction and defines a user-defined
  62. Haskell-Python function; it contains a name and a list of \emph{Rules}
  63. \item \emph{Rule} is a field of the Function. It consists of a list of patterns and
  64. an expression. If the function is applied and its arguments matches a pattern, the
  65. expression is evaluated.
  66. \item \emph{Substitution} inherits from HaskellObject and is the body of a function
  67. with its numbered variables substituted by values.
  68. \item \emph{PrimFunction} inherits from \emph{AbstractFunction} and describes a function
  69. implemented at the machine level. A function called \emph{expose\_primitive} is
  70. be used as a python-decorator to create \emph{PrimFunction} objects.
  71. \item \emph{Var} inherits from \emph{HaskellObject} and describes a Haskell variable.
  72. \item \emph{NumberedVar} inherits from \emph{HaskellObject}. A \emph{Var} is replaced
  73. by a \emph{NumberedVar} by a call object-function \emph{enumerate\_head} inherited from
  74. \emph{HaskellObject} when creating a \emph{Rule}.
  75. \item \emph{Application} inherits from \emph{HaskellObject} and describes an
  76. abstract base class for Haskell-Python function application. Like the constructor,
  77. classes are created by a function for various numbers of arguments.
  78. \item \emph{ApplicationN} inherits from \emph{Application} and is used by the function
  79. \emph{make\_application} to create Application classes with various numbers of arguments.
  80. This function, like \emph{make\_constructor} is used by the function
  81. \emph{make\_arg\_subclasses} to create Applications with various number of arguments.
  82. \item \emph{Thunk} inherits from \emph{HaskellObject} and represents an unevaluated
  83. function application.
  84. \item \emph{StackElement} is a base class for the elements of the evaluation stack.
  85. \item \emph{CopyStackElement} inherits from \emph{StackElement} and contains an
  86. \emph{Application}.
  87. \item \emph{UpdateStackElement} inherits from \emph{StackElement} and contains a
  88. \emph{Thunk}. The \emph{Thunk} contained in the object is updated after its
  89. content has been evaluated.
  90. \end{itemize}
  91. In addition to these classes, the most important functions are;
  92. \begin{itemize}
  93. \item \emph{expose\_primitive} and the following two functions have already been
  94. mentioned. \emph{expose\_primitive} can be used as a python-decorator.
  95. \item \emph{constr} creates a \emph{Constructor} object by looking up the "name" in
  96. the dictionary of the \emph{Symbol} class, and selecting the correct sub-class from the
  97. sub-classes generated for \emph{Constructor}.
  98. \item \emph{make\_application} is similar to \emph{make\_constructor}, there is no
  99. need for a wrapper function as it does not have to look up a \emph{Symbol}.
  100. \item \emph{main\_loop} is the main function, it reduces a Haskell-Python
  101. \emph{Application} to a \emph{Value}. The function \emph{evaluate\_hnf} is
  102. simply a wrapper for the \emph{main\_loop function}
  103. \emph{main\_loop}.
  104. \end{itemize}
  105. This section has described the Haskell-Python implementation in detail, based
  106. on the classes and functions implemented.
  107. \subsection*{Primitive types}
  108. The file "primtypes.py" in the folder "interpreter" contains the primitive types
  109. for the Haskell-Python interpreter. The types implemented inherit from the
  110. \emph{Value} class in "haskell.py".
  111. The following primitive types are implemented;
  112. \begin{itemize}
  113. \item \emph{Char} represents a single Haskell character-value.
  114. \item \emph{Int} represents a Haskell Int value.
  115. \item \emph{Addr} is a memory address. Among other things it is used to represent
  116. the String type.
  117. \item \emph{Double} is simply a double precision floating point value.
  118. \item \emph{Float} is simply a single precision floating point value.
  119. \end{itemize}
  120. Since these classes all inherit from the Value type of Haskell-Python,
  121. they have a common interface; they all contain a variable called value that
  122. contain its actual contents. This is currently implemented as a python
  123. variable, so it is not represented like its Haskell equivalent at the low-level.
  124. See listing \ref{lst:int1} for an example of how a primitive value is implemented.
  125. \begin{figure}[H]
  126. \lstset{ %
  127. language=Python,
  128. caption=Python class implementing the Haskell Int Value.,
  129. label=lst:int1
  130. }
  131. \begin{lstlisting}
  132. class Int(haskell.Value):
  133. _immutable_fields_ = ["value"]
  134. def __init__(self, integer):
  135. assert isinstance(integer, int)
  136. self.value = integer
  137. def match(self, other, subst):
  138. value = other.getvalue()
  139. if value:
  140. assert isinstance(value, Int)
  141. if self.value == value.value:
  142. return haskell.DEFINITE_MATCH
  143. return haskell.NO_MATCH
  144. return haskell.NEEDS_HNF
  145. def __eq__(self, other):
  146. return (isinstance(other, Int) and self.value == other.value)
  147. def __ne__(self, other):
  148. return not (self == other)
  149. def tostr(self):
  150. return str(self.value)
  151. \end{lstlisting}
  152. \end{figure}
  153. \subsection*{Modules}
  154. The file "module.py" in the folder "interpreter" contains the basics of the
  155. module system. It contains a single class; \emph{CoreMod}. This class corresponds
  156. to a Haskell Module. A \emph{CoreMod} object contains a name, and three dictionaries.
  157. These dictionaries are called \emph{qvars}, \emph{qtycons} and \emph{qdcons} and they
  158. contain \emph{qualified variables}, \emph{qualified type constructors} and
  159. \emph{qualified data constructors} respectively. When a \emph{CoreMod} object is created
  160. it is at once added to the list of Haskell Modules.
  161. \subsection*{JSCore parser}
  162. The file "jscparser.py" in the "interpreter" folder contains the parser code.
  163. This code is responsible for loading JSCore files, and creating the AST based
  164. on the interpreter and module-system implementations.
  165. The parser implementation takes advantage of some of the parsing tools
  166. available from
  167. the PyPy code base. By writing an EBNF grammar for JSON, and giving it to a
  168. function called \emph{parse\_ebnf}, a JSON parser is created.
  169. The resulting parser is a base class called \emph{RPythonVisitor}. The
  170. RPythonVisitor
  171. class implements \emph{visit} functions for the constructs created by the
  172. EBNF grammar. For JSON this becomes \emph{visit\_object},
  173. \emph{visit\_number} etc.
  174. The result of this is that we have a set of functions that get called when
  175. visiting JSON constructs. Since JSCore is a description of External-Core
  176. compatible with JSON this lets us parse the Core programs by branching
  177. on the visitor functions.
  178. This way, the mapping described in chapter \ref{chap:rewrite} is implemented.
  179. \section{Built-in functionality}
  180. The "builtin" folder contains all the Haskell library functionality that
  181. has been implemented in Python. Most of this functionality should have been
  182. stored in the sub folder "ghc\_modules" as JSCore files. However, due to some
  183. issues that are discussed in conclusions and future work, this is not currently
  184. the case.
  185. As an example, the file "num.py" contains the implementation for the generic
  186. Num class. Listing \ref{lst:zmnum} contains the implementation for the
  187. generic minus function. Currently it only supports the Int type. Note that
  188. it takes three arguments, the first argument is used to resolve which
  189. function actually implements the minus function, and the other two are the
  190. arguments for this function.
  191. \begin{figure}[H]
  192. \lstset{ %
  193. language=Python,
  194. caption=Python function implementing the generic minus function.,
  195. label=lst:zmnum
  196. }
  197. \begin{lstlisting}
  198. @haskell.expose_primitive(3)
  199. def zm( args ):
  200. ty = args[0]
  201. a = args[1]
  202. b = args[2]
  203. if ty == mod.qvars["$fNumInt"]:
  204. return haskell.make_partial_app(izhconstr,
  205. [haskell.make_partial_app(prim.zmzh, [a.getarg(0), b.getarg(0)])])
  206. else:
  207. raise NotImplementedError
  208. mod.qvars["-"] = zm
  209. \end{lstlisting}
  210. \end{figure}
  211. \section{External-Core to JSCore}
  212. The "core" folder contains the Haskell program responsible for generating the
  213. JSCore intermediate format. This program is called "core2js". Currently, the
  214. program uses the External-Core parser from the "extcore" Haskell package and
  215. the JSON package to read the Haskell file in External-Core format and dump it
  216. in JSCore format.
  217. \section{GHC modules}
  218. The folder "ghc\_modules" contain the GHC boot libraries. It also contains a
  219. script that goes through all these Haskell modules and uses GHC to create External-Core files
  220. using the "-fext-core" flag. Then it uses "core2js" to create JSCore files from the
  221. resulting External-Core files.
  222. The GHC library files are organized exactly as in the GHC source tree. Each
  223. folder in the "libraries" directory corresponds to a Haskell package. These
  224. packages contain the Haskell modules, e.g. the module "GHC.Tuple" is located
  225. in "ghc-prim/GHC/Tuple.hs".
  226. \section{Tests}
  227. Tests have been designed to test different parts of the interpreter.
  228. The "test" directory contains one subdirectory for each test and the
  229. subdirectories contain a single Haskell file with the same name as the
  230. subdirectory (other files are created and put in the directory when running
  231. the test). The test "fibonacci" is located in the following path:
  232. "test/fibonacci/fibonacci.hs" and after running the test the "fibonacci"
  233. folder should contain the additional files; "factorial.hcr", "factorial.hi",
  234. "factorial.o" and "factorial.hcj". ".hcj" is the extension used for JSCore
  235. files.
  236. The output from running the test script is:
  237. \begin{figure}[H]
  238. \lstset{ %
  239. language=Python,
  240. caption=Output from running the test script,
  241. label=lst:testoutput
  242. }
  243. \begin{lstlisting}
  244. -----------------------
  245. Results (29.96 seconds)
  246. -----------------------
  247. case : Success
  248. data : Success
  249. factorial : Success
  250. fibonacci : Success
  251. helloworld : Success
  252. let : Success
  253. multiply : Success
  254. partialapp : Success
  255. \end{lstlisting}
  256. \end{figure}