/3rd_party/llvm/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt

https://code.google.com/p/softart/ · Plain Text · 199 lines · 165 code · 34 blank · 0 comment · 0 complexity · f3666a05a60a7ec6a9f7985d77ec7076 MD5 · raw file

  1. Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST)
  2. From: Chris Lattner <sabre@nondot.org>
  3. To: Vikram Adve <vadve@cs.uiuc.edu>
  4. Subject: Re: a few thoughts
  5. Okay... here are a few of my thoughts on this (it's good to know that we
  6. think so alike!):
  7. > 1. We need to be clear on our goals for the VM. Do we want to emphasize
  8. > portability and safety like the Java VM? Or shall we focus on the
  9. > architecture interface first (i.e., consider the code generation and
  10. > processor issues), since the architecture interface question is also
  11. > important for portable Java-type VMs?
  12. I forsee the architecture looking kinda like this: (which is completely
  13. subject to change)
  14. 1. The VM code is NOT guaranteed safe in a java sense. Doing so makes it
  15. basically impossible to support C like languages. Besides that,
  16. certifying a register based language as safe at run time would be a
  17. pretty expensive operation to have to do. Additionally, we would like
  18. to be able to statically eliminate many bounds checks in Java
  19. programs... for example.
  20. 2. Instead, we can do the following (eventually):
  21. * Java bytecode is used as our "safe" representation (to avoid
  22. reinventing something that we don't add much value to). When the
  23. user chooses to execute Java bytecodes directly (ie, not
  24. precompiled) the runtime compiler can do some very simple
  25. transformations (JIT style) to convert it into valid input for our
  26. VM. Performance is not wonderful, but it works right.
  27. * The file is scheduled to be compiled (rigorously) at a later
  28. time. This could be done by some background process or by a second
  29. processor in the system during idle time or something...
  30. * To keep things "safe" ie to enforce a sandbox on Java/foreign code,
  31. we could sign the generated VM code with a host specific private
  32. key. Then before the code is executed/loaded, we can check to see if
  33. the trusted compiler generated the code. This would be much quicker
  34. than having to validate consistency (especially if bounds checks have
  35. been removed, for example)
  36. > This is important because the audiences for these two goals are very
  37. > different. Architects and many compiler people care much more about
  38. > the second question. The Java compiler and OS community care much more
  39. > about the first one.
  40. 3. By focusing on a more low level virtual machine, we have much more room
  41. for value add. The nice safe "sandbox" VM can be provided as a layer
  42. on top of it. It also lets us focus on the more interesting compilers
  43. related projects.
  44. > 2. Design issues to consider (an initial list that we should continue
  45. > to modify). Note that I'm not trying to suggest actual solutions here,
  46. > but just various directions we can pursue:
  47. Understood. :)
  48. > a. A single-assignment VM, which we've both already been thinking
  49. > about.
  50. Yup, I think that this makes a lot of sense. I am still intrigued,
  51. however, by the prospect of a minimally allocated VM representation... I
  52. think that it could have definite advantages for certain applications
  53. (think very small machines, like PDAs). I don't, however, think that our
  54. initial implementations should focus on this. :)
  55. Here are some other auxiliary goals that I think we should consider:
  56. 1. Primary goal: Support a high performance dynamic compilation
  57. system. This means that we have an "ideal" division of labor between
  58. the runtime and static compilers. Of course, the other goals of the
  59. system somewhat reduce the importance of this point (f.e. portability
  60. reduces performance, but hopefully not much)
  61. 2. Portability to different processors. Since we are most familiar with
  62. x86 and solaris, I think that these two are excellent candidates when
  63. we get that far...
  64. 3. Support for all languages & styles of programming (general purpose
  65. VM). This is the point that disallows java style bytecodes, where all
  66. array refs are checked for bounds, etc...
  67. 4. Support linking between different language families. For example, call
  68. C functions directly from Java without using the nasty/slow/gross JNI
  69. layer. This involves several subpoints:
  70. A. Support for languages that require garbage collectors and integration
  71. with languages that don't. As a base point, we could insist on
  72. always using a conservative GC, but implement free as a noop, f.e.
  73. > b. A strongly-typed VM. One question is do we need the types to be
  74. > explicitly declared or should they be inferred by the dynamic
  75. > compiler?
  76. B. This is kind of similar to another idea that I have: make OOP
  77. constructs (virtual function tables, class heirarchies, etc) explicit
  78. in the VM representation. I believe that the number of additional
  79. constructs would be fairly low, but would give us lots of important
  80. information... something else that would/could be important is to
  81. have exceptions as first class types so that they would be handled in
  82. a uniform way for the entire VM... so that C functions can call Java
  83. functions for example...
  84. > c. How do we get more high-level information into the VM while keeping
  85. > to a low-level VM design?
  86. > o Explicit array references as operands? An alternative is
  87. > to have just an array type, and let the index computations be
  88. > separate 3-operand instructions.
  89. C. In the model I was thinking of (subject to change of course), we
  90. would just have an array type (distinct from the pointer
  91. types). This would allow us to have arbitrarily complex index
  92. expressions, while still distinguishing "load" from "Array load",
  93. for example. Perhaps also, switch jump tables would be first class
  94. types as well? This would allow better reasoning about the program.
  95. 5. Support dynamic loading of code from various sources. Already
  96. mentioned above was the example of loading java bytecodes, but we want
  97. to support dynamic loading of VM code as well. This makes the job of
  98. the runtime compiler much more interesting: it can do interprocedural
  99. optimizations that the static compiler can't do, because it doesn't
  100. have all of the required information (for example, inlining from
  101. shared libraries, etc...)
  102. 6. Define a set of generally useful annotations to add to the VM
  103. representation. For example, a function can be analysed to see if it
  104. has any sideeffects when run... also, the MOD/REF sets could be
  105. calculated, etc... we would have to determine what is reasonable. This
  106. would generally be used to make IP optimizations cheaper for the
  107. runtime compiler...
  108. > o Explicit instructions to handle aliasing, e.g.s:
  109. > -- an instruction to say "I speculate that these two values are not
  110. > aliased, but check at runtime", like speculative execution in
  111. > EPIC?
  112. > -- or an instruction to check whether two values are aliased and
  113. > execute different code depending on the answer, somewhat like
  114. > predicated code in EPIC
  115. These are also very good points... if this can be determined at compile
  116. time. I think that an epic style of representation (not the instruction
  117. packing, just the information presented) could be a very interesting model
  118. to use... more later...
  119. > o (This one is a difficult but powerful idea.)
  120. > A "thread-id" field on every instruction that allows the static
  121. > compiler to generate a set of parallel threads, and then have
  122. > the runtime compiler and hardware do what they please with it.
  123. > This has very powerful uses, but thread-id on every instruction
  124. > is expensive in terms of instruction size and code size.
  125. > We would need to compactly encode it somehow.
  126. Yes yes yes! :) I think it would be *VERY* useful to include this kind
  127. of information (which EPIC architectures *implicitly* encode. The trend
  128. that we are seeing supports this greatly:
  129. 1. Commodity processors are getting massive SIMD support:
  130. * Intel/Amd MMX/MMX2
  131. * AMD's 3Dnow!
  132. * Intel's SSE/SSE2
  133. * Sun's VIS
  134. 2. SMP is becoming much more common, especially in the server space.
  135. 3. Multiple processors on a die are right around the corner.
  136. If nothing else, not designing this in would severely limit our future
  137. expansion of the project...
  138. > Also, this will require some reading on at least two other
  139. > projects:
  140. > -- Multiscalar architecture from Wisconsin
  141. > -- Simultaneous multithreading architecture from Washington
  142. >
  143. > o Or forget all this and stick to a traditional instruction set?
  144. Heh... :) Well, from a pure research point of view, it is almost more
  145. attactive to go with the most extreme/different ISA possible. On one axis
  146. you get safety and conservatism, and on the other you get degree of
  147. influence that the results have. Of course the problem with pure research
  148. is that often times there is no concrete product of the research... :)
  149. > BTW, on an unrelated note, after the meeting yesterday, I did remember
  150. > that you had suggested doing instruction scheduling on SSA form instead
  151. > of a dependence DAG earlier in the semester. When we talked about
  152. > it yesterday, I didn't remember where the idea had come from but I
  153. > remembered later. Just giving credit where its due...
  154. :) Thanks.
  155. > Perhaps you can save the above as a file under RCS so you and I can
  156. > continue to expand on this.
  157. I think it makes sense to do so when we get our ideas more formalized and
  158. bounce it back and forth a couple of times... then I'll do a more formal
  159. writeup of our goals and ideas. Obviously our first implementation will
  160. not want to do all of the stuff that I pointed out above... be we will
  161. want to design the project so that we do not artificially limit ourselves
  162. at sometime in the future...
  163. Anyways, let me know what you think about these ideas... and if they sound
  164. reasonable...
  165. -Chris