/3rd_party/llvm/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt
https://code.google.com/p/softart/ · Plain Text · 199 lines · 165 code · 34 blank · 0 comment · 0 complexity · f3666a05a60a7ec6a9f7985d77ec7076 MD5 · raw file
- Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST)
- From: Chris Lattner <sabre@nondot.org>
- To: Vikram Adve <vadve@cs.uiuc.edu>
- Subject: Re: a few thoughts
-
- Okay... here are a few of my thoughts on this (it's good to know that we
- think so alike!):
-
- > 1. We need to be clear on our goals for the VM. Do we want to emphasize
- > portability and safety like the Java VM? Or shall we focus on the
- > architecture interface first (i.e., consider the code generation and
- > processor issues), since the architecture interface question is also
- > important for portable Java-type VMs?
-
- I forsee the architecture looking kinda like this: (which is completely
- subject to change)
-
- 1. The VM code is NOT guaranteed safe in a java sense. Doing so makes it
- basically impossible to support C like languages. Besides that,
- certifying a register based language as safe at run time would be a
- pretty expensive operation to have to do. Additionally, we would like
- to be able to statically eliminate many bounds checks in Java
- programs... for example.
-
- 2. Instead, we can do the following (eventually):
- * Java bytecode is used as our "safe" representation (to avoid
- reinventing something that we don't add much value to). When the
- user chooses to execute Java bytecodes directly (ie, not
- precompiled) the runtime compiler can do some very simple
- transformations (JIT style) to convert it into valid input for our
- VM. Performance is not wonderful, but it works right.
- * The file is scheduled to be compiled (rigorously) at a later
- time. This could be done by some background process or by a second
- processor in the system during idle time or something...
- * To keep things "safe" ie to enforce a sandbox on Java/foreign code,
- we could sign the generated VM code with a host specific private
- key. Then before the code is executed/loaded, we can check to see if
- the trusted compiler generated the code. This would be much quicker
- than having to validate consistency (especially if bounds checks have
- been removed, for example)
-
- > This is important because the audiences for these two goals are very
- > different. Architects and many compiler people care much more about
- > the second question. The Java compiler and OS community care much more
- > about the first one.
-
- 3. By focusing on a more low level virtual machine, we have much more room
- for value add. The nice safe "sandbox" VM can be provided as a layer
- on top of it. It also lets us focus on the more interesting compilers
- related projects.
-
- > 2. Design issues to consider (an initial list that we should continue
- > to modify). Note that I'm not trying to suggest actual solutions here,
- > but just various directions we can pursue:
-
- Understood. :)
-
- > a. A single-assignment VM, which we've both already been thinking
- > about.
-
- Yup, I think that this makes a lot of sense. I am still intrigued,
- however, by the prospect of a minimally allocated VM representation... I
- think that it could have definite advantages for certain applications
- (think very small machines, like PDAs). I don't, however, think that our
- initial implementations should focus on this. :)
-
- Here are some other auxiliary goals that I think we should consider:
-
- 1. Primary goal: Support a high performance dynamic compilation
- system. This means that we have an "ideal" division of labor between
- the runtime and static compilers. Of course, the other goals of the
- system somewhat reduce the importance of this point (f.e. portability
- reduces performance, but hopefully not much)
- 2. Portability to different processors. Since we are most familiar with
- x86 and solaris, I think that these two are excellent candidates when
- we get that far...
- 3. Support for all languages & styles of programming (general purpose
- VM). This is the point that disallows java style bytecodes, where all
- array refs are checked for bounds, etc...
- 4. Support linking between different language families. For example, call
- C functions directly from Java without using the nasty/slow/gross JNI
- layer. This involves several subpoints:
- A. Support for languages that require garbage collectors and integration
- with languages that don't. As a base point, we could insist on
- always using a conservative GC, but implement free as a noop, f.e.
-
- > b. A strongly-typed VM. One question is do we need the types to be
- > explicitly declared or should they be inferred by the dynamic
- > compiler?
-
- B. This is kind of similar to another idea that I have: make OOP
- constructs (virtual function tables, class heirarchies, etc) explicit
- in the VM representation. I believe that the number of additional
- constructs would be fairly low, but would give us lots of important
- information... something else that would/could be important is to
- have exceptions as first class types so that they would be handled in
- a uniform way for the entire VM... so that C functions can call Java
- functions for example...
-
- > c. How do we get more high-level information into the VM while keeping
- > to a low-level VM design?
- > o Explicit array references as operands? An alternative is
- > to have just an array type, and let the index computations be
- > separate 3-operand instructions.
-
- C. In the model I was thinking of (subject to change of course), we
- would just have an array type (distinct from the pointer
- types). This would allow us to have arbitrarily complex index
- expressions, while still distinguishing "load" from "Array load",
- for example. Perhaps also, switch jump tables would be first class
- types as well? This would allow better reasoning about the program.
-
- 5. Support dynamic loading of code from various sources. Already
- mentioned above was the example of loading java bytecodes, but we want
- to support dynamic loading of VM code as well. This makes the job of
- the runtime compiler much more interesting: it can do interprocedural
- optimizations that the static compiler can't do, because it doesn't
- have all of the required information (for example, inlining from
- shared libraries, etc...)
-
- 6. Define a set of generally useful annotations to add to the VM
- representation. For example, a function can be analysed to see if it
- has any sideeffects when run... also, the MOD/REF sets could be
- calculated, etc... we would have to determine what is reasonable. This
- would generally be used to make IP optimizations cheaper for the
- runtime compiler...
-
- > o Explicit instructions to handle aliasing, e.g.s:
- > -- an instruction to say "I speculate that these two values are not
- > aliased, but check at runtime", like speculative execution in
- > EPIC?
- > -- or an instruction to check whether two values are aliased and
- > execute different code depending on the answer, somewhat like
- > predicated code in EPIC
-
- These are also very good points... if this can be determined at compile
- time. I think that an epic style of representation (not the instruction
- packing, just the information presented) could be a very interesting model
- to use... more later...
-
- > o (This one is a difficult but powerful idea.)
- > A "thread-id" field on every instruction that allows the static
- > compiler to generate a set of parallel threads, and then have
- > the runtime compiler and hardware do what they please with it.
- > This has very powerful uses, but thread-id on every instruction
- > is expensive in terms of instruction size and code size.
- > We would need to compactly encode it somehow.
-
- Yes yes yes! :) I think it would be *VERY* useful to include this kind
- of information (which EPIC architectures *implicitly* encode. The trend
- that we are seeing supports this greatly:
-
- 1. Commodity processors are getting massive SIMD support:
- * Intel/Amd MMX/MMX2
- * AMD's 3Dnow!
- * Intel's SSE/SSE2
- * Sun's VIS
- 2. SMP is becoming much more common, especially in the server space.
- 3. Multiple processors on a die are right around the corner.
-
- If nothing else, not designing this in would severely limit our future
- expansion of the project...
-
- > Also, this will require some reading on at least two other
- > projects:
- > -- Multiscalar architecture from Wisconsin
- > -- Simultaneous multithreading architecture from Washington
- >
- > o Or forget all this and stick to a traditional instruction set?
-
- Heh... :) Well, from a pure research point of view, it is almost more
- attactive to go with the most extreme/different ISA possible. On one axis
- you get safety and conservatism, and on the other you get degree of
- influence that the results have. Of course the problem with pure research
- is that often times there is no concrete product of the research... :)
-
- > BTW, on an unrelated note, after the meeting yesterday, I did remember
- > that you had suggested doing instruction scheduling on SSA form instead
- > of a dependence DAG earlier in the semester. When we talked about
- > it yesterday, I didn't remember where the idea had come from but I
- > remembered later. Just giving credit where its due...
-
- :) Thanks.
-
- > Perhaps you can save the above as a file under RCS so you and I can
- > continue to expand on this.
-
- I think it makes sense to do so when we get our ideas more formalized and
- bounce it back and forth a couple of times... then I'll do a more formal
- writeup of our goals and ideas. Obviously our first implementation will
- not want to do all of the stuff that I pointed out above... be we will
- want to design the project so that we do not artificially limit ourselves
- at sometime in the future...
-
- Anyways, let me know what you think about these ideas... and if they sound
- reasonable...
-
- -Chris