/docs/IFAQ.txt

http://github.com/kilim/kilim · Plain Text · 297 lines · 239 code · 58 blank · 0 comment · 0 complexity · e139c732a928734627b85edb4c18d305 MD5 · raw file

  1. (Copyright 2006 Sriram Srinivasan)
  2. Kilim IFAQ: Infrequently Asked Questions. Kilim v 1.0
  3. -- sriram srinivasan (Kilim _at_ malhar.net)
  4. ======================================================================
  5. Why is multi-threaded programming considered so hard?
  6. ======================================================================
  7. It is relatively easy to get thread programming correct (to a first
  8. approximation) by synchronizing all your shared data structures and
  9. taking locks in the right order.
  10. You could have one giant lock and just do things one at a time (like
  11. the current python interpreter with its Global Interpreter Lock).
  12. Clearly, this is not efficient. Increasing concurrent access of a
  13. data structure (by using finer-grained locks) is what makes it
  14. error-prone and hard to debug.
  15. ======================================================================
  16. Kilim uses kernel threads. Where do tasks and threads meet?
  17. ======================================================================
  18. Kilim's tasks are cooperatively scheduled on a kernel thread pool.
  19. Tasks are needed when you want to split up your workflow into small
  20. stages and write code as if it is blocking (instead of writing a
  21. callback and having to jump to that function when it gets called).
  22. Tasks should not ideally make thread-blocking calls, although if you
  23. *have* to call one, it is not the end of the world. That's what other
  24. threads are for .. they'll take care of the other tasks meanwhile.
  25. A Kilim task is owned and managed by a scheduler, which manages the
  26. thread pool. When a task needs to pause, it removes itself from the
  27. thread by popping its call stack, remembering enough about each
  28. activation frame in order to help rebuild the stack and resume, at a
  29. later point). The scheduler then reuses that thread for some other
  30. task.
  31. You can have more than one scheduler (read: thread pool) and assign
  32. each task to a particular scheduler. See the bench directory for
  33. examples.
  34. ======================================================================
  35. How lightweight is "lightweight"?
  36. ======================================================================
  37. The amount of memory occupied by a task is:
  38. 1. The java object that represents the task class
  39. 2. If paused, an array of activation frames is stored. The Kilim
  40. weaver performs data flow and live variable and constant analysis
  41. (intra-procedurally) to ensure that it capture only as
  42. much as is needed to resume.
  43. 3. The contents of all mailboxes that the task is receiving on.
  44. Clearly, all these depend on your application.
  45. The depth of the task stack is limited only by the thread's stack; no
  46. memory is preallocated. Note that when written in the message passing
  47. style, stacks tend not to be too deep because each task is like a
  48. stage in a workflow, with its own stack.
  49. ======================================================================
  50. What's the difference between channels in Ada, CSP/Occam, Newsqueak,
  51. Alef etc. and Kilim's mailboxes?
  52. ======================================================================
  53. Most of these languages use synchronous channels as their basic
  54. construct, where a sending task can proceed only after the receiver
  55. has received (or vice-versa).
  56. 1. Synchronous channels are easier to reason about because there is
  57. automatic flow control; the sender does not proceed unless the
  58. recipient drains the channel. Tony Hoare, Robin Milner, Rob Pike
  59. and John Reppy have all written extensively about synchronous
  60. programming, so I will take their word for it. However, I still
  61. find asynchronous programming (through buffering) a better default
  62. choice for practical reasons:
  63. 2. Context switching has a cost, however inexpensive Kilim's tasks are
  64. to create and context-switch (unlike the Occam/transputer world
  65. with its hardware-assisted switching). Although Kilim's mailboxes
  66. can be configured to be synchronous, it is not the default. There
  67. are many cases where you want to send messages to multiple
  68. recipients before waiting to collect replies. I find tedious the
  69. CSP approach of spawning a task to avoid blocking while sending.
  70. 3. I like the same interface for both concurrent and distributed
  71. programming (although support for distributed programming is yet to
  72. be bundled with Kilim). Synchronous _distributed_ programming is
  73. horribly inefficient .. every put has to be acked when a
  74. corresponding _get_ is done.
  75. This is why I have followed Erlang's example to prefer buffered
  76. channels (called mailboxes) as the default choice.
  77. ======================================================================
  78. Erlang vs. Kilim
  79. ======================================================================
  80. Kilim is an ode to Erlang (www.erlang.org), and strives to bring
  81. some of its features into the more familiar Java world.
  82. The term "Erlang", like Perl, refers to both the language and the sole
  83. available implementation. Comparisons have to be made on these two
  84. axes separately.
  85. The Erlang language is a soft-typed, purely functional language and
  86. has many of the goodies of a functional setting: higher-order
  87. functions, beautifully simple syntax and pattern matching on terms,
  88. features that I'd love to see in Java. However, programming in a purely
  89. functional style is not everyone's cup of tea and there is no reason
  90. that higher order functions and pattern matching can't be made
  91. available in an imperative setting (See Scala, JMatch, Tom(from INRIA)
  92. etc). If you have to have types, it is better to have Ocaml-style
  93. types (or even Smalltalk); but compared to Java-style types, I prefer
  94. the simplicity of Erlang's soft types.
  95. The argument for Java lies not in the language, but in the incredible
  96. JIT compilers, JDK, enormous open code base and community, excellent
  97. IDEs, good network, database, GUI and systems support. Why throw away
  98. all that?
  99. The Erlang *environment* (not the language) offers lightweight
  100. processes, fast messaging, uniform abstraction for concurrency and
  101. distribution and many, many systemic features (process monitoring,
  102. automatic restart), process isolation, failure isolation etc. These can be
  103. built atop Kilim as well.
  104. The idea behind Kilim is that one can have all the features of the
  105. Erlang environment without having to move to the Erlang
  106. *language*.
  107. ======================================================================
  108. Kilim vs. Transactional Memory
  109. ======================================================================
  110. Hardware/Software Transactional Memory is currently the new hope and
  111. an alternative for concurrent programing in the shared memory
  112. world. It is appropriate in a mostly functional setting where most
  113. objects are immutable and side-effects are rare or contained. In an
  114. imperative setting, I have my doubts about TM's scalability; hotspots
  115. are expensive. Atomic sections can't be too big, otherwise they risk
  116. getting retried all over again. And the part of code that retries had
  117. better not have any side effects that doesn't know about or is not
  118. controlled by the TM, such as sending messages on the network.
  119. I think the task and mailbox approach is a more understandable model,
  120. has nice run-to-completion semantics, has convenient graphical
  121. representations (dataflow diagrams, workflow diagrams, Petri nets). It
  122. brings the interaction with other processes out in the open. It allows
  123. batched and efficient communication.
  124. That said, there is absolutely no reason not to use the TM facilities
  125. internally inside Kilim. I intend to use non-blocking data structures
  126. when they perform well (currently, Java's data structures aren't
  127. as fast as I'd like them to be)
  128. ======================================================================
  129. What's the relation between CCS/pi-calculus and Kilim
  130. ======================================================================
  131. The notion that the Mailbox itself is a first class message datatype
  132. and can be sent as part of a message is inspired by Prof. Robin
  133. Milner's pi-calculus. This allows the topology to change with time.
  134. A can send a mailbox in a message to B, B can forward that message to C
  135. and C and D can shared that mailbox.
  136. Beyond that, CCS, like CSP is a modeling and specification language,
  137. and uses synchronous interaction between processes. At a practical
  138. level, this is terribly inefficient (esp. in Java).
  139. ======================================================================
  140. RMI vs. Kilim
  141. ======================================================================
  142. We need to distinguish between RMI implementations and the concept.
  143. RMI implementations block the java thread. That's a no-no for
  144. scalability. They themselves are incredibly heavyweight -- I/O
  145. serialization is always used, even in a concurrent setting, for
  146. ensuring isolation. The request response paradigm doesn't allow many
  147. other patterns of communication: fork/join, flow control, rate
  148. control, timeouts, streaming etc.
  149. Kilim, in a concurrent (local) setting, is at least 100x faster than
  150. Java RMI on even the simplest benchmarks. In a distributed setting,
  151. the Kilim approach is better because asynchronous messaging is much
  152. more scalable. Combine this with automatic stack management and you
  153. get a far easier programming model
  154. ======================================================================
  155. What are Continuations and what is Continuation Passing Style(CPS)?
  156. ======================================================================
  157. There is so much doubt and misinformation on the topic that a few
  158. words are in order.
  159. Simply put, a CPS style of programming is where a "return" keyword is
  160. not needed.
  161. The notions of procedures calling procedures by building up a stack
  162. has been burnt into our collective programming consciousness. If a()
  163. calls b() calls c(), we think, the stack must be three deep.
  164. Suppose a() has nothing more to do after calling b(). It (that is, a()) really
  165. doesn't need b() to return to it, so there is no use pushing a return
  166. address on the stack. In other words, the flow of control _continues_
  167. from a to b, never to return. Most respectable code generators
  168. recognize this special case and prevent the stack from building up
  169. ("tail call optimization"). It is a pity this isn't available under
  170. the standard JVMs. Even GCC doesn't do it all the time.
  171. Now consider,
  172. a() {
  173. do stuff
  174. b()
  175. do more stuff
  176. }
  177. b() {
  178. ...
  179. return
  180. }
  181. Now you need a stack and you want b() to return in order to "do more
  182. stuff". However, this bit of code can be transformed to ensure that b
  183. doesn't return; instead it continues on to another procedure that
  184. performs the "do more stuff" bit.
  185. a() {
  186. do stuff
  187. b("c") // pass a reference to c()
  188. }
  189. b(nextProc) {
  190. ...
  191. call nextProc
  192. }
  193. c() {
  194. do more stuff
  195. }
  196. The "do more stuff" part has now been separated out into c(). Now,
  197. a() chains on to b, supplying it the name of the next call to
  198. make. For its part, b _continues_ to the procedure referred to by its
  199. nextProc parameter, instead of returning.
  200. This transformation ensures that you never need the "return" keyword
  201. ... you always continue onwards to the parameter supplied.
  202. What if "do more stuff" needed to refer to local variables in a()'s
  203. stack frame? Well, the transformation ensures that a() packages the
  204. values of those variables along with a reference to the next proc to
  205. call. Now, instead of "nextProc", we have an _object_ (with state and
  206. a procedure) called a continuation.
  207. The obvious question is, why bother? The stack worked well, didn't it?
  208. Why dispense with it? Yes, the stack works incredibly well, which is
  209. why CPUs and compilers have special support for it. However, the
  210. continuation passing style allows for other forms of transfer of
  211. control very simply. C++ and Java provide two forms of "return", one
  212. normal and another using exceptions. If we had CPS, we wouldn't need
  213. these special cases.
  214. Instead of a() installing an exception handler, it would pass in two
  215. continuation objects to b() that know what to do under normal and
  216. under exceptional conditions. b() simply chains on to the appropriate
  217. object as its last move.
  218. As another example, you can have tasks that pass control to a
  219. scheduler that in turn passes control to another task, all without
  220. having to return to whoever called it.
  221. In a programming language with explicit support for continuations (ML,
  222. Lisp, Haskell), one can have the "return" keyword merely as a
  223. syntactic sugar (like a macro). Internally, the compiler CPS
  224. transforms the entire code, so no procedure returns to its caller.
  225. Are there any disadvantages of continuations? Oh yes. Machines are
  226. so well optimized for stack usage and no tail calls that the
  227. system is biased against continuations, performance-wise. The
  228. continuation object has to be allocated from the heap and depends
  229. on garbage collection. This is one reason why OCaml doesn't use CPS.
  230. That said, the current crop of garbage collectors and the amortized
  231. cost of garbage collection often matches that of stack-based
  232. allocation, and continuations are simply too powerful a feature to
  233. ignore.
  234. Where does Kilim fit into all this?
  235. Kilim's transformation is similar to CPS, but it needs to live within
  236. a JVM that does not even support tail calls. It also needs to live
  237. with the Java verifier that doesn't allow random gotos to be inserted
  238. in the code willy-nilly. More details in the paper "A Thread of One's
  239. Own" (included in the docs directory)