I've thought about it a little more, and drawn some pictures in my head. I still have to disagree with Christian when he says: > Making Python completely coroutine aware, without > tricking the C stack, is 90 percent of the problem. > But after walking that far, there is no reason > to leave the other 10 percent alone. Without continuations, but with microthreads (uthreads) or coroutines, each (Python) stack frame can simply be "paused" at a specific point and continued later. The semantics here are completely clear (except perhaps for end cases such as unhandled exceptions and intervening C stack frames). With continuations, you have to decide how much state to save for a future continuation. It would seem easy enough: save all state kept in the frame except for the local variables. But now consider this: the "evaluation stack" contained in each frame could easily be replaced by a bunch of temporary variables, if we had a slightly different instruction set (3-address opcodes instead of stack-based opcodes). Then would we save those temporary variables or not? it can make a difference! Since the "save continuation" operation is a function call, you can easily save a continuation while there are some items on the value stack. I believe the current implementation saves these so they are restored when you jump to the continuation. But if the compiler were to use temporary variables instead of the evaluation stack, they might not have been restored! Here's another example. Suppose you set up a for loop. After three iterations through the loop you save a continuation. Then you finish hree more iterations. Then you return to the saved continuation. Where does the loop continue: at 3 or at 6 iterations? Try to answer this without trying it. My guess: it gets restarted at 3 iterations, because the loop index is saved on the value stack. If you rewrite this using a while loop, however, it would get restarted at 6 iterations, because then your loop index is an unsaved local variable. Ditto if you changed the bytecode compiler so for loops use an anonymous local variable instead of an entry on the evaluation stack. This semantic imprecision is one of the reasons why I don't like the concept of continuations. (I've been told that the exact semantics of continuations in Scheme differ greatly between Scheme implementations.) Now let's look at Jython. In Jython, we can simulate "paused frames" well enough by using real Java threads. However full continuations would require modifications to the JVM -- which is unacceptable to a language boasting "100% Pure Java". Another reason against allowing continuations. So, all in all, I don't think of continuations as "the last 10% that we might as well add to finish the job." I see it as an ill-specified hypergeneralization. What *do* I want to see in a stackless PEP? Not surprisingly, I'd like to see generators, coroutines, and uthreads. These all share a mechanism that pauses one frame and resumes another. I propose to make the concept of uthreads fundamental -- this will simplify the emulation in Jython. A strawman proposal: The uthread module provides the new functionality at the lowest level. Uthread objects represent microthreads. An uthread has a chain of stack frames linked by back pointers just like a regular thread. Pause/resume operations are methods on uthread objects. Pause/resume operations do not address specific frames but specific uthreads; within an uthread the normal call/return mechanisms can be used, and only the top frame in the uthread's stack of call frames can be paused/resumed (the ones below it are paused implicitly by the call to the next frame, and resumed when that call returns). - u = uthread.new(func) creates a new uthread object, u. The new uthread is poised to call func() but doesn't execute yet. - u = uthread.current() returns the uthread object for the current frame. - u.yield() pauses the current uthread and resume the uthread u where it was paused. The current uthread is resumed when some other uthread calls its yield() method. Calling uthread.current().yield() is a no-op. - When func() returns, the uthread that was executing it ceases to be runnable. The uthread that most recently yielded to it is resumed, unless that is no longer runnable, in which case the uthread that most recently yielded to *it* is resumed, and so on until a runnable uthread is found or until no runnable uthreads are left, in which case the program terminates. (XXX I need a proof here that this works.) - When func() raises an unhandled exception, the exception gets propagated using the same rules as when it returns, and similarly its uthread ceases to be runnable. - u.kill(exc) causes the yield() call that paused u to raise the exception exc. (This can be caught in a try/except of course.) - Calling u.yield() or u.kill() for a non-runnable uthread is an error and raises an exception. I think this API should enough to implement Gordon's SelectDispatcher code. In general, it's easy to create a scheduler uthread that schedules other uthreads. Open issues: - I'm not sure that I got the start conditions right. Should func() be be allowed to run until its first yield() when uthread.new(func) is called? - I'm not sure that the rules for returning and raising exceptions from func() are the right ones. - Should it be possible to pass a value to another uthread by passing an argument to u.yield(), which then gets returned by the resumed yield() call in that uthread? - How do uthreads interact with real threads? Uthreads are explicitly scheduled through yield() calls; real threads use preemptive scheduling. I suppose we could create a new "main" uthread for each real thread. But what if we yield() to an uthread that's already executing in another thread? How is that error detected? Please help! --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4