smalltalk

More than one way to skin a cat

September 1, 2005 7:39:07.832

Tim Bray makes some assumptions about threading and scalability here:

In that recent Ruby piece, I remarked that Ruby threading struck me as kind of feeble, and that threading is getting real important. Well, I know one way to solve that problem. So I tracked down JRuby geek Tom Enebo and got some news and he pointed to me to some code that I think is pretty cool.
On that threading thing, Tom tells me that “We currently implement Ruby threads using Java native threads. It is our plan to continue doing so.” So, JRuby is ready for the threading era.

That may not be the best way to handle it. We actually have some good experience in this area, due to the port of Opentalk (our communications framework) to ObjectStudio for the last release. ObjectStudio maps its threads down to platform (in this case, Windows) threads. That makes them non-deterministic and much harder to control. It also means that any error made by a developer in an individual thread has the ability to tie things up very nicely - at the platform level.

In VisualWorks, threads are green. This makes them really, really inexpensive. For instance, in BottomFeeder, I spawn a thread per feed (I'm subscribing to 309 as I write this). I don't need to worry about thread pools, or overwhelming the platform. If I tried such a feat with native threads, I'd have to worry about those things.

Sure, you'll say, but what if I have a multi-CPU (or multi-Core) box that I want to take advantage of? Simple - run more than one image (process) and have them communicate as needed. That's actually the classic Unix approach to scaling, and it works quite well. The programming model is vastly simpler - I don't have as many complex synchronization issues to deal with. External APIs (such as database calls, etc) can already be threaded at the platform level, so you don't have a blocking issue.

Update: Patrick Logan has some thoughts on the issue.

Comments

Green threads good

[George Paci] September 1, 2005 10:05:27.359

I'm glad in CST "all threads are green" - I hate when those red and blue threads in other environments screw up my color scheme.

More seriously, I'm surprised to see you using a term from the Java camp, instead of something less jargony like "user-space cooperative threading"; nice guy Peter van der Linden explains the origin of the term:

When Java 1.0 first came out on Solaris, it did not use the native Solaris library libthread.so to support threads. Instead it used runtime thread support that had been written in Java for an earlier project code-named "Green." That thread library came to be known as "green threads."

Finally, does "Java native threads" in the code really mean OS-native threads, or green threads in the JVM?

Finally finally, are you sure VisualWorks threads don't show up in the process table? You could actually "overwhelm the platform" if you fill that table up with thousands of entries.

Finally finally finally, I ask everyone to please refrain from reviving the stupid "cooperative vs. preemptive multitasking" holy war from ten years ago. Neither approach is dominant.

VW Threads

[ James Robertson] September 1, 2005 10:33:41.984

VW Threads are a purely Smalltalk creation, so they won't end up in the platform process (or thread, as the case may be) table. Try this code while looking at a list of processes/threads in the platform:


| all |
all := OrderedCollection new.
1 to: 1000 do: [:each | | value |
     [value := each factorial.
     all add: (each -> value)] forkAt: Processor userBackgroundPriority].
all inspect 

But...

[murphee ( http://www.jroller.com/page/murphee )] September 1, 2005 11:01:38.900

Hmm... I don't understand one thing: if VisualWorks doesn't have native threads, then what do you do about blocking I/O? Doesn't this block your whole GUI or the whole VM? Is there some M:N scheme available? (The Jikes RVM (a Java VM written in Java) uses this; bascially on a machine with n CPUs they use n native threads, as far as I know).

BTW: I find it surprising that you advocate restricting flexibility and power (native threads) with the argument "Developers might have think a bit to use it right and not shoot themselves in the foot". From reading your blog, I'd gathered that you think that's what Smalltalk is all about.

Threads

[ James Robertson] September 1, 2005 11:27:29.421

Sockets are non-blocking at the VM level, and the GUI is all emulated - so you don't get platform blocking. Actually, the issues with providing native threads to developers via ST process --> platform thread is not a "restrict power" issue. It's an implementation issue, and a "how fast can we provide an answer" issue.

The existing VM - especially in areas like garbage collection - makes assumptions based on single threadedness. Changing that at the VM level is harder than doing the equivalent thing at the Smalltalk level via multi-process communication libraries.

blocking I/O

[Isaac Gouy] September 1, 2005 11:56:12.614

murphee wrote: what do you do about blocking I/O?
See VisualWorks THAPI

Simple - run more than one image (process) and have them communicate as needed
That's the old approach used with Erlang (a language designed for massively concurrent applications) but for several years they've been wondering how to make better use of multiple cpus
A Parallel and Multi-threaded Erlang Implementation

do you hate cats, or just love the taste?

[lurker] September 1, 2005 12:57:46.304

If neither, why would you want to skin one?

VW process model is more flexible

[ Terry] September 1, 2005 13:30:14.271

Comment by Terry

Murphee

I think you misunderstood James' response. The VW process model is very flexible. The process scheduler has a list of priority queues. It always runs the process at the highest priority that is at the front of the queue. A process is suspended only if a higher priority process is available to run or it relinquishes control. This makes for predictable execution.

If a developer wishes to have some processes execute in a round robin fashion, it is a simple matter to create a high priority process that periodically suspends the appropriate process and rearranges processes in the queue.

parallelism vs. concurrency

[Software transaction memory] September 1, 2005 14:37:12.116

Seems like we're making the common mistake of conflating parallelism and concurrency. If we want to enable multiple things to happen without blocking one another, then we're looking for concurrency and we can use lightweight language ('green') threads. If we want to take advantage of multiple CPUs (i.e. speed things up) then we need to use platform native (kernel) threads. Doesn't that make it easier?

[Ziv Caspi] September 2, 2005 8:46:58.210

So what does Smalltalk on Windows do when it needs (for example) to do DNS lookup? Does it implement its own async DNS client because Windows doesn't provide a non-blocking one?

DNS

[ James Robertson] September 2, 2005 9:28:31.239

Comment by James Robertson

At present, DNS lookups do, in fact, block at the VM level. We are implementing an asynch client for in order to deal with that OS level limitation.

More than one way to skin a cat

[ Alan Knight] September 2, 2005 10:17:30.102

Comment by Alan Knight

As I recall, DNS is not re-entrant anyway.

DNS

[Ziv Caspi] September 3, 2005 16:49:12.426

I'm not sure what you mean by "reentrant" here. Different people use different meanings for this word in the multi-threading world, so I don't want to assume any single one.