smalltalk

Method lookup answers

April 21, 2005 22:16:15.133

In response to the very end of this article, where Rodney Bates said this:

Smalltalk pays a high price elsewhere for taking object orientation to the extreme, notably in complete loss of static typing and serious runtime efficiency penalties. Special, one-instance forms of classes are, for many programming problems, not as good a conceptual match as modules. But at least it provides a single, consistent, and syntactically explicit call mechanism.

I thought I'd ask our lead VM engineer - Eliot Miranda - for some details on method lookup in Smalltalk (VisualWorks in particular):

Rodney, you should read the following books & papers (in order); they'll help you understand Smalltalk's performance.

[Goldberg83] Adele Goldberg, David Robson, Smalltalk-80: The Language and its Implementation, Addison-Wesley, 1983, ISBN 0.201.11371.6.

Now out of print but available by combining

  • Adele Goldberg, David Robson, Smalltalk-80: The Language, Addison-Wesley, 1989, 0.201.13688.0
  • The Blue Book (Implementation)
  • [Deutsch84] L. Peter Deutsch, Allan M. Schiffman, "Efficient Implementation of the Smalltalk-80 System", 11th Annual Symposium on Principles of Programming Languages, pp. 297-302, January 1984, ACM.

Context Management in VisualWorks 5i

Eliot Miranda - Available on the web here(PDF)

But briefly, here's how things are faster than you expect. Message selectors are maintained by the system as a pool of unique strings (Symbols), so that equality comparison of message selectors requires only comparing the addresses of the symbol objects. In the 70's and early 80's message lookup was optimized by the run-time system maintaining a small (1024 or 2048 entry) method lookup cache that remembers recent method lookups. The table is hashed by the identity hash of the receiver's class and the message selector. On early systems the id hash is equivalent to the object's address. Method lookup then becomes:

	hash := receiver class hash + selector hash bitAnd: CacheSize.
	(cache at: hash) class == receiver class
	and: [(cache at: hash) selector selector])
	       ifTrue:
	               [targetMethod := (cache at: hash) method]
	       ifFalse:
	               [targetMethod := self lookup: selector in: receiver class.
	               (cache at: hash)
	                       class: receiver class;
	                       selector: selector;
	                       method: targetMethod.


Since then Dynamic Translation (a.k.a. JIT compilation) has ncreased performance by nearly an order of magnitude. The run-time system does not interpret bytecode; instead it maintains a cache of the most recent used methods compiled on-demand to machine code. We call these nmethods. Every message send is first translated into the following machine code sequence:


       classRegister := selector.  "i.e. load a register with the address of a symbol"
	call unlinkedSend1Args. "i.e. call a run-time routine to find the method, encoding
	                                               the arg count in the call for arg counts 0 to small n"


When unlinkedSend is invoked it locates the receiver from e.g. argument registers, obtains the selector from classregister and uses a modified version of first-level method lookup cache algorithm above to locate an nmethod for the lookup. If an nmethod isn't found it searches the class hierarchy, translates the method to native code and stores it in the first-level lookup cache. And now the clever bit... The send site is rewritten from

	classRegister := selector.
	call unlinkedSend1Args.


to


       classRegister := class. "i.e. whatever the class of the receiver was when unlinkedSend was called"
	call nmethod.entryPoint


The nmethod's code at entry point then checks that the class of the current receiver agrees with that stored in classregister, e.g.


entry:
	tempRegister := receiver class.
	tempRegister != classRegister ifTrue:
	       [self handleSendMiss].
	...


So if the receiver's class is the same as it was when the send site was rewritten the target method is the same and we're done. So we simply have a class dereference, a regiser assignment and a comparison. 90% of send sites are monomorphic. So this speeds things up enormously.

Polymorphic send stes are sped up by using "polymorphc inline caches" or PICs, which look like a jump table, doing a series of class comparisons.

There is also substantial mechanism to allow native stack frames to be used, creatring context objects for method activations only when required.

Smalltalk method lookup is fast

Comments

[BM] April 21, 2005 23:25:22.075

There was a discussion on c.l.s.dolphin recently after a user raised concerns about using methods, because he thought performance was going to be an issue. Dolphin doesn't have a JIT engine, so everything is interpreted. Even with an interpeted VM, method lookup/dispatch is extremely fast (measured in nanoseconds).

Re: Where to find the book

[ James T. Savidge] April 22, 2005 0:11:27.316

Comment by James T. Savidge

Used copies of the book are available: Smalltalk-80: The Language and its Implementation

[Loryn Jenkins] April 22, 2005 4:21:11.583

You haven't shown that method lookup is fast, James. You've shown it is faster. I'd love to see a followup post where you compare method lookup techniques from C++, Java, CLR, Delphi, SmallEiffel with those used in VW Smalltalk.

C++ comparison

[ James Robertson] April 22, 2005 7:37:32.899

Comment by James Robertson

There's a comparison to the vtable approach here.

HotSpot Java engine will be quick

[Isaac Gouy] April 22, 2005 14:47:50.018

There's a comparison to the vtable approach here
And it concludes: "As Eric points out Sun's new HotSpot Java engine will be quick. Urs (sic Urs Hözle) consulted on the implementation."
29 Jul 1999

nfib Results: Smalltalk vs Delphi

[Loryn Jenkins] April 22, 2005 16:41:54.342

Here's the Delphi and Smalltalk results for nfib, expressed as activations per millisecond.

Borland Delphi 7.0: 88901.3
VW Smalltalk 7.3NC: 84622.2


Results are the average of four runs.

nfib Results: Smalltalk vs Eiffel

[Loryn Jenkins] April 23, 2005 2:05:34.160

Here's the Eiffel and Smalltalk results for nfib, expressed as activations per millisecond.

ISE Eiffel 5.2:     76148.0
ISE Eiffel 5.2:    123611.0 << (inlining = 10)
VW Smalltalk 7.3NC: 84622.2


Results are the average of four runs. ISE Eiffel does not use a vtable lookup mechanism. Inlining this example effectively eliminated function lookup time.

nfib Results: Smalltalk vs CLR

[Loryn Jenkins] April 23, 2005 4:34:48.451

Here's the CLR and Smalltalk results for nfib, expressed as activations per millisecond.

Borland Delphi.NET: 42608.4
ISE Eiffel.NET 5.2: 53475.4
VW Smalltalk 7.3NC: 84622.2

Results are the average of four runs.

The performance imapct of the CLR vis a vis native compilation for the Delphi and Eiffel programs is quite significant.

nfib Results: Smalltalk vs C++

[Loryn Jenkins] April 23, 2005 7:01:16.259

Here's the C++ and Smalltalk results for nfib, expressed as activations per millisecond.

Visual Studio 2003: 93869.0
VW Smalltalk 7.3NC: 84622.2

Results are the average of four runs.

Obviously, C++ is the overall speed winner. (Even more so, considering this was compiled using /Od (optimizations off), because the optimizations were reordering my timer calls, rendering them useless. Having said that, the fact that an optimization changes the meaning of the code is ludicrous!)

Smalltalk Speed Wrap

[Loryn Jenkins] April 23, 2005 7:27:45.879

I've now compared Smalltalk's performance on nfib with Delphi, Eiffel, C++, & CLR (Delphi.NET, Eiffel.NET). The speed rankings are as follows (activations per second):

ISE Eiffel 5.2 (with inlining): 123611.0 (inling avoids virtual function calls)
Microsoft VS2003 C++: 93869.0 (no optimizations; optimizing introduced errors)
Borland Delphi 7.0: 88901.3
VW Smalltalk 7.3NC: 84622.2
ISE Eiffel 5.2 Win32: 76148.0
ISE Eiffel.NET 5.2: 53475.4
Borland Delphi.NET: 42608.4

So, James, it is apparent that VW Smalltalk lags C++. But being within cooee of Delphi is no mean feat! Playing around the same speed as fully-compiled applications is laudable! VW Smalltalk is certainly fast compared with other VM-architectures. Which brings me to seriously question why Cincom is even entertaining porting Object Studio to the CLR. If anything, you should be porting Object Studio to VW.

The other side of speed

[Loryn Jenkins] April 23, 2005 7:44:47.674

Having compared the computer speed. I thought I'd just add a note about the speed of programming. Here is my subjective speed ranking for programming nfib in these different environments:

VW Smalltalk: I did nfib on Smalltalk first. It was clearly the quickest environment in which to write something this small. This is especially meaningful in that I have no commercial experience in VW Smalltalk: my only previous experience is programming a Genetic Algorithm in it.
Delphi: Perhaps reflecting my familiarity with Delphi, having programmed in it for the last two years.
Delphi.NET: This was literally a direct re-compile from the Win32 version. In terms of actual time, I should have placed it on top of this list. But I didn't think that fair, seeing as all I had to do was push a button once. (Actually, I compiled three times, cause I didn't believe it would be that easy!)
ISE Eiffel: It surprised me how slow it is to program this small program in Eiffel, seeing as I've used Eiffel for five years, albeit no Eiffel programming in the last three years.
Visual Studio 2003 C++: This was my first-ever C++ program. And the optimizations causing errors in the program certainly surprised me.
ISE Eiffel.NET: Having experienced the excellent experience Delphi gave me in migrating my app to CLR, I kind of expected the same here. Alas, recompiling on .NET revealed two bugs in ISE's binding to the .NET framework. These bugs slowed me down, so that this was literally the slowest solution to prepare. (Please note that 5.2 is two releases behind the currently available release, and, if memory serves correctly, was the first release to contain .NET support.)

My hat is off to VW Smalltalk for extreme productivity. (On tiny, little apps, anyway.)

 Share Tweet This
-->