George Bosworth's keynote on the Microsoft CLR, and his personal comparisons of it to the Smalltalk/V VM. George was one of the founders of Digitalk
Why a personal perspective on this? George is one of dozens of architects on the CLR. Why compare specifically to Smalltalk/V? because it's what George knows best from the Smalltalk world. The CLR has been built by a lot of people coming from a diverse set of backgrounds - COM, VB/VC, J++, MTS, Smalltalk, Scheme, Lisp, etc.
I am and always will be a Smalltalk fanatic
Nomenclature - the CLR has lots of code names, and terms are and have been used over. George admits that he's not good at nomenclature anyway. "There are official names for things that he's supposed to know" :)
The CLR is a acommercial platform for languages and tools. A platform for languages and tools is trying to enable divergence and differentiation over time. A language toolset (like Smalltalk/V) is leveraging conformity and uniformity. So there's a different perspective. Smalltalk/V
- GC, IL Based
- Debugging, execution svcs
- SLL's
- FFI
- Smalltalk libs and components
- Basic language libs, change control, UI builder, source control
CLR?
- CLR Execution engine
- GC, IL, JIT
- Debugging, execution svc, remoting (etc)
- Externalized PE files (DLL, Exe) contain pieces
- BCL base class libs
- CLS language interop rules - this is the layer that defines how different languages will interoperate and play together here
- Pile of specs - important to a platform
- Codified standards
- What's not part of the CLR? Lots of tools and services and libs from others (both in and out of MS).
Where are the tools and libs? Available from MS and others (insert marketing information here :) ) Unlike Smalltalk vendors, MS not trying to provide everything - providing a platform to build on, not an end all. The CLI runs in something and things run on it. As a technology, the CLR is similar to a Smalltalk VM. It's in the details that diffs emerge.
The CLR is not itself a product though - things built on it are products. As opposed to Smalltalk, which is sold itself.
Now a walk through the CLR (Common Language Runtime). There are actually many CLR implementations at MS:
- Desktop/Server (CLR)
- VS.NET
- WS2003
- Whidbey
- Longhorn
- Rotor - close to desktop version, under shared source license
- Compact Framework
- SPOT watch - see Scoble's blog
This talk is mostly about the desktop/server implementation.... The runtime control flow looks a lot like any other VM (In particular, like the JVM with the security model). So code (VB, VC, etc) compiled to IL, which the JIT then turns into native code which executes. There is an FFI. There's an interesting "pre-JIT" piece as well which he'll get to in a bit.
- "Econo Jit" - used in Rotor (slow)
- generates unoptimized native code
- Code can be discarded and regenerated
- "Standard JIT"
- Generates optimized native code
- Pre-JIT generation
- Done at install time
- Reduces start up time
- Native code has version checks and reverts to runtime JIT if they fail
- Assemblies - akin to parcels, SLL's, Squeak Segments
- One or more files, independent of packaging
- Self describing via metadata (manifest) - everything but the executable bits
- Versioning
- Captured by compiler
- policy per-application as well as per machine
- Security boundary
- Assemblies are granted permissions
- methods can demand proof that a permission has been granted to entire call chain
- Mediate type import and export
- Types named relative to assembly
What happens at compile time?
- Compiler reads imported metadata and builds internal symbol table
- Compiler resolves method overloading using languae specific type matching rules
- Compiler adds any required coercions or casts
- Compiler selects and records a precide member ref
- Compiler provides object layout requirements through its choice of
- Layout technique
- Non-static fields
- Virtual methods, with "new slot" vs. "use existing slot"
- Compiler emits metadata by nerging incoming metadata with metadata that's been generated
- Compiler emits IL with metadata tokens
What happens at class load time?
- A type ref occurs and is resolved to an existing type or an assembly ref
- Assembly is loaded
- If no managed native code available, IL module is loaded and first validity checks run
- Class is then validated and the CLR creates in memory data structs
- On first run, JIT is used (and verified if needed)
Jit - a whole lot of stuff you expect (inheritance checks, etc) happen - plus security checks. Finally, the code runs. Permissions can apply here - the security model can block this. This is a richer model than Java, because it sounds like it can be much more finely tuned. Ok, a picture of the "standard" model for compiled apps - code to exe file to directory to run. How about the managed world? source to IL assembly. To deploy, you can use tools that verify the assembly. PEVerify will do checks on the assembly. Then deploy to the GAC (the new registry!). To run, the assembly is loaded and the policy manager checks permissions (based on security model, if any). One interesting fallout - Assemblies (and the CLR) have been written to be hosted inside other applications. Different from most Smalltalk systems.
What's the Pre-JIT? an MSIL to native compiler. precompile, pre-load, pre-layout. Validated at execution time. May factor in the processor type, other assemblies, etc. If it ends up invalid, the normal path is used. The goal is to speed startup time - due to a smaller start up working set. Usually used as part of assembly installation. This tool ships as part of VS.NET - ngen. The theory is that a lot of steps get jumped if the pre-compiled assembly is valid at runtime.
Deployment Models
- Shipping and Installation
- Compile IL in build lab
- Ship IL
- PreJIT during installation
- Possible JIT during execution
- patching and servicing
- Deliver new IL image - could be huge though
- Repair native images (all depending on assemblies)
Application Domains - where assemblies are loaded and executed. Subprocess level of isolation. Defines own types and manages own memory - controlled by hosting policies. Objects in the domain are isolated from other domains, and will talk to them via remoting (proxy operations). This allows you to run disparate versions of libraries and have them all work. Hosting is about policies:
- Security
- Threading
- Threads vs. fiber scheduling
- Robustness
- Hosts load, initialize, and set policy for the CLR
An example - SQL Server won't allow a loaded application to use threads. So if yours does that, it won't load you. I see the point, but I have an issue here - same one I have with the final/sealed quandary. To my mind, it all assumes that the developer shouldn't have control...
What about memory issues? Again, this ends up being a policy setting thing. Ok - on to ST comparsion - he's calling it a toolset vs. platform view (I'd argue that Smalltalk, particularly VW, is a platform - but I get his point). The platform (CLR in this case) is concerned with guarantees that people can depend on. Language models (here he means Smalltalk) is different and concerns itself with conventions.
Specs matter more to a platform - changes affect too many people, so they are difficult to fix. Interestingly enough, Joshua Bloch of Sun said the same thing at ot2004.
Opportunity for questions:
Q: Rotor vs. CLR
A: Lousy code generator. Not as optimized. Not concerned with things like COM/OLE, so a lot of infrastructure for it not there. Mostly close
Q: What about in memory stability of objects (in ST - what about CLR?
A:What do you mean by that? Important to have type safe memory? Yes. The CLR should never corrupt memory.
Really talking about productivity/interactivity of objects during runtime (image, etc)
A: We don't do that (image) now. Statement that the image can be brittle. Uses PARTS (originally built for OS/2) and portability as an example.
Question is really about interactive development (workspaces, etc)
A:What is the state of the application that you want to persist? Hmm. I'm not sure he's answering this... Question - can you modify the code while it's running? can load classes. What about shape changes? My question - the blog server - patch on the fly. About state changes - Apparently, ASP.NET has some facilities in this direction, but it sounds complex and sounds like it only handles things forward, as opposed to extant stuff in memory. I think the short answer is "No"
Q:What do we do to survive in the CLR world?
A: learn to live with the CLR
Q: What is lacking in the CLR to support Smalltalk?
A: Tag typing support so that Smalltalk's object model could work. Says that Anonymous delegate stuff will help. "Intermediat late binding" support - Not good support now, needs to have it for good Smalltalk support.
Q: To make Smalltalk work and play in .NET, does it need to be type capable?
A: No, but it needs to be type aware. George related some early Smalltalk/V work in that direction.
Heh - comment from a Smalltalker doing VB.NET on the IRC channel: "I'd say it's about as good as you'll get at putting objects on basic, but it and .NET strike me as rather Rube Goldberg-esque"