show all comments

development

Misunderstanding the SOLID principles

February 07, 2009 17:39:13 EST

In a recent podcast, with parts transcribed here, Joel Spolsky attacks the SOLID principles. The SOLID principles give advice on how to handle dependency management in software projects. They explain how to prevent dependencies getting out of control, and how to structure code in the various layers of a project.

Joel's claim is that following the SOLID principles cause to much code, and that developers that follow the principles spend “(…) enormous amount of time writing a lot of extra code.”

I think it is generally accepted that dividing a project into modules helps the whole development process. The SOLID principles simply explain how to do this in practice. Joel should either state that he does not care about dependency management, or describe alternatives to the SOLID principles if he thinks they are not usable to keep dependencies under control. Keeping dependencies under control is hard, and you need to apply non-trivial techniques to accomplish it. If you think it is OK to let each software project be defined as a single module, it is true that the SOLID principles are a waste of time.

Joel does not fully understand the SOLID principles. He believes that following them will lead to an explosion in the number of classes required:

One of the SOLID principles, and I'm totally butchering this, but, one of the principles was that you shouldn't have two things in the same class that would be changed for a different reason [PDF]. Like, you don't want to have an Employee class, because it's got his name which might get changed if he gets married, and it has his salary, which might get changed if he gets a raise. Those have to be two separate classes, because they get changed under different circumstances. And you wind up with millions of tiny little classes, like the EmployeeSalary class, and it's just... (laughs) idiotic!

It seems clear that name and salary both are part of the same module. They are probably both part of the module that is commonly termed the “business layer”. Then, there is no need to create separate classes for these two properties. You should divide functionality after the layers/components in the problem domain, not per property, as Joel believes.

The linked article from Object Mentor specifically mentions an example where modules for geometry and graphics are mixed in the same class. The advice is to separate the graphic system into a separate module. The new graphic module will depend on the geometry module. Then, any changes in the graphic module will not affect the geometry layer. There is no claim to make the structure Joel describes.

Joel reads the article in the context of C++, without reflecting over the progress that has been made in the tools he uses himself. C# and Visual Basic both support layering code into modules without making "millions of tiny little classes". The new construct (in C#) is named "extension methods", but has been used to make modules in Smalltalk projects since the mid 1990s. Modules simply add methods to existing classes in other modules.

When the layering is not done per property and tools support layering without making new classes, following the SOLID principles is not … “idiotic”. You do not end up with a lot more code. But you get more control over how the various parts of your system interact.

Smalltalk

Supporting Multiple CPU Cores in VisualWorks

July 29, 2008 16:28:52 EDT

The VisualWorks Roadmap says Cincom will research how to make it ”(…) easier to leverage multi-core computer”. Cincom will follow a share-nothing approach for multi-core support in VisualWorks. Their approach will therefore probably be to add better support for running multiple images.

But how should spawning new images happen? In a new OS thread per image? How many images will developers spawn, and when will it be done? Per task you need to parallelize, or during startup of the application?

As Anwar Ghuloum of Intel explains, there exists two directions when scaling for multiple cores. I think these directions can be summarized as:

  • Parallelize the code to match the current hardware. This typically means spawning threads to match the number of cores you have at hand. You parallelize your code in n jobs in order to keep n cores busy.
  • Parallelize the code to match the domain problem. Erlang-style programming follows this approach. This style results in process counts far larger than the number of cores on today’s CPUs.

Using match-the-hardware approach means you target hardware that soon becomes dated. Anwar Ghuloum recommends programming “(…) for as many cores as possible, even if it is more cores than are currently in shipping products.“ This means choosing the match-the-domain approach, which is what Cincom should try to support in VisualWorks.

A problem with using the match-the-hardware approach is that spawning many OS threads boggles the operating system. Erlang solves this nicely by using lightweight VM (Virtual Machine) threads that is executed using a pool of OS threads. The number of OS threads matches the number of cores.

VisualWorks should have support for letting multiple images share all static data. This would for example include all Smalltalk byte code. Letting each image consume several megabytes is too much. The question is whether match-the-domain approach is feasible in VisualWorks at all. Without some smart work by Cincom, starting a new image will be many times as expensive as starting a new OS thread. Developers will then need to use the match-the-hardware approach.

Cincom should look at running multiple instances of a (single) image from one VM. All images should share their static data. The VM would use a pool of OS threads (possible matching the number of cores), and schedule the execution of the images using its own algorithm. Starting a new image should consume less resources than starting an OS thread. Erlang consumes around 1200 bytes per process. Maybe this is an impossible goal for Smalltalk. Basically, Cincom need to look at Erlang and see if there are lessons to learn.

As I wrote earlier, there already exists a solution to scale on multiple cores using Squeak. The Hydra VM basically eases start-up and communication between multiple images running in parallel. In Hydra, each image uses its own memory (no sharing of static data), and one OS thread. Cincom should aim for an approach similar to Hydra’s, but needs to improve it.

Smalltalk

New Avi Bryant Video Interview.

July 24, 2008 13:12:11 EDT

Smalltalk

Using the Croquet Hydra VM to Scale on Multiple CPU Cores

July 02, 2008 18:53:24 EDT

After spending some time playing with Erlang, I decided it was time to test running Smalltalk on multiple cores too.

"Hydra VM is a virtual machine capable of running multiple Croquet images side-by-side, therefore being able to effectively utilize multi-core CPUs."

To test Hydra on Windows, follow these steps:

A workspace explains how to save a headless copy of the current image. After saving it you can start the headless image by evaluating:

HydraVM loadAndRunNewImage: fileNameOfImage.

To evaluate some code (asynchronously) on the started image, evaluate:

HydraVM doIt: 'Transcript show: ''Doit Test''' at: indexOfStartedImage.

I added a load stress class before saving the headless image, started multiple copies of the image, and sent the instruction to start the load test to each image. All worker images ran the stress code in parallel.

Now, this way of using images has a lot of similarities to using processes in Erlang. You fork of another process/image in your code, keep a reference to it and send it asynchronous messages. I can easily see a lot of Erlang’s infrastructure being added at the Smalltalk level, and you have something that works well for a lot of heavy computational problems.

Hydra does however have some aspects which make it far from as good as Erlang’s approach to scaling on multiple cores. When running a large number of Erlang processes, the VM will not create one OS thread per Erlang process. Instead, it will create one OS thread per CPU core, and use its own (green-thread-like) scheduler to run the processes on the (relatively small number of) OS threads. Hydra, on the other hand, will use one (or two -- I am not sure) OS threads per Smalltalk image started. The disadvantage to this method is that operating systems fail to handle a large number of concurrently running OS threads. Erlang programs might start thousands of process, but this is out of question with Hydra’s approach.

Another aspect that prevents starting many images, is the fact that each image will consume a lot of memory. We are talking about several megabytes while an Erlang process uses only a few hundred bytes. I think this might be solvable by letting images share their static parts, but that might take a lot of work at the VM level to support.

I really think the Hydra VM is a great initiative. It shows how Smalltalk can scale on multiple CPU cores, using the only sane method to parallelize a problem (non-shared memory). Even if Hydra has similarities to the Erlang approach, the coding style will be different. While Erlang emphasizes using process as a dynamic part of the program, the Hydra approach will probably be to create a few worker images when your program starts and let the main image send jobs to these images. You do not want to start and stop these images dynamically as your program executes. That will be too expensive.

Smalltalk

Helpful Comments

April 12, 2008 15:43:52 EDT

Here is the comment for CompositePart>>initialize

"Initialize the receiver."

How helpful! We could auto-document all code by getting the name of the selector and appending "the receiver".

If your comment does not add any useful information, drop it.

Smalltalk

Use accessors to access private instance variables

April 12, 2008 15:32:28 EDT

I need to subclass CompositePart and change the way it stores its components. I basically want to hold a predefined set of named components and answer those when a client of the class sends it #components.

Of course CompositePart choose to access its instance variable #components directly, instead of going thought an accessor. Now, had it decided to always use the accessor I could have overridden a single method. Now I got around 15 methods I would need to change. Better look for another way to solve this problem…

If you want to ease white-box reuse, use accessors to access your private instance variables.

development

Presentation of the Office 2007 UI design process

April 11, 2008 17:16:57 EDT

Jensen Harris is one of the designers behind the new Office 2007 UI, and has a blog which is filled with interesting details on how Microsoft ended up with this advancement in Office.

Now a video showing a presentation of the design process behind the Office 2007 UI is available. The video sums up some of the previous details found in the blog posts, but does also show previously unpublished information.

Gui Frameworks

Do you need "native" widgets?

April 11, 2008 17:03:29 EDT

…probably not. Here’s a post that sums up the various Microsoft applications that ignores their own “native” widgets.