Caching is not agile
October 21, 2003, 1:25:04 am

One thing you'll hear programming experts say over and over again is that premature optimisation is bad. Why? Why is it so bad. Is it because it's harder to change the code? If that's the reason then you're frelled as soon as you decide you're not being premature any more.

That basically sums it up - you never want to optimise? Well, it depends on the kind of optimisation you're doing. If you decide to use instance variables to cache values so that you don't have to recalculate them, then yes, you're going to have trouble.

Nonsense you say! Well I'll be happy to argue this one out with you. As soon as you decide to cache a value, you'll suddenly need to know all its dependencies, not only who calls it when but what conditions can change it. And conditions that set in motion other conditions that can change it, etc..

Even debugging can cause your value to be instantiated at a bad moment which will suddenly make your program run one way, not the way you were trying to debug.

Is there a solution to this? Well, certainly. Firstly, you can optimise your program by changing algorithms. An algorithm is a design decision, not an optimisation trick. Secondly, you might consider using a programming language that will cache values for you. Functional programming languages do this by knowing what causes side-effects and what doesn't.

Could the later approach be adopted in Smalltalk? Well, it depends. If you went down the side effects path you quickly realise that you must look at the state of all instance variables, and potentially their state, to ensure that you have enough context to state that the cached answer you will return is the correct one.

The alternative is to give the system hints as to what instance variables matter - but this will take us right back to the caching problem.

Any one have any smart ideas on adding a meta-facility to Smalltalk for caching optimisations, so that programmers don't polute their programs with premature optimisations?

By loser on October 21, 2003, 8:11:56 am

I know that a fairly common optimization in lisp is memoize a function. You'll not save the actual values in instance variables but in the function itself, so you have the advantages that: -you can add a memoized function later in development, but leave the interface identical to the original -you would know what are the variables that you need to track down, they're just the variables you use as parameters to the function -the instance variable should be local to the function, so encapsulation is save. I'm not really a good programmer. I know this can be done really easily in ruby, and thus I think this can be done in smalltalk too, but possibly I misunderstood your post. In this case would you mind to explain me why a memoixe() approach is wrong ?

By anthony on October 21, 2003, 9:08:57 am

In the first place, you're taking a fairly general aphorism, "don't optimize prematurely," and giving it a precise - and somewhat arbitrary - meaning, "don't cache variables". What if the meaning is "Don't spend all your time making a half-finished program run twice as fast"? In the second place, caching objects in instance variables is fine because objects do not expose instance variables: They expose methods. They expose behaviour. Encapsulating behaviour funnels all outside access through a small, well defined, easy to understand set of methods. Exposing behaviour and hiding implementation frees you, the developer, to build algorithms however you like, knowing that they are safely hidden behind a well-defined API. The real error is providing get/set methods which break encapsulation by exposing internal state.