Refactoring and types
Jim Robertson's blog had a discussion recently about manifest typing and refactoring support. It got rapidly out of hand, but the basic point was interesting. Essentially, given static type information, a refactoring engine can work better. In particular, it's possible to rename methods (with Integer + as the example) which you couldn't in a dynamically typed system because you can't tell which sends are to integers and which to other things.
This is true, but more complicated than it first appears. It's not that the dynamically typed system doesn't have enough information to do the refactoring, but rather that renaming a method implemented by multiple classes incompatibly is not necessarily a behaviour-preserving transformation in a dynamically typed system. If there wasn't enough information, then the refactoring engine would just have trouble finding which call sites needed to be modified. But we can easily have call sites that can't be automatically modified. Consider
frobnicate: something ^something + 7.
The parameter might be an integer, but it might not. I believe that the same issue arises in a statically typed system with type inference, or in a manifestly typed system with templates. Suppose that we created a templated type SummableList<T>. This is just a parameterized collection with a sum operation that returns the sum of all the items in it. Now if we want to rename the + operation, do we change the send of + in our class or not?
More generally, a dynamically typed program will have less information about itself than one with manifest typing (although I presume it would be the same as a statically typed program with type inference). That's part of the point - you don't have to specify as much. At least as far as Smalltalk is concerned, these are part of the reason it has development traditions like test-driven design, and unit testing (and is where the original refactoring tools came from, as well as the original xUnit). These kinds of measures help ensure correctness when the program changes, while preserving the flexibility, terseness, and other advantages of not specifying types up-front. And as well, keyword messages make name collisions that much less likely.
But the nice thing about many of these techniques is that they don't just affect behaviour-preserving transformations. In my opinion, there's too much emphasis on that. While refactoring is wonderful, it's my considered opinion that to be really productive in development, you will eventually have to do something that changes the behaviour of the program. So it's not just the ability to make changes that don't affect behaviour, it's the extent to which changes in one place propogate through the rest of the program. Do they damp out quickly or do they force lots of other code to be changed. Examples of things that I'd consider to propogate are Java's checked exceptions in the method signatures and C++ const. Dynamic typing is an example of something that helps damp out changes.

Comments
Exploring Type Safety in Smalltalk
[Peter William Lount] September 14, 2005 11:15:20.634
Inspired by Alan's thoughts I wrote the article Exploring Types in Smalltalk where I demonstrate that Smalltalk has always had just as much type safety capability as any static or typed language or system. The difference: Smalltalk gives you the control over when to use these "type safety" capabilities while languages like C, Java, C++, C# and others force you to over constrain your programs into brittle contraptions. Smalltalk respects your judgement while these other systems treat you an an immature kid. If you want true control over your programs choose Smalltalk
Exploring Type Safety in Smalltalk - link correction
[Peter William Lount] September 14, 2005 11:23:55.442
The corrected link is: Exploring Types in Smalltalk
This Cincom blog does weird things with it's "Do Custom Markup?" check box, like messing up my link and altering the format of what I've typed. It changed the underscores into weird codes. Ish. Also it has problems with cookies where it says that I need to enable cookies when they are enabled. This prevented me from adding this new comment until I deleted the cookies! Double ish.
Types, Refactoring, and Metadata
[James Robertson] September 14, 2005 12:11:55.053
Trackback from Smalltalk Tidbits, Industry Rants
Types, Refactoring, and Metadata by James Robertson
After the long winded discussion of refactoring here, Alan added some light to the heat over here. Today, Peter William Lount adds some more:
Read the whole thing.
instead, presume types were inferred
[Isaac Gouy] September 14, 2005 14:43:48.332
Alan wrote: a dynamically typed program will have less information about itself than one with manifest typing (although I presume it would be the same as a statically typed program with type inference)
(I take for granted that you know) Smalltalk refactoring and rewrite works on ASTs not Smalltalk source code.
If we take a similar step and type check the "statically typed program with type inference" then I presume we'll either successfully infer type information or see a compiler error.
After this static type check, we have type information that was not explicit in the source code - we have information that we don't have about the "dynamically typed program" ASTs.
[isomer] September 14, 2005 15:27:15.330
Isaac,
Can you give a concrete example of how you will have more information in the AST from a statically typed program with type inference than from a dynamically typed one? I can't think of how.
Type Inference in Dynamic Systems
[Peter William Lount] September 14, 2005 16:18:31.770
In response to Isaac's comment I've written an indepth article Dynamic Runtime Type-Object-Class Inference.
The key point being that static programs are not the sole beneficiaries of "type inference". In fact, Smalltalk and Self had it long ago. Type (or class) inferencing techniques for Smalltalk were experiemneted on while at Xerox Parc and incorporated under the Self Language in it's virtual machine since it's inception.
It's hard to have a discussion without the basic facts being correctly stated.
more information in the AST
[Isaac Gouy] September 14, 2005 16:32:35.577
isomer wrote: more information in the AST
That is not what I wrote.
You'll notice that back in 2001 Eclipse gathered a variety of information for refactoring - including declared types, methods and fields. In a language without explicit declarations, can you think of how we can still gather type information?
Nice?
[Nice] September 14, 2005 17:02:23.766
Maybe Peter would be interested in the Nice language ;-)
Being Nice?
[Peter William Lount] September 14, 2005 17:14:16.142
Thanks, sure Nice is, well, Nice but nice? Not-so-much.
Nice uses variable, parameter and return type declarations. Not-so-nice and not needed! When will you guys learn that?
Nice uses the round parenthesis "function" style naming convention so it's not a "literate" programming language and thus not nice.
Smalltalk's keywords assist with "literate" programming style. In these and many other respects Smalltalk is much nicer than Nice, and the language that I'm working on, Zoku, will be nicer than Smalltalk.
ps. It would also be nice to know with whom I'm interacting.
Collecting Dynamic Type Information
[Peter William Lount] September 14, 2005 17:44:01.662
Isaac, I can't believe my eyes, you really asked "In a language without explicit declarations, can you think of how we can still gather type information?" Wow! This question seems to make it clear that you haven't understand the power available to dynamic systems, and it's no wonder your comments are often off target as a result.
In a dynamic system type information can simply be collected at runtime as the program executes, as messages are sent, as variables are set and objects returned from methods. All of these "events" can inform the meta data collection system about the types in the dynamic system. An Integrated development system can make use of this information in real time as the program executes or later on for many purposes including refactoring, optimizations (ala Self), documentation, unit test case generation/confirmation, and many more awesome ideas.
I'm glad that you asked this fundamental question. I hope it helps clarify dynamic systems for you. I hope it shows why you don't need type declarations in a program at all to be able to "infer the types" of variables, paramters, return values, interfaces, etc...
call sites that can't be automatically modified
[Isaac Gouy] September 14, 2005 18:00:34.925
Alan wrote call sites that can't be automatically modified
The obvious refactoring issue is that for some reason we aren't able to be specific enough - that's the case for polymorphic methods at simple call-sites (when don't have type information), and that may also be the case for polymorphic methods at polymorphic call-sites (when we have subtypes, even if we have type information).
afaict refactoring needs more information than type-checking.
instead, presume types were inferred
[ Alan Knight] September 14, 2005 18:36:23.311
Comment by Alan Knight
The problem, as I said originally, is not lack of information, but different semantics. As with templates, the source code is not limited as to which particular interfaces it is using operations from. So, from my understanding, with type inference you essentially get multiple compiled representations specialized for particular types from a single piece of source code, depending on the possible types at the call site. So if we define a function f (a,b) to return a + b, and call that as f(1,2) and f(1.0,2.0). That will work just fine, and the appropriate machine-level code will be invoked. Now we want to rename the + operation to something else, what is the appropriate behaviour-preserving transformation to apply to our definition of f?
renaming -> polymorphism
[keith ray] September 14, 2005 18:50:24.045
The renaming a method problem is a problem of polymorphism that can occur in a statically-typed languages as well as Smalltalk and other non-static languages... Smalltalk makes it more obvious because polymorphism isn't limited to subclasses of a base class or to classes that are declared to implement various named interfaces.
In a static language, if you rename method "foo" in one class, but not in parent classes / sibling classes / interfaces, and code could be referencing instances of that class through a parent-typed / interface-typed pointer, you don't know if the name-change should apply.
appropriate behaviour-preserving transformation
[Isaac Gouy] September 14, 2005 20:21:37.406
There are several possibilities, and I meant that we did not have enough 'information' (the programmers intent) to choose between them automatically.
Did we intend for all + operations to be renamed? f(a,b)= a plus b
Did we intend for Integer + operations to be renamed? f(a,b)= a + b and f(a,b)= a plus b
Did we intend...
Are we talking about the same thing?
appropriate behaviour-preserving transformation
[ Alan Knight] September 14, 2005 21:16:10.416
Comment by Alan Knight
Well, what I was replying to was your mention that with type inference, after inferring types we have information in the AST that was not explicit in the source code. I was saying that this information is not helpful in the case I was describing. The programmers intent in the example we were using was to rename Integer + operations, but not other + operations. Thus, merely changing the source of code of our function is not a behaviour-preserving transformation - we would have to split it into multiple functions. This is the same sort of issue that arises with templates. It does not arise in a manifestly typed system, because we would not have been able to write "f" as a single function in the first place unless we had defined an interface to which all numerics conformed.
I don't know if we are talking about the same thing or not. I'm certainly not following your connections.
yes, we would have to split it
[Isaac Gouy] September 15, 2005 2:26:48.870
Alan we would have to split it into multiple functions
Yes, that was my understanding as well.
However, as Keith Ray said it's an issue with polymorphism and not something to do with type inference. To rephrase what you said - this issue does arise in a manifestly typed system, when we can define subtypes or intefaces (subtype polymorphism) or generic functions (parametric polymorphism).
[] September 15, 2005 11:34:29.938
Why I Hate Advocacy