show all comments

patterns

Using a pattern language to design a system

June 14, 2009 16:10:12 EDT

One of the questions at ParaPLoP was "If patterns are the vertices of a graph, what are the edges?  How do we know how to go from one pattern to the next when we are using a pattern language"?

These are not the same question.  An arrow from vertex A to vertex B does not mean that using A requires you to use B.  Instead, it means that if you use A, B is a likely candidate.  The edges make it eaasier to use the pattern language, but they do not force you to use it a certain way.  Each pattern has preconditions.  It makes tradeoffs.  It solves some problems but makes other things worse.  If the thing it makes worse is unimportant to you then the pattern is "good".  If the thing it makes worse is valuable then the pattern is "bad".  Of course, the pattern itself is not good or bad, it is a particular use that is good or bad.

A vertex in a pattern language is a document that describes a pattern.  There is an edge from one vertex to another if the first pattern mentions the second, especially if it mentions the second as helping to complete the pattern, or as a preferred alternative in certain conditions.  Thus, if you think the first pattern is useful, you ought to consider the second, as well.

When you are using a pattern language, you are always using it to solve some problem, to design a system.  The system might be big and already use a lot of patterns.  Or, it might just be a list of requirements.  It doesn't matter, in both cases you have a system that you are designing and a set of problems that you want to solve.  You choose a pattern that you think will help improve your system.  This pattern ought to solve one or more of the problems, or at least transform them into potentially simpler problems.   After you use the pattern, you get a new version of the system, which a somewhat different set of problems.  A good description of the pattern will warn you about the new problems that it produces, but it will always be up to you to examine the new system to see whether you can tolerate the problems or whether you should try another pattern.  In fact, you often have to choose among a set of patterns, and often the best way to do this is to try out each one and see which one gives the best results.  Sometimes you have to work with a system awhile to discover that "inflexible and hard to change" is very bad for you, and so you have to backtrack and change the pattern that produced that quality.

A pattern should tell you when you can apply it.  You shouldn't need an extra set of instructions for "following an edge".  If the readers of your pattern can't figure out how to apply it properly, you misjudged your audience.  Perhaps the pattern needs more examples, or needs more detail.  Talk to a few readers, and try to figure out what they missed.  Revise the pattern to fix the problem, and try it on a few more people.  Communicating effectly requires feedback.  Good communicators keep revising until they get it right.

patterns

ParaPLoP

June 14, 2009 06:26:48 EDT

Last week I was in Santa Cruz at a workshop on patterns for parallel programming.  It was organized by Tim Mattson, one of the three authors of "Patterns for Parallel Programming", Kurt Keutzer of Berkeley, and me. 

Kurt has been heading up an effort at Berkeley to develop a pattern language of parallel programming.  They have a wiki that holds their patterns.  The upper left corner of that page has a link called "Our Pattern Language" that points to a paper that gives an overview of the language.  The top patterns provide the context for parallelism; they consist of "structural" patterns, which are what the POSA books call "architectural" patterns, and "computational" patterns, which are about algorithms.  The middle patterns are mostly taken from "Patterns for Parallel Programming".  They are the first patterns that are only about parallelism.  The bottom patterns are being worked out.

ParaPLoP was organised around writers workshiops.  You can see the papers (and groups).   There were  two writers workshops each day, but we mixed things up a bit between the first day and the second day so we would get to meet more people.

Nearly half the people at ParaPLoP were from Berkeley.  So, we talked a fair bit about their patterns.  I've been working with the Berkeley group since fall, so it was more of an ongoing conversation to me, though the patterns were new to others.  The Berkeley group has put the most effort into their computational patterns.  I agree that these are important patterns for parallel programming, since parallel programming, unlike object-oriented programming, has a big impact on the algorithms that you use.  However, I think that most of the "computational patterns" are too complex to be thought of as a single pattern.  Instead, it is better to think of them as small pattern languages.  My paper on "N-Body Pattern Language" illustrates this for one of these computation patterns.

I learned two important things at ParaPLoP.  First, I got a much better idea of what patterns are at the bottom of the pattern language for parallel programming.  I used to think it would be language primitives, but it isn't.  I think the bottom of the pattern language is optimization patterns, such as patterns for optimizing memory access, overlapping computation with communication, and load balancing.  The main reason we want to write parallel programs is because we want our programs to be faster than if we just wrote sequencial programs.  (Sometimes we write parallel programs because the problem we are solving is parallel, but not usually.)  So, we have to worry about performance, and at the moment that means load balancing and ensuring most data is in the cache.  People often think these are not as important as increasing the number of threads and preventing threads from interfering with each other, but if you care about performance then they are just as important.  Two papers illustrated these lower-level patterns, "Parallel execution-aware data structures" and "patterns for overlapping communication and computation".  At the moment, there is no place for these patterns on the Berkely website, but I am sure they will find a place.

The other thing I learned is that people new to patterns have a lot of questions that I know how to answer.  So, I am planning to answer some of them here in the coming weeks.

 

general

Guy Steele at OOPSLA 1999

May 17, 2009 07:10:16 EDT

The most impressive computer science talk I have ever seen was Guy Steele's "Growing A Language" that he gave at OOPSLA back in 1999.  I just discovered a Google video of it here

 

The talk starts slow.  You wonder "is this computer science"?  Stick with it!  Not only is it computer science, it is brilliant art at the same time.

 

OOPSLA is a wonderful conference.  Not only does it have  great computer science, but a lot of the OOPSLA leaders like to explore it in non-traditional ways.

general

Rules for designing frameworks

April 25, 2009 16:23:16 EDT

Jonathan Crossland has some good rules for designing frameworks.

general

Onward essays

February 16, 2009 08:52:30 EST

Onward! is a software conference focusing on revolutionary ideas that have not yet been proven enough to be presented at more traditional conferences.  It is colocated with OOPSLA, but the ideas do not have to be object-oriented.  If you have a an idea that burns in your mind, but you just haven't been able to get others to see it, or you know that if you submit a paper about it to your favorite conference then it would be rejected as needing more work to be publishable, you should consider submitting it to Onward!  Onward! does not accept second-rate papers, but its criteria are different from other conferences.  The most important aspect of a paper is how important it would be if it were true.  You have to convince the program committee that your idea is plausible, but emperical proof is less important than in most of the top computer conferences.

http://onward-conference.org/

In addition to the regular papers, Onward! has an essay track.  Essays are different from papers.  Instead of describing an invention, an Onward! essay is a thoughtful reflection upon software-related technology. Its goal is to help the reader to share a new insight, engage with an argument, or wrestle with a dilemma.

A successful essay is a clear and compelling piece of writing that explores a topic important to the software community. The subject area should be interpreted broadly, including the relationship of software to human endeavours, or its philosophical, sociological, psychological, historical, or anthropological underpinnings. An essay can be an exploration of its topic, its impact, or the circumstances of its creation; it can present a personal view of what is, explore a terrain, or lead the reader in an act of discovery; it can be a philosophical digression or a deep analysis. It can describe a personal journey, perhaps that by which the author reached an understanding of such a topic. 

The deadline for Onward! Essays is 20 April 2009
To submit, see   http://onward-conference.org/calls/foressays

The best way to understand what an essay is like is to read some.  Here are a couple of (strongly-contrasting) past essays:
 * Dan Grossman "The transactional memory / garbage collection analogy"
       http://www.cs.washington.edu/homes/djg/papers/analogy_oopsla07.pdf
 * Dick Gabriel "Designed as designer"
       http://dreamsongs.org/DesignedAsDesigner.html

 

general

Erich Gamma on 7 years of Eclipse development

February 14, 2009 07:06:58 EST

Erich Gamma gave a talk a an InfoQ conference (perhaps JAOO) on lessons learned from Eclipse.  He talks about open source and architecture.  And Swiss villages, shipping software, and being proud of your software.  It is a great talk!  InfoQ runs a series of conferences around the world.  I have been to several of them and have always enjoyed them.  They don't publish all their talks, but thave a nice collection on their website.

Erich's talk is at http://www.infoq.com/presentations/Eclipse-Lessons-Erich-Gamma   I recommend it.

software engineering

Emergent design, and refactoring in large projects

December 06, 2008 00:27:29 EST

I bought a bunch of books at OOPSLA. Two of them are "Emergent Design" by Scott Bain, and "Refactoring in Large Projects" by Stefan Rook and Martin Lippert. Both are about refactoring and emergent design.

The first book is more motivational, a bit less technical (but still has lots of code), and easier to read. It describes how design patterns can make your system more flexible, but how refactoring is almost always necessary. It describes design principles necessary to create good design, and programming practices that make programs easier to understand and to change. It describes how testing fits in and makes refactoring practical. I enjoyed reading the book, will recommend it to people and loan my copy to students, but didn't learn anything new from it. Scott did a great job of describing the value of refactoring and design patterns. If you are not sure why these are important then the book will be useful to you. However, if you have been practicing them for some time then you will probably not learn more than better ways to argue your points.

"Refactoring in Large Projects" is a much more advanced book. It talks about architectural smells and how to fix them. It describes refactorings that take a long time to perform, and how to carry them out without any tool support and without driving other people crazy. It talks about tools, too, such as refactoring tools (which I use) and Sotograph, which seems very cool and which I will try to get. It has a chapter on refactoring relational databases, and a chapter on API refactorings. All the chapters are precise and have lots of detail. There are code examples. If you are new to refactoring then this book might be over your head, but if you have experience with refactoring then this book will expand your vision for what is possible.

Some people might think that these books are overselling the ability to change software. In fact, they are underselling it. Any kind of change is possible. You can change software to make it more secure, more safe, more reliable, and faster. You can change it to run faster on multicores. All change takes work. "Refactoring in Large Projects" shows the kind of bookkeeping and testing that is necessary to carry out large refactorings. But the biggest problem with most software systems is that their owners have let them get out of control, and have not practiced basic code maintance. If you follow the practices taught by these books, you'll be able to have your code under control.

general

Erlang, the next Java

August 08, 2007 12:00:51 EDT

I was at ECOOP last week, though I didn't stay til the end. My favorite part was the talk by Joe Armstrong on Erlang, and talking with him afterwards. I met Joe last year and have been following Erlang every since.

Erlang is going to be a very important language. It could be the next Java. Its main problem is that there is no big company behind it. Instead, it is being pushed as an open source project. Its main advantage is that it is perfectly suited for the multi-core, web services future. In fact, it is the ONLY mature, rock-solid language that is suitable for writing highly scalable systems to run on multicore machines.

Erlang started out over 20 years ago as a kind of concurrent Prolog. Joe Armstrong invented it, and he has been the main person pushing it. He was at Erickson, the Swedish telecom company, and the first big Erlang project was a electronic switching system that was built by a few hundred people, who wrote a few million lines of code. The emphasis was always on reliability, not raw speed, and the system has an incredible reliability history. Joe claims they have achieved "nine 9's of reliability".

What does that mean, "nine 9's of reliability"? It means 1 second of downtime in 1 billion seconds, or 1 minute of downtime in 1 billion minutes. Now, a billion seconds is roughly 30 years. A billion minutes is roughly 2000 years. This system has been in production for ten years or more, but less than 15 (I think). They have sold hundreds of them, perhaps thousands, but I think hundreds. Two hundred systems at ten years apiece give 2000 years of operations, so they can say they have "nine 9's of reliability" if they have had less than 1 minute of downtime total for all the systems they have installed.

"Five 9's" is about five minutes of downtime in a year, and is pretty good. People who are fanatical about reliability want six 9s, or even seven 9s. It is unheard of to have nine 9s. But a system built in Erlang has done it.

That is not what is going to make Erlang big, though. Not enough people care about reliability. Another thing that is not going to make Erlang big is that "sequential Erlang" is a functional programming language. Or that "concurrent Erlang' is an object-oriented language. The thing that is going to make Erlang big is that it is the only mature language with a rock-solid implementation and good set of libraries that lets you write software that can scale seamlessly from a single processor system to a hundred processor system. In a few years, all our desktop systems and laptops will be multiprocessors, and the only way to make our applications run faster on them is to make them use multiple processors.

When you build a system in Erlang, you write a set of processes that communicate only by passing messages between them. There is no shared state; the only way to communicate with a process is to send messages to it. (Or, to have it write to the file system, which is cheating. Or call C code, which is also cheating. There are actually lots of ways to cheat, but we'll ignore them.) Unlike Java or Smalltalk, where you only write threads/processes when you want concurrency, Erlang programmers use processes for modularity, reliability, and reuse. Then they get concurrency for free. Go build your application on a single processor. In theory, you could write it all as a single process, but no decent Erlang programmer would do that. They are more likely to write it as thousands of processes. It doesn't hurt performance on a single processor and makes it likely that it will make good use of a multiprocessor. Then, put it on a ten processor system and make your system run ten times faster (probably eight or nine times faster, but that is still pretty good).

Of course, just because you wrote your system as thousands of processes doesn't mean that it is scalable. You could have bottlenecks, as with any system. Processes could spend all their time waiting on other processes. But messaging is asynchronous in Erlang, and it is common to send a message to a process and to forget about it, expecting the result to be forwarded to another process. There are a number of design patterns in Erlang that tend to make systems scalable.

Erlang comes with a bunch of libraries. A lot of them are for building or using various kinds of internet services. Erlang has web servers and databases. The Erlang community has been positioning it as the language for choice for building reliable web servers and web services. But the package I find most interesting is the OTP, or the "open telecom platform". Not surprisingly if you look at the names of Erlang packages, it has nothing to do with telecom. It is a framework/platform for building systems that can run decades without being turned off, while updating your software every day and even replacing your hardware periodically. This is needed by telecom applications, but is also useful for on-line banking, on-line stores, and any web service that you want others to build upon.

Joe Armstrong as just finished a book on Erlang that is being published by the Pragmatic Programmers. Joe has a article about his book, too.

It is a very good book, and must reading for anybody interested in Erlang. The thing that bugs me about the book (and about his talks) is that he make more fuss than he should about the functional language aspect of Erlang and not enough about the OO aspect. In fact, he denies that it is OO.

Processes in Erlang are objects. At the beginning of my course on OO design, I explain three views of OO programming. The Scandinavian view is that an OO system is one whose creators realize that programming is modeling. The mystical view is that an OO system is one that is built out of objects that communicate by sending messages to each other, and computation is the messages flying from object to object. The software engineering view is that an OO system is one that supports data abstraction, polymorphism by late-binding of function calls, and inheritance.

Erlang is a perfect example of the actor model, which is an example of the mystical view. Processes certainly support data abstraction and polymorphism. An Erlang process is a function that reads from the incoming message queue, pattern matches to find a particular message, and then responds to it. A function structured in this particular way is similar to a class in Smalltalk. Moreover, given several kinds of processes that have a common protocol and that share some things in common, it is easy to factor out the commonality into a function that they can both call. This is similar to class inheritance. So, you could even say that Erlang supports inheritance, though it does it very differently than in Java or Smalltalk. I imagine that many Erlang programmers think about programming as modeling. So, Erlang fits all the characteristics of an OO system, even though sequential Erlang is a functional language, not an OO language.

One way that Erlang differs from OO languages is its emphasis on failure. Any message can fail. Processes don't raise an exception, they fail. Systems are structured as worker processes at the bottom that are likely to fail, with manager processes above them that restart the failed processes. Because programmers expect processes to fail, they

Joe makes too much of functional programming because he says that lack of mutable state implies no locks. However, it is really lack of SHARED state that implies no locks. You could write processes in Basic, perl, or C. I'm sure that lots of people will look at Erlang and say "we can add that to our language". In my opinion, it is the concurrent programming aspects of Erlang that make it special, along with its mature implementation and powerful library designed for concurrency and reliability.

I do not believe that other languages can catch up with Erlang anytime soon. It will be easy for them to add language features to be like Erlang. It will take a long time for them to build such a high-quality VM and the mature libraries for concurrency and reliability. So, Erlang is poised for success. If you want to build a multicore application in the next few years, you should look at Erlang.