Ethics and open source
A recent paper in CACM ran some metrics on six open source programs and made claims about their need for maintenance. However, it didn't say exactly which programs they were, but just gave their total code size, "application type" (such as "operating system application", or "programming language"), and number of releases measured. The claims were about what I would expect, and might have been interesting if the figures were checkable. But because the authors didn't say which releases of which programs they used, the paper is nearly worthless.
The authors said that they didn't release the names of the projects because of "standard software engineering ethics", and referred to a paper by El-Amam called "Ethics and Open Source" that I couldn't find on the net. However, I found another paper on ethics that referenced the first one approvingly, and so the first paper probably uses similar arguments.
The basic argument is that anyone doing a case study of open source software is doing an experiment on humans, and either needs their permission or needs to make sure that they can't be harmed by the study. Writing a paper that points out flaws or weaknesses in a system can harm the authors of the system, because it casts doubts on their ability and might prevent them from getting a job. Although theatre and film critics attribute flaws to particular people, they have no alternative. However, there is an alternative when we are studying open source projects, because we can describe the projects in such a vague way that the original system is not identifiable. This requires that we not show any examples of source code, of course.
This opinion is not only wrong, it is pernicious. So far, all the people I've read who have said this have been people I have never heard of, people who probably do not know open source. A major part of putting software into the open source is to have it criticised. People point out bugs so that others can fix them. The internet is full of people opining on why one system is better than another. This give and take makes the systems better. Why should software engineering researchers be the only ones who can't name names?
One of the things that has kept software engineering from progressing faster has been the lack of good experimental evidence. Science progresses by repeatable experiments. Repeating a software development project is too expensive to be practical. But we don't even repeat our analyses of a particular software development project! One person watches a project, gives opinions, and that is treated as fast. However, from my experience, there are usually a lot of different reasons to explain why something happened. We need groups of people all studying the same projects. Open source is the first place where this can happen. I am hoping that open source will give software engineering researchers a common body of material to study. This will make software engineering progress much faster.
However, now I am afraid that repeatable studies of open source software are going to be declared unethical. Just imagine if the NSF forbade their researchers from mentioning the names of open source projects unless they got permission from the authors. Do you only need the permission of Linus if you want to mention Linux, or do you need the permission of everybody who has contributed? This idiotic opinion could stifle software engineering research. It is dangerous, and must be stopped. Professional bodies like the ACM and the IEEE Computer Society should take a stand and declare that open source is free to be used in any way consistent with its license, which certainly includes being used in scientific studies where all the details of the experiments are repeatable.
Comments
Ethics and open source
[ Reinout Heeck] October 6, 2004 6:54:50.000
Well if that were so the typo would have been fixed and our comments removed. After all this is *his* blog ;-) Anyway thanks for the explanation, learned another word today...
And thanks for giving me the hint that it is possible to edit my blog. I didn't realize I could do it, but once I realized I could, it wasn't hard to figure out how to get CST to do it. -Ralph
Open Source and Ethics
[Rich] February 21, 2005 9:43:05.103
You might want to take a look at:
Berry (2004) Internet Ethics: Privacy, Ethics and Alienation - An Open Source Approach.
This discusses these and other issues with some history and is a good background to the material and possible ways out.
R