| Edit | Rename | Changes | History | Upload | Download | Back to Top |
Title: Smalltalk Documentation Standards: Draft Proposal
Addendum: Cincom plans to realize product support for some features described in this proposal under the SmalltalkDoc initiative. For details, see:
http://www.cincomsmalltalk.com/CincomSmalltalkWiki/SmalltalkDoc Description of Document: This document is a draft proposal of a set of documentation standards for the Smalltalk language. It reports decisions reached by the author(s) through online discussions with interested parties in the Smalltalk community. A Final Working Draft Proposal is planned for Spring 2003, at which time endorsement will be sought from the Smalltalk Industry Council (STIC).
Addendum: during 2003, Cincom has been working to build product support for improved code/doc integration, with the aim to release a proof of concept for evaluation by the Smalltalk community. This new initialive is known as SmalltalkDoc.
General issues:
First, when we speak about "comments" in Smalltalk, we don't just mean comments in method bodies, but comments that might be attached to a variety of different constructs (e.g., classes, name spaces, shared variables, etc). Each construct has a slightly different function in the larger scheme of things: what is appropriate for a package comment may not be appropriate for a method comment.
Second, code comments aren't just for developers. They speak to different audiences and must therefore play a variety of roles. Not only are comments read by both seasoned and new developers (who tend to have different needs), but they are also used by writers, QA, and technical support staff.
Role depends upon audience. Comments are essential for new developers as they try to orient themselves with a package or class. Comments make it possible for technical and marketing writers to produce product documentation, white papers, tutorials, and marketing literature. For new developers and writers, a significant part of work time is spent reading, and code comments at all levels (package, class, method) play a key role in the process of reading and understanding code. For more experienced developers, comments explain what a component, class, or method does, and in this way they make it easier to re-use, debug, and maintain code.
Uncommented code exacerbates a variety of immediate- and long-term problems. For example, if a component or class has no comment, it may be difficult or prohibitively time-consuming to determine what it does, or whether it even meets basic project requirements. Similarly, classes which lack comments can become difficult or too costly to maintain and debug. Without comments in public methods, it may become too time-consuming to answer simple questions about the interface provided by a class.
Finally, code comments should be viewed as a first step in a larger and more open-ended process of product development and maintenance. Comments play a role not just in the evolution of code, but in a product development cycle. The successful entry and positioning of a product in the market, the use of library classes, their re-use, and maintenance -- all of these activities depend either directly or indirectly upon development practices that make systematic and consistent use of comments.
All of this raises a series of questions about what an effective comment looks like. To understand this, we first need to distinguish the various types of code definitions that are unique to Smalltalk.
Note: other dialects of Smalltalk may only define some of these, or additional types (e.g., ENVY components). For the moment, this proposal only considers VisualWorks. Interested parties familiar with other dialects are encouraged to clarify the kinds of comments they use (needed for specifying a cross-dialect documentation scheme).
VisualWorks allows a single comment string to be attached to each of these definitions. Of these five comments, the first three (component, class, method) are the most essential for communicating the functionality of the code. These comments correspond, roughly, to different levels of documentation, from general to particular.
Let's start with component and class comments, since these tend to resemble each other. Method comments generally have a different organization, so they need to be considered separately.
The most important distinction to make in a comment is between what a component does, and how it is implemented. The question "what?" is generally posed before the question "how?", so the explanation of what the component is, what it does, or what service it provides should always be included before the explanation of how it is implemented or why it has been implemented in a certain way.
If the package is bundled with others, the relationship between the packages should be noted. Roughly, this is the reason for dividing a larger component into several packages. What do the classes grouped in this package have in common? Why are they in this package instead of another?
The organization of the package should also be described. For example, if the package includes one or two public classes, and a dozen or so private classes, then it might be worth mentioning the public classes by name.
Notes on usage might include short examples, or links to overviews, tutorials, or walk-throughs.
Important relationships with other classes should be noted. For example, a text scanner that is always used in conjunction with a parser might mention the parser class by name.
It is nevertheless worth distinguishing between comments that appear in the body of a method and the comment that generally appears in the method's heading. As proponants of XP have argued, there are reasons to try to avoid comments in method bodies, but the headings of public methods generally have a different function.
As a rule, public methods should contain a summary sentence in the method heading.
The rationale for this is twofold: first, these summary sentences are used by automated documentation tools. Without a summary sentence, the usefulness of such tools decreases sharply. Second, in a language like Smalltalk, where public methods are typically only distinguished from private ones through the (often inconsistent or sloppy) use of protocols, the method comment becomes an important way to understand the class' public interface. In the absence of this comment, it becomes more difficult for new developers to understand the class, and it becomes more difficult for writers to produce product documentation. If a comprehensive description of the API can't be captured without digging through the code, then the time required to understand the class or produce product documentation begins to become prohibitive.
Although summary sentences are often omitted from accessor methods, they should be included for use by automated documentation tools.
It is important to keep this first sentence concise, as automated tools are likely to use it as a summary. As a rule, the summary sentence needs to make sense independent of the remainder of the comment. This sentence (and the others that follow) should try to avoid allusions to instance variables in the class to which the method belongs. Instance variables are implementation-specific, and may or may not be exposed to external protocol. More specific guidelines for the composition of this summary sentence appear below.
Additionally, the description may include:
Note that if these code fragments are included in the method, they should appear after the summary sentence, so that they do not get placed in API documentation produced by automated tools. Generally, the code fragment is placed within a separate pair of quaotation marks, so that it may be selected easily.
Standard practice is now to avoid line-by-line comments in methods.
When actually writing a comment sentence-by-sentence, you should consider the following guidelines:
It is generally most economical to write the summary sentence starting with a verb. For example:
Provides support for the Fonebone Database Connect. (suggested) This package provides support for ... (discouraged)
For methods, it is generally best to write the summary sentence starting with a verb. For example:
Answers whether the receiver is equal to the argument. (suggested) This method answers ... (discouraged)As a general rule in professional writing, the overly familar second-person form ("you...") should be avoided in favor of the third-person form ("it..."). I.e.,
Answers the lookup key of the receiver. (suggested) Answer the lookup key ... (discouraged)
Many programming languages suggest keeping comments less than 80 characters long, to avoid line wrapping. Unlike many other toolsets, Smalltalk tools tend to wrap lines automatically. This does not completely eliminate the problem of line wrapping in package and class comments, as developers often insert extra hard returns to achieve formatting effects.
These formatting characters become an issue with various documentation tools. For example, when converting comments for presentation in HTML, tabs, spaces, and hard returns create problems.
In package and class comments, it is strongly recommended that hard returns only be used to separate paragraphs.
For the moment, Smalltalk implementations tend to represent comments using plain-text strngs. It may be desirable to be able to italicize text for emphasis, or to make selective use of a special character style that highlights class, package, and method names. This could (if the user so desires) be displayed using a distinctive type face or color, making the comment easier to read in summaries.
It is also desirable to be able to embed hyperlinks into package and class comment strings.
The fact that comments can contain so many different types of information poses a problem. How can a summary be generated automatically by documentation tools? How can descriptions of components and classes be formatted consistently? To resolve these problems, we must consider the comment's internal representation.
One could imagine the comment being represented as:
The advantages of choosing options (3) or (4) are that documentation tools can be used to present neatly formatted and very precise summaries of components and classes. In the absence of imposed structure, it becomes difficult to build robust documentation tools.
At the very least, support for hyperlinks or any special formatting of comments, suggests that an alternate representation will be needed.
The following specific issues are raised by this proposal:
Review Comments:
Mark,
Thank you for taking on this important issue. The lack of adequate documentation is a significant barrier to what Richard Gabriel called "code habitability"
"Habitability is the characteristic of source code that enables programmers, coders, bug-fixers, and people coming to the code later in its life to understand its construction and intentions and to change it comfortably and confidently."While I agree with the XP notions of self-documenting code, some amount of architectural documentation is almost always needed at the higher component levels.
I am very much in favor of XHTML encoded comments - a subset of XHTML plus a subset of the Dublin Core tags plus Smalltalk-specific tags, plus extension tags defined by additional tools (such as modeling tools). The XHTML subset should be rich enough to support text highlighting, links, paragraphs, lists, tables, and images (often worth a thousand words!).
It should be possible to validate comments against an XML Schema that describes the allowed/required tags. Whether or not this validation is done is a separate issue - Smalltalkers tend to resist authoritarian mandates.
If comments are encoded with tags, then browers and other tools need to support two options for viewing them: an editable, unformatted version for developers that shows all tags, and an uneditable, formatted version for code readers.
I also have the following specific comments on your proposal:
Regards...Rich Demers
December 31, 2002
Thank you. This really impressed me. Though I fall into that XP freak category and might find some of the parts "more than necessary", I at this point have a very clear view of what it is you want to do, what the vision is, and that's worth a lot. Thanks again.
Travis Griggs
| Edit | Rename | Changes | History | Upload | Download | Back to Top |