API documentation using literate tools

API documentation is a particular subclass of literate programming. What makes it special?

First, its audience is diverse. Some readers know almost as much as the maintainers about the subject, others are rank beginners. Many know quite a bit about some parts of the subject and are almost ignorant of other parts. Some readers like to point and click, others prefer dead flat trees, others again prefer on-screen plain text such as man pages (I do, because I can type much faster than I can point and click).

Second, it's generally read to answer some specific question, most commonly How do I verb? or less what does this code do/where's the bug? (about code that uses the API). In both cases the the reader is interested is learning more about some special area of the subject, and not interested in the subject as a whole.

Third, the implementation of the documented code is irrelevant (at least in principle) and so, if possible, it should not be necessary to mention it at all. The reader will use an API, and in most cases, will expect that API to remain stable even as the implementation changes.

Fifth, an API may have aspects which are not easily expressed in its programming language. Provided for compability; do not use in new code, not available in foo version and so on.

Sixth, its audience may not have any good fallbacks if the documentation fails. If the documentation fails, the reader can perhaps read the code, can perhaps search for the answer in someone's blog, or call support or pay for a seminar. What the reader cannot do is walk to the next office and ask the author.

Litprog's abilities

Most of an API consists of code written, and literate programming tools can document that well. Some do it better than others — javadoc, for example, offers almost no features to help ensure that the API documentation is complete.

An API-specific feature that's well suited to programmatic handling is change control. The documentation tool can point out differences between the API that is available to users (e.g. the public classes in user-accessible files) and the documented one.

Scanning for omissions is another area where the tool can help well. It cannot counter malevolent writers, of course, but if you respond to its warnings, then this is an area where a tool can do much. For an example of how not to do it, look at Android's Adapter, which deals with displaying objects, but says absolutely nothing about what kind of object getItem() returns. Returns the data at the specified position it says, not a word about what that data might be. The writer did stop the tool from complaining about an undocumented return value, but the writer didn't help any readers.

Possible warnings include: Public member functions that aren't mentioned in the class description (fuzzy, needs heuristics) or aren't documented at all (very simple). Examples that use undocumented classes, functions, etc. Documentation blocks that talk about most values of an enum, but miss one. Class graphs that contain undocumented classes. Class documentation that doesn't mention those other classes which the function interfaces do use.

Tying code to examples: Qdoc used (still uses?) a simple and quite good policy for this. The documentation for each class/function mentions up to five example programs. If more than five examples are available, then it chose (chooses?) the five examples that use the fewest other functions/classes.

Explaining examples: Often, examples are basically undocumented collections of magic spells. Microsoft is guilty here, but I'm going to be gentle on myself and not look at any Microsoft examples today. Instead I'll point to some of Qt's examples as positive and most of the Android API demos as negative.

Notice how the Qt example contains text explaining what the example does. Every few lines there's a paragraph reiterating the code in words, often mentioning why this-and-not-that. The code in the example contains links to the documentation for each class and function.

The Android example, on the other hand, explains precisely one thing, namely that java packages use import to access identifiers from other packages. You might think the LinearLayout6 example might contain text about linear layouts, perhaps explaining what the example code does. Or, since it says its purpose in life is to demonstrate using the uniformSize attribute, shouldn't it explain why the code doesn't contain any use of uniformSize?

Achieving more

With more work, literate programming can be made to do more. That's out of my area of interest.

What I care about is that ideal point where the human does little, the tool does a lot, and the results are good. If the human does a lot and the tool does a lot, the results are better. Of course. I'll care about that when merely good results are commonplace.

Today's tools

I'd love to write here about how various tools have many fine features to help you get okay API documentation easily. But they don't. There is much they could do, little of it's done today.

Doxygen has something I like: Many output formats. Udoc has that too, and at least it has some warnings. But udoc is (I say that as its author) not particularly well suited to API documentation. Qdoc can integrate examples well, javadoc can't do anything well as far as I can tell.