Literate Programming: the book

Donald Knuth's book Literate Programming is a collection of articles about what he did in the the TexBook and in various other programs. I read it around 1995, while working on qdoc, and thought it was terribly naïve. I reread it in portions this year (the same copy, which found its way to me from Trolltech's once-extensive library — better to me than wherever Nokia is going) and this time I wanted to write down my thoughts. I wish I'd had a blog in 1995.

The book is all about writing two things at the same time, in two separate languages, not connected, merely adjacent in a single file. The main point is that WEB is inherently bilingual, and that such a combination of languages proves to be much more powerful than either single language by itself. WEB does not make other languages obsolete; on the contrary, it enhances them. (p101) Yes, no, no. WEB does not enhance either of the two languages (relevantly, at least). The Τεχ source is just that, it doesn't receive anything from WEB. You cannot access the list of Pascal variables in the Τεχ part, nothing in Pascal is visible to either the WEB or the Τεχ processors.

The other language is invisible to the Τεχ source. That's why WEB is correctly described as inherently bilingual. (WEB is Knuth's top-level language, TANGLE produces compilable Pascal from WEB and WEAVE produces a text in Τεχ format.)

The example starting on page 144 shows how the WEB sections link to each other. The section which defines the global variables has links to sections that define more globals, and to the point where the definitions are inserted. All that is inside WEB. But there are no links to the sections that use each variable. That's because variables are used only in the Pascal parts of the source, and the Pascal parse tree is inaccessible when the exposition (documentation) is generated.

I find it very comfortable to write about code without having to change into a different editor. Over to Word or even Powerpoint, back to Visual Studio. And I also agree that a combination is more powerful, but Knuth's book does little to explit the power. He doesn't mention bilingual error diagnostics or any kind of difference analysis (that was extremely valuable at Trolltech), and in many cases it's possible to generate various special sections automatically (such as this API overview page) and generate text output from mixed sources (such as in the Qt tutorial). All WEAVE does is really to generate something like the Τεχbook, and leaves much undone.

The third great benefit is one Knuth understands and describes: There are many senses in which a program can be considered good, of course. In the first place, a program is especially good if it works correctly. Secondly, a program is often good if it is easy to change, when the time for adaptation arises. Both of these goals are achieved when the program is easily readable and understandable to a person who knows the appropriate language. Well put. Knuth writes better than I.

But I doubt it's really so adaptable. The tool does nothing to discourage the material written in the the two languages creeping apart as time passes, and in my experience that's lethal. Code review would help, but Knuth's book doesn't mention it.

A pedantic note. Despite what I wrote above, WEB does improve on some deficiencies of Pascal, such as allowing variables to be declared near their first use instead of all in a single block. I consider that relevant to Pascal and not relevant to either WEB or to literate programming, so I don't regard it as a significant feature.

Building a balanced binary tree from sorted input in O(n)

Writing about Knuth's literate programming book reminded me about when I met him (at a conference) and asked why the following algorithm wasn't in TAOCP. He grasped the algorithm from a ten-second description, and said it wasn't there because he didn't know about it. Good reason.

My TA at university (when I invented it for an exercise) wasn't aware of it either, but unlike Knuth, my TA didn't understand it. And I hadn't commented the code at all. Sigh.

Here are some words in sorted order: […More…]

Literate programming failed. Why?

Donald Knuth invented literate programming and published the TexBook as an example. The book is great, or so I've heard from many people who've read it. So why is literate programming is practically unused today, at least the kind Knuth invented? […More…]

Kinds of literate programming

The TeXbook employs something called literate programming: Knuth wrote code and text together, effectively writing a narrative about that code, with that code as part of the narrative.

Knuth could do that, he's a genius. He was able to write a sizable program practically without bugs, in a note-book. Mortals like myself could not. I'd have to go back and rewrite earlier bits, and before long the narrative would stink. […More…]

The history of udoc

The origin of udoc goes a long way back, to when I still was a student at the University of Trondheim, the world's first and only Quasar Toolkit user, and about to start working at Trolltech, which at the time was called Quasar Technologies (Hi Haavard) and occupied a room and a half overlooking a busy street in a rather unfashionable part of Oslo.

I wasn't very happy with the Qt documentation, which was then written using LaTeX macros and already obsolete. I was also an opinionated asshole and far too sure of myself, and I'd just learned about Donald Knuth's literate programming techniques, but I hadn't read his book. Naturally I looked at the existing litprog tools (there were quite a few) before discarding them all to write something good. […More…]