mercredi 11 août 2010

Unit tests as a language tutorial

Writing a tutorial for a complex piece of software is generally difficult and a bit boring. Especially if you're writing a tutorial on  a programming language intended to non-technical users, and when in the same time you are still tweaking the language itself.

You have to slowly introduce features, preferably one at a time, and give many examples and code samples. One particular boring aspect is that you have to check that these examples actually work. This is especially annoying when you modify the language itself at the same time.

This happens more than often, because writing documentation in general leads to question the design of the thing being documented. It often permits to realise that you have taken a wrong path on a particular feature, or that something is not convenient enough, and that you could improve it.

So when you document a programming language that is not yet carved in stone, and wish for it to be as good as possible, you do the improvements. You enter then in a document/question/improve loop. This is not bad, but the time you spent in writing some parts of the documentation is lost.

On the other hand, I have a test file for Lama. It gathers a number of unit tests, sometimes very basic like checks that precedency rules between * and + are honoured. As I spend my time in a rather anarchic way between documentation, language tweaks and writing libraries it somewhat lagged behind lately. When reviewing it, I obseved that the progression in the tests is quite similar to the progression in the tutorials: basic stuff first, then the more complex features and combinations. This is of course not an accident: the first tests where written first, when Lama was less capable than a desk calculator; but also because if a regression makes, say, the string concatenation fail, it is better for me that the string concatenation test fails rather than, say, the test of closures.

This observation of the quasi parallel progression between the tutorial and the unit tests made me realise that the unit tests file, with the appropriate comments, could be the tutorial. This is as a matter of facts not a new idea; some claim that Test Driven Development is not about making bug-free software, but about driving the design of programs and documenting them. Indeed, unit tests by definition show what to do to get a particular result, or what result to expect from a particular sequence of instructions.

Another obervation is that a significant part of the text of programming language tutorials one may find on the the internet is actually code. The Scribble documentation generator for PLT Scheme/Racket even features a mean to execute the sample code and include its result in the generated documentation. 

In some situations, you may have a document that is one-half plain text and one-half code, or is one-half plain text and one-half dynamically generated text (like web pages for instance). One then has to make a choice: either embedded the code in the text, or embedded the text in the code. All tutorials on programming languages I have seen use the latter form (code embedded in text). I'll innovate a bit here by choosing the other way.