Friday, June 27
Tonight I want to say something about the effect that communicating through the code is having on the project.
When Sarah & I started working together we quickly ran up against the limits of our abilities to reason verbally about the processes we are automating. There are very many variables and cases. The calculations she described were not complicated, but the conditions for applying them were. Very. I suspected that this reflected optimisation long ago to minimise calculation, after which it just got difficult to reason across the cases. An actuary would be better trained to abstract a general framework within which to reason, but Sarah isn't an actuary.
So we tackled the case examples one by one, using control structures (IF, THEN, ELSE, &c.) in the code to relate the rules to the conditions for applying them. Case by case the control structures proliferated.
At first I refactored the code only inside the control structure elements. Had I refactored the control structures as well I should have destroyed the framework that anchored the work we were doing together. Sarah & I were already using the code as a visual representation of the conditions, to ensure we could see while we talked.
I came to see that I could refactor the control structures only where Sarah could follow the refactoring. Sometimes I have been able to teach her this. She sees how multiplying a number by a condition is the same as saying it's zero unless the condition holds. I would eventually be able to refactor to my heart's content only when we had finished and no longer needed to refer to it!
But that severely limited the benefits we could get from refactoring. Pity, because they included finding simpler abstractions from among the detail.
In retrospect it seems I did two things right at this point by luck. We both find the pair programming work as intensive as XP practitioners report it. So we break for a few minutes each hour to stretch. (At the moment, when we've achieved a milestone we've been aiming for, I pick and read aloud a poem from A.A Milne's When We Were Very Young.) I also vary the rhythm a bit by spending a few minutes every now and again deconstructing some recent expression in immediate-execution mode. So besides the variables and functions she's named herself, Sarah also has some insight into what the primitives are doing. For example, I've showed her how a Boolean expansion maps a list of numbers into a longer list, and she has used her knowledge to spot and challenge a line where this was not done as we had agreed it should be. So the first thing I think I did right was to keep discussing and explaining the code as I wrote it.
The second thing was to build upon this learning to support aggressive refactoring. Sarah likes the XP prescription for “merciless refactoring”, and enjoys seeing simplicity emerge from complexity.
When early in the project we coded the rules for calculating Market Value Reductions, we wandered in a depressingly long sequence of nested control structures. When we came recently to the rules for calculating annuities and policy fees, we reached for an array structure that would allow us to represent all the cases. This entailed analysing and recreating rules embedded in an Excel workbook, rules that Sarah doesn't know — she just uses the workbook.
We started by refactoring the Excel spreadsheets in the workbook, displaying and naming new intermediate results until we could see what varies from case to case and what is the same. This allows us to rewrite the rules in APL.
In my last post I compared the ephemeral views Excel's auditing tools provide of the information flow with the stable view from the APL code. Even after refactoring, the visual metaphor of the spreadsheet limits how much can be displayed on a monitor. Sarah made the same comparison today and said she finds the APL code considerably easier to follow than the Excel workbook.
I've posted a small Dyalog APL workspace with a working copy of the function we've been editing today. I think many programmers will find it remarkable that a non-programmer should prefer this description to an Excel workbook.
When I began this post about communicating through the code I intended to write separately about the effects on communication quality and on the code quality. But it turns out that they are closely entangled. Weak communication produced loose code which was hard to refactor. Tight communication produced dense code and lots of refactoring opportunities.
In fact Sarah is now contemplating using the insights she's gained from our work to review and refactor the manual processes we're automating.
Next post: thinking about requirements specification as a Wittgensteinian language-game suggests some interesting practices.
posted by Stephen Taylor |
1:42 AM
Tuesday, June 24
Pair programming with the customer
Earlier I described how I was cutting code as part of my communication with the trainers. They tell me what the system is to do, I code that and we verify the results. In this application domain, my description in APL of what's wanted takes similar time to the trainer describing it in English. So it's like a translation exercise, translating English to APL.
At first I described this as "pair programming with the customer". Then after reflection, I said it wasn't. The aim of pair programming is to raise code quality. The aim of what the trainers and I are doing is to raise communication quality.
We've been doing this for 2 months now and it's time to look at the results. I started working with 2 or 3 trainers, for the last month with Sarah alone.
Tonight, some key practices and communicating through the code with the customer, an unexpected result. Tomorrow, some unexpected effects on communication and code quality.
Programming practices
When the code produces wrong answers Sarah and I step through it with the interpreter, examining the intermediate results. So I write the code to produce the same intermediate results that the manual procedures do.
I don't want any other intermediate results, so ideally the only variable names in the code are names for Sarah's intermediate results.
Ideally, to avoid creating any other variables, I use a single line to represent how one intermediate variable is calculated. I tend to write long lines.
I also create anonymous D-fns 'in line' to avoid creating other variables, even to avoid referring to the same variable twice in one line. (Leading me, I think towards the verb trains in J.)
I want to support skim-reading the code. In skimming code, I need to see only the flow of information between variables. In skimming, dense chains of symbols between the curly braces around a D-fn register simply as do something here.
I also use D-fns merely to highlight the flow of information. For example, where a new value is calculated from a few other variables, I often compose the line as one or a few D-fns so that in skimming the code it is clear that the new value is derived as some function of certain others.
I cannot overemphasise the value of being able to read the code at two levels.
This is similar to the use of Excel's Auditing tools, which allow you to display the precedents or dependents of cells in a spreadsheet. The tools draw arrows over the spreadsheet to show where one number is used or obtained. (You have to read the cell formulas to see exactly how. This corresponds to reading the contents of my D-fns.) But it does not take many arrows to make a big mess, so you need to turn Excel's Auditing on and off for cell after cell to trace the flow of information. In contrast, everything in the APL code is available to a visual scan. The code can be skimmed to follow the flow of information, or the eye can stop to review the detail encapsulated in a D-fn.
A related practice is to name quite small fragments of code, either a D-fn or a derived function, and comment it, so that later it can be read in context by its semantics rather than its mechanics. For example, in code we were writing today, in two places we needed to convert 3-lists into 4-lists by duplicating the first element. That was the mechanics. The semantics was 'replicate the NPR to produce a Reduced'. So we declared a function
rnr „ 2 1 1°/¨ © replicate NPR as Reduced
and used rnr in lines where the reader needs only its semantics, not its mechanics. Unsurprisingly, this semantics for the derived function works only in the context of a single function, so rnr got defined in and localised to that one function.
Communicating through the code
I discuss the code with Sarah as if she were a novice programmer. I review the information flow, pointing to the variable and function names (which she chose) and saying in English what the primitive functions are doing. Occasionally I deconstruct a primitive expression in immediate-execution mode to illustrate or confirm a point, sometimes just to share a particularly elegant or powerful expression. Much of this is in the spirit of play: playing with a complex and fascinating toy.
Today Sarah challenged a line of code. Correctly. In a few related lines, I'd been using Boolean expansions to map calculated results into parts of a collection of arrays. Suddenly Sarah leaned forward and pointed at one of them. "Shouldn't they be for all the columns?" She was right, and had been able to see that the code didn't do it.
So Sarah skim-reads the code and through it we communicate about the rules we're automating.
This is fascinating. I had similar experiences in the 1970s with customers who actually were novice APL programmers. But Sarah doesn't write any APL.
It reminds me of my ability to read Italian. In limited contexts I'm a competent reader and hearer of Italian. Drawing on menu Italian and word roots shared with English, French and Latin I've acquired a vocabulary that meets my needs on holiday and allows me to skim-read newspaper articles.
It also reminds me of the language games Wittgenstein imagined in Philosophical Investigations. Sarah & I play a language game. It doesn't involve her speaking or writing any APL; it's not the same language game that programming or learning programming is. What are the uses of code fragments in this language game?
Similarly, we couldn't play the game with just any APL code. Certain properties of my code are essential to this working. I suspect that the way it supports skimming is crucial, also the use of Sarah's nomenclature in recognisable spellings.
It would be interesting and useful to identify what those properties are and consider how they can be replicated in other languages.
I would like to pay tribute here to what Ken Iverson taught me in 1977 about teaching APL; his emphasis on exploration, hypothesis testing, Socratic dialogue and working in pairs.
posted by Stephen Taylor |
10:32 PM
|
|
|
About this blog |
|
This is the journal of an APL project in which I'm trying out some XP practices. I presented some of my conclusions in a research report at the XP 2003 conference in Genoa in May 2003.
This is also my exploration of blogging. So you may find the appearance of the blog changes drastically from time to time. Or it might be broken next time you visit. It's my personal sandbox.
|
|
Quotes |
|
It's so refreshing to have an almost instant IT solution without endless meetings, planning and yet more meetings.
We've probably been spoilt for the future, but long may this approach reign!
Kevin Wallis
I love it when a plan comes together!
Kim Kennington, The A Team fan
Sarah said
Less is more;
for what we are about to receive, to APL we are truly grateful !!!
Sarah Glasgow, Thomas Cranmer fan
I'll say no more than necessary; if that.
Stephen Taylor, Elmore Leonard fan
|
|
Links |
|
The Agile Alliance is a non-profit organization dedicated to promoting the concepts of agile software development, and helping organizations adopt those concepts
XProgramming.com an Extreme Programming resource, including XP Magazine
Dyadic Systems the Dyalog APL developers. Home of D, Namespaces, Reference Arrays, and
possibly the finest development environment for GUIs in any language
A Programming Language Paul Mansour's blog on APL software development, and inspiration for this blog. Paul, imitation is the sincerest form of flattery
Vector the Journal of the British APL Association, edited by Stefano 'WildHeart' Lanzavecchia
SIGAPL the ACM Special Interest Group for APL and J
Eberhard Lutz on collection oriented languages
Comp.Lang.APL discussion board
|
J Software
Home of J, Iverson's successor to APL, with special emphasis
on understanding mathematics. Take a free
copy for personal use
|
A+ a stripped-down, ASCII-only, run-like-a-train APL subset designed for Morgan Stanley's trading-room applications by Arthur Whitney, the original Jack of Speed
Kx Systems What Arthur Did Next. After Wall St, the language Whitney wrote for his own use to program a database to run 1-2 orders of magnitude faster than Oracle. (Take a free evaluation copy.)
Pavel Kocura teaches K at Loughborough University
|
|
Code |
|
This site uses a special font to display APL code that may be downloaded from Dyadic Systems.
Correspondence: sjt@lambenttechnology.com
|
|
Archives |
|
|
|
|
|
|