As software engineers get crafty with our programming languages, we get really good at being able to code up software to solve just about any problem. Depending on our level of experience, the amount of care we take to make it so, and our own personal tolerances for imperfection, we determine that the code we write is elegant, easy to maintain, and extensible.
As our software ages and grows and the ecosystem it lives in changes, even the cleanest of implementations and abstractions become tainted by the encroachment of changing paradigms and the outside world. We see them break down and develop kludgey workarounds and schisms. Eventually, even seemingly insignificant changes become burdensome or impossible due to design contradictions and complicated dependencies.
We call this system “legacy”, we deprecate it, shrug our shoulders and decide to rewrite the whole thing. When we do, it will be elegant, easy to maintain, and extensible.
Next time we won’t make the same mistakes. We’ll make new ones.
It may even be true that the cycle of death and rebirth in software is useful and even desirable, but that is a topic for another article. What I want to address here is an aspect of software design that I feel is potentially the most significant factor in the premature aging of code and to the cost of writing and rewriting software, namely, the encoding of knowledge in source code.
We commonly look to model the concepts and rules of the domain we’re operating on directly in our application’s source code, which means that the code becomes the authoritative source of truth as to the intent and behavior of the system. In other words, the most reliable knowledge we have about our software’s behavior is embedded and lives within the source code itself.
The problem with this is that “code” is written first and foremost to tell the computer what to do, not to describe it to humans.
This is true in varying degrees, depending on the style of the developer and the flexibility of the expressiveness or orientation of the programming language. You might even be thinking that your code doesn’t have this problem, because you are an adherent of true behavior-driven-development and your specs or tests tell the true story of your software, not the application’s code.
In some sense you’d be right, since application code is no longer the authority. In another sense you’d be twice as wrong, since test coverage only really enables a form of parity in the encoding of knowledge, in that now you have simply encoded your intent in two distinct sets of source code, the application and the test suite.
The specification becomes codependent on the code, instead of existing on its own terms.
Before I continue, perhaps I should explain why it is expensive to put the knowledge into source code in the first place. There are actually a great number of reasons and effects, but lets constrain the issue to the software development lifecycle for this article.
Consider that you are reaching the point of rewriting a system, because it is becoming too costly to extend the current version. Somehow you have to re-express or port all the specifications from the old system that you want to keep into a new system.
There are few artifacts from an old codebase which can be brought to the new one directly. One potential example would be a Cucumber feature file. Cucumber features are interesting software artifacts, because they don’t actually express software implementation details. They can represent a description of the behavior of a software feature, irrespective of the programming language it is written in or any other technical implementation detail.
Cucumber is great at expressing acceptance tests for a feature, but it is not intended to be used for any other purpose. Implementation and test code still needs to be written. What if we expressed our applications in something akin to Cucumber’s feature file format, but used them to generate code for the implementation and test suite?
Coding is easy. Definition is hard.
One thing that becomes clear after years of experience working on large software projects is that writing good code is only a small part of the job. In fact, focusing on code can really detract from the job. Successful software development is at it’s core, much more about defining concepts, communication of intent and managing complexity through abstraction, than it is about writing code.
I have found it very interesting to approach actual “programming” from a purely non-code perspective. I’d be tempted to call it pseudocode, except that the goal of writing software without code is more about modeling and conceptualization than instructions and execution. Consider the following description of the game “snake”:
Snake is a game. The game begins with an empty game board. A game board is a rectangular space surrounded by walls. When the game begins, a snake is placed in the middle of the board and begins to slither to the right. The player controls the direction of the snake. The snake can not stop moving. The snake will die if its head touches it own body or any of the walls of the game board. Apples appear on the board every 20 seconds and disappear if not eaten within 10 seconds. A score is kept, beginning at 0 and increasing by 1 for each second played. If the snake’s head touches an apple, the snake will eat the apple. If the snake eats an apple, the snake will grow and the score will increase by 100 points.
This is a fairly simple game to describe and the specification above offers a good basis for an implementer to produce a working version of the game. What is interesting to me about it as an artifact is how I might use a Cucumber-style “step definition” matchers to map to statements in the specification to relate them to and generate implementation code and corresponding tests.
A definition for “The player controls the direction of the snake” could boil down to the implementation code for the user-input control bindings and some acceptance tests around them. Think Cucumber step definitions, one generating implementation code and the other generating test code.
So I say now that the software is “stupid” because the implementation code itself doesn’t “know” or act as the authority about our intent, because we’re defining our intent in the “plainspeak” specification.
This approach organizes the development effort and its source artifacts around the human oriented program specification. In a way this is a form of Micro-Feature Driven Development, where each statement acts as a feature to itself.
I’ve not actually written software this way, yet.
My goal in writing this post is to put the idea out there and possibly start a discussion around it before diving in whole hog. A bit of tooling would probably be needed to make this manageable. I’m currently exploring ways to do this that are simple.
There’s a company called Intentional Software that does this kind of thing in a way that is not simple and wholly relies on a massive set of proprietary software tooling, which is amazing, but really not simple. I recommend reading about their approach in this old Martin Fowler article. It’s a much more technical solution to the knowledge encoding problem.
I’d like to find a way to solve the essential problem using a simpler plain text and template driven scheme that is friendly and convenient to typical developers. I’ll update the blog with my progress on this front if/as I get there.