Semantic Wiki Metadata Shorthand

I think I’ve hit upon about the tersest, yet cleanest, syntax for the expression of semantic metadata to use in a semantic wiki context. A quick aside: I’ve been refining concepts related to a semantic wiki-like project of mine since at least as early as 2005, and the deceptively simple notation proposed here took an unusually long time to distill. Here’s an example topic source document using the simple syntax:

# Bob

employer: Apple
  since: 1/15/2010
  until: 8/4/2013

employer: Banana
  since: 8/20/2013
  until: present

It should be obvious from the above article text that an entity “Bob” was employed by “Apple” from 1/15/2010 until 8/4/2013, started work at “Banana” on 8/20/2013 and is still employed there.

In pure RDF fashion, the mechanism for relating the since and until metadata to the employer facts about Bob is done via a reification of the employer facts in a way similar to the following expansion (assume prefixes “w” for the local wiki’s namespace and “rdf” for “http://www.w3.org/2000/01/rdf-schema#”)

w:Bob                  w:employer     w:Apple
w:Bob_employer_Apple   rdf:type       w:Statement
w:Bob_employer_Apple   rdf:subject    w:Bob
w:Bob_employer_Apple   rdf:predicate  w:employer
w:Bob_employer_Apple   rdf:object     w:Apple
w:Bob_employer_Apple   w:since        "1/15/2010"
w:Bob_employer_Apple   w:until        "8/4/2013"
w:Bob                  w:employer     w:Banana
w:Bob_employer_Banana  rdf:type       rdf:Statement
w:Bob_employer_Banana  rdf:subject    w:Bob
w:Bob_employer_Banana  rdf:predicate  w:employer
w:Bob_employer_Banana  rdf:object     w:Banana
w:Bob_employer_Banana  w:since        "8/20/2013"
w:Bob_employer_Banana  w:until        w:present

IMHO the shorthand captures the spirit and convenience of a Wiki much more so than the cumbersome approach of other Semantic Wikis (okay, maybe there’s only one other in existence, Semantic MediaWiki!) Also, there’s nothing to prevent the syntax from supporting further qualifications of Statements, so data-points like this are expressible:

# Bob

employer: Apple
  since: 1/15/2010
    source: http://example.com/bobs-resume

The nesting syntax doesn’t address using a Statement as an object of another Statement, but it feels more natural for qualifications to be expressed as metadata about the subject Statement as opposed to the other way around.

Just throwing this out there, because it makes me happy!

Let’s make software as stupid as possible

As software engineers get crafty with our programming languages, we get really good at being able to code up software to solve just about any problem. Depending on our level of experience, the amount of care we take to make it so, and our own personal tolerances for imperfection, we determine that the code we write is elegant, easy to maintain, and extensible.

As our software ages and grows and the ecosystem it lives in changes, even the cleanest of implementations and abstractions become tainted by the encroachment of changing paradigms and the outside world. We see them break down and develop kludgey workarounds and schisms. Eventually, even seemingly insignificant changes become burdensome or impossible due to design contradictions and complicated dependencies.

We call this system “legacy”, we deprecate it, shrug our shoulders and decide to rewrite the whole thing. When we do, it will be elegant, easy to maintain, and extensible.

Next time we won’t make the same mistakes. We’ll make new ones.

It may even be true that the cycle of death and rebirth in software is useful and even desirable, but that is a topic for another article. What I want to address here is an aspect of software design that I feel is potentially the most significant factor in the premature aging of code and to the cost of writing and rewriting software, namely, the encoding of knowledge in source code.

We commonly look to model the concepts and rules of the domain we’re operating on directly in our application’s source code, which means that the code becomes the authoritative source of truth as to the intent and behavior of the system. In other words, the most reliable knowledge we have about our software’s behavior is embedded and lives within the source code itself.

The problem with this is that “code” is written first and foremost to tell the computer what to do, not to describe it to humans.

This is true in varying degrees, depending on the style of the developer and the flexibility of the expressiveness or orientation of the programming language. You might even be thinking that your code doesn’t have this problem, because you are an adherent of true behavior-driven-development and your specs or tests tell the true story of your software, not the application’s code.

In some sense you’d be right, since application code is no longer the authority. In another sense you’d be twice as wrong, since test coverage only really enables a form of parity in the encoding of knowledge, in that now you have simply encoded your intent in two distinct sets of source code, the application and the test suite.

The specification becomes codependent on the code, instead of existing on its own terms.

Before I continue, perhaps I should explain why it is expensive to put the knowledge into source code in the first place. There are actually a great number of reasons and effects, but lets constrain the issue to the software development lifecycle for this article.

Consider that you are reaching the point of rewriting a system, because it is becoming too costly to extend the current version. Somehow you have to re-express or port all the specifications from the old system that you want to keep into a new system.

There are few artifacts from an old codebase which can be brought to the new one directly. One potential example would be a Cucumber feature file. Cucumber features are interesting software artifacts, because they don’t actually express software implementation details. They can represent a description of the behavior of a software feature, irrespective of the programming language it is written in or any other technical implementation detail.

Cucumber is great at expressing acceptance tests for a feature, but it is not intended to be used for any other purpose. Implementation and test code still needs to be written. What if we expressed our applications in something akin to Cucumber’s feature file format, but used them to generate code for the implementation and test suite?

Coding is easy. Definition is hard.

One thing that becomes clear after years of experience working on large software projects is that writing good code is only a small part of the job. In fact, focusing on code can really detract from the job. Successful software development is at it’s core, much more about defining concepts, communication of intent and managing complexity through abstraction, than it is about writing code.

I have found it very interesting to approach actual “programming” from a purely non-code perspective. I’d be tempted to call it pseudocode, except that the goal of writing software without code is more about modeling and conceptualization than instructions and execution. Consider the following description of the game “snake”:

Snake is a game. The game begins with an empty game board. A game board is a rectangular space surrounded by walls. When the game begins, a snake is placed in the middle of the board and begins to slither to the right. The player controls the direction of the snake. The snake can not stop moving. The snake will die if its head touches it own body or any of the walls of the game board. Apples appear on the board every 20 seconds and disappear if not eaten within 10 seconds. A score is kept, beginning at 0 and increasing by 1 for each second played. If the snake’s head touches an apple, the snake will eat the apple. If the snake eats an apple, the snake will grow and the score will increase by 100 points.

This is a fairly simple game to describe and the specification above offers a good basis for an implementer to produce a working version of the game. What is interesting to me about it as an artifact is how I might use a Cucumber-style “step definition” matchers to map to statements in the specification to relate them to and generate implementation code and corresponding tests.

A definition for “The player controls the direction of the snake” could boil down to the implementation code for the user-input control bindings and some acceptance tests around them. Think Cucumber step definitions, one generating implementation code and the other generating test code.

So I say now that the software is “stupid” because the implementation code itself doesn’t “know” or act as the authority about our intent, because we’re defining our intent in the “plainspeak” specification.

This approach organizes the development effort and its source artifacts around the human oriented program specification. In a way this is a form of Micro-Feature Driven Development, where each statement acts as a feature to itself.

I’ve not actually written software this way, yet.

My goal in writing this post is to put the idea out there and possibly start a discussion around it before diving in whole hog. A bit of tooling would probably be needed to make this manageable. I’m currently exploring ways to do this that are simple.

There’s a company called Intentional Software that does this kind of thing in a way that is not simple and wholly relies on a massive set of proprietary software tooling, which is amazing, but really not simple. I recommend reading about their approach in this old Martin Fowler article. It’s a much more technical solution to the knowledge encoding problem.

I’d like to find a way to solve the essential problem using a simpler plain text and template driven scheme that is friendly and convenient to typical developers. I’ll update the blog with my progress on this front if/as I get there.

Usergenic Manifesto

As we use tools, they become an extension of us. Our minds trace over the fractal surface of the possibilities they enable and we are ever influenced subtly by their design.

We become fluent with tools when they are convenient and predictable. These qualities are essential to attracting and engaging us. Power at the expense of convenience or cleverness at the expense of predictability frustrate our drive to connect.

Tools can be complicated in ways that meaningfully improve their usefulness, such as with a ratcheted screwdriver handle. They can also be made unwieldy when form is sacrificed in the name of function, as in the Wenger Giant 85-tool Swiss Army Knife.

When tools are humane, they empower, enrich and expand us. When they are not, they perpetuate a myth about our weakness. Tools transform us and the world around us.

Their invention is a sacred art.

Rebooted.

In reviewing my attempts at building a coherent expression of myself online over the past two decades, I conclude I have been wildly inconsistent and ineffective. I have not a singular vision in this regard; I get “visions” all the time, but my focus changes so regularly, I inevitably wind up tearing down my blogs and site experiments after a while, because they feel obsolete when I get bored of them and I’m on to the next vision.

This modus operendi is ultimately very unfulfilling, however, and I feel increasingly adrift. I am at the point now where I really need to set up a place to think out loud again and gather some much needed psychic traction.

It’s time to start this blog again. And this time… it’s for realsies.

I’ve been inspired by Michael Hyatt and other bloggers and podcasters espousing the virtues of cultivating a “platform” and realize that I’ve been creating and destroying my platform for years in a Sysiphean dance. If I just cut out the destruction part of my process, I might actually have something to be proud of in a little while. So I’m intentionally committing to keeping at least this site around.

With that said, I’ve spent a good amount of time tonight getting everything installed and wired up– I’ll probably write up what I did there in another post. In the interim, I’m off to bed.