Literate OWL (well on blogs)
My next blog post was going to be about function, as I have just had a paper about it accepted. But, I got slightly side-tracked along the way, thinking about Literature Programming as it applies to OWL. While an ontology is (or, to my mind, should be) a computational artifact, it’s a bit different from a program; the main thing is that it doesn’t run; it doesn’t have that functional test that a program does. This is not to say that an ontology is not an application-dependent entity. It can be, but even then it needs to have a program built on it.
One of the upshots of this is that a narrative justification for an Ontology is fairly important; currently, we spend far too long on mailing lists, arguing about ontology terms and, to my mind, not enough of this is reflected in the final outcome. If, on the other hand, we moved to a situation that adding a new concept was equivalent to writing a paper, we might have less of this. Discussion would be a bit more focussed; besides which, most scientists are experienced with writing and reviewing papers, so we’d just be better at it.
For this to happen productively, though, the paper has to become, itself, a computational artifact. It’s not good having documentation that has to be kept in-sync with the ontology; we will just end up with multiple versions, and will never quite know what we are talking about; my discussions about BFO have shown me this; do we mean the OWL, the definitions in the OWL, the papers or what? We should be able to generate both readable documentation and computational OWL at the same time. In short, literate programming.
Now, I know that Bijan Parsia has been investigating this also, but I wanted to think a little bit about how it would fit into my environment.
One thought was to get the system working within asciidoc which I am using to generate these pages. This turned out to be simple enough; take, for instance, this definition for BiologicalFunction.
Class: BiologicalFunction
Annotations:
rdfs:comment
"Definition: A biological function is a realizable entity that inheres in continuant
which is realized in an activity, and where the homologous structure(s) of
individuals of closely related species (or identical species) fulfil this
same biological function.",
SubClassOf:
Function
Asciidoc uses source-highlight for it’s syntax highlighting. I had to add a bit of config (which, annoyingly, needs to be placed into main install directory for source-highlight, rather than in a user space dot-directory.
Unfortunately, this is not going to be as good as you might hope for printed documentation. The obvious solution here is to aim at LaTeX. I think that I am going to have a quick go at producing something like this, inspired by Literate Haskell. Basically, I need three tags which look like this:
\begin{owl}
Class: Thing
\end{owl}
\ignore{
\begin{owl}
Class: BoringOWL
\end{owl}
}
\begin{notowl}
Clazz: BrokenOwl
\end{notowl}
The first copes with OWL that should appear both in the documentation and code (that is most of it). The second covers OWL that should appear just in the code; the haskell example is for a “help” function; I suspect that this is rarely needed for OWL. The final example appears just in documentation; it would be useful for anti-examples (“Don’t do this!!!”). My plan would be to pre-process the latex just using regexps, nothing complex, to dump the OWL to a file, mostly because I don’t know how to get latex to do it. Meanwhile, these two macros would be just be defined in terms of the Listings package (which means writing yet another syntax highlighting set of regexps, oh dear).
Well, this is okay, but has two problems: first, it means writing OWL inside latex which means that editor support is going to be rubbish; second, what if I want to blog AND print a document. My solution to this is to move my ontologies to being multi-file based. As far as I can tell, Manchester OWL is order independent (except for the header). So the plan would be to write multiple files, each with a few Concepts in:
function/header.omn
function/function.omn
function/biological_function.omn
function/artifactual_function.omn
Generating a complete Manchester syntax file from this would be easy
(more or less, just run cat
). This could be supported within latex by
adding some include macros. Again, this is trivial to do with listings
package.
\owl{function.omn}
\ignore{\owl{help.omn}}
\noowl{broken.omn}
Likewise, asciidoc supports it using include macros. I shall give this a go next week. I shall produce a document describing the axiomatisation for function in OWL that started all of this off.
PS Just finished this, and found out that blogpost stripped off all my nice syntax highlighting. Took a bit of effort but (hopefully) it should all be back in again now.