Ontology Building with Emacs
I have just started to build an ontology and I have to admit that it has been a while since I have done this; I think that the last time was when writing a paper about function (n.d.a) so I was interested to see how it would work. I’ve have been engaged in discussions recently about syntactic aspects of OWL (n.d.b) the main reason for this is my long-held believe of the need for editing tools that work at the syntactic level; this allows us to plug in to the enormous body of programming tools supporting building, collaborative development, versioning and so on. So, I decided to build the entire thing using Emacs; the nature of the ontology also meant that I wanted to reboot my long-neglected attempts to bring literate development to ontologies (n.d.c) While it is not a large ontology I did manage 60 classes in an afternoon, so I am quite pleased with the results.
My basic working environment is as follows:
Emacs
for editing
omn-mode.el
providing basic OWL Manchester
Syntax Support
pabbrev.el
dynamic and automatic abbreviation support
Protege
for viewing the ontology, and running reasoners
As an environment, this works quite well now. Although I have tried it
before, Protege seems to work much better now when (mis)-used as a
display environment. First, when loading a file into Protege it gives a
report of errors, but has a nice “reload” button so that I can now fix
the errors (or at least try to). Second, after an ontology has been
loaded, Protege will now detect that the file has changed an offer to
reload it. In general, it works quite well. There are still some
issues — I have not had time to work this out reproducibly enough for a
bug report — but there are times when Protege breaks, particularly when
I change the file and break the syntax. I can live with this — while
restarting Protege is slow in computer time with the Java loading, it
doesn’t require lots of clicking to get back to where I started (“Open
Recent”), so it’s quick for the user. Finally, and most importantly of
all, Protege’s Manchester syntax parser now seems to support comment
characters correctly — at least in my hands it treats “#
” as a comment
character.
Using Emacs as an editing environment over Manchester syntax has some considerable advantages over using Protege raw. I am very keyboard-centric while Protege is very mouse-centric; just moving backward and forward to class definitions with incremental search is much better than in Protege. Simple things like search and replace, especially with regexp just happens naturally in Emacs and there is no equivalent in Protege.
I wrote omn-mode.el a long time ago now; I don’t remember when although
the last update to it’s original Subversion repository is from 2005
according to the :Date :
inside the file. omn-mode is based on
generic.el, and it is starting to get a little stretched for this now; I
should move it to using the normal define-major-mode
functionality.
However, this does reflect what it is best at which is syntax
highlighting. I had to fiddle with this a bit, to add support for some
extra keywords and just make it more consistant. I am working on the
basis that everything should be syntax-highlighted; although this makes
Manchester Syntax files a little garish, it helps get the syntax
correct. I also improved some of the regexps: so " some "
has been
changed to "\\<some\\>"
.
As Protege is now doing comments, I have added proper support for these
also, although as a comment character “#
” is a bit irritating, given
that is also a valid part of a URL. So I fudged a bit here and used
“#
” as a two character start. This means that multiple comment
characters such as “##
” which I use in Lisp is not going to work. But
fixing the situation would be much, much harder. It seems to me that
“#
” is not an ideal choice; a lisp-like “;
” would be better; I
think, technically, you can find a “;
” in a URL, but I have never seen
one.
While the previous indentation engine (based purely on the previous line) worked surprisingly well, I have also improved this now. Actually Manchester syntax is surprisingly easy to indent reasonably well; an ontology is essentially a bag of axioms, so it has relatively little structure at the syntactic level, which means that my engine only uses three indentation levels.
Finally, I’ve also updated the mode to recognise both ":"
, and "_"
as a Word constituant, for reasons that should become clear.
pabbrev.el is my own dynamic “as you type” abbreviation expansion
package, and it is still the nicest thing that I have written for Emacs.
I use it every day (and am using it now). I did notice one minor bug
with it which was making is misbehave, but the main change comes from
the update to omn-mode’s syntax table. pabbrev.el now expands prefixes
and terms as a single word. For me, at least, this seems to work well.
The change to the syntax table will also affect other dynamic
abbreviation packages as well, including dabbrev and hippie-expand.
Similarly underscore_separated_terms
should expand.
The combination of these changes means that, for me anyway, Emacs is now
quite a capable Manchester syntax editor. I have only really touched
here on the things that I have written, but editing OWL ontologies at
the textual level also really does open up many possibilities. Having an
environment that I control is also useful. I would like to extend
Manchester syntax to support semantics-free identifiers
(n.d.b) I can now make
an initial implementation of this, using a pre-processsor to unwind the
Alias:
definitions to produce a “real” Manchester syntax file.
———. n.d.a. https://dx.doi.org/10.1186/2041-1480-1-S1-S4.
———. n.d.c. https://www.russet.org.uk/blog/1213.