I will now list a number of technical buzz words relevant to the Marginalia annotation implementation, along with an explanation of the relevance of each. This covers the major architectural choices of the project, and may be of interest to other developers interested in using my code or in developing similar technology.

Firefox & the W3C Range Object

This implementation uses the W3C Range object to determine the location in the page of user-selected text when creating annotations. The Range object is only fully supported by the Mozilla family of browsers, but it is a standard and may be adopted elswhere in future. I have also implemented special code (i.e. a big hack) that extracts the location of the selection from Internet Explorer, something IE is not designed to provide.

Pointers (No XPointer)

Each annotation is associated with a location in the document. This text range is stored as a path to the start of a block-level element in the document, plus an offset beyond that point. Though the specified point follows the start of the referenced element, it is not necessarily within it. This is essential so that pointers can be compared and ordered without reference to the original document. As for the offset, it is measure in words in order to circumvent problems with inconsistent handling of whitespace between browsers. For example, /2/15.7 specifies the 8th character of the 15th word following the start of the second paragraph in the document.

I chose not to use the W3C's (rather complex) XPointer specification because a) it does not provide a suitable method of specifying a unique character position in a document, and b) XPointers cannot be ordered without reference to the target document. (It might have been possible to extend XPointer, but as I had to pick my battles and XPointer is neither simple nor widely adopted). However, elements are referenced using a format based on XPointer's child sequences.

Javascript, DOM & CSS (AJAX)

The implementation of text highlighting and the placement of annotations in the margin adjacent to highlighted text necessitate the extensive use of Javascript and the DOM. A further advantage to this approach is that there are minimal changes to the annotated page: adding annotation support to a page requires only a few CSS classes, an empty ordered list, and the inclusion of several Javascript files. This also holds out the promise of Greasemonkey or similar implemantion in future.

Microformats

The Javascript code searches the page for a simple structure, defined by CSS classes. Once it has found this, it extracts the information it needs about that "post" using CSS classes as a guide. The default class names are based on the Atom syndication format, which is undergoing standardization at microformats.org. I have discussed the advantages of this approach in my blog.

REST

There are four annotation operations: getting a list of annotations, adding an annotation, deleting an annotation, and modifying an annotation. This is a perfect match for the REST model. The annotation Javascript code communicates over XMLHttp to a service, currently implemented in PHP. I tried using nice URLs for the service, and there is some support in the application, but due to the design of the applications I was integrating with I have focused on less-nice URLs.

Annotea, XPointer & RDF

This may be the direction this code should evolve in, but for now it's just too complex. Annotea uses XPointer, which I have decided against, although my pointer format is inspired by it. As for RDF, I may add support one of these days. However, I have chosen Atom as the native protocol, as it brings with it the immediate benefits of syndication

How Highlighting Works

I have been asked how I highlight arbitrary HTML, even across element boundaries. To do this, I insert one or more <em> elements into the DOM around the highlighted text. The result looks like the following:

<p>One two <em class="annotation a43">three</em></p>
<p><em class="annotation a43">four</em> five six</p>

It's not really pretty, but it's the only way to overlay highlighting on a tree structure like HTML without extending or the browser. In order to prevent the code from creating an invalid document (e.g. by inserting em tags inside tr tags), the implementation checks against the HTML content model. For now, HTML 4.01 Transitional is the model used, because Moodle and other applications tend to allow users to take advantage of deprecated tags.

Paragraph Linking

The OJS implementation allows users to create hyperlinks to block-level elements (usually paragraphs) in other journal articles. The usual way to implement this would be by attaching an HTML id attribute to each potential link target. But this isn't feasible for OJS as the plug-in is not involved in the server-side output of article text. Javascript could add the ids when the page loads, but this would be very slow for long documents.

Intsead, Marginalia uses an XPointer-like form of the fragment identifier in the URL (e.g. #/3/2/1/9). When the user clicks a link to a journal article, Javascript on the destination page checks the fragment identifier to see whether it is an XPointer-style reference to a specific element on the page. If so (e.g. the fragment identifier looks like #/3/2/1/9) , it scrolls the browser to that point and flashes a box around it.

Server- vs. Client-Based Annotation

There are numerous client-based annotation systems that plug into the browser. Some are commercial, others, like Annotea, or not. Yet I've never seen one of these in active use. There is great resistance to installing client-based software. It's also usually browser- and platform-specific (my code is really standards-specific - if and when more browsers support the standards, it will work in others, in the mean time it can be adapted and degrade gracefully). Furthermore, the capability of annotating any web page prevents these systems from annotating objects smaller than a page. My system can annotate an individual forum post and retrieve those annotations regardless of whether it is displayed in a different context or other page content around it changes. Finally, I didn't have the resources to develop client-based software. Having said all of this, it may be possible to integrate the two approaches, perhaps using Greasemonkey or the like.

Smartcopy

Smart copy automatically adds citation information to copy-paste operations from the browser window. For example, it can add the title of the source document along with a link whenever text is copied. This is achieved through a Firefox hack: when the user clicks the mouse button, the extra information is inserted into the document, but a CSS style makes it hidden. When it is pasted, the CSS style rule is missing, or if it is pasted into an application with support, the style is stripped. When the user clicks elsewhere, the hidden information is deleted. This degrades gracefully to an ordinary copy-paste operation if the browser doesn't support the feature. It is limited in that it cannot wrap the selected text, only prefix it, so marking the text as a blockquote (for example) is impossible. The smartcopy information is extracted from the source document according to the CSS classes described under microformats above.

PHP

I don't like it. Magic quotes give me the willies. More importantly, I'm tremendously unimpressed by the weak support for XML and Unicode. However, I have faith in Worse Is Better: PHP is widely adopted, and will (I hope) mature in time. But I don't have to like it.

I would like to thank BC Campus for funding this effort, and for allowing its release under the GNU General Public License, and to Simon Fraser University for their support of the project through Dr. Andrew Feenberg's Applied Communication and Technology Lab in the School of Communication and through the Learning and Instructional Development Centre. Open Journal Systems support and additional features were made possible by Dr. Rick Kopak's "Navigating Information Spaces" project at the University of British Columbia, funded by the Social Sciences and Humanities Research Council of Canada. The UNDESA "Africa i-Parliaments Action Plan" project provided support for edit actions, the per-paragraph multiuser support, and a number of other improvements.

You appear to be using Internet Explorer. You may therefore have difficulty navigating this site. More Information...