Tag Archives: Text Encoding Initiative

A Review of James Cummings’ “The Text Encoding Initiative and the Study of Literature”

James Cummings is a digital medievalist at Oxford University, specialising in TEI XML.  His article on “The Text Encoding Initiative and the Study of Literature” may be found here.

Cummings begins with a well-grounded description of what the TEI is and why it was founded.  He notes that the TEI has existed since before the web was formed, and so “its recommendations have influenced the development of a number of web standards, most notably XML and XML-related standards”.  His article is not a complete history of the TEI, nor is it a general introduction.  Instead it serves to sample “some of the history, a few of the issues and some of the methodological assumptions” of the TEI.

Cummings goes on to give a general description of the content and structure of the TEI Guidelines.  This seems to be a rather pointless feat, as a quick glance at the TEI’s website will reveal this information.  The main body of this article deals with the technological and theoretical background of the TEI.  It begins with a description of the TEI’s early manifesto, drawn up at a conference at Poughkeepsie in 1987.  This is quite interesting as it allows the reader not only to chronicle the evolution of the TEI, but also to recognise areas of weakness or under-development.  According to Cummings, institutions such as the Oxford Text Archive and the University of Virginia’s Electronic Text Center have greatly assisted in the firm establishment of the TEI’s standards for text-encoding and preservation.

Text Encoding Model

One of the main benefits of the TEI, as Cummings points out, is the fact that it is “driven by the needs of its members, but also directed by […] the technologies it employs”.  It evolves according to necessity.  The TEI incorporates a diverse community of disciplines, resulting in a general encoding structure that can be adapted for basic or specialised modules.  The TEI is very much community-based and continually adapts according to its users’ needs: ” That the nature of the TEI is to be directed by the needs of its users is not surprising given that it is as a result of the need for standardisation and interoperability that the TEI was formed”.  Cummings goes on to describe the fact that the Guidelines have made the elements “more applicable to a greater number of users”.

However, Cummings also points out the disadvantages of such an approach.  He believes that it leads to “methodological inequality”, where specialised markup is used for some projects, whereas others only require more generalised methods. Cummings believes that the solution to this problem is the development of “rigorous local encoding guidelines”.

Cummings communicates a very interesting series of statements towards the centre of his article:

It is needless to say that many involved with the earliest efforts to create systems of markup for computer systems were not literary theorists, but this is not the case with the development of the TEI, which has often benefited from rigorous debate on the very nature of what constitutes a text (McGann 2001: 187).  While the history of textual markup obviously pre-dates computer systems, its application to machine-readable text was partly influenced by simultaneous developments in literary theory and the study of literature.

While these facts may seem obvious to Cummings, they would not be so to someone with no previous knowledge in this area.  For this reason it seems to me that Cummings is writing for his peers rather than a more general audience.  However, a readership with expertise in TEI would find his introduction very basic and perhaps a bit pointless.

The article then goes on to hypothesise that New Criticism may have influenced the application of markup to digital text.  I think it would have been interesting if Cummings dwelt on this point a bit more, however he brushes over it rather quickly.

Cumming believes that the TEI has greatly advanced our understanding of what a text is.  This is a bit far-fetched considering many people have never even heard of the TEI, but Cummings’ description of the hierarchy of texts and their overlapping structure is well elucidated.

Cummings spends much of the remainder of the article quoting from the TEI Guidelines which makes for a rather monotonous read.  Overall, I think he makes some good points but spends a lot of time getting to his main one, which is that the TEI is not a perfect system but with compromise it makes digital representation of texts much easier.

Images from http://it.wikipedia.org/wiki/Text_Encoding_Initiative; http://scripts.sil.org/cms/scripts/page.php?item_id=IWS-Chapter01