Vasser college text encoding initiative or tei

#Vasser college text encoding initiative or tei full#

For example, this paragraph (p) has been marked up into sentences (s) and clauses (cl). TEI allows texts to be marked up syntactically at any level of granularity, or mixture of granularities. There is also a samples page on the TEI wiki, which gives examples of real-world projects that expose their underlying TEI. The text of the TEI guidelines is rich in examples. A variety of options to represent this sort of data is suggested by the guidelines.

#Vasser college text encoding initiative or tei full#

It is a manageable selection from the extensive set of elements available in the full TEI Guidelines.Īs an XML-based format, TEI cannot directly deal with overlapping markup and non-hierarchical structures. It defines an XML-based file format for exchanging texts. TEI Lite is an example of such a customization. In addition to documenting and describing each TEI tag, an ODD specification specifies its content model and other usage constraints, which may be expressed using schematron. The TEI defines a sophisticated customization mechanism known as ODD for this purpose.

Most users of the format do not use the complete range of tags, but produce a customisation using a project-specific subset of the tags and attributes defined by the Guidelines. A number of tools support the production of the guidelines and the application of the guidelines to specific projects.Ī number of special tags are used to circumvent restrictions imposed by the underlying Unicode glyph to allow representation of characters that do not qualify for Unicode inclusion and choice to allow overcome the required strict linearity. Schemata in most of the modern formats ( DTD, RELAX NG and W3C Schema) are generated automatically from the tag-by-tag definitions. The standard is split into two parts, a discursive textual description with extended examples and discussion and set of tag-by-tag definitions. ( word, sentence, character, glyph, person, Įtc.) each is grounded in one or more academic disciplines and examples are given. There are some 500 different textual components and concepts Mylonas and Renear also note that the TEI has accomplished two other major achievements: it has produced a powerful new data description language (which is influencing the development of new The format differs from other well-known open formats for text (such as HTML and OpenDocument) in that it is primarily semantic rather than presentational the semantics and interpretation of every tag and attribute are specified. This effort was completely successful and the TEI Guidelines are now widely accepted as the standard interchange format for textual data.

It had as its original objective the development of an interchange language for textual data. The Text Encoding Initiative (TEI), was launched in 1987 and sponsored by the Association for Computers and the Humanities, the Association for Literary and Linguistic Computing, and the Association for Computational Linguistics. Mylonas and Renear introduce a volume of selected papers from The Text Encoding Initiative 10th Anniversary Conference, held at Brown University in November 1997.