
Results of Our Work
For the past three years, the Cultural Heritage Language Technologies consortium has engaged in research about the most effective ways to apply technologies and techniques from the fields of computational linguistics, natural language processing, and information retrieval technologies to challenges faced by students and scholars who are working with texts written in Greek, Latin, and Old Norse. While the project was formally organized into workpackages, our work is better understood in the context of four major thematic areas: 1) providing access to primary source materials that are often rare and fragile, 2) helping readers understand texts written in difficult languages, 3) enabling researchers to conduct new types of scholarship, and 4) preserving digital resources for the future.
Our research has produced concrete results in each of these areas; over the past three years, we have:
- Provided Access to Rare and Fragile Source Materials
- Created corpora of texts written in Early-Modern Latin and Old Icelandic;
- Created transcriptions, images and hand coded texts of
over 60,000 words of Newton MSS in fully searchable form
- Created a digital library environment that allows for high resolution images of pages from rare and fragile printed books and manuscripts to be presented alongside these automatically generated hypertexts so that users can read the texts and see the pages on which they originally appeared;
- Helped Readers Understand Texts Written in Difficult Languages
- Enabled Scholars to Conduct New Types of Scholarship
- Developed an innovative visualization tool that clusters these search results in conceptual categories;
- Used computational techniques to build a word-profile tool that integrates statistical data, information from different reference works, and citations of the passages where words appear into a single interface that is being used by a team of scholars to write the first new Greek-English lexicon to be created in more than one hundred years;
- Created tools that allow for the computational study of style including tools to discover common subjects and objects of Greek and Latin verbs, the relative frequency of different grammatical forms, and the distribution of grammatical forms in texts;
- Worked to Preserve Digital Resources for the Future
- Worked to develop digital library architectures and protocols for resource discovery and metadata sharing in affiliated digital libraries;
- Negotiated a free open-access agreement with Cambridge University Press for an electronic edition of the Greek-English lexicon to be published on-line simultaneously with the print edition;
- Explored ways that these tools can be used and shared across cooperating digital library systems.