Old Norse Morphological Analyzer Overview

The analyzer is a computer program that receives normalized Old Norse words as input and outputs likely declension tables. There are actually two programs. The first analyzer accepts headwords from the complete Zoega (1957) dictionary only. The second analyzer accepts Old Norse words in any declined form. Trial programs can be found online at http://ecampusdev.humnet.ucla.edu/curban. The programs are written in Perl and utilize a MySQL database to retrieve grammatical information regarding headwords and translations. The project runs on an Apache server with a Linux operating system for increased portability.

The first analyzer builds declension tables according to regular rules of sound change and declension paradigms. The current database contains the Zoega dictionary and declension tables for the various word forms (including irregular forms) in the Fornrit normalization. The parser looks up the given headword in the dictionary. Based on the grammatical information given in Zoega, it suggests a likely word root. In a next step, the parser queries the database for possible declension endings. Finally, it applies the various sound changes and outputs its results. A sample output looks like the following:

The second analyzer accepts any declined Old Norse word as input. Based on the input's ending and vowel structure, it produces all possible headwords for the input. These headwords are then fed into the first parser. The program outputs those declension tables produced by the first parser that contain the original input word. In other words, the two parser programs interact to make the search more efficient. The same sample output looks like the following (minus the translation).

The second parser inspects the declined input word 'börnum'. It then deducts the headword 'barn' from the word's ending '-um' and its vowel pattern. The headword 'barn' is then fed into the first parser to yield the above declension table.

Parser performance is verified using Bower (1994) and Gordon (1956) declension tables. At this point, the system performs well for nouns and adjectives. The slightly lower performance with verbs stems mainly from an increased number of exceptional verbs plus ambiguous grammatical information in our source dictionary.

Most Recent Achievements

At this stage, the parser recognizes all words from the Zoega dictionary. In addition, the database tables of exceptional word forms have been expanded. The parser is now able to recognize more irregular forms. Code improvement in the area of sound change routines have also contributed to increased parsing performance.

Work on the second parser (accepting declined words, not only headwords) has progressed. A trial program is now available online for most Old Norse verbs, nouns, and adjectives.

Next Steps

In the coming year performance of the second parser will be improved. In addition, the dictionary will be expanded with a focus on irregular forms. In the near future, the parsers will be linked to an XML document such that a click on individual words in the text will output their morphological information.