<?xml version="1.0" encoding="utf-8"?>
<?oxygen RNGSchema="../../common/schema/DHQauthor-TEI.rng" type="xml"?>
<?oxygen SCHSchema="../../common/schema/dhqTEI-ready.sch"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0"
     xmlns:dhq="http://www.digitalhumanities.org/ns/dhq"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:cc="http://web.resource.org/cc/">
   <teiHeader>
      <fileDesc>
         <titleStmt>
            <title>What Your Teacher Told You is True: Latin Verbs Have Four Principal Parts </title>
            <author>Raphael Finkel</author>
            <dhq:authorInfo>
               <dhq:author_name>Raphael <dhq:family>Finkel</dhq:family>
               </dhq:author_name>
               <dhq:affiliation>University of Kentucky</dhq:affiliation>
               <email>raphael@cs.uky.edu</email>
               <dhq:bio>
                  <p>Raphael Finkel received a PhD from Stanford University in 1976 in the area of Robotics. He was a faculty member of the University of Wisconsin - Madison from 1976 to 1987.  He has been a professor of computer science at the University of Kentucky in Lexington since 1987.  His early research involves distributed data structures, distributed algorithms, and distributed operating systems.</p>
                  <p>Recent projects include formalizing natural-language morphology with
      default inheritance hierarchies, designing and implementing a web-based scheme for students to work on organic chemistry homework, and using constraints to generate puzzles like Sudoku, to model an advice-giving scenario, and to build and solve logic puzzles.</p>
                  <p>Dr. Finkel has published over 50 articles in refereed journals and conferences and has produced over 50 technical reports.  He has written two textbooks: <title rend="italic">An Operating Systems Vade Mecum</title>, (Prentice-Hall, 1988), and <title rend="italic">Advanced Programming Language Design</title> (Benjamin-Cummings, 1996).  He is also a coauthor of <title rend="italic">The Hacker's Dictionary</title> (Harper and Row, 1983).</p>
               </dhq:bio>
            </dhq:authorInfo>
            <author>Gregory Stump</author>
            <dhq:authorInfo>
               <dhq:author_name>Gregory <dhq:family>Stump</dhq:family>
               </dhq:author_name>
               <dhq:affiliation>University of Kentucky</dhq:affiliation>
               <email>gstump@uky.edu </email>
               <dhq:bio>
                  <p>Gregory T. Stump is Professor of English &amp; Linguistics at the University of Kentucky.  He earned his Ph.D. in Linguistics from the Ohio State University in 1981.  His areas of research specialization include morphological theory, the Indo-Iranian languages, and the Breton language.  In recent years, his work has focussed on the development of Paradigm Function Morphology, a realizational theory of inflection in which paradigms are taken to be central to the definition of a language’s morphology; on the use of principal-part analysis as a basis for morphology typology; and on the grammar of the Shughni language.  He is the author of <title rend="italic">Inflectional Morphology: A Theory of Paradigm Structure</title> (Cambridge, 2001) and of numerous articles in linguistics journals and edited volumes.  He is currently serving as review editor of <title rend="italic">Language</title> and as one of the main editors of <title rend="italic">Word Structure</title>.</p>
               </dhq:bio>
            </dhq:authorInfo>
         </titleStmt>
         <publicationStmt>
            <idno type="DHQarticle-id">000032</idno>
            <idno type="volume">003</idno>
            <idno type="issue">1</idno>
            <dhq:articleType>article</dhq:articleType>
            <date when="2009-02-26">26 February 2009</date>
            <availability>
               <cc:License xmlns="http://digitalhumanities.org/DHQ/namespace"
                           rdf:about="http://creativecommons.org/licenses/by-nc-nd/2.5/"/>
            </availability>
         </publicationStmt>
         <sourceDesc>
            <p>Authored for DHQ; migrated from original DHQauthor format</p>
         </sourceDesc>
      </fileDesc>
      <encodingDesc>
         <classDecl>
            <taxonomy xml:id="dhq_keywords">
               <bibl>DHQ classification scheme; full list available in the <ref target="http://www.digitalhumanities.org/dhq/taxonomy.xml">DHQ keyword taxonomy</ref>
               </bibl>
            </taxonomy>
            <taxonomy xml:id="authorial_keywords">
               <bibl>Keywords supplied by author; no controlled vocabulary</bibl>
            </taxonomy>
         </classDecl>
      </encodingDesc>
      <profileDesc>
         <langUsage>
            <language ident="en"/>
         </langUsage>
      </profileDesc>
      <revisionDesc>
          <change when="2008-10-24" who="Melanie Kohnen">Encoding</change>
          <change when="2008-10-27" who="Melanie Kohnen">Encoding</change>
          <change when="2008-11-17" who="Melanie Kohnen">Encoding</change>
         <change when="2008-12-16" who="JHF">Reviewed encoding, fixed errors, made phrase-level encoding of emph, hi, term, code more consistent. Fixed typos and image numbering, revised teaser, added missing glossary refs, removed line-end hyphenation. </change>
         <change when="2009-03-11" who="JHF">Made final changes per Greg Crane</change>
         <change when="2009-01-30" who="CRB">Added publicationStmt element and associated content.</change>
      </revisionDesc>
   </teiHeader>
   <text>
      <front>
         <dhq:abstract>
            <p>We describe two different strategies for generating the morphology of Latin verbs. First, we hand-code default inheritance hierarchies in the KATR formalism, treating inflectional exponents as markings associated with the application of rules by which complex word 
          forms are deduced from simpler roots or stems. The high degree of similarity among verbs of different conjugation classes allows us to formulate general rules; these general rules are, however, sometimes overridden by conjugation-specific rules. This approach allows linguists to gain an appreciation for the structure of verbs, gives teachers a foundation for organizing lessons in morphology, and provides students a technique for generating forms of any verb. Second, we start with a paradigm chart, then automatically remove common parts and redundant morphosyntactic property sets (columns), combine similar conjugations (rows), and generate the KATR theory that produces a complete table of forms for a set of lexemes. This second approach automatically determines principal parts (for Latin, we verify that there are four), groups inflection classes into super-classes, and builds full paradigm charts. 
    </p>
         </dhq:abstract>
         <dhq:teaser>
            <p>Two approaches to analyzing the morphology of Latin verbs</p>
         </dhq:teaser>
      </front>
      <body>
         <head>What Your Teacher Told You is True: Latin Verbs Have Four Principal Parts </head>
         <div>
            <head>Introduction</head>
            <p>Recent research into the nature of morphology has demonstrated the feasibility of two alternative approaches to the definition of a language’s inflectional system. Central to both approaches is the notion of an inflectional paradigm. In general terms, the <term>inflectional paradigm</term> of a lexeme L can be regarded as a set of <ref target="#glossary1">cells</ref>
               <note>See the glossary for a definition of technical terms.</note>, where each cell is the pairing of L with a set 
       of morphosyntactic properties, and each cell has a word form as its realization; for instance, the paradigm of the lexeme <hi rend="italic">walk</hi> includes cells such as <code>&lt;WALK, {3rd singular present indicative}&gt;</code> and <code>&lt;WALK, {past}&gt;</code>,
       whose 
         realizations are the word forms <hi rend="italic">walks</hi> and <hi rend="italic">walked</hi>. 
     </p>
            <p>Given this notion, one approach to the definition of a language’s inflectional system is the <term>realizational</term> approach (see <ptr target="#matthews1972"/>, <ptr target="#zwicky1985"/>, <ptr target="#anderson1992"/>, <ptr target="#corbett1993"/>, <ptr target="#stump2001"/>). In this approach, each 
       word form in a lexeme’s paradigm is deduced from the lexical and morphosyntactic properties of the cell that it realizes by means of a system of 
       morphological rules. For instance, the word form <hi rend="italic">walks</hi> is deduced from 
       the cell &lt;WALK, {3rd singular present indicative}&gt; by means of the rule of <hi rend="italic">-s</hi> suffixation, which applies to the root <hi rend="italic">walk</hi> of the lexeme WALK to express 
         the property set {3rd singular present indicative}. 
     </p>
            <p>An alternative approach to the definition of a language’s inflectional 
       system is the <term>implicative</term> approach (see <ptr target="#blevins2005"/>, <ptr target="#blevins2006"/>, <ptr target="#finkel2008"/>). According to this approach, certain word forms in a lexeme’s 
       paradigm serve as the basis for inferring the paradigm’s other forms. In 
       Old English, for instance, the word form <hi rend="italic">hældon</hi> 
               <q>healed (plural)</q> may be 
       deduced from the word form <hi rend="italic">hælde</hi> 
               <q>healed (3rd singular)</q> in accordance 
       with a general principle that in the inflection of a weak verbal lexeme L, the 
       realization of &lt;L, {3rd singular past indicative}&gt; and that of &lt;L, {plural 
       past indicative}&gt; stand in the relation <hi rend="italic">Xde</hi> ↔ <hi rend="italic">Xdon</hi>.</p>
            <p>Despite their differences, both approaches are capable of generating 
       a language’s inflected forms. We demonstrate this claim for Latin. We 
       first present a realizational analysis of Latin in the KATR language <ptr target="#finkel2002"/>. KATR is based on DATR, a formal language 
       for representing lexical knowledge designed and implemented by Roger 
       Evans and Gerald Gazdar <ptr target="#evans1989"/>. We then present an implicative analysis that uses techniques of abstraction and grouping to derive both a principal-part analysis and a different KATR theory for Latin. </p>
            <p>This research is part of a larger effort aimed at elucidating the morphological structure of natural languages. In particular, we are interested in 
       identifying the ways in which default-inheritance relations describe a language’s morphology as well as the theoretical relevance of the traditional 
       notion of principal parts. </p>
         </div>
         <div>
            <head>Benefits</head>
            <p>As we demonstrate below, the realizational approach leads to a Latin KATR 
       theory that provides a clear picture of the morphology of Latin verbs. Different audiences might find different aspects of it attractive.
     
     <list type="unordered">
                  <item>A <hi rend="bold">linguist</hi> can peruse the theory to gain an appreciation for the structure of Indo-European verbs in general and Latin verbs in particular, 
         with all exceptional cases clearly marked either by morphophonological 
         <ref target="#glossary3">diacritics</ref> or by rules of <ref target="#glossary7">sandhi</ref>, which are segregated from all the other rules. 
       </item>
                  <item>A <hi rend="bold">teacher</hi> of the language can use the theory as a foundation for organizing lessons in morphology.</item>
                  <item>A <hi rend="bold">student</hi> of the language can suggest verb roots and use the theory 
         to generate all the appropriate forms, instead of locating the right 
         paradigm in a book and substituting consonants. 
       </item>
               </list>
            </p>
            <p>The implicative approach that we demonstrate in this paper has several 
       benefits.
     
     <list type="unordered">
                  <item>It automatically determines which forms of a verb could be treated 
         as principal parts. For Latin, we compute that four principal parts 
         suffice.</item>
                  <item>It allows us to group inflection classes into super-classes. For Latin 
         verbs, there are more than four inflection classes if one takes into 
         account variations in such forms as those of the active perfect and 
         passive participle; our grouping method shows that the traditional 
         organization into four conjugations is consistent with super-classes 
         of our more finely detailed set of inflection classes.</item>
                  <item>It generates charts showing the full paradigm of lexemic exemplars; 
         such charts can have pedagogic value.</item>
               </list> 
            </p>
         </div>
         <div>
            <head>A Realizational KATR Theory for Latin</head>
            <p>The purpose of the KATR theory described here is to generate verb forms 
       for Latin, specifically, the realizations of all combinations of the morphosyntactic properties of voice (active/passive), mood (indicative/subjunctive), 
       aspect (imperfective/perfective), tense (present/past/future), number (singular/plural), and person (1/2/3). The combinations form a total of 144 
       <hi rend="bold">morphosyntactic property sets</hi> (MPSs). However, Latin has no future subjunctive, reducing the total to 120 MPSs.</p>
            <p>Latin verbs consist of a sequence of morphological formatives, arranged 
       in five slots:
     
     <list type="unordered">
                  <item>Root, which realizes the verb’s lexeme, possibly dependent on the 
         aspect. For instance, for the verb <hi rend="italic">laudō</hi>, the stem is <hi rend="italic">laud</hi>.</item>
                  <item>Tense marker 1, which realizes part of the verb’s tense, possibly dependent on the mood. For instance, for past perfective subjunctive, 
         this marker is <hi rend="italic">issē</hi>.</item>
                  <item>Tense marker 2, which realizes another part of the verb’s tense. This 
         marker is usually empty. For the past indicative, though, it is <hi rend="italic">ā</hi>, and 
         for the present perfective subjunctive, it is <hi rend="italic">ī</hi>. 
       </item>
                  <item>Person/Number, which realizes a verb’s properties of person and 
         number, possibly dependent on other categories. For example, the 
         marker for the first person singular present indicative is <hi rend="italic">ō</hi>. </item>
                  <item>Voice, which realizes a verb’s voice. It is empty for the active voice, 
         and is usually <hi rend="italic">r</hi> for the passive voice. </item>
               </list> 
            </p>
            <p>To keep our discussion short, we omit the imperative and infinitive forms, 
       although our complete KATR theory includes them without difficulty. For 
       those forms that use a participle (such as the perfective passive), we limit 
       ourselves to a single form, the masculine singular. 
     </p>
            <p>There are four frequently encountered conjugations, distinguished by 
       their theme vowel: first (ā: <hi rend="italic">laudāre</hi>), second (ē: <hi rend="italic">monēre</hi>), third (i: <hi rend="italic">dūcere</hi>, 
       <hi rend="italic">capere</hi>), fourth (ī: <hi rend="italic">audīre</hi>). The third conjugation has two variants; in one (<hi rend="italic">capere</hi>), the theme vowel is more pronounced. 
     </p>
         </div>
         <div>
            <head>The Conjugation-1 Verb laudāre <q>Praise</q>
            </head>
            <figure xml:id="figure1">
               <graphic url="resources/images/finkel_and_stump_2007_fig1.jpg"/>
               <figDesc>A diagram showing a network of nodes for generating forms of verbs</figDesc>
               <dhq:caption>A network of nodes for generating forms of verbs in five conjugations.</dhq:caption>
            </figure>
            <p>A theory in KATR is a network of <ref target="#glossary6">nodes</ref>. The network of nodes constituting our verb morphology theory is partially represented in Figure 1. The organizational principle in this network is hierarchical: The tree structure’s 
       terminal nodes represent individual verbal lexemes, and each of the non-terminal nodes in the tree defines default properties shared by the lexemes 
       that it dominates. 
     </p>
            <p>Each of the nodes in a theory houses a set of <term>rules</term>. We represent the 
       verb <hi rend="italic">laudāre</hi> 
               <q>praise</q> by a node: 
       
       
<eg>Praise: 
  1 &lt;root&gt; == l a u d 
  2 &lt;&gt; == VerbA</eg>     
       
       
       The node, named <code>Praise</code>, has two rules, which we number for discussion purposes only. KATR syntax requires that a node be terminated 
       by a single period (full stop), which we omit here. Our convention is to 
       name the node for a lexeme by a capitalized English word (here <code>Praise</code>) 
       representing its meaning. 
     </p>
            <p>Rule 1 says that a query asking for the root of this verb should produce a 
       four-atom result containing l, a, u, and d. Rule 2 says that all other queries 
       are to be referred to the <code>VerbA</code> node, which we introduce below. 
     </p>
            <p>A <term>query</term> is a list of atoms, such as <code>&lt;root&gt;</code> or <code>&lt;active indicative perfect present 1 sg&gt;</code>, addressed to a node such as <code>Praise</code>. In our 
       theory, the atoms in queries generally represent <term>morphological formatives</term> 
       (such as <code>root, themeVowel</code>), <term>morphosyntactic properties</term> (such as <code>perfect, sg</code>) or <term>surface forms</term> (specific orthographic characters). </p>
            <p>A query addressed to a given node is matched against all the rules 
       housed at that node. A rule <term>matches</term> if all the atoms on its left-hand side 
       match the atoms in the query. A rule can match even if its atoms do not exhaust the entire query. In the case of <code>Praise</code>, a query <code>&lt;root perfect&gt;</code> 
         is matched by Rules 1 and 2; a query <code>&lt;themeVowel&gt;</code> is only matched by 
           Rule 2. 
     </p>
            <p>Left-hand sides expressed with <term>path notation</term> (<code>&lt;pointed brackets&gt;</code>) 
       only match if their atoms match an initial substring of the query. Left-hand 
       sides expressed with <term>set notation</term> (<code>{braces}</code>) match if their atoms are all 
       expressed, in whatever position, in the query. We usually use set notation 
       for queries based on morphological formatives and morphosyntactic properties, where order is insignificant, but path notation for queries based on 
       surface forms, where order is significant. </p>
            <p>When several rules match, KATR picks the best match, that is, the one 
       whose left-hand side <q>uses up</q> the most of the query. This choice embodies Pāṇini’s principle, which entails that if two rules are applicable, the 
       more restrictive rule applies, to the exclusion of the more general rule. We 
       sometimes speak of a rule’s <term>Pāṇini precedence</term>, which is the cardinality of 
       its left-hand side. If a node in a KATR theory houses two applicable rules 
       with the same Pāṇini precedence, we consider that theory malformed.</p>
            <p>In our case, Rule 2 of <code>Praise</code> only applies when Rule 1 does not apply, 
       because Rule 1 is always a better match if it applies at all. Rule 2 is called a
       <term>default rule</term>, because it applies by default if no other rule applies. Default 
       rules define a hierarchical relation among some of the nodes in a KATR 
       theory; thus, in the tree structure depicted in <ref target="#figure1">Figure 1</ref>, node X immediately 
       dominates node Y iff Y houses a default rule that refers queries to X.</p>
            <p>KATR generates output based on queries directed to nodes representing individual lexemes. Since these nodes, such as <code>Praise</code>, are not referred to by other nodes, they are called <term>leaves</term>, as opposed to nodes like <code>VerbA</code>, 
       which are called <term>internal nodes</term>. The KATR theory itself indicates the list 
       of queries to be addressed to all leaves. Here is the output that KATR generates for several queries directed to the <code>Praise</code> node. 
     </p>
            <p>
               <table xml:id="table1">
                  <row>
                     <cell>
                        <code>active,indicative,imperfective,present,sg,1</code>
                     </cell>
                     <cell>
                        <code>laudō </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,indicative,imperfective,past,sg,2</code>
                     </cell>
                     <cell>
                        <code>laudābās</code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,indicative,imperfective,past,sg,3</code>
                     </cell>
                     <cell>
                        <code>laudābat </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,indicative,imperfective,future,pl,1</code>
                     </cell>
                     <cell>
                        <code>laudābimus </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,indicative,perfective,present,pl,2</code>
                     </cell>
                     <cell>
                        <code> laudāvistis</code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,indicative,perfective,past,pl,3</code>
                     </cell>
                     <cell>
                        <code>laudāverant </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,indicative,perfective,future,sg,1</code>
                     </cell>
                     <cell>
                        <code>laudāverō </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,subjunctive,imperfective,present,sg,2</code>
                     </cell>
                     <cell>
                        <code>laudēs </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,subjunctive,imperfective,past,sg,3</code>
                     </cell>
                     <cell>
                        <code>laudāret </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,subjunctive,imperfective,past,pl,1</code>
                     </cell>
                     <cell>
                        <code>laudārēmus </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>active,subjunctive,perfective,present,pl,2</code>
                     </cell>
                     <cell>
                        <code>laudāverītis </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>passive,indicative,imperfective,present,pl,3</code>
                     </cell>
                     <cell>
                        <code>laudantur </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>passive,indicative,imperfective,past,sg,1</code>
                     </cell>
                     <cell>
                        <code>laudābāmer</code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>passive,indicative,imperfective,past,sg,2</code>
                     </cell>
                     <cell>
                        <code>laudābāris </code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>passive,indicative,imperfective,future,sg,3</code>
                     </cell>
                     <cell>
                        <code>laudābitur</code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>passive,indicative,perfective,present,pl,1</code>
                     </cell>
                     <cell>
                        <code>laudātīsumus</code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>passive,indicative,perfective,past,pl,2</code>
                     </cell>
                     <cell>
                        <code>laudātīerātis</code>
                     </cell>
                  </row>
                  <row>
                     <cell>
                        <code>passive,indicative,perfective,future,pl,3</code>
                     </cell>
                     <cell>
                        <code>laudātīerunt</code>
                     </cell>
                  </row>
               </table>
       
       The rule for <code>Praise</code> illustrates the strategy we term <term>provisioning</term> 
               <ptr target="#finkel2007b"/>: It provides information (here, the consonants of the verb’s 
       root) needed by but not provided by more general nodes (here, <code>VerbA</code> and 
       the nodes to which it, in turn, refers). 
     </p>
            <p>We refer to the individual segments of a morphological form by means 
       of particular atoms: 
       
       <list type="unordered">
                  <item>
                     <code>themeVowel</code> is the vowel usually found after the root of a verb. </item>
                  <item>The atoms <code>stemImperfective, stemPerfective</code>, and <code>stemParticiple</code> are the verb stems, usually including the theme vowel, that precede suffixes marking tense, number, person, mood, and aspect. </item>
               </list>
            </p>
         </div>
         <div>
            <head>The <code>VerbA</code> Node </head>
            <p>We now turn to the <code>VerbA</code> node, to which the Praise node refers.
     
       
       <eg>VerbA: 
         1 &lt;themeVowel&gt; == ā 
         2 &lt;stemImperfective&gt; == "&lt;oot&gt;" &lt;themeVowel&gt;
         3 &lt;stemPerfective&gt; == &lt;stemImperfective&gt; v
         4 &lt;stemParticiple&gt; == &lt;stemImperfective&gt; t
         5 &lt;&gt; == Verb:&lt;conj1&gt; &lt;/code&gt;
       </eg>
       
       As with the <code>Praise</code> node, <code>VerbA</code> defers most queries to its parent, in this case the node called <code>Verb</code>, as Rule 5 indicates. 
     </p>
            <p>Most of these rules are for provisioning. Rule 1 answers the 
       <code>themeVowel</code> query. We sometimes address this query to leaf nodes; the <code>Praise</code> node defers it to <code>VerbA</code>, which answers the query.</p>
            <p>Rules 2-4 provision the three stems. The quoted path <code>&lt;root&gt;</code> in the 
       right-hand side directs a new query to the node to which the original query 
       was first addressed, in our case, <code>Praise</code>, which produces the four atoms 
       <code>l a u d</code>. The non-quoted path <code>&lt;themeVowel&gt;</code> directs a new query to the 
         current node, that is, VerbA, resolved by Rule 1. The right-hand side of the 
         rule in this case is equivalent to <code>l a u d ā</code>. Similarly, the right-hand side of Rule 3 is <code>l a u d ā v</code>, and the right-hand side of Rule 4 is <code>l a u d ā t</code>. 
     </p>
            <p>Rule 5 is a default rule, directing its query to the Verb node, with the 
       atom <code>conj1</code> prepended to the query. Therefore, queries addressed to <code>Verb</code> contain not only morphosyntactic markers (such as <q>present passive</q>) but 
       also informational markers (<q>conj1</q>). 
     </p>
            <p>By way of contrast, we also present the <code>VerbIO</code> node, which applies to <hi rend="italic">i</hi>-stem third-conjugation verbs such as <hi rend="italic">facere</hi> and <hi rend="italic">capere</hi>.
    
     <eg>VerbIO: 
       1 {themeVowel} == i 
       2 {themeVowel past imperfective } == I 
       3 {themeVowel 2 sg present imperfective passive 
       indicative} == I 
       4 {themeVowel imperative} == I 
       5 {themeVowel infinitive} == I 
       6 &lt;stemImperfective&gt; == "&lt;root&gt;" &lt;themeVowel&gt; 
       7 &lt;stemPerfective&gt; == AEE:&lt;"&lt;root&gt;"&gt; 
       8 &lt;stemParticiple&gt; == "&lt;root&gt;" t 
       9 &lt;&gt; == Verb:&lt;conj3io&gt;     
     </eg>
     
     Rules 2-5 introduce the strategy we call <emph>overriding</emph>, answering a query 
     that is usually answered by a more general node in order to provide specific results for this situation. The left-hand sides of these rules use braces 
     instead of angle brackets, indicating that the order of appearance of the 
     atoms is irrelevant for matching the rules. The atom I that appears on 
     the right-hand sides of these rules is a <ref target="#glossary5">morphophoneme</ref> that either disappears or converts to an <hi rend="italic">i</hi> or an <hi rend="italic">e</hi>, depending on surrounding context, during a postprocessing step. 
     </p>
            <p>Rule 7 introduces the <emph>lookup</emph> strategy, by which particular information 
       is obtained by reference to a special-purpose node. It directs a query such 
       as <code>f a c</code> to the <code>AEE</code> node to convert the <code>a</code> to <code>ē</code>, so the perfective stem of 
       <hi rend="italic">facere</hi> becomes <hi rend="italic">fēc</hi>. Here is that lookup node:
     
<eg>AEE: 
  1 &lt;$letter#1 a $letter#2&gt; == $letter#1 ē $letter#2      
</eg>
     
     This node depends on a definition (not shown) that defines what atoms 
     are in the category <q>letter.</q> Rule 1 says that any query beginning with a 
     letter, then the atom <code>a</code>, then any letter, should evaluate to the two letters 
       surrounding <code>ē</code> instead. 
     </p>
         </div>
         <div>
            <head>The <code>Verb</code> Node </head>
            <p>Queries addressed to <code>Praise</code> are generally deferred to its parent, <code>VerbA</code>, which then defers them further to <code>Verb</code>. 
       
<eg>Verb:  
       1 {$conj34 1 sg future/present imperfective 
           indicative/subjunctive} =+= &lt;&gt; 
       2 {future subjunctive} == ! 
       3 {perfective passive} == Sandhi:&lt;"&lt;stemParticiple&gt;" 
           AdjSuffix:&lt;nominative masculine&gt; wordEnd&gt; , 
           ToBe:&lt;imperfective active&gt; 
       4 &lt;&gt; == Sandhi:&lt;StemAspect SuffixTense1 SuffixTense2 
           SuffixPersonVoice wordEnd&gt;   
</eg>
       
       Rule 1 reflects the future indicative to the present subjunctive in verbs 
       of conjugations 3 and 4 (abbreviated by the value <code>$conj34</code>) in the first singular imperfective. This rule is quite specialized, applying, for instance, 
       to <hi rend="italic">dūcam</hi> 
               <q>I will lead / may I lead.</q> Without this rule, we would produce <hi rend="italic">dūcēo</hi> 
               <q>I will lead.</q> 
            </p>
            <p>Rule 2 indicates that there is no result for a query involving future subjunctive forms; Latin does not have these forms. 
     </p>
            <p>Rules 3 and 4 reflect to the <code>Sandhi</code> node as a postprocessing step after 
       assembling the components of a verb. The rule with the widest applicability, Rule 4, is the default rule. It combines the results of queries directed to the four nodes <code>StemAspect</code>, <code>SuffixTense1</code>, <code>SuffixTense2</code>, and <code>SuffixPersonVoice</code>, along with the marker <code>wordEnd</code> for postprocessing.  Rule 3 generates forms for the perfective passive, which involve a participle, an adjectival end, and a specific form of the verb <hi rend="italic">esse</hi>, which has its own node <code>ToBe</code>.</p>
         </div>
         <div>
            <head>Auxiliary Nodes</head>
            <p>The <code>Verb</code> node invokes several auxiliary nodes.
       
       <eg>StemAspect: 
         1 {imperfective} =+= "&lt;stemImperfective&gt;" 
         2 {perfective} =+= "&lt;stemPerfective&gt;" </eg>
       
       The stem depends on the aspect; it either results in the imperfective 
       or the perfective stem. The <code>=+=</code> notation preserves all the elements of the 
       query path, including those otherwise removed by matching the left-hand 
       side of the rules. 
     </p>
            <p>
               <eg>SuffixTense1: 
         1 &lt;&gt; == 
         2 {conj1 present imperfective subjunctive} == ē 
         3 {present imperfective subjunctive} == ā 
         4 {perfective} == e r 
         5 {past perfective subjunctive} == i s s ē 
         6 {present perfective indicative} == I 
         7 {3 pl present perfective indicative} == ē r 
         8 {3 pl present perfective subjunctive} == e r 
         9 {past imperfective indicative} == b 
         10 {future imperfective indicative} == b 
         11 {past imperfective indicative conj3io} == i ē b 
         12 {past imperfective indicative conj4} == ē b 
         13 {future imperfective indicative $conj34} == 
         14 {future imperfective indicative conj4} == ē 
         15 {future imperfective indicative conj3io} == ē 
         16 {past imperfective subjunctive} == r ē 
       </eg>
       
       The <code>SuffixTense1</code> node contains most tense information. It ranges 
       from very specific rules, like Rule 14, to fairly general rules, such as Rule 4, 
       which is overridden by more specific Rules 5-8. 
     </p>
            <p>
               <eg>SuffixTense2: 
         1 {past indicative} == ā 
         2 {present perfective subjunctive} == ī 
         3 &lt;&gt; == 
       </eg>
       
       The second tense suffix is usually empty (Rule 3), but it is occasionally 
       either <hi rend="italic">ā</hi> or <hi rend="italic">ī</hi>. 
     </p>
            <p>
               <eg>SuffixPersonVoice: 
         1 {2 sg} == I SuffixVoice SuffixPerson 
         2 {2 pl passive} == I m i n ī 
         3 &lt;&gt; == SuffixPerson SuffixVoice </eg>
       
       The suffix for person and voice is occasionally quite specific, as in the 
       second person plural passive. In the second person singular, the voice suffix (<hi rend="italic">r</hi> for the passive) precedes the person suffix, as in <hi rend="italic">laudāris</hi> 
               <q>you (sg) are 
       praised,</q> whereas the voice suffix usually follows the person suffix, as in 
       <hi rend="italic">laudāmur</hi> 
               <q>we are praised.</q> 
            </p>
            <p>
               <eg>SuffixVoice: 
         1 {passive} == r 
         2 {passive 3} == u r 
         3 &lt;&gt; == 
       </eg>
     
     In general, Rule 3 indicates that there is no suffix for voice. However, 
       there is a suffix for the passive voice, which is generally <hi rend="italic">r</hi> (Rule 2) but sometimes <hi rend="italic">ur</hi> (Rule 3). 
     </p>
            <p>
               <eg>SuffixPerson: 
         1 {1 sg} == m 
         2 {1 sg present imperfective indicative} == ō
         3 {1 sg future indicative} == ō 
         4 {1 sg present perfective indicative} == ī 
         5 &lt;&gt; == SuffixalVowel Desinence</eg>
       
       The personal suffix is usually a vowel and a 
       <ref target="#glossary2">desinence</ref> (Rule 5), but 
       the first person singular is exceptional, with a general suffix <hi rend="italic">m</hi> (Rule 1) but 
       sometimes <hi rend="italic">ō</hi> or <hi rend="italic">ī</hi>. 
     </p>
            <p>
               <eg>SuffixalVowel: 
         1 {future} == I 
         2 {present imperfective indicative} == I 
         3 {past imperfective subjunctive} == I 
         4 {3 pl +2} == u 
         5 {3 pl future perfective active} == I 
         6 &lt;&gt; == 
       </eg>
       
       The vowel has several forms; sometimes it is the morphophoneme <code>I</code> 
       as in <hi rend="italic">laudābit</hi> 
               <q>he will praise,</q> but sometimes <hi rend="italic">u</hi>, as in <hi rend="italic">laudābunt</hi> 
               <q>they will 
       praise.</q> 
            </p>
            <p>
               <eg>Desinence: 
         1 {2 sg} == I s 
         2 {2 sg present perfective indicative} == s t ¯ı 
         3 {3 sg} == t 
         4 {1 pl} == m u s 
         5 {2 pl} == t i s 
         6 {2 pl present perfective indicative} == s &lt;2 pl&gt; 
         7 {3 pl} == n t 
       </eg>
       
       The desinence provides the final consonants, typically marking person 
       and number, but occasionally influenced by aspect and tense (Rules 2 and 
       6). 
     </p>
         </div>
         <div>
            <head>The Sandhi Node</head>
            <p>After we assemble the entire verb, we apply language-specific sandhi rules 
       to account for phonological alterations.
     
       <eg>Sandhi: 
         1 &lt;wordEnd&gt; == 
         2 &lt;$letter&gt; == $letter &lt;&gt; 
         3 &lt;s r wordEnd&gt; == &lt;r wordEnd&gt; 
         4 &lt;I&gt; == &lt;i&gt; 
         5 &lt;$unroundedVowel I&gt; == &lt;$unroundedVowel&gt; 
         6 &lt;I r&gt; == &gt;e r&gt; 
         7 &lt;I $longUnroundedVowel&gt; == $longUnroundedVowel&gt; 
         8 &lt;I e wordEnd&gt; == e % canIe =&gt; cane 
         9 &lt;i i&gt; == &lt;i&gt; % cap i i ē bam -&gt; capieebam 
         10 &lt;ā ō&gt; == ¯o&gt; 
         11 &lt;ā ē&gt; == &lt;ē&gt; 
         12 &lt;ē ō&gt; == &lt;e ō&gt; 
         13 &lt;ē ā&gt; == &lt;e ā&gt; % moneeām -&gt; moneām 
         14 &lt;ī ō&gt; == &lt;i ō&gt; % audīoo -&gt; audioo 
         15 &lt;ī ē&gt; == &lt;i ē&gt; % audīees -&gt; audiees 
         16 &lt;ī ā&gt; == &lt;i ā&gt; % audīām -&gt; audiām -&gt; 
         17 &lt;ī ū&gt; == &lt;i ū&gt; % audīūnt -&gt; audiūnt 
         18 &lt;ū $vowel&gt; == &lt;u $vowel&gt; % frūctū-um -&gt; 
             frūctu-um; cornūa-&gt; 
         19 &lt;$longVowel $nonSibilantConsonant wordEnd&gt; ==  
             &lt;Shorten:&lt;$longVowel&gt; $nonSibilantConsonant wordEnd&gt; 
         20 &lt;$longVowel $stop#1 $stop#2&gt; == 
             &lt;Shorten:&lt;$longVowel&gt; $stop#1 $stop#2&gt; 
         21 &lt;$longUnroundedVowel u&gt; == 
             &lt;$longUnroundedVowel&gt; % dūceebāunt -&gt; 
             dūceebānt -&gt; .. 
         22 &lt;c s&gt; == &lt;x&gt; 
         23 &lt;g s&gt; == &lt;c s&gt; 
         24 &lt;$consonant r wordEnd&gt; == &lt;$consonant e r 
             wordEnd&gt;
       </eg>
     
     Unlike other nodes, <code>Sandhi</code> works strictly from left to right, dealing
     with a few atoms at a time. The first rule removes the <code>wordEnd</code> marker 
     if that is all that is left. The other rules simplify the beginning of the remaining string of letters and then use angle brackets to direct the modified 
     string back to the <code>Sandhi</code>node. 
     </p>
            <p>Rule 2 is quite general: if no more specific rule applies, it takes a single 
       letter from the query as output, and directs the remainder of the query 
       back to Sandhi. Rule 3 converts forms like <hi rend="italic">*dūcimusr</hi> to <hi rend="italic">*dūcimur</hi>. Rules 4-8 deal with the morphophoneme I, typically converting it to <hi rend="italic">i</hi> (Rule 5), 
       but sometimes converting it to <hi rend="italic">e</hi> or removing it entirely. Rules 9-18 deal 
       with two vowels in a row; typically, the second vowel is retained, and the 
       first either disappears or shortens. Rules 19 and 20 shorten long vowels 
       in certain contexts by applying to the <code>Shorten</code> rule, which we omit here. 
       Finally, Rules 22-24 introduce spelling rules. We include them in under the 
       rubric of sandhi. </p>
         </div>
         <div>
            <head>Strategies for Building KATR Theories</head>
            <p>We have been applying KATR to natural-language morphology for several years. In addition to Latin, we have built a complete morphology of 
       Hebrew verbs <ptr target="#finkel2007b"/>, large parts of Sanskrit (and other 
       related languages), and smaller studies of Bulgarian, Swahili, Georgian, 
       Lingala, Spanish, Polish, and Turkish. KATR allows us to represent morphological rules for these languages with great elegance.</p>
            <p>Writing specifications in KATR is not easy. KATR is capable of representing elegant theories, but arriving at those theories requires considerable effort. Early choices color the structure of the resulting theory, and the author must often discard attempts and rethink how to represent the target morphology. The hardest choice is often whether to model a form by introducing a sandhi rule or a formative rule. An example is the <hi rend="italic">-mur</hi> suffix that 
       marks first person plural passive. We choose to model this ending as <emph>mus 
       + r</emph> and to reduce the result by sandhi. We could have introduced instead 
       a rule in the <code>Desinence</code> node: 
       
       <eg>4.5 {1 pl passive} = m u</eg>
       
       and let the <hi rend="italic">r</hi> appear due to the <code>SuffixVoice</code> node. In this case, we prefer 
       not to introduce a special rule in <code>Desinence</code>, partially because it looks so 
       similar to the existing Rule 4, and therefore seems unparsimonious, and 
       partially because we hypothesize that historically there really was some 
       form <hi rend="italic">*-musr</hi> that eventually elided the two final consonants. 
     </p>
         </div>
         <div>
            <head>An Implicative KATR Theory for Latin</head>
            <div>
               <head>The Paradigm Chart </head>
               <p>We start this analysis by presenting a paradigm of word forms for Latin 
         verbs. Table 2 displays a subset of the entire paradigm that covers only 
         the present indicative active forms. The roots of lexemes are abstracted 
         away from this paradigm. </p>
               <p>
                  <table xml:id="table2">
                     <row role="label">
                        <cell>CONJ</cell>
                        <cell>PrIAc1s</cell>
                        <cell>PrIAc2s</cell>
                        <cell>PrIAc3s</cell>
                        <cell>PrIAc1p</cell>
                        <cell>PrIAc2p</cell>
                        <cell>PrIAc3p</cell>
                     </row>
                     <row>
                        <cell>TEMPLATE</cell>
                        <cell>4S1C</cell>
                        <cell>1S1Cs</cell>
                        <cell>1S1Ct</cell>
                        <cell>4S1Cmus</cell>
                        <cell>1S1Ctis</cell>
                        <cell>4S1Cnt</cell>
                     </row>
                     <row>
                        <cell>cIa</cell>
                        <cell>ō</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                     </row>
                     <row>
                        <cell>cIb</cell>
                        <cell>ō</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                     </row>
                     <row>
                        <cell>cIc</cell>
                        <cell>ō</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                        <cell>ā</cell>
                     </row>
                     <row>
                        <cell>cIIa</cell>
                        <cell>eō</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                     </row>
                     <row>
                        <cell>cIIb</cell>
                        <cell>eō</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                     </row>
                     <row>
                        <cell>cIIc</cell>
                        <cell>eō</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                     </row>
                     <row>
                        <cell>cIId</cell>
                        <cell>eō</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                     </row>
                     <row>
                        <cell>cIIe</cell>
                        <cell>eō</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                        <cell>ē</cell>
                     </row>
                     <row>
                        <cell>cIIIa</cell>
                        <cell>ō</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                     </row>
                     <row>
                        <cell>cIIIb</cell>
                        <cell>ō</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                     </row>
                     <row>
                        <cell>cIIIc</cell>
                        <cell>ō</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                     </row>
                     <row>
                        <cell>cIIId</cell>
                        <cell>ō</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                     </row>
                     <row>
                        <cell>cIIIe</cell>
                        <cell>iō</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                        <cell>i</cell>
                     </row>
                     <row>
                        <cell>cIIIf</cell>
                        <cell>ō</cell>
                        <cell>∅</cell>
                        <cell>∅</cell>
                        <cell>∅</cell>
                        <cell>∅</cell>
                        <cell>∅</cell>
                     </row>
                     <row>
                        <cell>cIIIs</cell>
                        <cell>um</cell>
                        <cell>∅</cell>
                        <cell>∅</cell>
                        <cell>u</cell>
                        <cell>∅</cell>
                        <cell>u</cell>
                     </row>
                     <row>
                        <cell>cIVa</cell>
                        <cell>iō</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                     </row>
                     <row>
                        <cell>cIVb</cell>
                        <cell>iō</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                     </row>
                     <row>
                        <cell>cIVc</cell>
                        <cell>iō</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                     </row>
                     <row>
                        <cell>cIVd</cell>
                        <cell>iō</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                        <cell>ī</cell>
                     </row>
                     <dhq:caption>Latin paradigm fragment (6 of the 92 columns)</dhq:caption>
                  </table>
               </p>
               <p>We have expanded the traditional four conjugations into 19 conjugations. They are mostly distinguished by forms not shown here, particularly by the perfect indicative active. For instance, a cIa verb such as <hi rend="italic">iuvāre</hi> 
                  <q>help</q> forms the perfect stem by lengthening the middle vowel, as in <hi rend="italic">iūvī</hi>
                  <q>I helped,</q> whereas a cIb verb such as <hi rend="italic">laudāre</hi> 
                  <q>praise</q> simply adds <hi rend="italic">āvī</hi> to the stem to form the perfect 1 singular. For completeness, we even include conjugation cIIIs for the two exceptional verbs <hi rend="italic">esse</hi> 
                  <q>be</q> and <hi rend="italic">posse</hi> 
                  <q>be able.</q>
               </p>
               <p>The line marked <code>CONJ</code> simply lists the morphosyntactic property sets 
         for our convenience. 
       </p>
               <p>The line marked <code>TEMPLATE</code> indicates parts of the chart that are constant 
         within each column. For instance, the second-person singular is marked 
         with <code>1S1Cs</code>. The <code>1S</code> part means <q>place the first stem here.</q> We choose to 
         make the first stem the present stem. Next, <code>1C</code> means <q>place the first entry in the column here.</q> Our Latin charts have only a single entry per column 
         in each row. When an entry is empty, we mark it with <hi rend="italic">∅</hi>. Finally, <code>s</code> means 
         <q>place the letter <hi rend="italic">s</hi> here</q>. All Latin verbs follow this strategy for building 
         the perfect indicative active 2nd person singular forms. </p>
               <p>Our chart requires that each verb have five stems, possibly identical: present, perfect, supine, present first person, and present past subjunctive. For iuvāre <q>help,</q> these stems are <hi rend="italic">iuv</hi>, <hi rend="italic">iū</hi>, <hi rend="italic">iū</hi>, <hi rend="italic">iuv</hi>, and <hi rend="italic">iuv</hi>, respectively. For esse <q>be,</q> the stems are <hi rend="italic">es</hi>, <hi rend="italic">fu</hi>, <hi rend="italic">es</hi>, <hi rend="italic">s</hi>, and <hi rend="italic">es</hi>, respectively.</p>
               <p>It often happens that a conjugation regularly refers one stem to another. For instance, class cIa refers the fourth and fifth stems to the first; class cIIIs refers the third to first. (For consistency, we always refer stems to earlier-numbered stems.) We represent stem referrals as an addendum to our chart, as in Table 3.</p>
               <p>
                  <table xml:id="table3">
                     <row>
                        <cell>REFER</cell>
                        <cell>cIa 4 , 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIb 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIc 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIa 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIb 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIc 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIId 4 , 5 -&gt; 1 ; 3 -&gt; 2</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIe 4 , 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIIa 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIIb 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIIc 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIId 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIIs 3 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIIe 3 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIVa 3 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIIIf 4 - 5 -&gt; 1 </cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIVb 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIVc 2 - 5 -&gt; 1</cell>
                     </row>
                     <row>
                        <cell>REFER</cell>
                        <cell>cIVd 2 - 5 -&gt; 1</cell>
                     </row>
                     <dhq:caption>Latin stem referrals</dhq:caption>
                  </table>
               </p>
               <p>We then represent the lexicon by indicating the conjugation and stems of each verb as another addendum to the chart, as in Table 4.
       </p>
               <p>
                  <table xml:id="table4">
                     <row>
                        <cell>LEXEME</cell>
                        <cell>help</cell>
                        <cell>cIa</cell>
                        <cell>1:iuv</cell>
                        <cell>2:ded</cell>
                        <cell>3:iū</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>bathe</cell>
                        <cell>cIa</cell>
                        <cell>1:lav</cell>
                        <cell>2:lāv</cell>
                        <cell>3:lau</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>stand</cell>
                        <cell>cIa</cell>
                        <cell>1:st</cell>
                        <cell>2:stet</cell>
                        <cell>3:sta</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>give</cell>
                        <cell>cIa</cell>
                        <cell>1:d</cell>
                        <cell>2:ded </cell>
                        <cell>3:da</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>praise</cell>
                        <cell>cIb</cell>
                        <cell>1:laud</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>rattle</cell>
                        <cell>cIc</cell>
                        <cell>1:crep </cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>destroy</cell>
                        <cell>cIIa</cell>
                        <cell>1:dēl</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>mourn</cell>
                        <cell>cIIb</cell>
                        <cell>1:lūg</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>order</cell>
                        <cell>cIIb</cell>
                        <cell>1:iub</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>warn</cell>
                        <cell>cIIc</cell>
                        <cell>1:mon</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>see</cell>
                        <cell>cIId</cell>
                        <cell>1:vid</cell>
                        <cell>2:vīd</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>arouse</cell>
                        <cell>cIIe</cell>
                        <cell>1:ci</cell>
                        <cell>2:cī</cell>
                        <cell>3:ci</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>decide</cell>
                        <cell>cIIIa</cell>
                        <cell>1:dēcern</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>nourish</cell>
                        <cell>cIIIb</cell>
                        <cell>1:al</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>lead</cell>
                        <cell>cIIIc</cell>
                        <cell>1:dūc</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>attach</cell>
                        <cell>cIIId</cell>
                        <cell>1:fīg</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>take</cell>
                        <cell>cIIIe</cell>
                        <cell>1:cap</cell>
                        <cell>2:cēp</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>carry</cell>
                        <cell>cIIIf</cell>
                        <cell>1:fer</cell>
                        <cell>2:tul</cell>
                        <cell>3:lāt</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>be</cell>
                        <cell>cIIIs</cell>
                        <cell>1:es</cell>
                        <cell>2:fu</cell>
                        <cell>4:s </cell>
                        <cell/>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>be able</cell>
                        <cell>cIIIs</cell>
                        <cell>1:potes</cell>
                        <cell>2:possu</cell>
                        <cell>4:poss</cell>
                        <cell>5:pos</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>go</cell>
                        <cell>cIIIs</cell>
                        <cell>1:i</cell>
                        <cell>2:i</cell>
                        <cell>4:e </cell>
                        <cell>5:ī</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>come</cell>
                        <cell>cIVa</cell>
                        <cell>1:ven </cell>
                        <cell>2:vēn</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>hear</cell>
                        <cell>cIVb</cell>
                        <cell>1:aud</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>leap</cell>
                        <cell>cIVc</cell>
                        <cell>1:sal</cell>
                     </row>
                     <row>
                        <cell>LEXEME</cell>
                        <cell>bind</cell>
                        <cell>cIVd</cell>
                        <cell>1:vinc</cell>
                     </row>
                     <dhq:caption>Latin lexicon</dhq:caption>
                  </table>
        
         
               </p>
               <p>The final section of the paradigm chart expresses rules of sandhi, as shown in Table 5.</p>
               <p>
                  <table xml:id="table5">
                     <row role="label">
                        <cell>CLASS</cell>
                        <cell>finalStop r m t nt</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>s s | =&gt; s % esst =&gt; est</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>s r =&gt; s s % esret =&gt; esset</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>s b =&gt; r % esbam =&gt; eram</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>ā\verb+\+s*[:finalStop:] | =&gt; a $1</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>ī\verb+\+s*[:finalStop:] | =&gt; i $1</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>ē\verb+\+s*[:finalStop:] | =&gt; e $1</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>ō\verb+\+s*[:finalStop:] | =&gt; o $1</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>ū\verb+\+s*[:finalStop:] | =&gt; u $1</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>g s =&gt; x % lugsi =&gt; luxi</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>b s =&gt; s s % iubsī =&gt; iussī</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>b t =&gt; s s % iubtum =&gt; iussum</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>g t =&gt; c t % lugtum =&gt; luctum</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>c s =&gt; x % ducsi =&gt; duxi</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>d s =&gt; s % vīdsum =&gt; vīsum</cell>
                     </row>
                     <row>
                        <cell>SANDHI</cell>
                        <cell>rn t =&gt; rt % dēcerntum =&gt; dēcertum</cell>
                     </row>
                     <dhq:caption>Latin sandhi rules</dhq:caption>
                  </table>
         
         
               </p>
               <p>These rules sometimes truly express sandhi, such as the rules for shortening long vowels before a final stop. Others are merely spelling rules, such as converting <hi rend="italic">cs</hi> to <hi rend="italic">x</hi>. We introduce a few in order to make our paradigm more regular, such as changing <hi rend="italic">sb</hi> to <hi rend="italic">r</hi>. In many cases, we indicate the situation that led us to introduce the rule by a comment starting with %.</p>
            </div>
            <div>
               <head>Deriving the Essence of the Paradigm</head>
               <p>Before analyzing the paradigm, we reduce it to its essence. The first reduction removes identical conjugations. This situation does not arise in Latin, but it does in French, where 71 of the 149 conjugations listed in the Larousse Dictionary are redundant after we present them on our paradigm form, and another 11 are identical except for stem-referral pattern.
       </p>
               <p>The next step is to remove redundant columns (morphosyntactic property sets, or MPSs). For example, although the second and third person verb forms are different in the present indicative active, the columns associated with those forms are identical. The differences are all covered by the template and sandhi rules. Of the 92 MPSs in Latin, there are only 14 unique ones.</p>
               <p>The last step is to remove essentially identical columns. Two columns are essentially identical if there is a one-to-one and onto mapping between the <ref target="#glossary4">exponences</ref> (cell contents) found in those columns. For example, the future perfect indicative active 2nd singular MPS is essentially the same as another MPS. Our analysis reduces the chart from 92 MPSs to 9 important ones, which we call <hi rend="bold">distillations</hi>.</p>
               <p>We call the reduced paradigm chart the <hi rend="bold">essence</hi>. The essence does not contain actual exponences; we substitute unique symbols instead. Table 6 presents the essence of the Latin paradigm. An entry like <code>e4_2</code> means that this distillation is based on the fourth MPS (which happens to be <code>PrIAc1p</code> ) and has the second exponence found in that MPS (which happens to be ē).</p>
               <p>
        
                  <table xml:id="table6">
                     <row>
                        <cell>cIa</cell>
                        <cell>e1_1</cell>
                        <cell>e2_1</cell>
                        <cell>e4_1</cell>
                        <cell>e13_1</cell>
                        <cell>e25_1</cell>
                        <cell>e37_1</cell>
                        <cell>e55_1</cell>
                        <cell>e58_1</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIb</cell>
                        <cell>e1_1</cell>
                        <cell>e2_1</cell>
                        <cell>e4_1</cell>
                        <cell>e13_1</cell>
                        <cell>e25_1</cell>
                        <cell>e37_2</cell>
                        <cell>e55_1</cell>
                        <cell>e58_2</cell>
                        <cell>e92_2</cell>
                     </row>
                     <row>
                        <cell>cIc</cell>
                        <cell>e1_1</cell>
                        <cell>e2_1</cell>
                        <cell>e4_1</cell>
                        <cell>e13_1</cell>
                        <cell>e25_1</cell>
                        <cell>e37_3</cell>
                        <cell>e55_1</cell>
                        <cell>e58_2</cell>
                        <cell>e92_3</cell>
                     </row>
                     <row>
                        <cell>cIIa</cell>
                        <cell>e1_2</cell>
                        <cell>e2_2</cell>
                        <cell>e4_2</cell>
                        <cell>e13_2</cell>
                        <cell>e25_2</cell>
                        <cell>e37_4</cell>
                        <cell>e55_2</cell>
                        <cell>e58_3</cell>
                        <cell>e92_4</cell>
                     </row>
                     <row>
                        <cell>cIIb</cell>
                        <cell>e1_2</cell>
                        <cell>e2_2</cell>
                        <cell>e4_2</cell>
                        <cell>e13_2</cell>
                        <cell>e25_2</cell>
                        <cell>e37_5</cell>
                        <cell>e55_2</cell>
                        <cell>e58_3</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIIc</cell>
                        <cell>e1_2</cell>
                        <cell>e2_2</cell>
                        <cell>e4_2</cell>
                        <cell>e13_2</cell>
                        <cell>e25_2</cell>
                        <cell>e37_3</cell>
                        <cell>e55_2</cell>
                        <cell>e58_3</cell>
                        <cell>e92_3</cell>
                     </row>
                     <row>
                        <cell>cIId</cell>
                        <cell>e1_2</cell>
                        <cell>e2_2</cell>
                        <cell>e4_2</cell>
                        <cell>e13_2</cell>
                        <cell>e25_2</cell>
                        <cell>e37_1</cell>
                        <cell>e55_2</cell>
                        <cell>e58_3</cell>
                        <cell>e92_5</cell>
                     </row>
                     <row>
                        <cell>cIIe</cell>
                        <cell>e1_2</cell>
                        <cell>e2_2</cell>
                        <cell>e4_2</cell>
                        <cell>e13_2</cell>
                        <cell>e25_2</cell>
                        <cell>e37_6</cell>
                        <cell>e55_2</cell>
                        <cell>e58_3</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIIIa</cell>
                        <cell>e1_1</cell>
                        <cell>e2_3</cell>
                        <cell>e4_3</cell>
                        <cell>e13_2</cell>
                        <cell>e25_3</cell>
                        <cell>e37_1</cell>
                        <cell>e55_3</cell>
                        <cell>e58_4</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIIIb</cell>
                        <cell>e1_1</cell>
                        <cell>e2_3</cell>
                        <cell>e4_3</cell>
                        <cell>e13_2</cell>
                        <cell>e25_3</cell>
                        <cell>e37_3</cell>
                        <cell>e55_3</cell>
                        <cell>e58_4</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIIIc</cell>
                        <cell>e1_1</cell>
                        <cell>e2_3</cell>
                        <cell>e4_3</cell>
                        <cell>e13_2</cell>
                        <cell>e25_3</cell>
                        <cell>e37_5</cell>
                        <cell>e55_3</cell>
                        <cell>e58_4</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIIId</cell>
                        <cell>e1_1</cell>
                        <cell>e2_3</cell>
                        <cell>e4_3</cell>
                        <cell>e13_2</cell>
                        <cell>e25_3</cell>
                        <cell>e37_5</cell>
                        <cell>e55_3</cell>
                        <cell>e58_4</cell>
                        <cell>e92_5</cell>
                     </row>
                     <row>
                        <cell>cIIIe</cell>
                        <cell>e1_3</cell>
                        <cell>e2_3</cell>
                        <cell>e4_3</cell>
                        <cell>e13_3</cell>
                        <cell>e25_4</cell>
                        <cell>e37_1</cell>
                        <cell>e55_4</cell>
                        <cell>e58_5</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIIIf</cell>
                        <cell>e1_1</cell>
                        <cell>e2_4</cell>
                        <cell>e4_4</cell>
                        <cell>e13_2</cell>
                        <cell>e25_3</cell>
                        <cell>e37_1</cell>
                        <cell>e55_5</cell>
                        <cell>e58_1</cell>
                        <cell>e92_6</cell>
                     </row>
                     <row>
                        <cell>cIIIs</cell>
                        <cell>e1_4</cell>
                        <cell>e2_4</cell>
                        <cell>e4_5</cell>
                        <cell>e13_4</cell>
                        <cell>e25_5</cell>
                        <cell>e37_1</cell>
                        <cell>e55_6</cell>
                        <cell>e58_6</cell>
                        <cell>e92_0</cell>
                     </row>
                     <row>
                        <cell>cIVa</cell>
                        <cell>e1_3</cell>
                        <cell>e2_5</cell>
                        <cell>e4_6</cell>
                        <cell>e13_3</cell>
                        <cell>e25_4</cell>
                        <cell>e37_1</cell>
                        <cell>e55_4</cell>
                        <cell>e58_5</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIVb</cell>
                        <cell>e1_3</cell>
                        <cell>e2_5</cell>
                        <cell>e4_6</cell>
                        <cell>e13_3</cell>
                        <cell>e25_4</cell>
                        <cell>e37_7</cell>
                        <cell>e55_4</cell>
                        <cell>e58_5</cell>
                        <cell>e92_7</cell>
                     </row>
                     <row>
                        <cell>cIVc</cell>
                        <cell>e1_3</cell>
                        <cell>e2_5</cell>
                        <cell>e4_6</cell>
                        <cell>e13_3</cell>
                        <cell>e25_4</cell>
                        <cell>e37_3</cell>
                        <cell>e55_4</cell>
                        <cell>e58_5</cell>
                        <cell>e92_1</cell>
                     </row>
                     <row>
                        <cell>cIVd</cell>
                        <cell>e1_3</cell>
                        <cell>e2_5</cell>
                        <cell>e4_6</cell>
                        <cell>e13_3</cell>
                        <cell>e25_4</cell>
                        <cell>e37_5</cell>
                        <cell>e55_4</cell>
                        <cell>e58_5</cell>
                        <cell>e92_1</cell>
                     </row>
                     <dhq:caption>Essence of the Latin paradigm </dhq:caption>
                  </table>
         
       
               </p>
            </div>
            <div>
               <head>Principal Parts</head>
               <p>Intuitively, a set of principal parts is a minimal subset of MPSs so that if one knows the exponences of those MPSs for a particular verb, one can deduce the verb's conjugation, from which one can deduce all the other MPSs of the verb. The practical utility of principal parts for language pedagogy has long been recognized. Generations of Latin students have learned that each verb in Latin has four principal parts (present indicative active first person singular, active infinitive, perfect indicative active first person singular, perfect passive participle (neuter nominative singular)). If one knows <hi rend="italic">laudō</hi>, <hi rend="italic">laudāre</hi>, <hi rend="italic">laudāvī</hi>, <hi rend="italic">laudātum</hi>, one knows enough to place the verb in conjugation 1, from which one can determine the exponences of all the other MPSs by reference to the paradigm chart.</p>
               <p>We can define several kinds of principal-part systems <ptr target="#finkel2007a"/>.
         
         <list>
                     <item>
                        <hi rend="bold">Static</hi>: a set of MPSs that applies for all verbs in the chart. Given the exponences of a verb for those MPSs, one can deduce its conjugation. Static principal parts are equivalent to the traditional understanding.</item>
                     <item>
                        <hi rend="bold">Adaptive</hi>: a tree of MPSs. Given the exponence of a verb for the MPS at the root of the tree, one can select an appropriate subtree and recurse. The leaves of the tree are conjugations.</item>
                     <item>
                        <hi rend="bold">Dynamic</hi>: a set of {MPS, exponence} pairs for each conjugation. If a verb agrees with a set of pairs, it belongs to the associated conjugation.</item>
                  </list>
               </p>
               <p>We have built a program that takes the essence of a paradigm and computes its principal-part systems. For Latin, we find that there are, in fact, four static principal parts, with ten variations, as shown in Table 7. It is a bit surprising that the infinitive does not figure into any of these variations. However, the paradigm from which we calculate this result places the infinitive MPS almost at the end. It turns out that the exponences for the infinitive are identical to the exponences for the imperfect subjunctive active first person singular. For <hi rend="italic">laudō</hi>, for instance, both show <code>ā</code>, one using that exponence to form the infinitive <hi rend="italic">laudāre</hi>, and the other to form the subjunctive <hi rend="italic">laudārem</hi>. In turn the MPS for the imperfect subjunctive active first person singular is essentially identical to the MPS for the present indicative active second person singular. Therefore, the first variation in Table 7 is the traditional set of principal parts.</p>
               <p>
                  <table xml:id="table7">
                     <row>
                        <cell>1</cell>
                        <cell> Present 1 sg, Present 2 sg, Perfect 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>2</cell>
                        <cell> Present 1 sg, Present 1 pl, Perfect 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>3</cell>
                        <cell> Present 2 sg, Imperfect 1 sg, Perfect 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>4</cell>
                        <cell> Present 2 sg, Future perfect 1 sg, Perfect 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>5</cell>
                        <cell> Present 2 sg, Perfect 1 sg, Present subj 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>6</cell>
                        <cell> Present 2 sg, Perfect 1 sg, Present subj 1 pl, Supine</cell>
                     </row>
                     <row>
                        <cell>7</cell>
                        <cell> Present 1 pl, Imperfect 1 sg, Perfect 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>8</cell>
                        <cell> Present 1 pl, Future perfect 1 sg, Perfect 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>9</cell>
                        <cell> Present 1 pl, Perfect 1 sg, Present subj 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>10</cell>
                        <cell> Present 1 pl, Perfect 1 sg, Present subj 1 pl, Supine</cell>
                     </row>
                     <dhq:caption> Latin static principal parts (indicative active unless otherwise 
               marked)</dhq:caption>
                  </table>
         
         
               </p>
               <p>Our program computes one of the possible trees representing adaptive principal parts, as shown in Figure 2, which also shows a representative verb from each conjugation. Some conjugations can be determined with only two adaptive principal parts. For instance, if the present second person singular form uses <emph>ās</emph> and the perfect first person singular form is <code>ī</code>, then the conjugation is cIa, as in <hi rend="italic">iuvāre</hi> 
                  <q>help.</q> Others require three adaptive principal parts, such as <hi rend="italic">dūcere</hi> 
                  <q>lead.</q> However, no conjugation needs four principal parts. This analysis shows that the most important distinction, the one at the top of the tree, is based on the present indicative active second person singular form, which we note above is essentially the same as the active infinitive form.</p>
               <p>
                  <figure xml:id="figure2">
                     <graphic url="resources/images/finkel_and_stump_2007_fig2.jpg"/>
                     <figDesc>Latin adaptive principal parts</figDesc>
                     <dhq:caption>Latin adaptive principal parts (all indicative active)
       </dhq:caption>
                  </figure>
               </p>
               <p>We also compute dynamic principal parts. Table 8 displays one set for each conjugation; in general, each conjugation has several variations. This analysis shows that many conjugations can be completely determined by a single principal part. For example, if the perfect indicative active 1 sg form of a verb is <hi rend="italic">-āvī</hi>, the verb is in conjugation cIb. Only one conjugation, cIIIc, requires three exponences to determine all its forms.</p>
               <p>
                  <table xml:id="table8">
                     <row>
                        <cell>cIa</cell>
                        <cell>Present 2 sg, Present subjunctive 1 pl</cell>
                     </row>
                     <row>
                        <cell>cIb</cell>
                        <cell>Perfect 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIc</cell>
                        <cell>Present subjunctive 1 pl, Supine</cell>
                     </row>
                     <row>
                        <cell>cIIa</cell>
                        <cell>Perfect 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIIb</cell>
                        <cell>Present 1 sg, Perfect 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIIc</cell>
                        <cell>Present 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>cIId</cell>
                        <cell>Present 1 sg, Perfect 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIIe</cell>
                        <cell>Perfect 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIIIa</cell>
                        <cell>Perfect 1 sg, Present subjunctive 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIIIb</cell>
                        <cell>Perfect 1 sg, Present subjunctive 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIIIc</cell>
                        <cell>Perfect 1 sg, Present subjunctive 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>cIIId</cell>
                        <cell>Present subjunctive 1 sg, Supine</cell>
                     </row>
                     <row>
                        <cell>cIIIe</cell>
                        <cell>Present 1 sg, Present 2 sg</cell>
                     </row>
                     <row>
                        <cell>cIIIf</cell>
                        <cell>Present 1 pl</cell>
                     </row>
                     <row>
                        <cell>cIIIs</cell>
                        <cell>Present 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIVa</cell>
                        <cell>Present 2 sg, Perfect 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIVb</cell>
                        <cell>Perfect 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIVc</cell>
                        <cell>Present 2 sg, Perfect 1 sg</cell>
                     </row>
                     <row>
                        <cell>cIVd</cell>
                        <cell>Present 2 sg, Perfect 1 sg</cell>
                     </row>
                     <dhq:caption>Latin dynamic principal parts (indicative active unless otherwise 
                 marked)</dhq:caption>
                  </table>
   
         
               </p>
            </div>
            <div>
               <head>Grouping</head>
               <p>Computing the adaptive principal parts produces one way to see the interrelation of the conjugations, as shown in <ref target="#figure2">Figure 2</ref>. We can compute the interrelation in a more direct way by using an algorithm based on Huffman encoding  <ptr target="#huffman2007"/>. We define the distance between two conjugations as the number of distillations on which they disagree. We repeatedly find the two conjugations of minimum distance, delete them from the set of conjugations, combine them, and insert the result, a pseudo-conjugation, back into the set of conjugations. A pseudo-conjugation has a compound value for those distillations where the two conjugations disagree. We consider a compound value to be of distance 0 from any superset or subset.</p>
               <p>This algorithm leads to multiple possible analyses, because we may be able to choose among several minimum pairs. Each analysis is a taxonomic tree. Figure 3 shows one tree that our program produces. The entries like <code>e13_2</code> refer to the essence of Figure 3. They show in what way each node in the tree is distinguished from its siblings. For instance, conjugations cIIb and cIIe are distinguished by <code>e37</code>, which is perfect indicative active first person singular.</p>
               <p>
                  <figure xml:id="figure3">
                     <graphic url="resources/images/finkel_and_stump_2007_fig3.jpg"/>
                     <figDesc>Latin conjugation groups</figDesc>
                     <dhq:caption>Latin conjugation groups </dhq:caption>
                  </figure>
               </p>
               <p>Figure 3 verifies that the usual conjugation nomenclature is reasonable. All three cI conjugations are close to each other, although cIa is a slight outlier. Conjugations cIIIa−d are very close to each other, but cIIIf (<hi rend="italic">ferre</hi>) is significantly farther away. Conjugation cIIIs (<hi rend="italic">esse</hi>) is still farther. Strangely, conjugation cIIIe (<hi rend="italic">capere</hi>) is grouped with the cIV conjugations; apparently, <hi rend="italic">i</hi>-stem third conjugation verbs share more connection with the fourth conjugation than the third.</p>
            </div>
            <div>
               <head>Generating a KATR Theory</head>
               <p>We can generate a KATR theory directly from the paradigm of <ref target="#table2">Table 2</ref> along with the stem-referral rules of <ref target="#table3">Table 3</ref>, and the lexicon of <ref target="#table4">Table 4</ref>. We can take advantage of the grouping (<ref target="#figure3">Figure 3</ref>) to generate a fairly compact KATR theory. <ref target="#figure4">Figure 4</ref> shows a fragment of the computed KATR theory.</p>
               <p>The <code>Help</code> node introduces the three stems required by conjugation cIa. It refers all other requests to the <code>CONJcIa</code> node, which refers the remaining stems to the first stem. It also provisions the version of distillation <code>e58</code> to have variant 1. It refers other requests to a chain of grouping nodes, here shown as <code>Join12</code>, <code>Join15</code>, <code>Join17</code>, and <code>Join18</code>, each of which provisions some distillations and hands of other requests to the next node in the chain. Finally, node <code>Join18</code> refers all morphological queries to <code>EXPAND</code>. This node, of which we show only a small piece, combines the result of referring to nodes <code>MPS1</code> through <code>MPS92</code>, for each one looking up the appropriate exponence. <code>MPS1</code>, which generates the present indicative active first person singular form, invokes node <code>T02</code> with a parameter that depends on the value of the <code>e1</code> distillation. Finally, the <code>T02</code> node looks up the appropriate stem (in this case, stem 4) and combines it with the given ending.</p>
               <p>  
                  <figure xml:id="figure4">
                     <graphic url="resources/images/finkel_and_stump_2007_fig4.jpg"/>
                     <figDesc>Automatically generated KATR theory</figDesc>
                     <dhq:caption>Automatically generated KATR theory (fragment) </dhq:caption>
                  </figure>           
               </p>
               <p>Table 9 shows some of the output that KATR generates for this theory. We use such output to verify that we have correctly captured the original paradigm in our chart of <ref target="#table2">Table 2</ref>.</p>
               <p>
                  <table xml:id="table9">
                     <row>
                        <cell>iuvō</cell>
                        <cell>iuvās</cell>
                        <cell>iuvat</cell>
                        <cell>iuvāmus</cell>
                        <cell>iuvātis</cell>
                        <cell>iuvant</cell>
                     </row>
                     <row>
                        <cell>laudō</cell>
                        <cell>laudās</cell>
                        <cell>laudat</cell>
                        <cell>laudāmus</cell>
                        <cell>laudātis</cell>
                        <cell>laudant</cell>
                     </row>
                     <row>
                        <cell>moneō</cell>
                        <cell>monēs</cell>
                        <cell>monet</cell>
                        <cell>monēmus</cell>
                        <cell>monētis</cell>
                        <cell>monent</cell>
                     </row>
                     <row>
                        <cell>dūco</cell>
                        <cell>dūcis</cell>
                        <cell>dūcit</cell>
                        <cell>dūcimus</cell>
                        <cell>dūcitis</cell>
                        <cell>dūcunt</cell>
                     </row>
                     <row>
                        <cell>sum</cell>
                        <cell>es</cell>
                        <cell>est</cell>
                        <cell>sumus</cell>
                        <cell>estis</cell>
                        <cell>sunt</cell>
                     </row>
                     <dhq:caption>Output of automatically generated KATR theory (fragment)</dhq:caption>
                  </table>                       
         
          
         
               </p>
            </div>
         </div>
         <div>
            <head>Conclusions</head>
            <p>This exercise demonstrates that both the realizational and the implicative approaches to defining language morphology lead to effective descriptions, as evidenced by the KATR theories they produce. An advantage of the realizational approach is that it allows us to apply language-specific knowledge and insight to create a default inheritance hierarchy that captures the morphological structure of the language, with slots pertaining to different morphosyntactic properties. However, as we have noted elsewhere <ptr target="#finkel2007b"/> writing KATR specifications requires considerable effort. Early choices color the structure of the resulting theory, and the author must often discard attempts and rethink how to represent the target morphology. We have built KATR theories for verbs in Hebrew, Slovak, Polish, Spanish, and Lingala (a Bantu language of the Congo), as well as for parts of Hungarian, Sanskrit, and Pali.</p>
            <p>The implicative approach is much more automatic. One still needs to manually construct the initial paradigm, decide how many stems are needed (for Latin, we use five; for French, we use 15), and abstract as much information as possible into the templates for each MPS. After that, we can use automatic methods that reduce the paradigm chart to its essence, group conjugations, and generate an effective KATR theory. These steps take only a few seconds to complete (on a 1.8GHz Intel Pentium running Linux, only one second). It is a simple (albeit tedious) matter to verify that all the forms the KATR theory generates are accurate. The KATR theory itself is fairly compact, taking advantage of grouping. However, it is about twice the size of the hand-built theory (measured in characters). More important, it doesn't clearly delineate the slots of the exponences. It is therefore somewhat less satisfying, somewhat less informative, than the KATR theory we build manually following the realizational approach. We have applied the implicative approach to French (both as spelled and as pronounced), Hebrew, and Yiddish, as well as some lesser-known languages, such as Comaltepec Chinantec (Oto-Manguean, spoken in Oaxaca, Mexico), Fur (Nilo-Saharan, in Darfur), and Sora (Austro-Asiatic, India).</p>
            <p>The implicative approach also has the advantage that it allows us to analyze the principal parts of the language based solely on the exponences in the paradigm chart. We have taken advantage of that ability elsewhere to characterize languages based on properties of their principal parts <ptr target="#finkel2007a"/>. For example, Latin, along with Sanskrit, but in strong contradistinction to Comaltepec Chinantec, has a very orthogonal set of principal parts: Each principal part tends to govern a disjoint set of MPSs.</p>
            <p>
     As Latin developed into the Romance languages, its scheme of conjugations and principal parts evolved. Initial investigation of French, following the same implicative approach as that shown here for Latin, shows that 9 static principal parts are needed to distinguish the 67 distinct conjugations. Ignoring spelling and considering only pronunciation, we have been able to reduce this total to 7 principal parts distinguishing 35 conjugations. This investigation continues.
   </p>
         </div>
         <div>
            <head>Acknowledgments</head>
            <p>We would like to thank Lei Shen and Suresh Thesayi, who were instrumental in implementing our Java™ version of KATR. Nancy Snoke assisted in implementing our Perl/Prolog version.</p>
            <p>This work was partially supported by the National Science Foundation under Grants IIS-0097278 and IIS-0325063 and by the University of Kentucky Center for Computational Science. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.</p>
         </div>
         <div>
            <head>Glossary</head>
            <list>
               <item>
                  <label>Cell</label>: A position in a full table of word forms, where the row is the inflection class (such as conjugation 1), and the column represents a set of morphosyntactic properties (such as first person singular present indicative active).</item>
               <item>
                  <label>Desinence</label>: An inflectional ending, usually added to a stem according to its syntactic context. For example, <hi rend="italic">amat</hi> 
                  <q>he/she loves</q> has the desinence <hi rend="italic">-t</hi>. </item>
               <item>
                  <label>Diacritic</label>: A marker of a particular morphophonological property. For example, the fact that a verb is in conjugation 4 is a diacritic.</item>
               <item>
                  <label>Exponence</label>: The contents of a cell for a given lexeme, such as <hi rend="italic">amat</hi>.</item>
               <item>
                  <label>Morphophoneme</label>: A phonological unit whose phonemic expression depends on its context. For example in our Latin KATR theory, we use I as the phonological unit in conjugation 3 (i-stem) that is either expressed as the phoneme <hi rend="italic">i</hi> (as in <hi rend="italic">capiō</hi> 
                  <q>I grab</q>), as the phoneme <hi rend="italic">e</hi> (before <hi rend="italic">r</hi>, as in <hi rend="italic">capere</hi> 
                  <q>to grab</q>), or disappears entirely (such as before <hi rend="italic">ī</hi>, as in <hi rend="italic">cēpī</hi> 
                  <q>I have grabbed</q>).</item>
               <item>
                  <label>Node</label>: A set of rules in a KATR theory to which a query is directed. The particular rule to apply depends on the query. Some nodes refer to others, leading to a hierarchical node structure.</item>
               <item>
                  <label>Sandhi</label>: Rules of euphony or spelling. For example <hi rend="italic">ēō</hi> is pronounced <hi rend="italic">eō</hi>, as in <hi rend="italic">videō</hi> 
                  <q>I see,</q> and <hi rend="italic">cs</hi> is spelled <hi rend="italic">x</hi>, as in <hi rend="italic">dūxī</hi> 
                  <q>I have led.</q>
               </item>
            </list>
         </div>
      </body>
      <back>
         <listBibl>
            <bibl xml:id="anderson1992" label="Anderson 1992"> Anderson, S. R. 1992 <title rend="italic"> Amorphous morphology</title>, Cambridge University Press.</bibl>
            <bibl xml:id="blevins2005" label="Blevins 2005"> Blevins, J. P. 2005 <title rend="quotes">Word-based Declensions in Estonian,</title> in G. Booij &amp; J. van Marle (eds.), <title rend="italic">Yearbook of Morphology 2005</title>, Springer, Dordrecht, pp. 1–25.</bibl>
            <bibl xml:id="blevins2006" label="Blevins 2006"> Blevins, J. P. 2006 <title rend="quotes">Word-based Morphology,</title> 
               <title rend="italic">Journal of Linguistics</title> 42: 531– 573.</bibl>
            <bibl xml:id="corbett1993" label="Corbett 1993"> Corbett, G. G. &amp; Fraser, N. M. 1993
       <title rend="quotes">Network Morphology: A DATR Account of Russian Nominal Inflection,</title> 
               <title rend="italic">Journal of Linguistics</title> 29: 113– 142.</bibl>
            <bibl xml:id="evans1989" label="Evans 1989"> Evans, R. &amp; Gazdar, G. 1989 <title rend="quotes">Inference in DATR, Proceedings of the Fourth Conference of the European Chapter of the Association for Computational Linguistics,</title> Manchester, pp. 66–71.</bibl>
            <bibl xml:id="finkel2002" label="Finkel 2002">Finkel, R., Shen, L., Stump, G. &amp;
             Thesayi, S. 2002 <title rend="quotes">KATR: A Set-based Extension of DATR,</title> Technical Report 346-02, University of Kentucky Department of Computer Science, Lexington, KY. <ref target="ftp://ftp.cs.uky.edu/cs/techreports/346-02.pdf">ftp://ftp.cs.uky.edu/cs/techreports/346-02.pdf</ref>.</bibl>
            <bibl xml:id="finkel2007a" label="Finkel 2007a"> Finkel, R. A. &amp; Stump, G. T.
     (2007a). <title rend="quotes">Principal Parts and Morphological Typology,</title> 
               <title rend="italic">Morphology</title> 17: 39–75.</bibl>
            <bibl xml:id="finkel2007b" label="Finkel 2007b"> Finkel, R. &amp; Stump, G. (2007b).
               <title rend="quotes">A Default Inheritance Hierarchy for Computing Hebrew Verb Morphology,</title> 
               <title rend="italic">Literary and Linguistic Computing</title> 22(2): 117–136. dx.doi.org/10.1093/llc/fqm004.</bibl>
            <bibl xml:id="finkel2008" label="Finkel 2008">Finkel, R. &amp; Stump, G. 2008
                 <title rend="quotes">Principal Parts and Degrees of Paradigmatic Transparency,</title> in J. P. Blevins &amp; J. Blevins (eds), <title rend="italic">Analogy in Grammar: Form and Acquisition</title>, Oxford University Press, Oxford.</bibl>
            <bibl xml:id="huffman2007" label="Huffman 2007"> Huffman coding (2007). <ref target="http://en.wikipedia.org/wiki/huffman">http://en.wikipedia.org/wiki/Huffman_coding</ref>
            </bibl>
            <bibl xml:id="matthews1972" label="Matthews 1972"> Matthews, P. H. 1972 <title rend="italic">Inflectional Morphology</title>, Cambridge University Press.</bibl>
            <bibl xml:id="stump2001" label="Stump 2001"> Stump, G. T. 2001 <title rend="italic">Inflectional Morphology</title>, Cambridge University Press, Cambridge, England.</bibl>
            <bibl xml:id="zwicky1985" label="Zwicky 1985"> Zwicky, A. M. 1985 <title rend="quotes">How to Describe Inflection,</title> 
               <title rend="italic">Proceedings of the 11th Annual Meeting of the Berkeley Linguistics Society </title>, pp. 372–386.</bibl>
         </listBibl>
      </back>
   </text>
</TEI>
