Linking Example-Baed and Rule-Baed Machine Tranlation Michael Carl, Catherine Peae and Oliver Streiter Intitut fur Angewandte Informationforchung, Martin-Luther Str. 14 (carl,cath,oliver)@iai.uni-b.de 66111 Saarbrucken, Germany January 20, 1999 Abtract In order to improve the quality of tranlation and to make the MT ytem eaier to tune to the need of dierent uer, two machine tranlation ideologie are combined, in order to invetigate the conequence of thi linkage and to determine what kind of linguitic entitie (yntactic contruction, lexicographic type, collocation etc: ::) can be dynamically tranferred between the dierent component without introducing new tranlation error. 1 Introduction All of the individual MT approache which have been employed o far have their trength and weaknee and it i unlikely that an new, "ideal" approach may be propoed and implemented on a izeable cale in the foreeeable future. Subtantial progre in the eld can therefore be achieved only bycombining the trength of dierent approache. A dicuion of the expected benet of uch a linkage in term of a recall, coverage, adaptability, tranlation quality, reliability and uer-orientation i beyond the cope of thi paper and can be found in ESSLI (Carl et al., 1998). Starting from thi aumption, we invetigated how dierent MT paradigm can be integrated in one framework. The paradigm we conider here are rule-baed MT (RBMT) and example-baed MT (EBMT). Experiment of linkage have been carried out uing the RBMT ytem CAT2 and the EBMT ytem EDGAR. More information about thee ytem than can be provided here can be found in (Streiter, 1996), (Streiter, 1998), (Carl, 1998).
2 g f g pred mod mod pred _ mod g pred t f g f t mod t d prache haben rel d arbeiten gro angt rel untercheiden morphem g mod t mod t g mod t g a mod t prache angt rel pro arbeiten pro untercheiden pro morphem 2 CAT2 CAT2 i a unication-baed MT formalim ued at IAI for the development of variou MT related application. In order to explain the operation of the ytem, we follow the tranlation of a illutrative (although ctive) German entence into Englih tep by tep: Der Sprachwienchaftler hat bei der Arbeit groe Angt vor ununtercheidbaren Morphemen. Syntactic and emantic analyi i performed following HPSGlike cheme of compoition. Beide the yntactic function, emantic role are identied (a=agent, t=theme, g=goal). In our example, Angt (fear) i recoginzed a predicative noun of the upport verb haben (have), der Sprachwienchaftler (the linguit) a the peron who experience the fear and ununtercheidbare Morpheme (unditinguihable morpheme) a the content of the fear: Before tranfer, function word (determiner, cae marking prepoition, degree word, auxiliarie, magnier etc: ::) are tructurally removed. Pronoun are introduced a internal argument of modier relation. Support verb (e.g. haben) and copula verb are replaced by the element bearing the argument tructure (e.g. Angt). The lexical tranfer trigger ucceively dierent tranlational option until an overall integration of thee option become poible: After tranfer, upport verb and copulative verb (here be) are inerted. Word order i rearranged and function word are generated a neceary:
3 {lex=prachwienchaftler} =>{lex=language,lex=linguit} {lex=angt} =>{lex=anxiou,lex=anxiou} {lex=bei} =>{lex=rel,lex=during} {lex=angt} =>{lex=pro,lex=?} {lex=arbeit} =>{lex=operate,lex=operation} {lex=omeone} =>{lex=pro,lex=omeone} {lex=ununtercheidbar} =>{lex=differ,lex=different/differ/differently} {lex=ditinguih,lex=ditinguihable} {lex=omeone} =>{lex=pro,lex=omeone} {lex=morphem} =>{lex=morpheme,lex=morpheme} g mod t mod t g mod t g a mod t language anxiou rel pro operate pro ditinguih pro morpheme g f g mod _ pred mod g _ f pred f g pred t f t t mod mod pred f pred d langua. be deg anxi. rel morph. pro be deg ditin. rel d operation "The linguit i very anxiou about morpheme that are not ditinguihable during the operation."
4 3 EDGAR EDGAR i an experimental EBMT ytem which decompoe and "generalie" a morphologically analyzed input entence by matching it againt a cae bae (henceforth CB). In thi CB the morphological analyi of tranlation example i tored together with tranlation template which are automatically generalized from the tranlation example. The generation of tranlation template i een a a ort of example driven grammar induction. In the abence of a complete match ofaentence and a reference tranlation, a imple ubject-verbobject entence, for example, can be tranlated uing a ubject-verb-object template via linguitic generaliation. The generalized input entence i then pecied (i.e. the correct linguitic information i gathered) and "rened" in the target language. Each example i divided into two feature et: xed and variable feature. The xed feature include the lexical decription (i.e. the lemmatied form of the word) lu, gender g and type (i.e. yntactic category) c of the input. The variable feature decribe the morpho-yntactic feature, thee include type c, tenetn, verbal type vtyp, number nb and prepoitional form pform a hown in the table below. lexical feature: morphological feature: lu, g, c c, tn, vtyp, nb, pform Generaliation conit in replacing ub-equence of an example with reduction (i.e. variable contraint by a et of feature). Reduction in tranlation template diregard the xed feature while keeping track of the variable one. For intance, from French/Englih tranlation example (1) and (2) below a generalied cae (2g) can be inferred. French expreion Englih expreion 1 (ki) NOUN! (ki) NOUN 2 (tation de ki) NOUN! (ki tation) NOUN 2g (tation de X NOUN ) NOUN! (X NOUN tation) NOUN The tranlation template (2g) will match a number of chunk uch a tation de port, tation de taxi etc: ::where the ller of the lot X NOUN are contraint by a et of feature to be hared with ki, e.g. the feature NOUN.
5 Thee equence would be tranlated in the abence of full matching cae into port tation, taxi tation etc: ::. More than one reduction within a tranlation template i poible if dierent equence can be reduced. The generaliation 3a, for intance ha reduction X DP NOM and Y DP ACC. 3a. (X DP NOM eat Y DP ACC ) S! (X DP NOM een Y DP ACC ) S 3b. (X DP NOM eat Y DP ACC ) S! (Y DP ACC een X DP NOM ) S During tranlation, the morphological analyi of a new entence i matched againt the example in the CB. Thoe equence of the entence which match one or more tranlation example are reduced to one node. The newly created reduction keep track of the of the matching example() number i and their propertie uch a NOUN or ACC. Wherea the propertie of the matching example are viible in the generaliation, the number() i only erve in the renement tep to determine the internal tructure of the target language chunk to be generated. The input entence { thu generalied { i then matched (again) againt the CB until no more reduction can be performed or the entire entence i reduced to one node only. Depending on the type of the example, dierent feature are percolated into the external contraint of the reduced node a hown in the following table. phrae type tag type external contraint adverbial phrae ADV { adjective phrae A nb, cae noun phrae NOUN nb, cae determiner phrae DP nb, cae, pec prepoitional phrae PP nb, cae, pec, pform entence S tn, vform 4 Linkage of CAT2 and EDGAR In the CAT2-EDGAR Experiment we linked the CAT2 dynamically to EDGAR in uch away that EDGAR come rt into play after the morphological analyi and before the yntactic analyi performed by CAT2 and econdly, during generation, after the yntactic generation and before the morphological generation. In uch an architecture, EDGAR erve for CAT2 a an intelligent multi-word and phrae tranlation front end, wherea CAT2 perform tranlation of linguitic tructure which are beyond the capabilitie of EDGAR.
6 EDGAR ue only morphological and yntactic information of tranlation example which can be acquired automatically a a conequence, it i eay to tune and to extend to new domain. On the other hand, CAT2 focue on yntactic and emantic principle which underlie the language involved. When EDGAR fail to nd an appropriate tranlation example, CAT2 come into play andcover the text with a default tranlation. If a uer prefer a tranlation dierent from the default tranlation, a uitable tranlation example can be added to the CB. Subequent tranlation will then ue thi tranlation example intead of the default tranlation. According to the conten of the CB, the interaction of EDGAR and CAT2 may have one of the following hape: EDGAR egment the entire input text into autonomou chunk. In thi cae, the (reduced) chunk need not pa through CAT2 at all. EDGAR cannot nd any egment to be reduced in the input text. In thi cae, the ource text i tranmitted to CAT2 to be proceed a uual. EDGAR can partially reduce the input text by matchin it againt the CB. In thi cae, both the reduced chunk and the remaining unrecognied text element are tranmitted to CAT2. In generation, EDGAR re-generate only thoe target language part that it ha reduced previouly. In thi architecture, an MT ytem adapt itelf dynamically to the data which the uer enter into the CB and the text encountered: while a complete match of cae in a entence convert the ytem into a TM, in the next entence the ytem may return to a purely rule-baed treatment, or combine the two approache. A for the chunk obtained from EDGAR, they remain "lexically ealed" for CAT2. Thi mean that CAT2 conider the TU that come from EDGAR a ingle node, diregarding their internal lexical tructure. CAT2 may ormay not aign ome grammatical feature to the target ide of the chunk in order to guide adaptation. The lexical content of thee unit remain unchanged and thu doe not aect their reliability. CAT2, on the other hand, i bound to operate fater and in a more robut way, if for no other reaon than imply becaue it ha fewer unit to handle. 5 Example Tranlation In thi ection we how how the hybrid EDGAR-CAT2 ytem tranlate ome phrae type, given the following CB:
7 CB1 (man) NOUN $ (Mann) NOUN CB2 (newpaper) NOUN $ (Zeitung) NOUN CB3 (a man) DP $ (Ein Mann) DP CB3G (a X NOUN) DP $ (Ein X NOUN) DP CB4 (The newpaper) DP $ (Die Zeitung) DP CB4G (The X NOUN) DP $ (Der X NOUN) DP CB5 (The old man) DP $ (Der alte Mann) DP CB5G (The old X NOUN) DP $ (Der alte X NOUN) DP CB6 (for the man) PP $ (fur den Mann) PP CB6G (for dp)pp $ (fur X DP ) PP CB7 (The old women) DP $ (die alten Frauen) DP CB8 (ecretary of tate) NOUN $ (Staatminiter) NOUN CB9 (on the table) PP $ (auf dem Tich) PP CB10 (day after day) ADV $ (Tag fur Tag) ADV CB11 (The man read the newpaper every day.) S $ (Der Mann liet jeden Tag die Zeitung.) S A for notational convention, we underline unit which EDGAR recongnie and reduce to one node during the chunking tep (C). The reulting generaliation (G) i tranlated by CAT2 into German. The reduced node in CAT2' tranlation output (T) are then rened by EDGAR (R). Example 1 C: The old man i elling the ecretary of tate' car. G: X 6 DP NOM ACC DEF SG i elling Y 5g=8 DP GEN DEF SG car. T: X 6 DP NOM DEF SG verkauft den Pkw Y 5g=8 DP GEN DEF SG. R: Der alte Mann verkauft den Pkw de Staatminiter. The chunk The old man matche CB5 (the old man) DP and i reduced into the node X 6 DP NOM ACC DEF SG. The chunk the ecretary of tate' i recognied in two ucceive tep of generaliation. On a rt level ecretary of tate' matche CB8 (ecretary of tate) NOUN and i reduced into the node Y 8 NOUN GEN SG. Notice that tate and tate' dier only in their morphological analyi by the cae feature. While tate' can only be genitive, tate can not. A outlined above, the cae feature { here having the value GEN { i taken
8 from the matching chunk and percolated into the reduction. On a econd level of generaliation, the chunk the X 8 NOUN GEN SG matche CB4G (the X NOUN ) DP. Given the CB, no more reduction can be computed. The reulting generaliation i thu paed to CAT2 for tranlation. Becaue only the rt contituent can be the ubject of the entence, CAT2 i able to diambiguate the cae feature. CAT2 pare the Y node a a pre-nominal modier which in German can be realied a a pot-nominal genitive. CAT2 recognie the progreive i elling and tranlate it into the German preence verkauft. The reulting tructure i then, again, paed to EDGAR for pecication and renement of the reduced node. Example 2 C: The old men ell car. G: X 6;7 DP NOM ACC DEF PLU ellcar. T: X 6;7 DP NOM DEF PLU verkaufen Auto. R: Die alten Manner verkaufen Auto. In contrat to Example 1 the chunk The old men matche CB5 (The old man) DP only lexically due to the dierent number of men and man. The chunk i, however, identical with CB7 (The old women) DP with repect to morphological feature, and o both CB6 and CB7 are ued a reference tranlation. CAT2 tranlate the remaining item X 6;7 DP NOM ACC DEF P LU ell car. and dictate the cae of the object chunk NOM. EDGAR then merged together the lexical and the morphological feature of the target language reference tranlation and rene the merged chunk according to the dictated cae. Example 3 C: The old woman i waiting for the old man. G: X 7;6 DP NOM ACC DEF SG i waiting Y 6g=6 PP NOM ACC DEF SG T: X 7;6 DP NOM DEF SG wartet Y 6g=6 P P ACC DEF SG auf R: Die alte Frau wartet auf den alten Mann.
9 The chunk The old woman i found in a imilar way tothe old men in Example 2. Morphological feature are matched onto CB5 (The old man) DP and lexical feature are matched onto CB7 (The old women) DP. For the old man i chunked in two generaliation tep. Firt the old man i matched onto CB5 which yield the reduction X 5 DP NOM ACC DEF SG. In a econd tep of generaliation the chunk matche CB6G (for X DP ) PP. A entence reduced to four node i paed to CAT2. CAT2 tranlate the progreive i waiting into imple German preence a in Example 2. Further, the node Y 6g=6 PP NOM ACC DEF SG which repreent the prepoitional phrae for the old man i aigned the emantic role THEME a argument of wait. The German tranlation require for the theme of thi verb the prepoition auf (i.e. warten auf) and the accuative cae. When rening the target language ide of the CB6G (fur X DP ) PP, EDGAR replace the prepoition fur with the prepoition auf baed on the information provided from CAT2. In thi way the correct prepoition can be aigned to an argument PP. Example 4 C: The man put the book on the table. G: X 4g=1 DP NOM ACC DEF SG put the book Y 9 PP NOM ACC DEF SG. T: X 4g=1 DP NOM ACC DEF SG tellt da Buch Y 9 PP ACC DEF SG. R: Der Mann tellt da Buch auf den Tich. The chunk The man i reduced to one node in two generaliation tep: Firt man i reduced baed on CB1 and then CB4G matche the entire chunk. On the table ha a complete match in CB9. The reduced entence i then tranlated in CAT2 where the node Y receive the emantic role LOCATION and the cae ACC. Incontrat to the role THEME, no pecic prepoition i dictated from CAT2. The default prepoition i thu taken from the CB while pecifying the target example. Example 5
10 C: Day after day the man buy a newpaper. G: X 10 ADV Y 4g=1 DP NOM ACC DEF SG buy Z 3g=2 DP NOM ACC INDEF SG. T: X 10 ADV Y 4g=1 DP NOM DEF SG buy Z 3g=2 DP ACC INDEF SG. R: Tag fur Tag kauft der Mann eine Zeitung. Day after day i recognied a an adverbial phrae ADV and underlinethe man and a newpaper are recognied a determiner phrae DP a they repectively match CB10, CB4G/1 and CB3G/2. CAT2 tranlate the unreduced item in the entence and generate an appropriate word order of the element in the target language. Example 6 C: The man read the newpaper every day. G: X 11 S T: X 11 S R: Der Mann liet jeden Tag die Zeitung. There i a complete match for the whole entence in CB11, and there i no unreduced part left for CAT2 to tranlate. The entire tranlation i thu taken from the CB with no adaptation required. 6 Concluion The tranlation example in the previou ection how a poible linkage of a RBMT and an EBMT ytem: prerequiite for the fruitful integration of EDGAR and CAT2 i an appropriate adaptation ability of both ytem. A hown in the example tranlation, in the renement tep EDGAR perform adaptation on agreement feature in determiner and prepoitional phrae and replace prepoition in a prepoitional phrae according to the value dictated by CAT2. Such a minimal adaptation capacity i required if ub-entential tranlation example (uch a determiner or noun phrae) are ued at one time in the ubject poition and another time in the object poition or, a in Example 3 and 4 are once an argument oftheverb and another time are a modier. On the other hand CAT2 need to adapt to dierently chunked input entence and accordingly dictate EDGAR the appropriate adaptation feature. In one extreme, EDGAR recognie none of the word in the input entence and the entire tranlation proce i carried out by CAT2, or the whole entence can be recognied a one chunk a in Example 6.
11 In thi cae the reduction i jut paed through CAT2 who need nothing add to it. Thi architecture enure an optimal interaction of the two component where the full reliability of the EBMT component i enhanced by the adaptative power of the RBMT ytem, which enure a high coverage even if the text i beyond the cope of the CBMT component. Reference Michael Carl, Leonid L. Iomdin, and Oliver Streiter. 1998. Toward dynamic linkage of example-baed and rule-baed machine tranlation. In Proceeding of the ESSLLI '98 Machine Tranlation Workhop. Michael Carl. 1998. A contructivit approach to Machine Tranlation. In Proceeding of NeMLaP3/CoNLL98, page 247{256, Sydney. Oliver Streiter. 1996. Linguitic Modeling for Multilingual Machine Tranlation. Informatik. Shaker Verlag, Aachen. Oliver Streiter. 1998. Aemantic decription language for multilingual NLP. In Paper preented at the Tucan Word Centre{Intitut fur Deutche Sprache Workhop on Multilingual Lexical Semantic, 19-21 June 1998. URL: http://www.iai.uni-b.de/en/cat-doc.html.