Maskinöversättning 2008 F2 Översättningssvårigheter + Översättningsstrategier
Flertydighet i källspråket poäng point, points, credit, credits, var verb ->was, were pron -> each adv -> where adj -> every subst->pus
Flertydighet i källspråket, forts anta [någon] (till utbildning) -> admit [att] -> suppose kunna vara i stånd -> be able to ha kunskap om -> know
Variation i målspråket Vid avslutad kurs On completion of the course 173.000 After completion of the course 74.000 Having completed the course, 25.900 After finishing the course 25.400 After completed course 636 After a completed course 192 En ordagrann direktöversättning passar inte in i något fall (*At completed course).
Lexikala översättningsval på ->on/of/in/ baserad på -> based on exempel på -> example of studenter på programmet -> students in the program redogöra för account for describe
Grammatiska skillnader Efter avslutad kurs förväntas studenten ha grundläggande kunskaper om dynamiken i atmosfären. On completion of the course, the student is expected to have basic knowledge of the dynamics of the atmosphere.
Basic strategies direct translation rule-based translation transfer interlingua example-based translation statistical translation hybrids
The Vauquois triangle http://www1.cs.columbia.edu/~julia/jmchap ters/ch24.pdf
Direct translation no complete intermediary sentence structure translation proceeds in a number of steps, each step dedicated to a specific task the most important component is the bilingual dictionary typically general language problems with ambiguity inflection word order and other structural shifts
Simplistic approach sentence splitting tokenisation handling capital letters dictionary look-up and lexical substitution heuristics for handling ambiguities copying unknown words, digits, signs of punctuation etc. formal editing
Advanced classical approach (Tucker 1987) source text dictionary look-up and morphological analysis identification of homographs identification of compound nouns identification of nouns and verb phrases processing of idioms
Advanced approach, cont. processing of prepositions subject-predicate identification syntactic ambiguity identification synthesis and morphological processing of target text rearrangement of words and phrases in target text
Feasibility of the direct translation strategy Is it possible to carry out the direct translation steps as suggested by Tucker with sufficient precision without relying on a complete sentence structure?
Systran System Translation developped in the US by Peter Toma first version 1969 (Ru-En) EC bought the rights of Systran in 1976 currently 18 language pairs first sv-en version in 2003 http://babelfish.altavista.com/
Systran, cont. more than 1,600,000 dictionary units 20 domain dictionaries daily use by EC translators, administrators of the European institutions originally a direct translation strategy see H&S to-day more of a transfer-based strategy
Ex. 1: fairly good translation /Systran sv-en "Enskilda företagare som inte bildat bolag klassificeras hit." "Individual entrepreneurs that have not formed companies are classified here. The system has identified bildat as a perfect tense form and translates it correctly have formed with the negation not in the right place.
Ex. 2: word order problem/ Systran sv-en "När byarna kontaktades hade de inte ens utsatts för influensa." "When the villages were contacted had they not even been exposed to flu. The system has not identified the subject and the predicate and thus generates wrong word order.
Ex. 3: ambiguity problem/ Systran sv-en "Vad kan vi lära av Arrawetestammen?" "What can we faith of the Arawete? The system does not find the connection between kan and lära and thus fails to recognize lära as a verb.
Ex. 4: ambiguity problem/ Systran sv-en Extrapoleringen går till så här. " The extrapolation goes to so here. The system does not recognize the phrasal verb gå till and thus translates incorrectly word by word.
Systran Linguistic Resources Dictionaries POS Definitions Inflection Tables Decomposition Tables Segmentation Dictionaries Disambiguation Rules Analysis Rules
Systran Processing Steps Analysis Lookup Compound decomposition Disambiguation Syntactic analysis Compound expansion Sentence transfer Initial target structure Lookup Default transfer of attributes Structure transformation
Systran Processing Steps (cont) Sentence synthesis Structure transformation Inflection look-up Surface transformation
Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91
Example 1 Sv. Fyll på olja i växellådan. En. Fill gearbox with oil. (from the Scania corpus) fyll på fill obj adv adv obj