Integration of Content Optimization Software into the Machine Translation Workflow Ben Gottesman Acrolinx
What is Acrolinx? Acrolinx is Content Optimization Software. It helps authors make their text! more correct,! more consistent,! and easier to translate. Role of Acrolinx in taraxű:! pre-edit translation source! post-edit translation output! develop resources to process and check Chinese text! Multilingual Term Extraction
Pre- and post-editing for MT! Pre-editing: Rules for making content more easily machinetranslatable! e.g. keep two verb parts together The user must shut down the system swiftly.! because statistical MT has trouble with long-distance dependencies! Post-editing: Rules for correcting errors that are typical of MT systems! e.g. correct reflexive verb usage Aber verwirren Sie sich nicht.! many errors can be attributed to syntactic differences between source and target language
Pre- and post-editing for MT Business value:! Many existing Acrolinx customers use Acrolinx in combination with translation.! Typically, such customers use Acrolinx to improve source texts prior to translating them.! when you improve source text directly you improve many translations indirectly Translation: EN Source: DE Translation: FR Translation: ZH
Pre- and post-editing for MT Business value:! Many existing Acrolinx customers use Acrolinx in combination with translation.! Typically, such customers use Acrolinx to improve source texts prior to translating them.! when you improve source text directly you improve many translations indirectly! Many of these customers already use MT in their translation workflow, or wish to start doing so.! The rules we developed and expertise we gained in taraxű put us in a better position to serve these customers needs.
Example: Pre-editing with Acrolinx 1. Compose content in your preferred editing tool. 2. Check your content in the editing tool with the Acrolinx plugin.
Example: Pre-editing with Acrolinx 1. Compose content in your preferred editing tool. 2. Check your content in the editing tool with the Acrolinx plugin. 3. Machine-translate your content The user must shut the system swiftly down. The user must shut down the system swiftly. Der Benutzer muss das System heruntergefahren schnell nach unten. Der Benutzer muss das System herunterfahren schnell.
Chinese Support for checking Chinese developed from scratch within taraxű! not just for pre- and post-editing but in general! Components:! segmentation, tokenization, POS-tagging, morphology! spell-checking! 9634 erroneous spellings can be flagged! 70 grammar rules, 76 style rules Business value:! opens up Chinese market to Acrolinx new customers! also several existing customers want to check Chinese text
Multilingual Term Extraction Identify technical terms and their translations in multilingual texts German Die Spannungsversorgung für die Elektronik wird vom Speisegerät G526 sichergestellt. Spannungsversorgung für interne Speisung (X3e) English The voltage supply for the electronics is maintained by the power supply unit G526. Power supply for internal supply (X3e) Unterspannung in der Stromversorgung Undervoltage in the power supply Spannungsversorgung Stromversorgung voltage supply power supply
Multilingual Term Extraction MTE results can contribute to translation workflow:! extracted terms are used to build up terminology! human validators specify preferred/dispreferred variants from among extracted same-language synonyms can guide future writing or translations! cross-language links can guide future translations (human/machine) Business value for Acrolinx:! have sold MTE as a service to several customers! MTE offering has helped attract new customers! some customers interested in purchasing MTE tool itself
Summary taraxű outcomes for Acrolinx:! new checking rules! pre-editing, post-editing! SMT, RBMT! English, German, Chinese! Multilingual Term Extraction tool