Tackling interoperability issues within UIMA workflows
|
|
|
- Juliet Butler
- 10 years ago
- Views:
Transcription
1 Tackling interoperability issues within UIMA workflows Nicolas Hernandez LINA (CNRS - UMR 6241) University of Nantes 2 rue de la Houssinière B.P , NANTES Cedex 3, France [email protected] Abstract One of the major issues dealing with any workflow management frameworks is the components interoperability. In this paper, we are concerned with the Apache UIMA framework. We address the problem by considering separately the development of new components and the integration of existing tools. For the former objective, we propose an API to generically handle TS objects by their name using reflexivity in order to make the components TS-independent. In the latter case, we distinguish the case of aggregating heterogeneous TS-dependent UIMA components from the case of integrating non UIMA-native third party tools. We propose a mapper component to aggregate TS-dependent UIMA components. And we propose a component to wrap command lines third party tools and a set of components to connect various markup languages with the UIMA data structure. Finally we present two situations where these solutions were effectively used: Training a POS tagger system from a treebank, and embedding an external POS tagger in a workflow. Our approch aims at providing quick development solutions. Keywords: UIMA, interoperability, type system, data serialization format, software component integration 1. Introduction Over the last few years, there has been growing interest in the Apache Unstructured Information Management Architecture 1 (UIMA) (Ferrucci and Lally, 2004) as a software solution to manage unstructured information. In comparison with GATE 2 (Cunningham, 2002), its probable major difference is that it was initiated more recently and not by researchers but by industrials (IBM) with stronger engineering and design skills. GATE presents the interests of being used for long in the Natural Language Processing (NLP) community and of offering a wide range of analysis components and tools. NLTK 3 remains also an interesting framework in particular because of the availability of numerous integrated third party tools and data resources, and because of its programming language Python which is well-adapted for handling text material and quick development. From the NLP researcher point of view, Apache UIMA is an attractive solution for at least two reasons 4 : First, it dissociates the engineering middleware problems from the NLP issues and takes in charge many of the engineering needs like the workflow deployment, the data transmission or the data serialization; Second, it offers a programming framework for defining and managing NLP objects present in analysis tasks (such as creating or getting the annotations of a given type). One of the major issues dealing with any workflow management frameworks is the components interoperability. UIMA components only exchange data. So the data structure of the shared data is important since it ensures the interoperability. UIMA offers mechanisms to freely define its own data structure and the means to handle it afterwards. This may lead to some interoperability problems since any See (Hernandez et al., 2010) for more reasons. one can design its own domain model to represent the same concepts. For example, word, mot or token can be different names to mean the same type of information. In addition, as shown in Figure 1, an information, such as the part-ofspeech (POS) value Noun of a word, can be represented in several ways. In (1), it is the value of a POS feature of a word annotation. In (2), it is the value of a whatever feature of a POS annotation at the same offset of a word annotation. And in (3), it corresponds itself to a type of an annotation covering the desired text offsets of a word. In the UIMA jargon, the definition of the data structure is called the type system (TS). In this paper, we address the interoperability issue by considering separately the development of new components and the integration of existing software instruments. For the former objective, we propose an API to generically handle TS objects by their name using reflexivity in order to make the components TS-independent (Section 3.1.). In the latter case, we distinguish the case of aggregating heterogeneous TS-dependent UIMA components from the case of integrating non UIMA-native third party tools. We propose a mapper component to aggregate TS-dependent UIMA components (Section 3.2.). And we propose a component to wrap command lines third party tools (Section 3.3.) and a set of components to connect various markup languages with the UIMA data structure (Section 3.4.). Finally we present two situations where these solutions were effectively used: Training a POS tagger system from a treebank (Section 4.1.), and embedding an external POS tagger in a workflow (Section 4.2.). Our approach aims at providing quick development solutions. This work is part of the efforts for building a French-speaking community around UIMA (Hernandez et al., 2010). 3618
2 Figure 1: Various data model definitions. 2. Background The question of interoperability for sharing language resources and technology is the concern of several related initiatives such as the CLARIN 5 project and the META-NET 6 network. Various kinds of related interoperability aspects can be considered: the «domain models» which provide definitions of the types of the data elements that constitute a domain as well as the description of how these types are structured in the domain, the «data serialization formats» which are used to store or to remotely transmit the primary data and its associated metadata, the «metamodels» which are abstractions of domain models UIMA concepts In March 2009, the UIMA metamodel was voted as a standard by the OASIS consortium 7. «The specification defines platform-independent data representations and interfaces for text and multi-modal components or services. The principal objective of the UIMA specification is to support interoperability among components or services.» The Common Analysis Structure (CAS) is the data structure which is exchanged between the UIMA components. It includes the data, subject of analysis and called the Artefact, and the metadata, in general simply called the Annotations, which describe the data. The annotations are structured in Views (e.g. an HTML document can have the following views: the HTML structure, the extracted text, and a translation of the latter) where they are directly associated to. The annotations are made of a feature structure and are stored in CAS index. The definition of an annotation structure is called the Type System (TS) and consists of an implementation of a domain model. A UIMA workflow is made of three types of components: the Collection Reader (CR) which imports the data to process (for example from the Web or from the file system...) and turns it into a CAS. The Analysis Engines (AE) which literally process the data (including but not restricted to NLP analysis tasks); The annotations result from AE processing. And lastly the CAS Consumer (CC) which exports the annotations (for example to a database or to an XML representation of the analysis results). The UIMA framework handles the effective transmission of the data either as objects between components deployed on a same computer or as XML streams by (a)synchronous web services. In addition, the UIMA framework comes with components which offer the possibility to serialize the exchanged data into the XML Model Interchange 8 (XMI) format which is the OMG s XML standard for exchanging Unified Model Language (UML) metadata Serialization formats for exchanging data between third party tools As part of the ISO s TC37 SC4, (Ide and Suderman, 2009) defend the idea that the GrAF (Graph Annotation Framework) format, which is the xml serialization of the LAF (Linguistic Annotation Framework) metamodel (Romary and Ide, 2004), can serve primarily as a «pivot» for transducing serialization formats. They show that it is possible to convert the information from the GrAF to a UIMA CAS. This is mainly made possible since both underlying metamodels are based on a graph structure. Nevertheless the conversion is not straightforward; Since the UIMA CAS defines typed feature structures, the process requires the use of external knowledge sources to be able to specify the types of the UIMA features. Neither the outcome is not completely bijective; The process requires the definition of UIMA additional features for being able to explicit the graph structure of the GrAF and consequently to reverse the process. As a consequence, any UIMA CAS cannot be transduced into a GrAF without an adaptation of its structure. As a matter of fact the XMI format remains an appealing solution to store and exchange UIMA CAS Main trend for tackling the type system interoperability problem The proposed solutions to tackle the domain model (i.e. type system) interoperability problems were similar to the solutions proposed for tackling the XML languages interoperability problems. The main trend was to define a common tool- and domain-free TS. In fact, several TS emerged: The CCP metamodel s TS (Verspoor et al., 2009), the DKPro s TS at the Darmstadt University 9 (Gurevych et al., 2007), the Julie lab s TS (Hahn et al., 2007) 10 and the U-Compare project s TS (Kano et al., 2009; Thompson et al., 2011) 11. The former consists of a simple annotation hierarchy where the domain semantics is captured through pointers into external resources. The others roughly consist of an abstract hierarchy of NLP concepts covering the dkpro 10 Software.html
3 various linguistic analysis levels. In particular, (Thompson et al., 2011) defend the adoption of the Apache UIMA framework and of the integration U-Compare system within the META-SHARE infrastructure which is an initiative of META-NET for sharing language resources and technology on a range of European languages. In practice these type systems are still in use separately. As we mentioned in (Hernandez et al., 2010), as often as possible existing TS should be used, but in our opinion, distinct TS will always exist and it will always be necessary to develop software converters either to ensure compatibility with existing TS-dependent components or to fit with specific problem requirements. The U-Compare project, for example, comes with some ad hoc TS converters from CCP, OpenNLP and Apache which turn them into the U-Compare TS. It offers also some U-Compare TS to OpenNLP. 3. Handling more easily and generically the UIMA framework Below we present the projects we develop for this purpose. They are made of libraries and UIMA components uima-common: Re-using common UIMA codes uima-common 12 aims at assembling common and generic code snippets that can be usefully reused in several distinct UIMA developments (like AE or any applications). It is mainly made of two parts: A UIMA utilities library and a generic AE implementation. The library defines methods to generically handle the various UIMA object types (i.e.view, annotation, feature) referring to them by string names. It centralizes redundant codes in particular for parsing collections of these objects, getting and setting them. The generic AE can be used by extending its class. It allows to develop TS-independent AE and so to handle generically the processed views and annotations. This is made possible by specifying the names of the handled views and annotations by parameters. In addition, the AE follows a common analysis template: It allows to perform some process (e.g.. does it start with an upper case letter) on some annotations (e.g. the tokens)covered by some others (e.g. the sentences) which are only present in some views (e.g. the extracted text) by simply overwriting the right methods. The processing result can be stored in a new view, or a new annotation 13 only by setting their name by parameters. In some way, it can be compared with uimafit 14 but uima-common is more centered on the internal development of components while uimafit is more dedicated to the development of applications with the ability to perform dynamic configuration and instantiation of components and workflows. The implementation is hardly based on the java.lang.reflect API It is also possible to set or update the feature value of an annotation uima-mapper: Mapping between UIMA objects uima-mapper 15 aims at tackling two issues: mapping UIMA objects (annotations and features from distinct type systems and views) and recognizing annotation patterns. Both issues are indeed two views of a same process performed on the fly by declarative rules. In NLP, the rule-based analysis is one of the two main approaches to process the documents; The other one is based on machine learning. The development of graph matching systems presents two difficulties: 1. designing simple and intuitive language for expressing rules and 2. implementing an effective engine to process the rules over graphs. uima-common offers some solutions when developing new UIMA components to make them TS-independent. Nevertheless, many components available 16 assume the names of the view to work with and of the annotations to process or create. uima-mapper proposes to address this problem by inserting between two components using the same concepts defined with distinct TS, a specific AE which is able to translate annotations from the former TS to the latter TS. The AE only requires to define the parameters of the rules file paths. A rule declares some edit operations (such as creation) by specifying constraints over an annotations pattern and over the annotations. On the creation operation it is possible to set features with values imported from the matched pattern. Currently the pattern can count only one annotation and only the creation operation is supported. As a matter of fact, this AE was first a proof of concept. The respect of the semantic coherence of the transformation is up to the user. The implementation is based on W3C XPath 17 as the language to support the declaration of constraints over the source annotations and Apache JXPath 18 as the engine which processes the constraints. JAPE of GATE (Thakker et al., 2009; Cunningham et al., 2011) and Unitex (Paumier, 2003) offer capabilities for defining regular expressions over annotations. Efforts have been performed to allow the embedding of GATE pipelines within UIMA 19 ; and similarly with Unitex (Meunier et al., 2009) 20. Even if the embedding of such tools can be a solution for mapping UIMA annotations, the approach seems too complex and resources consuming compared to the need. Indeed, it is necessary to both configure these tools as well as the embedding; In addition to having to write the mapping rules between UIMA objects, the user must also supply mapping rules defining how to map between UIMA and GATE/Unitex annotations. Up to our knowledge, it exists currently two native UIMA projects whose goal is to offer capabilities for recognising UIMA annotations patterns in a UIMA workflow: The zanzibar 21 project and the TextMarker component (Kluegl and 3620
4 et al., 2009) 22. The former project has not been upgraded since March The current version is hardly stable. And despite the fact that patterns can be defined, the constraints over the annotation are quite limited in expressiveness (i.e. only the presence of a feature value can be specified). On the other hand, the TextMarker component is a very appealing project because of the rich expressiveness it aims at offering. It is now further developed and hosted at Apache UIMA and there is no current stable release and it is recommended to use the previous stable version which remains very complex to use and quite dependent of the Eclipse environment. In this previous version, we encountered the problem that TextMarker engine allowed only one annotation of the same type starting at the same offset. This limitation prevents us from using it since we could not combined it with the uima-connectors s XML2CAS AE presented below (See Section 3.4.). The XML2CAS AE produces as many annotations of the same type as there are attributes in an XML element. There was no possibility to express TextMarker rules for selecting a specific annotation of this type; The TextMarker always considered the first one at this offset in the annotation index uima-shell: Integrating non native UIMA third party tools uima-shell 23 offers a way to process Shell command over a CAS element, view or annotation, and to store the result either as a new view or annotation. It mainly aims at running within a UIMA workflow some external third party tools available via command line. These tools must perform their processing by taking the input as a file name parameter or a standard input (stdin) and produce the result via the standard output (stdout). The specified CAS element to analyse will be turned into a file argument which will be accessed by specifying the commands and the argument tokens which precede the file argument (i.e. PreCommand parameter) as well as the commands and the argument tokens which follow the file argument (i.e. PostCommand parameter). Under a LINUX system, if the command to process takes its input from the standard input, then the PreCommand parameter will be set with a cat value and the PostCommand parameter will starts with a pipe character followed by the command. If the command to process takes its input as an argument at a specific position then the PreCommand parameter will be set with the command and the first arguments and the PostCommand parameter with the last arguments. In any case it is possible to set several commands pre and post the file argument. It is also possible to specify environment variables (i.e. EnvironmentVariable parameter) which will be available for the process running the commands. With the uima-connectors project (See Section 3.4.), this current component aims at solving interoperability issues when dealing with non native UIMA tools. The implementation is based on the Martini Shell API tmwiki.informatik.uni-wuerzburg.de uima-connectors: Connecting various text markup languages with UIMA uima-connectors 25 mainly aims at offering solutions to build the bridge between some markup languages and the UIMA CAS. The Apache UIMA Tika 16 project aims at detecting and extracting metadata and structured text content from various type MIME documents. In comparison, uima-connectors is more dedicated to perfom mapping from/to text formats to/from CAS, providing solutions for handling general markup language such as extended Markup Language (XML), Comma Separated Value (CSV), whitespace-tokenized texts with a sentence per line... or applications of these formats such as Message Understanding Conferences (MUC), Apache OpenNLP, CONLL (B-I-E)... Similar facilities are offered by other plateforms such as U-Compare, the GrAF softwares 26, GATE. The aim of uima-connectors is to offer the most generic solution as possible in order to prevent from having to develop ad hoc solutions or to extend existing ones. The basic idea is to assume that there are recurrent situations where the metamodels of these formats (XML, CSV... ) can be aligned with the CAS. In practice, solutions could be collection readers, analysis engines and CAS consumers. We preferentially adopt an approach in terms of development of AEs which allows to cut into any point of a workflow by specifying the view to process. These components can complete the wrapping performed by the uima-shell component but they can also be used at the beginning or at the end of a processing workflow to import or export from/to specific serialization formats. Examples of uima-connectors include the CSV2CAS and the XML2CAS AEs. The CSV2CAS AE offers various ways to create or to update annotations with CSV-like formatted information. The type of the annotation to handle is given as parameter. If the annotation type exists in a specified view, the AE will update the annotations. The number of CSV lines is so assumed to be the same as the number of the annotations instances to update. The AE allows to set the correspondence between features names and column ranks. The XML2CAS AE transduces the information from the XML tree structure to the CAS by creating annotations over the text spans delimited by the begin and the end tags of XML elements. The types of the created annotations are predefined and represent the XML element and attribute nodes: XMLElementAnnotation and XMLAttributeAnnotation. The XMLAttributeAnnotation type, for example, comes with dedicated features to inform about its name (i.e. attributename feature), its value (i.e. attributevalue feature) and the element name to which it belongs (i.e. elementname feature). As adiguba/p3035/java/5-0-tiger/ runtime-exec-n-est-pas-des-plus-simple tools/index.html#uima 3621
5 annotations, both have begin, end and coveredtext features. This AE offers a way to process the XML structure of any XML document without having to define new UIMA types. In practice, we use the uima-mapper afterwards, to turn the generic types into the specific input annotation types of another following AEs. 4. Examples of use cases Below we present two real uses cases where we used the components presented in the previous section Mapping annotations: From FTB to HMMPOSTaggerTrainer In this section we present a situation where it is useful to use a mapping AE to connect two AEs which handles distinct annotations for the same concept. We illustrate this use case on the task of building a statistic model for a Tagger system from the French Treebank (FTB) corpus (Abeillé and Barrier, 2004). We use the Apache HMM tagger trainer 16 (apache-addons:hmmpostaggertrainer) to build an HMM model from the data. The workflow is represented in Figure 2. <SENT nb="8000"> <w cat="adv" mph="adv" lemma="tout au plus"> <w catint="adv">tout</w> <w catint="p">au</w> <w catint="d"/> <w catint="adv">plus</w> </w> <NP> <w lemma="un" cat="d" mph="d-ind-fp">des</w> <w cat="a" mph="a-qual-fp" lemma="petit">petites</w> <w cat="n" mph="n-c-fp" lemma="chose">choses</w> </NP> <VPinf> <w cat="p" mph="p" lemma="à">à</w> <VN> <w cat="v" mph="v--w" lemma="changer">changer</w> </VN> <PP> <w cat="p" mph="p" lemma="sur">sur</w> <NP> <w cat="d" mph="d-def-fs" lemma="le">l </w> <w cat="n" mph="n-c-fs" lemma="intégration">intégration</w> </NP> </PP> </VPinf> <w cat="ponct" mph="ponct-s" lemma=".">.</w> </SENT> Figure 3: Example of annotations from the lmf3_08000_08499ep.xd.cat.xml file of the FTB. Some attributes have been removed or renamed for readability purpose. the XMLElementAnnotation is not given here. The table 1 shows the feature names (first line) and the corresponding values of each of the three created XMLAttributeAnnotation annotations. begin and end features and values are not shown to make the table simpler. All the three annotations share the same begin and end values. elementname attributename attributevalue coveredtext w cat A petites w lemma petit petites w mph A-qual-fp petites Figure 2: Mapping annotations: From FTB to HMM- POSTaggerTrainer. The FTB is available for research purpose in an XML format. Figure 3 shows an example of annotations at various analysis levels. For this use case, we are interested by the XML w element which marks the words. The XML cat, lemma and mph attributes describe respectively the POS, the lemma and some morphological information about the words. The w element is used for simple words and multiword expressions; Embedded w elements do not have a cat attribute but a catint attribute (See the first occurrence of the w element). The uima-connectors s XML2CAS AE allows to process XML content and to produce annotations for the specified XML nodes. As described in Section 3.4., this AE defines its own TS. Indeed, for each word XML annotation, the AE will create a CAS annotation XMLElementAnnotation for the element w and an XMLAttributeAnnotation for each one of the attributes. For instance, the following XML annotation <w cat="a" mph="a-qual-fp" lemma="petit">petites</w> will produce three XMLAttributeAnnotation annotations; Detail for Table 1: Features names and values of the three XMLAttributeAnnotation annotations created for the attributes of the XML annotation <w cat="a" mph="a-qual-fp" lemma="petit">petites</w>. The begin and end features and values are not shown for readability purpose. For our training purpose let consider we want to work with the w elements which are simple words or parts of a multiword but not a multiword. We are so interested by the XMLAttributeAnnotation annotations which have the following criteria: Either cat or catint as value of the attributename feature and w as value of the elementname feature. In addition, in order to distinguish the XMLAttributeAnnotation annotations which correspond to simple words from those which correspond to embedding w elements (both have cat as value of the attributename feature), we decide to filter them out based on the presence of a whitespace character in the covered text. Figure 4 shows an example of uima-mapper s rule for mapping annotations which implements this definition. The rule declares that for each XMLAttributeAnnotation annotation, if the XPath 3622
6 <rule id="xmlattributeannotationw2apachetokenannotation" description="in the FTB, select the w elements which are simple words or parts of a multiword but not one"> <pattern> <patternelement type="fr.univnantes.lina.uima.connectors.types.xmlattributeannotation"> <constraint>.[(@elementname= w ) and ((@attributename= cat ) or (@attributename= catint )) and not(contains(@coveredtext, ))]</constraint> <create type="org.apache.uima.tokenannotation" > <setfeature name="postag" value="normalize-space(./@attributevalue)"/> </create> </patternelement> </pattern> </rule> Figure 4: Example of an uima-mapper s rule for mapping UIMA annotations. constraint is satisfied, then it will imply the creation of a TokenAnnotation at the same offsets and set its postag feature with a value resulting from another XPath processing on the matched annotation. This example gives an idea of the expressive power of XPath which allows to express constraints with many kinds of operators (e.g. boolean, comparison, regular expression... ) and functions (e.g. string, mathematical... ) Integrating a third party tool: TreeTagger In this section we show how to process UIMA CAS elements with command line tools and how to set the UIMA CAS elements with the produced results. In particular we describe how to integrate a command line tool with CSVlike input and output formats. We illustrate this use case by setting the POS and the lemma features of some CAS token annotations with the processing results of an external POS tagger. As a third party tool example, we use the (Schmid, 1994) s POS tagger (also named «TreeTagger») which takes a word per line and produces POS and lemma annotations as tabulated values aside of each input word (i.e. petites ADJ petit). Both input and output formats can be considered as CSV-like formats; The input format being a special case with a single column. We also use the Apache whitespace tokenizer 16 (apache-addons:wst) which segments text into word token annotations. The workflow is represented in Figure 5. Each component works on a specified input view and stores its result in a specified output view. The raw text of a document is present in the view _InitialView at the beginning of the processing. The CAS2CSV and the CSV2CAS AEs are part of the uima-connectors project. They take in charge the conversion from/to CSV-like formats to/from UIMA CAS elements. The shell AE refers to the uima-shell AE and allows the integration of TreeTagger. The processing proceeds like this: The Apache whitespace tokenizer performs the segmentation and stores the token annotations (i.e. TokenAnnotation) in its working input view _InitialView. The POS tag and the lemma features are not yet set. The CAS2CSV AE takes in parameters the names of an annotation type (i.e. TokenAnnotation) and of the features (i.e. coveredtext) to process. For each annotation instances present in the given _InitialView, it creates a CSV-formatted line with the values of the specified features at a specified column rank. The result is stored in the CAS2CSV_View. The Figure 5: Integrating a third party tool: TreeTagger. uima-shell AE is set with the following parameters values: The EnvironmentVariable parameter declares TT_HOME=/path/to/application/home/tree-tagger, and the PreCommand and the PostCommand parameters declare respectively the following values: cat and ${TT_HOME}/bin/tree-tagger ${TT_HOME}/lib/french.par -token -lemma -sgml. The CSV2CAS AE parses the TT_View. Thanks to the given name of an annotation type (i.e. TokenAnnotation) and the associations of column ranks with feature names specified by parameters (e.g. 1 -> postag ; 2 -> lemma), the process updates each annotation instance from the given _InitialView with information coming from the CSV-formatted content; Each line corresponds to an annotation. The POS tag and the lemma features are so set. It is interesting to note that the apache-addons:wst and the CAS2CSV components can simply be removed and functionally replaced by a command line token such as perl -ne chomp; s/(\p{isalnum})(\p{ispunct})/$1 $2/g; s/(\p{ispunct})(\p{isalnum})/$1 $2/g; s/ /\n/g; print in the PostCommand parameter before the TreeTagger command. 5. Conclusion This uima-common project is a common basis for all the uima-components we develop such as uima-mapper, uima-shell, and uima-connectors. It was also used for example for developing a wrapper for the java implementations of the C99 and TextTiling segmentation algorithms, written by Freddy Choi 3623
7 uima-text-segmenter 27. All the presented components aims at tackling the problem of interoperability within a UIMA workflow. They were both developed and used for building workflows in particular for importing and connecting an XML version of the French Treebank corpus (Abeillé and Barrier, 2004) to trainer systems such as the Apache HMM tagger 28 and the OpenNLP MaxEnt preliminary processing tools (Boudin and Hernandez, 2012). As perspectives, we want primarily to inform about these solutions to motivate people to use them and to participate to the development of these projects. Considering our short term developments, we plan to concentrate our efforts on the uima-mapper. We are thinking in some ways to build UIMA annotations index which could effectively support the implementation of recognition mechanisms for matching annotations patterns. We also would like to evaluate the performance of the various existing solutions in terms of expressiveness and resource consuming. The uima-connectors will also evolve. As we previously mentioned, we are more interested by generic approaches than ad hoc ones. Currently, we offer a solution to import to the CAS, the information that can be inferred from the tree structure of XML formats (e.g. two elements can be considered as in relation due to the fact that one of them encloses the other one). Since, several XML formats, such as GrAF, are used to represent graph structures thanks to common mechanisms (e.g. couples of id/idref attributes), we are interested to offer a way to specify generically these mechanisms and to align them with elements of the CAS structure in order to be able to handle any XML documents representing graphs. Concerning the uima-shell, the current version does not address the security problems such as the injections of malicious codes or the lack of system resources to perform safely the commands. Indeed, the commands, which are intended to be run, are assumed to be performed by an allowed user as well as to be safe for the running system. We believe that the component should not be restricted because it runs counter to what it aims to do. And in addition, we believe that it can not be restricted because it is impossible to a priori enumerate all the unsafe use cases. The best solutions to prevent these risks are to run the vulnerable process in «jail» environments with restricted rights and resources (i.e. to configure correctly the embedding application servers and operating systems). 6. Acknowledgements This work was financially supported by La région Pays de la Loire under the DEPART project projet-depart.org. I also would like to express my gratitude to the reviewers for there insightful comments uima-text-segmenter /05/construire-des-modelisations-du-french. html 7. References Anne Abeillé and Nicolas Barrier Enriching a french treebank. In Actes de la conférence LREC, Lisbonne. Florian Boudin and Nicolas Hernandez Détection et correction automatique d erreurs d annotation morphosyntaxique du french treebank. In TALN. Hamish Cunningham, Diana Maynard, Kalina Bontcheva, Valentin Tablan, Niraj Aswani, Ian Roberts, Genevieve Gorrell, Adam Funk, Angus Roberts, Danica Damljanovic, Thomas Heitz, Mark A. Greenwood, Horacio Saggion, Johann Petrak, Yaoyong Li, and Wim Peters Text processing with gate. University of Sheffield Department of Computer Science. ISBN , April 15th. Hamish Cunningham Gate, a general architecture for text engineering. Computers and the Humanities, 36: David Ferrucci and Adam Lally Uima: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering, 10(3-4): Iryna Gurevych, Max Mühlhäuser, Christof Müller, Jürgen Steimle, Markus Weimer, and Torsten Zesch and Darmstadt knowledge processing repository based on uima. In Proceedings of the First Workshop on Unstructured Information Management Architecture at Biannual Conference of the Society for Computational Linguistics and Language Technology, Tübingen, Germany. Udo Hahn, Ekaterina Buyko, Katrin Tomanek, Scott Piao, John McNaught, Yoshimasa Tsuruoka, and Sophia Ananiadou An annotation type system for a datadriven nlp pipeline. In The LAW at ACL 2007 Proceedings of the Linguistic Annotation Workshop, pages Prague, Czech Republic, June 28-29, Stroudsburg, PA: Association for Computational Linguistics. Nicolas Hernandez, Fabien Poulard, Matthieu Vernier, and Jérôme Rocheteau Building a French-speaking community around UIMA, gathering research, education and industrial partners, mainly in Natural Language Processing and Speech Recognizing domains. Proceedings of the LREC 2008 Workshop New Challenges for NLP Frameworks, La Valleta, Malta, 05. Nancy Ide and Keith Suderman Bridging the gaps: interoperability for graf, gate, and uima. In Proceedings of the Third Linguistic Annotation Workshop, ACL- IJCNLP 09, pages 27 34, Stroudsburg, PA, USA. Association for Computational Linguistics. Yoshinobu Kano, Luke McCrohon, Sophia Ananiadou, and Jun ichi Tsujii Integrated NLP evaluation system for pluggable evaluation metrics with extensive interoperable toolkit. In Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009), pages 22 30, Boulder, Colorado, June. Association for Computational Linguistics. Peter Kluegl, Martin Atzmueller, and Frank Puppe Textmarker: A tool for rule-based information extraction. In Christian Chiarcos, Richard Eckart de Castilho, 3624
8 and Manfred Stede, editors, Proceedings of the Biennial GSCL Conference 2009, 2nd Workshop, pages Gunter Narr Verlag. Frédéric Meunier, Philippe Laval, Gaëlle Recourcé, and Sylvain Surcin Kwaga : une chaîne uima d analyse de contenu des mails. In First Frenchspeaking meeting around the framework Apache UIMA at the 10th Libre Software Meeting, University of Nantes, July. Sébastien Paumier A Time-Efficient Token Representation for Parsers. pages Proceedings of the EACL Workshop on Finite-State Methods in Natural Language Processing. Laurent Romary and Nancy Ide International Standard for a Linguistic Annotation Framework. Natural Language Engineering, 10(3-4): , September. Helmut Schmid Probabilistic part-of-speech tagging using decision trees. In Proceedings of the Conference on New Methods in Language Processing, Manchester, UK. Dhaval Thakker, Taha Osman, and Phil Lakin Gate jape grammar tutorial, February 27. Paul Thompson, Yoshinobu Kano, John McNaught, Steve Pettifer, Teresa Attwood, John Keane, and Sophia Ananiadou Promoting interoperability of resources in meta-share. In Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm, pages 50 58, Chiang Mai, Thailand, November. Asian Federation of Natural Language Processing. Karin Verspoor, William Baumgartner Jr., Christophe Roeder, and Lawrence Hunter Abstracting the types away from a uima type system. In 2nd UIMA Workshop at Gesellschaft für Sprachtechnologie und Computerlinguistik (GSCL), Tagung, Germany, October. 3625
A Conceptual Framework of Online Natural Language Processing Pipeline Application
A Conceptual Framework of Online Natural Language Processing Pipeline Application Chunqi Shi, Marc Verhagen, James Pustejovsky Brandeis University Waltham, United States {shicq, jamesp, marc}@cs.brandeis.edu
Shallow Parsing with Apache UIMA
Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland [email protected] Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic
Abstracting the types away from a UIMA type system
Abstracting the types away from a UIMA type system Karin Verspoor, William Baumgartner Jr., Christophe Roeder, and Lawrence Hunter Center for Computational Pharmacology University of Colorado Denver School
Semantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
UIMA: Unstructured Information Management Architecture for Data Mining Applications and developing an Annotator Component for Sentiment Analysis
UIMA: Unstructured Information Management Architecture for Data Mining Applications and developing an Annotator Component for Sentiment Analysis Jan Hajič, jr. Charles University in Prague Faculty of Mathematics
WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations
WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations Seid Muhie Yimam 1,3 Iryna Gurevych 2,3 Richard Eckart de Castilho 2 Chris Biemann 1 (1) FG Language Technology,
AnnoMarket: An Open Cloud Platform for NLP
AnnoMarket: An Open Cloud Platform for NLP Valentin Tablan, Kalina Bontcheva Ian Roberts, Hamish Cunningham University of Sheffield, Department of Computer Science 211 Portobello, Sheffield, UK [email protected]
An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines)
An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines) James Clarke, Vivek Srikumar, Mark Sammons, Dan Roth Department of Computer Science, University of Illinois, Urbana-Champaign.
TEANLIS - Text Analysis for Literary Scholars
TEANLIS - Text Analysis for Literary Scholars Andreas Müller 1,3, Markus John 2,4, Jonas Kuhn 1,3 (1) Institut für Maschinelle Sprachverarbeitung Universität Stuttgart (2) Institut für Visualisierung und
Integrating Annotation Tools into UIMA for Interoperability
Integrating Annotation Tools into UIMA for Interoperability Scott Piao, Sophia Ananiadou and John McNaught School of Computer Science & National Centre for Text Mining The University of Manchester UK {scott.piao;sophia.ananiadou;john.mcnaught}@manchester.ac.uk
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
Natural Language Processing in the EHR Lifecycle
Insight Driven Health Natural Language Processing in the EHR Lifecycle Cecil O. Lynch, MD, MS [email protected] Health & Public Service Outline Medical Data Landscape Value Proposition of NLP
Automatic Text Analysis Using Drupal
Automatic Text Analysis Using Drupal By Herman Chai Computer Engineering California Polytechnic State University, San Luis Obispo Advised by Dr. Foaad Khosmood June 14, 2013 Abstract Natural language processing
The Prolog Interface to the Unstructured Information Management Architecture
The Prolog Interface to the Unstructured Information Management Architecture Paul Fodor 1, Adam Lally 2, David Ferrucci 2 1 Stony Brook University, Stony Brook, NY 11794, USA, [email protected] 2 IBM
WebLicht: Web-based LRT services for German
WebLicht: Web-based LRT services for German Erhard Hinrichs, Marie Hinrichs, Thomas Zastrow Seminar für Sprachwissenschaft, University of Tübingen [email protected] Abstract This software
Unstructured Information Management Architecture (UIMA) Version 1.0
Unstructured Information Management Architecture (UIMA) Version 1.0 Working Draft 05 29 May 2008 Specification URIs: This Version: http://docs.oasis-open.org/[tc-short-name] / [additional path/filename].html
PMML and UIMA Based Frameworks for Deploying Analytic Applications and Services
PMML and UIMA Based Frameworks for Deploying Analytic Applications and Services David Ferrucci 1, Robert L. Grossman 2 and Anthony Levas 1 1. Introduction - The Challenges of Deploying Analytic Applications
31 Case Studies: Java Natural Language Tools Available on the Web
31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software
Markus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013
Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data
Survey Results: Requirements and Use Cases for Linguistic Linked Data
Survey Results: Requirements and Use Cases for Linguistic Linked Data 1 Introduction This survey was conducted by the FP7 Project LIDER (http://www.lider-project.eu/) as input into the W3C Community Group
Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
MDA Overview OMG. Enterprise Architect UML 2 Case Tool by Sparx Systems http://www.sparxsystems.com. by Sparx Systems
OMG MDA Overview by Sparx Systems All material Sparx Systems 2007 Sparx Systems 2007 Page:1 Trademarks Object Management Group, OMG, CORBA, Model Driven Architecture, MDA, Unified Modeling Language, UML,
ANNIC: Annotations in Context. Niraj Aswani, Valentin Tablan Thomas Heitz University of Sheffield
ANNIC: Annotations in Context Niraj Aswani, Valentin Tablan Thomas Heitz University of Sheffield ANNIC Motivation Need for a corpus analysis tool Useful for authoring of IE patterns for rules is an IR
A web based Workbench for Interactive Semantic Text Analysis: Design and Prototypical Implementation
Department of Informatics TECHNISCHE UNIVERSITÄT MÜNCHEN Master s Thesis in Information Systems A web based Workbench for Interactive Semantic Text Analysis: Design and Prototypical Implementation Tobias
UIMA and WebContent: Complementary Frameworks for Building Semantic Web Applications
UIMA and WebContent: Complementary Frameworks for Building Semantic Web Applications Gaël de Chalendar CEA LIST F-92265 Fontenay aux Roses [email protected] 1 Introduction The main data sources
11-792 Software Engineering EMR Project Report
11-792 Software Engineering EMR Project Report Team Members Phani Gadde Anika Gupta Ting-Hao (Kenneth) Huang Chetan Thayur Suyoun Kim Vision Our aim is to build an intelligent system which is capable of
Introduction to XML Applications
EMC White Paper Introduction to XML Applications Umair Nauman Abstract: This document provides an overview of XML Applications. This is not a comprehensive guide to XML Applications and is intended for
Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1
Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically
Firewall Builder Architecture Overview
Firewall Builder Architecture Overview Vadim Zaliva Vadim Kurland Abstract This document gives brief, high level overview of existing Firewall Builder architecture.
An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases
An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases Paul L. Bergstein, Priyanka Gariba, Vaibhavi Pisolkar, and Sheetal Subbanwad Dept. of Computer and Information Science,
Managing large sound databases using Mpeg7
Max Jacob 1 1 Institut de Recherche et Coordination Acoustique/Musique (IRCAM), place Igor Stravinsky 1, 75003, Paris, France Correspondence should be addressed to Max Jacob ([email protected]) ABSTRACT
Meta-Model specification V2 D602.012
PROPRIETARY RIGHTS STATEMENT THIS DOCUMENT CONTAINS INFORMATION, WHICH IS PROPRIETARY TO THE CRYSTAL CONSORTIUM. NEITHER THIS DOCUMENT NOR THE INFORMATION CONTAINED HEREIN SHALL BE USED, DUPLICATED OR
Business Rule Standards -- Interoperability and Portability
Rule Standards -- Interoperability and Portability April 2005 Mark H. Linehan Senior Technical Staff Member IBM Software Group Emerging Technology [email protected] Donald F. Ferguson IBM Fellow Software
A Workbench for Prototyping XML Data Exchange (extended abstract)
A Workbench for Prototyping XML Data Exchange (extended abstract) Renzo Orsini and Augusto Celentano Università Ca Foscari di Venezia, Dipartimento di Informatica via Torino 155, 30172 Mestre (VE), Italy
estatistik.core: COLLECTING RAW DATA FROM ERP SYSTEMS
WP. 2 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing (Bonn, Germany, 25-27 September
DKPro TC: A Java-based Framework for Supervised Learning Experiments on Textual Data
DKPro TC: A Java-based Framework for Supervised Learning Experiments on Textual Data Johannes Daxenberger, Oliver Ferschke, Iryna Gurevych and Torsten Zesch UKP Lab, Technische Universität Darmstadt Information
A Scheme for Automation of Telecom Data Processing for Business Application
A Scheme for Automation of Telecom Data Processing for Business Application 1 T.R.Gopalakrishnan Nair, 2 Vithal. J. Sampagar, 3 Suma V, 4 Ezhilarasan Maharajan 1, 3 Research and Industry Incubation Center,
The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle
SCIENTIFIC PAPERS, UNIVERSITY OF LATVIA, 2010. Vol. 757 COMPUTER SCIENCE AND INFORMATION TECHNOLOGIES 11 22 P. The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle Armands Šlihte Faculty
Master Degree Project Ideas (Fall 2014) Proposed By Faculty Department of Information Systems College of Computer Sciences and Information Technology
Master Degree Project Ideas (Fall 2014) Proposed By Faculty Department of Information Systems College of Computer Sciences and Information Technology 1 P age Dr. Maruf Hasan MS CIS Program Potential Project
DataDirect XQuery Technical Overview
DataDirect XQuery Technical Overview Table of Contents 1. Feature Overview... 2 2. Relational Database Support... 3 3. Performance and Scalability for Relational Data... 3 4. XML Input and Output... 4
Semester Thesis Traffic Monitoring in Sensor Networks
Semester Thesis Traffic Monitoring in Sensor Networks Raphael Schmid Departments of Computer Science and Information Technology and Electrical Engineering, ETH Zurich Summer Term 2006 Supervisors: Nicolas
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Technical Report. The KNIME Text Processing Feature:
Technical Report The KNIME Text Processing Feature: An Introduction Dr. Killian Thiel Dr. Michael Berthold [email protected] [email protected] Copyright 2012 by KNIME.com AG
Unstructured Information Management Architecture (UIMA) 3rd UIMA@GSCL Workshop September 23, 2013
Unstructured Information Management Architecture (UIMA) 3rd UIMA@GSCL Workshop September 23, 2013 Copyright c 2013 for the individual papers by the papers authors. Copying permitted only for private and
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery Dimitrios Kourtesis, Iraklis Paraskakis SEERC South East European Research Centre, Greece Research centre of the University
Cisco Data Preparation
Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and
Federated, Generic Configuration Management for Engineering Data
Federated, Generic Configuration Management for Engineering Data Dr. Rainer Romatka Boeing GPDIS_2013.ppt 1 Presentation Outline I Summary Introduction Configuration Management Overview CM System Requirements
Djangology: A Light-weight Web-based Tool for Distributed Collaborative Text Annotation
Djangology: A Light-weight Web-based Tool for Distributed Collaborative Text Annotation Emilia Apostolova, Sean Neilan, Gary An*, Noriko Tomuro, Steven Lytinen College of Computing and Digital Media, DePaul
Die Vielfalt vereinen: Die CLARIN-Eingangsformate CMDI und TCF
Die Vielfalt vereinen: Die CLARIN-Eingangsformate CMDI und TCF Susanne Haaf & Bryan Jurish Deutsches Textarchiv 1. The Metadata Format CMDI Metadata? Metadata Format? and more Metadata? Metadata Format?
Automatic Timeline Construction For Computer Forensics Purposes
Automatic Timeline Construction For Computer Forensics Purposes Yoan Chabot, Aurélie Bertaux, Christophe Nicolle and Tahar Kechadi CheckSem Team, Laboratoire Le2i, UMR CNRS 6306 Faculté des sciences Mirande,
Easy configuration of NETCONF devices
Easy configuration of NETCONF devices David Alexa 1 Tomas Cejka 2 FIT, CTU in Prague CESNET, a.l.e. Czech Republic Czech Republic [email protected] [email protected] Abstract. It is necessary for developers
How to Improve Database Connectivity With the Data Tools Platform. John Graham (Sybase Data Tooling) Brian Payton (IBM Information Management)
How to Improve Database Connectivity With the Data Tools Platform John Graham (Sybase Data Tooling) Brian Payton (IBM Information Management) 1 Agenda DTP Overview Creating a Driver Template Creating a
Textual Modeling Languages
Textual Modeling Languages Slides 4-31 and 38-40 of this lecture are reused from the Model Engineering course at TU Vienna with the kind permission of Prof. Gerti Kappel (head of the Business Informatics
ONTOLOGY-BASED MULTIMEDIA AUTHORING AND INTERFACING TOOLS 3 rd Hellenic Conference on Artificial Intelligence, Samos, Greece, 5-8 May 2004
ONTOLOGY-BASED MULTIMEDIA AUTHORING AND INTERFACING TOOLS 3 rd Hellenic Conference on Artificial Intelligence, Samos, Greece, 5-8 May 2004 By Aristomenis Macris (e-mail: [email protected]), University of
Sage CRM Connector Tool White Paper
White Paper Document Number: PD521-01-1_0-WP Orbis Software Limited 2010 Table of Contents ABOUT THE SAGE CRM CONNECTOR TOOL... 1 INTRODUCTION... 2 System Requirements... 2 Hardware... 2 Software... 2
Embedded Software Development with MPS
Embedded Software Development with MPS Markus Voelter independent/itemis The Limitations of C and Modeling Tools Embedded software is usually implemented in C. The language is relatively close to the hardware,
Software Requirements Specification of A University Class Scheduler
Software Requirements Specification of A University Class Scheduler Deanna M. Needell Jeff A. Stuart Tamara C. Thiel Sergiu M. Dascalu Frederick C. Harris, Jr. Department of Computer Science University
Collecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
PoS-tagging Italian texts with CORISTagger
PoS-tagging Italian texts with CORISTagger Fabio Tamburini DSLO, University of Bologna, Italy [email protected] Abstract. This paper presents an evolution of CORISTagger [1], an high-performance
Natural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
Flattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
II. PREVIOUS RELATED WORK
An extended rule framework for web forms: adding to metadata with custom rules to control appearance Atia M. Albhbah and Mick J. Ridley Abstract This paper proposes the use of rules that involve code to
Tool Support for Model Checking of Web application designs *
Tool Support for Model Checking of Web application designs * Marco Brambilla 1, Jordi Cabot 2 and Nathalie Moreno 3 1 Dipartimento di Elettronica e Informazione, Politecnico di Milano Piazza L. Da Vinci,
The Django web development framework for the Python-aware
The Django web development framework for the Python-aware Bill Freeman PySIG NH September 23, 2010 Bill Freeman (PySIG NH) Introduction to Django September 23, 2010 1 / 18 Introduction Django is a web
Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms
Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms Mohammed M. Elsheh and Mick J. Ridley Abstract Automatic and dynamic generation of Web applications is the future
SEARCH The National Consortium for Justice Information and Statistics. Model-driven Development of NIEM Information Exchange Package Documentation
Technical Brief April 2011 The National Consortium for Justice Information and Statistics Model-driven Development of NIEM Information Exchange Package Documentation By Andrew Owen and Scott Came Since
Lightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
How to make Ontologies self-building from Wiki-Texts
How to make Ontologies self-building from Wiki-Texts Bastian HAARMANN, Frederike GOTTSMANN, and Ulrich SCHADE Fraunhofer Institute for Communication, Information Processing & Ergonomics Neuenahrer Str.
UIMA Overview & SDK Setup
UIMA Overview & SDK Setup Written and maintained by the Apache UIMA Development Community Version 2.4.0 Copyright 2006, 2011 The Apache Software Foundation Copyright 2004, 2006 International Business Machines
Generating Aspect Code from UML Models
Generating Aspect Code from UML Models Iris Groher Siemens AG, CT SE 2 Otto-Hahn-Ring 6 81739 Munich, Germany [email protected] Stefan Schulze Siemens AG, CT SE 2 Otto-Hahn-Ring 6 81739 Munich,
YouTrack MPS case study
YouTrack MPS case study A case study of JetBrains YouTrack use of MPS Valeria Adrianova, Maxim Mazin, Václav Pech What is YouTrack YouTrack is an innovative, web-based, keyboard-centric issue and project
A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems
Proceedings of the Postgraduate Annual Research Seminar 2005 68 A Model-based Software Architecture for XML and Metadata Integration in Warehouse Systems Abstract Wan Mohd Haffiz Mohd Nasir, Shamsul Sahibuddin
Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives
Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives Ramona Enache and Adam Slaski Department of Computer Science and Engineering Chalmers University of Technology and
REDUCING THE COST OF GROUND SYSTEM DEVELOPMENT AND MISSION OPERATIONS USING AUTOMATED XML TECHNOLOGIES. Jesse Wright Jet Propulsion Laboratory,
REDUCING THE COST OF GROUND SYSTEM DEVELOPMENT AND MISSION OPERATIONS USING AUTOMATED XML TECHNOLOGIES Colette Wilklow MS 301-240, Pasadena, CA phone + 1 818 354-4674 fax + 1 818 393-4100 email: [email protected]
WHITE PAPER DATA GOVERNANCE ENTERPRISE MODEL MANAGEMENT
WHITE PAPER DATA GOVERNANCE ENTERPRISE MODEL MANAGEMENT CONTENTS 1. THE NEED FOR DATA GOVERNANCE... 2 2. DATA GOVERNANCE... 2 2.1. Definition... 2 2.2. Responsibilities... 3 3. ACTIVITIES... 6 4. THE
MD Link Integration. 2013 2015 MDI Solutions Limited
MD Link Integration 2013 2015 MDI Solutions Limited Table of Contents THE MD LINK INTEGRATION STRATEGY...3 JAVA TECHNOLOGY FOR PORTABILITY, COMPATIBILITY AND SECURITY...3 LEVERAGE XML TECHNOLOGY FOR INDUSTRY
PERFORMANCE COMPARISON OF COMMON OBJECT REQUEST BROKER ARCHITECTURE(CORBA) VS JAVA MESSAGING SERVICE(JMS) BY TEAM SCALABLE
PERFORMANCE COMPARISON OF COMMON OBJECT REQUEST BROKER ARCHITECTURE(CORBA) VS JAVA MESSAGING SERVICE(JMS) BY TEAM SCALABLE TIGRAN HAKOBYAN SUJAL PATEL VANDANA MURALI INTRODUCTION Common Object Request
Special Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
BUSINESS VALUE OF SEMANTIC TECHNOLOGY
BUSINESS VALUE OF SEMANTIC TECHNOLOGY Preliminary Findings Industry Advisory Council Emerging Technology (ET) SIG Information Sharing & Collaboration Committee July 15, 2005 Mills Davis Managing Director
Communiqué 4. Standardized Global Content Management. Designed for World s Leading Enterprises. Industry Leading Products & Platform
Communiqué 4 Standardized Communiqué 4 - fully implementing the JCR (JSR 170) Content Repository Standard, managing digital business information, applications and processes through the web. Communiqué
