ANNIC: Annotations in Context. Niraj Aswani, Valentin Tablan Thomas Heitz University of Sheffield
|
|
- Anthony Watkins
- 8 years ago
- Views:
Transcription
1 ANNIC: Annotations in Context Niraj Aswani, Valentin Tablan Thomas Heitz University of Sheffield
2 ANNIC Motivation Need for a corpus analysis tool Useful for authoring of IE patterns for rules is an IR engine that can search over: Document Content Meta-data (Annotation types, features and values) for example: Person.gender== male 2
3 ANNIC is based on Apache Lucene technology. can index any document supported by GATE is integrated in GATE as Searchable Serial DataStore (SSD) has an advanced GUI that provides: view of annotation mark-ups over the matched patterns Interactive way of developing new patterns e.g. title followed by noun that is always in upper case? Annotation statistics 3
4 How does it work? Integrated in GATE as Searchable Serial Datastore (SSD) Initialization Where to store What to Index and what to exclude Context boundary (e.g. restricted within sentence or paragraph boundaries) Index actions linked with Datastore actions When document is saved, index or re-index if already indexed When document is deleted, delete it from the index 4
5 Query Language JAPE Pattern syntax String within quotes or without quotes e.g. ubuntu {AnnotationType} e.g. {Person} {AnnotationType == string} e.g. {Organization == University of Sheffield } {AT.featureName==value} e.g. {Person.gender == male} {AT.feature==value, AT.feature==value} e.g. {Token.orth == upperinitial, Token.length == 3 } 5
6 Query Language Klene Operator + and * but they need to be quantified {Person}{Token}*3{Organization} find all Person and Organization annotations within upto 3 tokens of each other Logical (OR) operator {A}({B} {C}) - ({A}{B}) ({A}{C}) Order and presence of query terms is very important 6
7 DEMO! 7
8 Hands-on-exercise Populate corpus with documents Process with ANNIE, making output of all PRs to be ANNIC annotation set Create Searchable datastore, supplying needed parameters Store corpus there Go to search tab on datastore Enter some sample queries: {Person} Check what annotations are around (e.g. Organization} Expand pattern to find people near Organizations 8
9 Index Generation-Approach I Based on Start Offsets Mr Symond works for Creative Arts in LA T1 T2 T3 T4 T5 T6 T7 T8 Title LastName Person Organization Location Token Stream T1 Person Title T2 LastName T3 T4 T5 Organization T6 T7 T8 Location {Title} {LastName} works for {Organization} T T {Person} {LastName} works for {Organization} T F {Title} {LastName} works for ({Token})+3 {Location} T T {Title} {LastName} works for {Organization} {Token} {Location} F T 9
10 Index Generation-Approach II Based on End Offsets Mr Symond works for Creative Arts in LA T1 T2 T3 T4 T5 T6 T7 T8 Title LastName Person Organization Location Token Stream T1 Title T2 LastName Person T3 T4 T5 T6 Organization T7 T8 Location {Title} {LastName} works for {Organization} F T {Person} {LastName} works for {Organization} F F {Title} {LastName} works for ({Token})+3 {Location} T T {Title} {LastName} works for {Organization} {Token} {Location} F T 10
11 Index Generation-Approach III Based on Start + End Offsets Mr Token string orth root pos Mr upperinitial mr NNP Symonds Token string Symonds orth upperinitial root symonds pos NNP Term Token Token.string == Mr Token.orth == upperinitial Token.root == mr Token.pos == NNP Start Offset 1 1 End Offset Person gender male Person Person.gender == male 1 2 Token Token.string == Symonds Token.orth == upperinitial Token.root == Symonds Token.pos == NNP
12 Index Generation-Approach III Based on Start + End Offsets Mr Symond works for Creative Arts in LA T1 T2 T3 T4 T5 T6 T7 T8 Title LastName Person Organization Location Token Stream T1 Person.eo=T2 Title.eo=T1 T2 LastName.eo=T2 T3 T4 T5 Organization.eo=T6 T6 T7 T8 Location.eo=T8 12
13 Search Optimization {Title} {LastName} works for {Organization} {Token} {Location} Parse query into N sub-queries such that every sub-query matches ({Token})* {Non-Token} expression Q1 = {Title}, Q2 = {LastName}, Q3 = works for {Organization}, Q4 = {Token} {Location} Q2 is searched only within the result set of Q1 If Q1 returns 3 hits H1, H2 and H3, three queries are formed for Q2 Q2.so = H1.eo + 1 H2.eo + 1 H3.eo + 1 Q3 is searched only within the result set of Q2 If Q2 says only H1 and H3 are correct Q3.so = H1.eo + 1 H3.eo + 1 Q4 is searched only within the result set of Q3 If Q3 says only H1 is valid Q4.so = H1.eo
GATE Mímir and cloud services. Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation
GATE Mímir and cloud services Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation GATE Mímir GATE Mímir is an indexing system for GATE documents. Mímir can index: Text: the original
More informationIntroduction to IE with GATE
Introduction to IE with GATE based on Material from Hamish Cunningham, Kalina Bontcheva (University of Sheffield) Melikka Khosh Niat 8. Dezember 2010 1 What is IE? 2 GATE 3 ANNIE 4 Annotation and Evaluation
More informationSemantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
More informationInformation Retrieval Elasticsearch
Information Retrieval Elasticsearch IR Information retrieval (IR) is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches
More informationThe Best Kept Secrets to Using Keyword Search Technologies
The Best Kept Secrets to Using Keyword Search Technologies By Philip Sykes and Richard Finkelman Part 1 Understanding the Search Engines A Comparison of dtsearch and Lucene Introduction Keyword searching
More information11-792 Software Engineering EMR Project Report
11-792 Software Engineering EMR Project Report Team Members Phani Gadde Anika Gupta Ting-Hao (Kenneth) Huang Chetan Thayur Suyoun Kim Vision Our aim is to build an intelligent system which is capable of
More informationMOC 20461C: Querying Microsoft SQL Server. Course Overview
MOC 20461C: Querying Microsoft SQL Server Course Overview This course provides students with the knowledge and skills to query Microsoft SQL Server. Students will learn about T-SQL querying, SQL Server
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationIntroduction to Text Mining. Module 2: Information Extraction in GATE
Introduction to Text Mining Module 2: Information Extraction in GATE The University of Sheffield, 1995-2013 This work is licenced under the Creative Commons Attribution-NonCommercial-ShareAlike Licence
More informationLast Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling
XML (extensible Markup Language) Nan Niu (nn@cs.toronto.edu) CSC309 -- Fall 2008 DHTML Modifying DOM Event bubbling Applets Last Week 2 HTML Deficiencies Fixed set of tags No standard way to create new
More informationQlik REST Connector Installation and User Guide
Qlik REST Connector Installation and User Guide Qlik REST Connector Version 1.0 Newton, Massachusetts, November 2015 Authored by QlikTech International AB Copyright QlikTech International AB 2015, All
More informationElectronic Document Management Using Inverted Files System
EPJ Web of Conferences 68, 0 00 04 (2014) DOI: 10.1051/ epjconf/ 20146800004 C Owned by the authors, published by EDP Sciences, 2014 Electronic Document Management Using Inverted Files System Derwin Suhartono,
More informationTZWorks Windows Event Log Viewer (evtx_view) Users Guide
TZWorks Windows Event Log Viewer (evtx_view) Users Guide Abstract evtx_view is a standalone, GUI tool used to extract and parse Event Logs and display their internals. The tool allows one to export all
More informationHow to Improve Database Connectivity With the Data Tools Platform. John Graham (Sybase Data Tooling) Brian Payton (IBM Information Management)
How to Improve Database Connectivity With the Data Tools Platform John Graham (Sybase Data Tooling) Brian Payton (IBM Information Management) 1 Agenda DTP Overview Creating a Driver Template Creating a
More informationMotivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1
Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically
More informationSnapLogic Salesforce Snap Reference
SnapLogic Salesforce Snap Reference Document Release: October 2012 SnapLogic, Inc. 71 East Third Avenue San Mateo, California 94401 U.S.A. www.snaplogic.com Copyright Information 2012 SnapLogic, Inc. All
More informationCUT YOUR GRAILS APPLICATION TO PIECES
CUT YOUR GRAILS APPLICATION TO PIECES BUILD FEATURE PLUGINS Göran Ehrsson Technipelago AB @goeh Göran Ehrsson, @goeh From Stockholm, Sweden 25+ years as developer Founded Technipelago AB Grails enthusiast
More informationMicrosoft Access 3: Understanding and Creating Queries
Microsoft Access 3: Understanding and Creating Queries In Access Level 2, we learned how to perform basic data retrievals by using Search & Replace functions and Sort & Filter functions. For more complex
More informationSemi-Automated Argumentative Analysis of Online Product Reviews
Semi-Automated Argumentative Analysis of Online Product Reviews Adam WYNER b,1, Jodi SCHNEIDER a, Katie ATKINSON b and Trevor BENCH-CAPON b a Digital Enterprise Research Institute, National University
More informationUnderstanding Slow Start
Chapter 1 Load Balancing 57 Understanding Slow Start When you configure a NetScaler to use a metric-based LB method such as Least Connections, Least Response Time, Least Bandwidth, Least Packets, or Custom
More informationOracle Database 12c: Introduction to SQL Ed 1.1
Oracle University Contact Us: 1.800.529.0165 Oracle Database 12c: Introduction to SQL Ed 1.1 Duration: 5 Days What you will learn This Oracle Database: Introduction to SQL training helps you write subqueries,
More informationStreamServe Persuasion SP5 Document Broker Plus
StreamServe Persuasion SP5 Document Broker Plus User Guide Rev A StreamServe Persuasion SP5 Document Broker Plus User Guide Rev A 2001-2010 STREAMSERVE, INC. ALL RIGHTS RESERVED United States patent #7,127,520
More informationSVM Based Learning System For Information Extraction
SVM Based Learning System For Information Extraction Yaoyong Li, Kalina Bontcheva, and Hamish Cunningham Department of Computer Science, The University of Sheffield, Sheffield, S1 4DP, UK {yaoyong,kalina,hamish}@dcs.shef.ac.uk
More informationNatural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
More informationLanguage Interface for an XML. Constructing a Generic Natural. Database. Rohit Paravastu
Constructing a Generic Natural Language Interface for an XML Database Rohit Paravastu Motivation Ability to communicate with a database in natural language regarded as the ultimate goal for DB query interfaces
More informationNatural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
More informationEnglish Grammar Checker
International l Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-3 E-ISSN: 2347-2693 English Grammar Checker Pratik Ghosalkar 1*, Sarvesh Malagi 2, Vatsal Nagda 3,
More informationModule 1: Getting Started with Databases and Transact-SQL in SQL Server 2008
Course 2778A: Writing Queries Using Microsoft SQL Server 2008 Transact-SQL About this Course This 3-day instructor led course provides students with the technical skills required to write basic Transact-
More informationSaskatoon Business College Corporate Training Centre 244-6340 corporate@sbccollege.ca www.sbccollege.ca/corporate
Microsoft Certified Instructor led: Querying Microsoft SQL Server (Course 20461C) Date: October 19 23, 2015 Course Length: 5 day (8:30am 4:30pm) Course Cost: $2400 + GST (Books included) About this Course
More informationMySQL for Beginners Ed 3
Oracle University Contact Us: 1.800.529.0165 MySQL for Beginners Ed 3 Duration: 4 Days What you will learn The MySQL for Beginners course helps you learn about the world's most popular open source database.
More informationWriting Queries Using Microsoft SQL Server 2008 Transact-SQL
Course 2778A: Writing Queries Using Microsoft SQL Server 2008 Transact-SQL Length: 3 Days Language(s): English Audience(s): IT Professionals Level: 200 Technology: Microsoft SQL Server 2008 Type: Course
More informationAn NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines)
An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines) James Clarke, Vivek Srikumar, Mark Sammons, Dan Roth Department of Computer Science, University of Illinois, Urbana-Champaign.
More informationNear Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya
Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming by Dibyendu Bhattacharya Pearson : What We Do? We are building a scalable, reliable cloud-based learning platform providing services
More informationChildFreq: An Online Tool to Explore Word Frequencies in Child Language
LUCS Minor 16, 2010. ISSN 1104-1609. ChildFreq: An Online Tool to Explore Word Frequencies in Child Language Rasmus Bååth Lund University Cognitive Science Kungshuset, Lundagård, 222 22 Lund rasmus.baath@lucs.lu.se
More informationDutch Parallel Corpus
Dutch Parallel Corpus Lieve Macken lieve.macken@hogent.be LT 3, Language and Translation Technology Team Faculty of Applied Language Studies University College Ghent November 29th 2011 Lieve Macken (LT
More informationOracle SQL. Course Summary. Duration. Objectives
Oracle SQL Course Summary Identify the major structural components of the Oracle Database 11g Create reports of aggregated data Write SELECT statements that include queries Retrieve row and column data
More informationSetting Up a CLucene and PostgreSQL Federation
Federated Desktop and File Server Search with libferris Ben Martin Abstract How to federate CLucene personal document indexes with PostgreSQL/TSearch2. The libferris project has two major goals: mounting
More informationDeveloping Web Browser Recording Tools. Using Server-Side Programming Technology
Developing Web Browser Recording Tools Using Server-Side Programming Technology Chris J. Lu Ph.D. National Library of Medicine NLM, NIH, Bldg. 38A, Rm. 7N-716, 8600 Rockville Pike Bethesda, MD 20894, USA
More informationCombining structured data with machine learning to improve clinical text de-identification
Combining structured data with machine learning to improve clinical text de-identification DT Tran Scott Halgrim David Carrell Group Health Research Institute Clinical text contains Personally identifiable
More informationCSE 308. Coding Conventions. Reference
CSE 308 Coding Conventions Reference Java Coding Conventions googlestyleguide.googlecode.com/svn/trunk/javaguide.html Java Naming Conventions www.ibm.com/developerworks/library/ws-tipnamingconv.html 2
More informationInterpreting areading Scaled Scores for Instruction
Interpreting areading Scaled Scores for Instruction Individual scaled scores do not have natural meaning associated to them. The descriptions below provide information for how each scaled score range should
More informationConnections to External File Sources
Connections to External File Sources By using connections to external sources you can significantly speed up the process of getting up and running with M-Files and importing existing data. For instance,
More informationTEANLIS - Text Analysis for Literary Scholars
TEANLIS - Text Analysis for Literary Scholars Andreas Müller 1,3, Markus John 2,4, Jonas Kuhn 1,3 (1) Institut für Maschinelle Sprachverarbeitung Universität Stuttgart (2) Institut für Visualisierung und
More informationIntroducing Apache Pivot. Greg Brown, Todd Volkert 6/10/2010
Introducing Apache Pivot Greg Brown, Todd Volkert 6/10/2010 Speaker Bios Greg Brown Senior Software Architect 15 years experience developing client and server applications in both services and R&D Apache
More informationLab 9 Access PreLab Copy the prelab folder, Lab09 PreLab9_Access_intro
Lab 9 Access PreLab Copy the prelab folder, Lab09 PreLab9_Access_intro, to your M: drive. To do the second part of the prelab, you will need to have available a database from that folder. Creating a new
More informationSQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.
SQL Databases Course by Applied Technology Research Center. 23 September 2015 This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases. Oracle Topics This Oracle Database: SQL
More informationWriting Queries Using Microsoft SQL Server 2008 Transact-SQL
Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Writing Queries Using Microsoft SQL Server 2008 Transact-SQL Course 2778-08;
More informationOther Language Types CMSC 330: Organization of Programming Languages
Other Language Types CMSC 330: Organization of Programming Languages Markup and Query Languages Markup languages Set of annotations to text Query languages Make queries to databases & information systems
More informationSQL Injection Vulnerabilities in Desktop Applications
Vulnerabilities in Desktop Applications Derek Ditch (lead) Dylan McDonald Justin Miller Missouri University of Science & Technology Computer Science Department April 29, 2008 Vulnerabilities in Desktop
More informationExtracting Opinions and Facts for Business Intelligence
Extracting Opinions and Facts for Business Intelligence Horacio Saggion, Adam Funk Department of Computer Science University of Sheffield Regent Court 211 Portobello Street Sheffield - S1 5DP {H.Saggion,A.Funk}@dcs.shef.ac.uk
More informationInteractive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
More informationBusiness Application Services Testing
Business Application Services Testing Curriculum Structure Course name Duration(days) Express 2 Testing Concept and methodologies 3 Introduction to Performance Testing 3 Web Testing 2 QTP 5 SQL 5 Load
More informationIf you want to skip straight to the technical details of localizing Xamarin apps, start with one of these platform-specific how-to articles:
Localization This guide introduces the concepts behind internationalization and localization and links to instructions on how to produce Xamarin mobile applications using those concepts. If you want to
More informationAdvanced Query for Query Developers
for Developers This is a training guide to step you through the advanced functions of in NUFinancials. is an ad-hoc reporting tool that allows you to retrieve data that is stored in the NUFinancials application.
More informationDrupal CMS for marketing sites
Drupal CMS for marketing sites Intro Sample sites: End to End flow Folder Structure Project setup Content Folder Data Store (Drupal CMS) Importing/Exporting Content Database Migrations Backend Config Unit
More informationScribe Online Integration Services (IS) Tutorial
Scribe Online Integration Services (IS) Tutorial 7/6/2015 Important Notice No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, photocopying,
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
More informationUsing Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms
Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms Mohammed M. Elsheh and Mick J. Ridley Abstract Automatic and dynamic generation of Web applications is the future
More information31 Case Studies: Java Natural Language Tools Available on the Web
31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software
More informationAnotaciones semánticas: unidades de busqueda del futuro?
Anotaciones semánticas: unidades de busqueda del futuro? Hugo Zaragoza, Yahoo! Research, Barcelona Jornadas MAVIR Madrid, Nov.07 Document Understanding Cartoon our work! Complexity of Document Understanding
More informationNLP Lab Session Week 3 Bigram Frequencies and Mutual Information Scores in NLTK September 16, 2015
NLP Lab Session Week 3 Bigram Frequencies and Mutual Information Scores in NLTK September 16, 2015 Starting a Python and an NLTK Session Open a Python 2.7 IDLE (Python GUI) window or a Python interpreter
More informationAnnotated Corpora in the Cloud: Free Storage and Free Delivery
Annotated Corpora in the Cloud: Free Storage and Free Delivery Graham Wilcock University of Helsinki graham.wilcock@helsinki.fi Abstract The paper describes a technical strategy for implementing natural
More informationSchema documentation for types1.2.xsd
Generated with oxygen XML Editor Take care of the environment, print only if necessary! 8 february 2011 Table of Contents : ""...........................................................................................................
More informationA Model of the Operation of The Model-View- Controller Pattern in a Rails-Based Web Server
A of the Operation of The -- Pattern in a Rails-Based Web Server January 10, 2011 v 0.4 Responding to a page request 2 A -- user clicks a link to a pattern page in on a web a web application. server January
More informationIntroduction to Cassandra
Introduction to Cassandra DuyHai DOAN, Technical Advocate Agenda! Architecture cluster replication Data model last write win (LWW), CQL basics (CRUD, DDL, collections, clustering column) lightweight transactions
More informationNEW SOUTH WALES DEPARTMENT OF TRANSPORT. Transport Services Division. Driver Authority Information Service E-Mail Format Specification
NEW SOUTH WALES DEPARTMENT OF TRANSPORT. Transport Services Division Driver Authority Information Service E-Mail Format Specification 20 May 1999 TABLE OF CONTENTS 1 Introduction... 3 2 Definitions...
More informationA prototype infrastructure for D Spin Services based on a flexible multilayer architecture
A prototype infrastructure for D Spin Services based on a flexible multilayer architecture Volker Boehlke 1,, 1 NLP Group, Department of Computer Science, University of Leipzig, Johanisgasse 26, 04103
More informationMicrosoft Access 2000
Microsoft Access 2000 Level 1 Region 4 Teaching, Learning and Technology Center Kaplan, LA Activity 1 Creating a Database 1. Open Microsoft Access 2000 a. Click on START, highlight Programs, point and
More informationGATECloud.net: Cloud Infrastructure for Large-Scale, Open-Source Text Processing
: Cloud Infrastructure for Large-Scale, Open-Source Text Processing Valentin Tablan Ian Roberts Hamish Cunningham Kalina Bontcheva University of Sheffield 28 September 2011 Tablan, Roberts, Cunningham,
More informationJoomla! Override Plugin
Joomla! Override Plugin What is an override? There may be occasions where you would like to change the way a Joomla! Extension (such as a Component or Module, whether from the Joomla! core or produced
More informationNgram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department
More informationMarkus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013
Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data
More informationTractor Manual. 1 What is Tractor? 2 1.1 GATE... 3 1.2 Propositionalizer... 3 1.3 CBIR... 3 1.4 Syntax-Semantics Mapper... 3
Tractor Manual Stuart C. Shapiro, Daniel R. Schlegel, and Michael Prentice Department of Computer Science and Engineering and Center for Multisource Information Fusion and Center for Cognitive Science
More informationFinding the Needle in a Big Data Haystack. Wolfgang Hoschek (@whoschek) JAX 2014
Finding the Needle in a Big Data Haystack Wolfgang Hoschek (@whoschek) JAX 2014 1 About Wolfgang Software Engineer @ Cloudera Search Platform Team Previously CERN, Lawrence Berkeley National Laboratory,
More informationOffice of History. Using Code ZH Document Management System
Office of History Document Management System Using Code ZH Document The ZH Document (ZH DMS) uses a set of integrated tools to satisfy the requirements for managing its archive of electronic documents.
More informationResources You can find more resources for Sync & Save at our support site: http://www.doforms.com/support.
Sync & Save Introduction Sync & Save allows you to connect the DoForms service (www.doforms.com) with your accounting or management software. If your system can import a comma delimited, tab delimited
More informationEnhancing Document Review Efficiency with OmniX
Xerox Litigation Services OmniX Platform Review Technical Brief Enhancing Document Review Efficiency with OmniX Xerox Litigation Services delivers a flexible suite of end-to-end technology-driven services,
More informationInteroperability, Standards and Open Advancement
Interoperability, Standards and Open Eric Nyberg 1 Open Shared resources & annotation schemas Shared component APIs Shared datasets (corpora, test sets) Shared software (open source) Shared configurations
More informationSPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout
Analyzing Data SPSS Resources 1. See website (readings) for SPSS tutorial & Stats handout Don t have your own copy of SPSS? 1. Use the libraries to analyze your data 2. Download a trial version of SPSS
More informationContents. 2 Alfresco API Version 1.0
The Alfresco API Contents The Alfresco API... 3 How does an application do work on behalf of a user?... 4 Registering your application... 4 Authorization... 4 Refreshing an access token...7 Alfresco CMIS
More informationOracle Database: SQL and PL/SQL Fundamentals
Oracle University Contact Us: 1.800.529.0165 Oracle Database: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn This course is designed to deliver the fundamentals of SQL and PL/SQL along
More informationPackage hive. January 10, 2011
Package hive January 10, 2011 Version 0.1-9 Date 2011-01-09 Title Hadoop InteractiVE Description Hadoop InteractiVE, is an R extension facilitating distributed computing via the MapReduce paradigm. It
More informationIntroduction to Apache Tajo: Data Warehouse for Big Data. Jihoon Son / Gruter inc.
Introduction to Apache Tajo: Data Warehouse for Big Data Jihoon Son / Gruter inc. Query Federation with Tajo 11 JDBC-based Storage Support 25 Self-describing Data Formats
More informationCloudera Certified Developer for Apache Hadoop
Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number
More informationEffective Use of SQL in SAS Programming
INTRODUCTION Effective Use of SQL in SAS Programming Yi Zhao Merck & Co. Inc., Upper Gwynedd, Pennsylvania Structured Query Language (SQL) is a data manipulation tool of which many SAS programmers are
More informationA basic create statement for a simple student table would look like the following.
Creating Tables A basic create statement for a simple student table would look like the following. create table Student (SID varchar(10), FirstName varchar(30), LastName varchar(30), EmailAddress varchar(30));
More informationChapter 4: Implementing and Managing Group and Computer Accounts. Objectives
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 4: Implementing and Managing Group and Computer Accounts Objectives Understand the purpose of using group accounts to
More informationHow Strings are Stored. Searching Text. Setting. ANSI_PADDING Setting
How Strings are Stored Searching Text SET ANSI_PADDING { ON OFF } Controls the way SQL Server stores values shorter than the defined size of the column, and the way the column stores values that have trailing
More informationGrandstream Networks, Inc.
Grandstream Networks, Inc. XML Based Downloadable Phone Book Guide GXP21xx/GXP14xx/GXP116x IP Phone Version 2.0 XML Based Downloadable Phone Book Guide Index INTRODUCTION... 4 WHAT IS XML... 4 WHY XML...
More informationYahoo! Grid Services Where Grid Computing at Yahoo! is Today
Yahoo! Grid Services Where Grid Computing at Yahoo! is Today Marco Nicosia Grid Services Operations marco@yahoo-inc.com What is Apache Hadoop? Distributed File System and Map-Reduce programming platform
More informationMB2-707: Version: Microsoft Dynamics CRM Customization. and Configuration. Demo
MB2-707: Version: Microsoft Dynamics CRM Customization and Configuration Demo 1. You are a Microsoft Dynamics CRM consultant. You are assigned a new implementation. Before you configure the customer's
More informationPUBLIC Supplement for J.D. Edwards
SAP Data Services Document Version: 4.2 Support Package 7 (14.2.7.0) 2016-05-06 PUBLIC Content 1 Overview....3 2 System requirements....4 2.1 World....4 2.2 OneWorld....4 3 Datastores.... 6 3.1 Defining
More informationSystem Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks
System Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks OnurSoft Onur Tolga Şehitoğlu November 10, 2012 v1.0 Contents 1 Introduction 3 1.1 Purpose..............................
More informationHow to make Ontologies self-building from Wiki-Texts
How to make Ontologies self-building from Wiki-Texts Bastian HAARMANN, Frederike GOTTSMANN, and Ulrich SCHADE Fraunhofer Institute for Communication, Information Processing & Ergonomics Neuenahrer Str.
More informationMultimedia Systems WS 2010/2011
Multimedia Systems WS 2010/2011 31.01.2011 M. Rahamatullah Khondoker (Room # 36/410 ) University of Kaiserslautern Department of Computer Science Integrated Communication Systems ICSY http://www.icsy.de
More informationTrameur: A Framework for Annotated Text Corpora Exploration
Trameur: A Framework for Annotated Text Corpora Exploration Serge Fleury Sorbonne Nouvelle Paris 3 SYLED-CLA2T, EA2290 75005 Paris, France serge.fleury@univ-paris3.fr Maria Zimina Paris Diderot Sorbonne
More informationBRINGING INFORMATION RETRIEVAL BACK TO DATABASE MANAGEMENT SYSTEMS
BRINGING INFORMATION RETRIEVAL BACK TO DATABASE MANAGEMENT SYSTEMS Khaled Nagi Dept. of Computer and Systems Engineering, Faculty of Engineering, Alexandria University, Egypt. khaled.nagi@eng.alex.edu.eg
More informationCourse Information Course Number: IWT 1229 Course Name: Web Development and Design Foundation
Course Information Course Number: IWT 1229 Course Name: Web Development and Design Foundation Credit-By-Assessment (CBA) Competency List Written Assessment Competency List Introduction to the Internet
More informationQuerying Microsoft SQL Server
Course 20461C: Querying Microsoft SQL Server Module 1: Introduction to Microsoft SQL Server 2014 This module introduces the SQL Server platform and major tools. It discusses editions, versions, tools used
More informationMorphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications
Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications Berlin Berlin Buzzwords 2011, Dr. Christoph Goller, IntraFind AG Outline IntraFind AG Indexing Morphological
More information