Comparison between historical population archives and decentralized databases

Size: px
Start display at page:

Download "Comparison between historical population archives and decentralized databases"

Transcription

1 Comparison between historical population archives and decentralized databases Marijn Schraagen Dionysius Huijsmans Leiden Institute of Advanced Computer Science (LIACS) Leiden University, The Netherlands LaTeCH Workshop 2013

2 Research subject Historical databases have increasingly become digitized Census data, civil registry, church records, trade records,... Millions of interrelated records historical social networks However, network structure is not given Alternative data sources: personal and local archives Family trees, legal archives,... Small amount of information Relations between records generally indicated and verified Research goal: combine the information from different sources

3 Outline 1 Introduction 2 Matching 3 Verification 4 Application 5 Conclusion

4 Motivation Links between (historical) records are important for a wide range of applications Data Mining: graph traversal algorithms, community detection Humanities: migration patterns, family size, occupational development Linguistics: stability of spelling, morphology, phonetics Onomastics: name inheritance, geographical name distribution

5 Overview First match records from databases X and Y, then identify complementary or conflicting links birth record X 1 match? birth record Y 1 L a link compare L b death record X 2 match? death record Y 2 Example: If X 1 = Y 1 but X 2 Y 2 then either L a or L b or both are wrong.

6 Data formats Large-scale historical databases Syntax usually structured XML, SQL, comma-separated Occasionally structured natural language is used Semantics generally based on events Birth, marriage, baptism, change of ownership Exception: census records Family databases Syntax often the legacy Gedcom format Hierarchical level numbers and tags Semantics generally based on individuals and families

7 Example historical databases Genlias civil certificate database Official registration of birth, marriage and death The Netherlands, million certificates (events) Gedcom family archive Hand-compiled from various sources Mostly northern part of the Netherlands, 1600-now 1750 records (individuals and families) Overlap: 1100 events, of which 600 births

8 Data formats example Civil certificate Type: birth certificate Serial number: 176 Date: Place: Wonseradeel Child: Sierk Rolsma Father: Sjoerd Rolsma Mother: Agnes Weldring Family archive FAM INDI 1 NAME Agnes/Welderink/ INDI 1 NAME Sierk/Rolsma/ 1 BIRT 2 DATE 16 MAY 1883

9 Data formats example Civil certificate Type: birth certificate Serial number: 176 Date: Place: Wonseradeel Child: Sierk Rolsma Father: Sjoerd Rolsma Mother: Agnes Weldring Family archive FAM INDI 1 NAME Agnes/Welderink/ INDI 1 NAME Sierk/Rolsma/ 1 BIRT 2 DATE 16 MAY 1883

10 Parser Grammar birth [FAM:CHIL]:child, father,mother. child bdate,bplace,name. father [FAM:HUSB]:name. mother [FAM:WIFE]:name. bdate [INDI:BIRT:DATE]. bplace [INDI:BIRT:PLAC]. name [INDI:NAME]. Family archive FAM INDI 1 NAME Agnes/Welderink/ INDI 1 NAME Sierk/Rolsma/ 1 BIRT 2 DATE 16 MAY 1883

11 Record similarity measure The parser provides uniform data for matching two records using similarity requirements for selected fields. Example: Birth certificate similarity Out of the four names of child and mother, at least two names are exactly equal. The year of birth is equal, or the difference in year of birth is within a small margin and the edit distance between the names is below some threshold. If multiple candidates for matching a record are found, then the candidate with the smallest edit distance is selected. Note that the definition is domain specific.

12 Matching example Birth certificate similarity Out of the four names of child and mother, at least two names are exactly equal. The year of birth is equal, or the difference in year of birth is within a small margin and the edit distance between the names is below some threshold. Civil certificate Date: Child: Sierk Rolsma Mother: Agnes Weldring Family archive Date: 16 MAY 1883 Child: Sierk Rolsma Mother: Agnes Welderink Three out of four names equal (Sierk, Rolsma, Agnes), year of birth equal (1883) match

13 Matching results Birth certificate similarity Out of the four names of child and mother, at least two names are exactly equal. The year of birth is equal, or the difference in year of birth is within a small margin and the edit distance between the names is below some threshold. Birth matches: 361/611 (59%) Civil certificate database still in digitization phase Family database contains many peripheral individuals for which parent names and birth date are unknown Similarity measure could be improved Cf. results for marriage certificate matching: 154/176 (88%)

14 Verification Ideal case: gold standard Generally not available for historical databases Large variation in domain and data quality Performance of matching algorithms obtained on one database is not indicative for other databases Unlike, e.g., newspaper archives, archives, co-author networks,... Possible solution: internal verification

15 Internal verification A similarity measure does not necessarily use all record fields for matching Unused fields can provide a support level for a match Example: the birth similarity measure used person names and year of birth Location, exact date of birth, and serial number can be used for verification

16 Verification results serial location date dist birth marriage > total

17 Interpretation of support categories serial location date dist mean % unique ok ok ok ok ok likely ok manual check manual check > incorrect total 361

18 Application: link comparison First match records from databases X and Y, then identify complementary or conflicting links record X 1 match? record Y 1 L a link compare L b record X 2 match? record Y 2 Application: compare links from Gedcom family archive (given) to links between civil certificates (computed)

19 Visualization tool Sikke Sasses van der Zee Aafke Klazes de Boer Afke de Boer Sjoerd Riemerts Riemersma Johanna Sikkes van der Zee Jan Johannes Altena Klaaske Sikkes van der Zee Johannes Altena Elisabeth Vonk Eke Foekema Aaltje Altena Sikke Altena Cornelia Verkooyen Cornelia Verkooijen Ruurd Altena Anna Jans Rolsma ~1900 H Wesseling Agatha Altena Hendrikus Wesseling Agatha ~1920 Sikkes? IJbeltje ~1925 Bartolomeus Mathias van Oerle Klaaske Sikke Altena Trijntje Homminga A tool is developed to explore the link tree Red and blue: matched certificates have differences

20 Visualization tool Sikke Sasses van der Zee Aafke Klazes de Boer Afke de Boer Sjoerd Riemerts Riemersma Johanna Sikkes van der Zee Jan Johannes Altena Klaaske Sikkes van der Zee Johannes Altena Elisabeth Vonk Eke Foekema Aaltje Altena Sikke Altena Cornelia Verkooyen Cornelia Verkooijen Ruurd Altena Anna Jans Rolsma ~1900 H Wesseling Agatha Altena Hendrikus Wesseling Agatha ~1920 Sikkes? IJbeltje ~1925 Bartolomeus Mathias van Oerle Klaaske Sikke Altena Trijntje Homminga Only red or blue: marriage from family archive without match in civil certificates, or vice versa

21 Visualization tool Sikke Sasses van der Zee Aafke Klazes de Boer Afke de Boer Sjoerd Riemerts Riemersma Johanna Sikkes van der Zee Jan Johannes Altena Klaaske Sikkes van der Zee Johannes Altena Elisabeth Vonk Eke Foekema Aaltje Altena Sikke Altena Cornelia Verkooyen Cornelia Verkooijen Ruurd Altena Anna Jans Rolsma ~1900 H Wesseling Agatha Altena Hendrikus Wesseling Agatha ~1920 Sikkes? IJbeltje ~1925 Bartolomeus Mathias van Oerle Klaaske Sikke Altena Trijntje Homminga Records F19 and are a false negative match

22 Visualization tool Sikke Sasses van der Zee Aafke Klazes de Boer Afke de Boer Sjoerd Riemerts Riemersma Johanna Sikkes van der Zee Jan Johannes Altena Klaaske Sikkes van der Zee Johannes Altena Elisabeth Vonk Eke Foekema Aaltje Altena Sikke Altena Cornelia Verkooyen Cornelia Verkooijen Ruurd Altena Anna Jans Rolsma ~1900 H Wesseling Agatha Altena Hendrikus Wesseling Agatha ~1920 Sikkes? IJbeltje ~1925 Bartolomeus Mathias van Oerle Klaaske Sikke Altena Trijntje Homminga Records F122, F123, F124 are outside of the civil certificate timeframe

23 Summary Combining information from different databases in the same domain Syntactic and semantic parsing of records based on individuals to records based on events Matching using domain-specific similarity measures Match validation using additional record fields Application: visualization of link comparison

24 Future work Scale up to more and larger databases Crowdsourcing is particularly suited to obtain data Refine matching procedure Public release of visualization tool

25 Acknowledgment This work is part of the research programme LINKS, which is financed by the Netherlands Organisation for Scientific Research (NWO), grant The authors would like to thank Tom Altena for the use of his Gedcom database.

Required Employment D Documents Document Options for Ve erifying Eligibility Legal S Spouse Eligibility requirements:

Required Employment D Documents Document Options for Ve erifying Eligibility Legal S Spouse Eligibility requirements: Required Employment Documents Below is a list of eligibility rules and documents required to verify the eligibility of each dependent. In some cases, at least TWO forms of documentation aree required.

More information

KEY TIPS FOR CENSUS SUCCESS

KEY TIPS FOR CENSUS SUCCESS KEY TIPS FOR CENSUS SUCCESS We're going to begin with tips that apply whichever census you're searching, and whichever site you're using. * Always allow for mistakes - for example, it's rarely advisable

More information

Open Data in the Netherlands - opportunities for innovation. Bob Coret Gaenovium 7 October 2014

Open Data in the Netherlands - opportunities for innovation. Bob Coret Gaenovium 7 October 2014 Open Data in the Netherlands - opportunities for innovation Bob Coret Gaenovium 7 October 2014 Open data defined - http://opendefinition.org/ Open data can be freely used, modified, and shared by anyone

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

JENNIFER TARDELLI, MA, LPC, NCC PSYCHOTHERAPY WOMEN S ISSUES

JENNIFER TARDELLI, MA, LPC, NCC PSYCHOTHERAPY WOMEN S ISSUES Personal History/Life Script Questionnaire What is your main reason for coming into therapy? What is your difficulty or personal problem? Give a recent example of when and how this problem occurred and

More information

Practice Direction 14C Reports by the Adoption Agency or Local Authority

Practice Direction 14C Reports by the Adoption Agency or Local Authority Practice Direction 14C Reports by the Adoption Agency or Local Authority This Practice Direction supplements FPR Part 14, rule 14.11(3) Matters to be contained in reports 1.1 The matters to be covered

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

This is a publication of the Netherlands Ministry of Foreign Affairs. FAQ Same-sex marriage 2010

This is a publication of the Netherlands Ministry of Foreign Affairs. FAQ Same-sex marriage 2010 This is a publication of the Netherlands Ministry of Foreign Affairs FAQ Same-sex marriage 2010 1 1. Same-sex marriage 2. Alternative types of partnership 3. Conversion 4. Recognition abroad 5. Non-Dutch

More information

The VONK Ancestral line of Dirk Arie Vonk (1910-1986)

The VONK Ancestral line of Dirk Arie Vonk (1910-1986) The VONK Ancestral line of Dirk Arie Vonk (1910-1986) By Richard L. Tolman, Ph. D. First Generation 1. Pieter van der Vonk was born abt 1660 of Hazerswoude, Zuid-Holland, Netherlands. 1 2 i. SIJMON VAN

More information

Business Process Discovery

Business Process Discovery Sandeep Jadhav Introduction Well defined, organized, implemented, and managed Business Processes are very critical to the success of any organization that wants to operate efficiently. Business Process

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

INSTRUCTIONS FOR COMPLETING THE PETITION TO CORRECT A BIRTH CERTIFICATE

INSTRUCTIONS FOR COMPLETING THE PETITION TO CORRECT A BIRTH CERTIFICATE INSTRUCTIONS FOR COMPLETING THE PETITION TO CORRECT A BIRTH CERTIFICATE About the Petition Who can file a petition to amend a birth certificate? You can only apply to amend a birth certificate if you are

More information

Follow your family using census records

Follow your family using census records Census records are one of the best ways to discover details about your family and how that family changed every 10 years. You ll discover names, addresses, what people did for a living, even which ancestor

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 Research Motivation In today s modern digital environment with or without our notice we are leaving our digital footprints in various data repositories through our daily activities,

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING Geoinformatics 2004 Proc. 12th Int. Conf. on Geoinformatics Geospatial Information Research: Bridging the Pacific and Atlantic University of Gävle, Sweden, 7-9 June 2004 GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL

More information

Simplifying e Business Collaboration by providing a Semantic Mapping Platform

Simplifying e Business Collaboration by providing a Semantic Mapping Platform Simplifying e Business Collaboration by providing a Semantic Mapping Platform Abels, Sven 1 ; Sheikhhasan Hamzeh 1 ; Cranner, Paul 2 1 TIE Nederland BV, 1119 PS Amsterdam, Netherlands 2 University of Sunderland,

More information

ESTATE PLANNING INFORMATION

ESTATE PLANNING INFORMATION ESTATE PLANNING INFORMATION I PERSONAL AND FAMILY DATA A. Husband Husband's Name: (First) (Middle) (Last) Social Security Number Home Phone Employer Business Address Business Phone Place of Birth: U.S.

More information

ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search

ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Project for Michael Pitts Course TCSS 702A University of Washington Tacoma Institute of Technology ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Under supervision of : Dr. Senjuti

More information

Deposit Identification Utility and Visualization Tool

Deposit Identification Utility and Visualization Tool Deposit Identification Utility and Visualization Tool Colorado School of Mines Field Session Summer 2014 David Alexander Jeremy Kerr Luke McPherson Introduction Newmont Mining Corporation was founded in

More information

Levels of legal consequences of marriage, cohabitation and registered partnership for different-sex and same-sex partners:

Levels of legal consequences of marriage, cohabitation and registered partnership for different-sex and same-sex partners: C O M P A R A T I V E O V E R V I E W Levels of legal consequences of marriage, cohabitation and registered partnership for different-sex and same-sex partners: Comparative overview by Kees Waaldijk 1

More information

Data ownership within governance: getting it right

Data ownership within governance: getting it right Data ownership within governance: getting it right Control your data An Experian white paper Data Ownership within Governance : Getting it right - 1 Table of contents 1. Introduction 03 2. Why is data

More information

L A W ON ELECTRONIC DOCUMENT I. GENERAL PROVISIONS. Scope of the Law

L A W ON ELECTRONIC DOCUMENT I. GENERAL PROVISIONS. Scope of the Law L A W ON ELECTRONIC DOCUMENT I. GENERAL PROVISIONS Scope of the Law Article 1 This Law shall regulate the conditions and manner of handling of electronic document in legal transactions, administrative,

More information

Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations

Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Amara Keller, Martin Kelly, Aaron Todd 4 June 2010 Abstract This research has two components, both involving the

More information

Making the most of your conference poster. Dr Krystyna Haq Graduate Education Officer Graduate Research School

Making the most of your conference poster. Dr Krystyna Haq Graduate Education Officer Graduate Research School Making the most of your conference poster Dr Krystyna Haq Graduate Education Officer Graduate Research School Why present a conference poster? Why present a conference poster? communicate a message (your

More information

Ancestors of Mildred A. Slaugh

Ancestors of Mildred A. Slaugh Ancestors of Mildred A. Slaugh Generation 1 1. Mildred A. Slaugh, daughter of Dick Jan Slagh and Clara Raak was born on 22-10-1903 in Michigan, USA. She died. She married Dick D. Wiersema, son of David

More information

Graph-Based Linking and Visualization for Legislation Documents (GLVD) Dincer Gultemen & Tom van Engers

Graph-Based Linking and Visualization for Legislation Documents (GLVD) Dincer Gultemen & Tom van Engers Graph-Based Linking and Visualization for Legislation Documents (GLVD) Dincer Gultemen & Tom van Engers Demand of Parliaments Semi-structured information and semantic technologies Inter-institutional business

More information

HP Quality Center. Upgrade Preparation Guide

HP Quality Center. Upgrade Preparation Guide HP Quality Center Upgrade Preparation Guide Document Release Date: November 2008 Software Release Date: November 2008 Legal Notices Warranty The only warranties for HP products and services are set forth

More information

Nowcasting of significant convection by application of cloud tracking algorithm to satellite and radar images

Nowcasting of significant convection by application of cloud tracking algorithm to satellite and radar images Nowcasting of significant convection by application of cloud tracking algorithm to satellite and radar images Ng Ka Ho, Hong Kong Observatory, Hong Kong Abstract Automated forecast of significant convection

More information

Ancestors of Pleuntje (Pearl) van der Maarel

Ancestors of Pleuntje (Pearl) van der Maarel Ancestors of Pleuntje (Pearl) van der Maarel Generation 1 1. Pleuntje (Pearl) van der Maarel, daughter of Cornelis van der Maarel and Dina Adriana van Gerven was born on 10-8-1907 in Schiedam. She died

More information

A GUIDE TO WRITING YOUR LIFE STORY. Identifying information: Name and maiden name, date and place of birth:

A GUIDE TO WRITING YOUR LIFE STORY. Identifying information: Name and maiden name, date and place of birth: 1 A GUIDE TO WRITING YOUR LIFE STORY You are welcome to answer the following questions here or write your own essay on separate paper. If you would like to complete this on your computer ask your trainer

More information

APPLICATION TO AMEND CERTIFICATE OF BIRTH

APPLICATION TO AMEND CERTIFICATE OF BIRTH APPLICATION TO AMEND CERTIFICATE OF BIRTH STATE OF LOUISIANA DHH/OPH/Vital Records Packet 18, Rev 08/04 Applicant s Name: Last First Middle Street Address: City: Tel No State: Zip Code: Signature: Relationship

More information

Towards Software Configuration Management for Test-Driven Development

Towards Software Configuration Management for Test-Driven Development Towards Software Configuration Management for Test-Driven Development Tammo Freese OFFIS, Escherweg 2, 26121 Oldenburg, Germany [email protected] Abstract. Test-Driven Development is a technique where

More information

Do Code Clones Matter?

Do Code Clones Matter? Elmar Juergens, Florian Deissenboeck, Benjamin Hummel, Stefan Wagner Do Code Clones Matter? May 22 nd, 2009 31st ICSE, Vancouver 1 Code Clone 2 Agenda Related Work Empirical Study Detection of inconsistent

More information

THE BACHELOR S DEGREE IN SPANISH

THE BACHELOR S DEGREE IN SPANISH Academic regulations for THE BACHELOR S DEGREE IN SPANISH THE FACULTY OF HUMANITIES THE UNIVERSITY OF AARHUS 2007 1 Framework conditions Heading Title Prepared by Effective date Prescribed points Text

More information

GCE. Computing. Mark Scheme for January 2011. Advanced Subsidiary GCE Unit F452: Programming Techniques and Logical Methods

GCE. Computing. Mark Scheme for January 2011. Advanced Subsidiary GCE Unit F452: Programming Techniques and Logical Methods GCE Computing Advanced Subsidiary GCE Unit F452: Programming Techniques and Logical Methods Mark Scheme for January 2011 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge and RSA) is a leading

More information

How To Teach English To Other People

How To Teach English To Other People TESOL / NCATE Program Standards STANDARDS FOR THE ACCREDIATION OF INITIAL PROGRAMS IN P 12 ESL TEACHER EDUCATION Prepared and Developed by the TESOL Task Force on ESL Standards for P 12 Teacher Education

More information

2014/02/13 Sphinx Lunch

2014/02/13 Sphinx Lunch 2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue

More information

WHITE PAPER HOW TO REDUCE RISK, ERROR, COMPLEXITY AND DRIVE COSTS IN THE ACCOUNTS PAYABLE PROCESS

WHITE PAPER HOW TO REDUCE RISK, ERROR, COMPLEXITY AND DRIVE COSTS IN THE ACCOUNTS PAYABLE PROCESS WHITE PAPER HOW TO REDUCE RISK, ERROR, COMPLEXITY AND DRIVE COSTS IN THE ACCOUNTS PAYABLE PROCESS Based on a benchmark study of 250 companies with a total of more than 900 billion euro in Accounts Payable

More information

COCOVILA Compiler-Compiler for Visual Languages

COCOVILA Compiler-Compiler for Visual Languages LDTA 2005 Preliminary Version COCOVILA Compiler-Compiler for Visual Languages Pavel Grigorenko, Ando Saabas and Enn Tyugu 1 Institute of Cybernetics, Tallinn University of Technology Akadeemia tee 21 12618

More information

05-12-2011 Framework Contract no: DI/06691-00 Authors: P. Wauters, K. Declercq, S. van der Peijl, P. Davies

05-12-2011 Framework Contract no: DI/06691-00 Authors: P. Wauters, K. Declercq, S. van der Peijl, P. Davies Ref. Ares(2011)1311113 Ares(2012)149022-09/02/2012 06/12/2011 Study on cloud and service oriented architectures for e- government Final report summary 05-12-2011 Framework Contract no: DI/06691-00 Authors:

More information

Domain Adaptive Relation Extraction for Big Text Data Analytics. Feiyu Xu

Domain Adaptive Relation Extraction for Big Text Data Analytics. Feiyu Xu Domain Adaptive Relation Extraction for Big Text Data Analytics Feiyu Xu Outline! Introduction to relation extraction and its applications! Motivation of domain adaptation in big text data analytics! Solutions!

More information

Foundations of Information Management

Foundations of Information Management Foundations of Information Management - WS 2012/13 - Juniorprofessor Alexander Markowetz Bonn Aachen International Center for Information Technology (B-IT) Data & Databases Data: Simple information Database:

More information

STEPS FOR APPLYING FOR FINANCIAL AID

STEPS FOR APPLYING FOR FINANCIAL AID STEPS FOR APPLYING FOR FINANCIAL AID 1. Determine if you are a dependent student or an independent student. A student is only an independent student if he or she is: a. 24 or older at the beginning of

More information

BIRTH CERTIFICATES FOR THE STATE OF ARIZONA

BIRTH CERTIFICATES FOR THE STATE OF ARIZONA VIP SERVICES 2012 Louisiana Street Houston, Texas 77002 713-659-8472 1-800-856-8472 Fax 713-659-3767 Website: www.vippassports.com Email: [email protected] BIRTH CERTIFICATES FOR THE STATE OF ARIZONA

More information

RULES ALABAMA STATE BOARD OF HEALTH ALABAMA DEPARTMENT OF PUBLIC HEALTH CHAPTER 420-7-1 VITAL STATISTICS REVISED: FEBRUARY 2014

RULES ALABAMA STATE BOARD OF HEALTH ALABAMA DEPARTMENT OF PUBLIC HEALTH CHAPTER 420-7-1 VITAL STATISTICS REVISED: FEBRUARY 2014 RULES OF ALABAMA STATE BOARD OF HEALTH ALABAMA DEPARTMENT OF PUBLIC HEALTH CHAPTER 420-7-1 VITAL STATISTICS REVISED: FEBRUARY 2014 1 RULES OF ALABAMA STATE BOARD OF HEALTH ALABAMA DEPARTMENT OF PUBLIC

More information

Learning Translation Rules from Bilingual English Filipino Corpus

Learning Translation Rules from Bilingual English Filipino Corpus Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,

More information

Report to the Council of Australian Governments. A Review of the National Identity Security Strategy

Report to the Council of Australian Governments. A Review of the National Identity Security Strategy Report to the Council of Australian Governments A Review of the National Identity Security Strategy 2012 Report to COAG - Review of the National Identity Security Strategy 2012 P a g e i Table of contents

More information

Figure 1: Architecture of a cloud services model for a digital education resource management system.

Figure 1: Architecture of a cloud services model for a digital education resource management system. World Transactions on Engineering and Technology Education Vol.13, No.3, 2015 2015 WIETE Cloud service model for the management and sharing of massive amounts of digital education resources Binwen Huang

More information

OPERATING ENGINEERS TRUST FUNDS

OPERATING ENGINEERS TRUST FUNDS OPERATING ENGINEERS TRUST FUNDS 1640 South Loop Road Alameda, CA 94502 P.O. Box 23190 Oakland, CA 94623-0190 Telephone (510) 433-4422 or (510) 271-0222 or Claims Department (800) 251-5013 Pension Department

More information

AMERICAN GENEALOGY: HOME STUDY COURSE

AMERICAN GENEALOGY: HOME STUDY COURSE AMERICAN GENEALOGY: HOME STUDY COURSE Syllabus NGS AMERICAN GENEALOGY HOME STUDY COURSE SYLLABUS Copyright 2009 National Genealogical Society 3108 Columbia Pike, Suite 300 Arlington, Virginia 22204-4370

More information

Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

More information

Analysis and Forecasting for Own Source Revenues in the Municipality of

Analysis and Forecasting for Own Source Revenues in the Municipality of Swiss Kosovo Local Governance and Decentralization Support LOGOS Analysis and Forecasting for Own Source Revenues in the Municipality of KLLOKOT This report was prepared by RECURA Financials for the LOGOS

More information

KNOWLEDGE ORGANIZATION

KNOWLEDGE ORGANIZATION KNOWLEDGE ORGANIZATION Gabi Reinmann Germany [email protected] Synonyms Information organization, information classification, knowledge representation, knowledge structuring Definition The term

More information

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo Selecting a Taxonomy Management Tool Wendi Pohs InfoClear Consulting #SLATaxo InfoClear Consulting What do we do? Content Analytics Strategy and Implementation, including: Taxonomy/Ontology development

More information

USER MODELLING IN ADAPTIVE DIALOGUE MANAGEMENT

USER MODELLING IN ADAPTIVE DIALOGUE MANAGEMENT USER MODELLING IN ADAPTIVE DIALOGUE MANAGEMENT Gert Veldhuijzen van Zanten IPO, Center for Research on User-System Interaction, P.O. Box 213, 5600 MB Eindhoven, the Netherlands [email protected]

More information

Data collection architecture for Big Data

Data collection architecture for Big Data Data collection architecture for Big Data a framework for a research agenda (Research in progress - ERP Sense Making of Big Data) Wout Hofman, May 2015, BDEI workshop 2 Big Data succes stories bias our

More information

Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

More information

Applying for a passport from outside the UK Supporting Documents

Applying for a passport from outside the UK Supporting Documents Applying for a from outside Supporting Documents Group 1: Your application will be delayed if you don t include all your supporting. If we have to write to you for missing, or additional, you ll need to

More information

Improving Your Use of FamilySearch: Data Cleanup Strategies Geoffrey D. Rasmussen [email protected]

Improving Your Use of FamilySearch: Data Cleanup Strategies Geoffrey D. Rasmussen Geoff@LegacyFamilyTree.com Improving Your Use of FamilySearch: Data Cleanup Strategies Geoffrey D. Rasmussen [email protected] Researchers should be careful when publishing/sharing information from their family file with

More information

Web 3.0 image search: a World First

Web 3.0 image search: a World First Web 3.0 image search: a World First The digital age has provided a virtually free worldwide digital distribution infrastructure through the internet. Many areas of commerce, government and academia have

More information

Software Engineering of NLP-based Computer-assisted Coding Applications

Software Engineering of NLP-based Computer-assisted Coding Applications Software Engineering of NLP-based Computer-assisted Coding Applications 1 Software Engineering of NLP-based Computer-assisted Coding Applications by Mark Morsch, MS; Carol Stoyla, BS, CLA; Ronald Sheffer,

More information

Internet of Things, data management for healthcare applications. Ontology and automatic classifications

Internet of Things, data management for healthcare applications. Ontology and automatic classifications Internet of Things, data management for healthcare applications. Ontology and automatic classifications [email protected] SAS Institute Norway Different challenges same opportunities! Data capture

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

The course is included in the CPD programme for teachers II.

The course is included in the CPD programme for teachers II. Faculties of Humanities and Theology LLYU72, Swedish as a Second Language for Upper Secondary School Teachers, 60 credits Svenska som andraspråk för lärare i gymnasieskolan, 60 högskolepoäng First Cycle

More information

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that

More information

Automatic Detection and Correction of Errors in Dependency Treebanks

Automatic Detection and Correction of Errors in Dependency Treebanks Automatic Detection and Correction of Errors in Dependency Treebanks Alexander Volokh DFKI Stuhlsatzenhausweg 3 66123 Saarbrücken, Germany [email protected] Günter Neumann DFKI Stuhlsatzenhausweg

More information

Quick Start to Family Tree

Quick Start to Family Tree Quick Start to Family Tree Family Tree is a new way to organize and record your genealogy online. It is free, is available to everyone, and provides an easy way to discover your place in history with free

More information

PATIENT IDENTIFICATION AND MATCHING INITIAL FINDINGS

PATIENT IDENTIFICATION AND MATCHING INITIAL FINDINGS PATIENT IDENTIFICATION AND MATCHING INITIAL FINDINGS Prepared for the Office of the National Coordinator for Health Information Technology by: Genevieve Morris, Senior Associate, Audacious Inquiry Greg

More information

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship The PALAVRAS parser and its Linguateca applications - a mutually productive relationship Eckhard Bick University of Southern Denmark [email protected] Outline Flow chart Linguateca Palavras History

More information

1 File Processing Systems

1 File Processing Systems COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.

More information

ROYAL BALLET SCHOOL ASSOCIATES PROGRAMME Declaration of Income and Application for Assistance with Associate Fees

ROYAL BALLET SCHOOL ASSOCIATES PROGRAMME Declaration of Income and Application for Assistance with Associate Fees ROYAL BALLET SCHOOL ASSOCIATES PROGRAMME Declaration of Income and Application for Assistance with Associate Fees Please circle as appropriate: JA MA SA Centre:. Classes: INFORMATION ABOUT THE STUDENT

More information

Semantic Web based e-learning System for Sports Domain

Semantic Web based e-learning System for Sports Domain Semantic Web based e-learning System for Sports Domain S.Muthu lakshmi Research Scholar Dept.of Information Science & Technology Anna University, Chennai G.V.Uma Professor & Research Supervisor Dept.of

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

Pattern Insight Clone Detection

Pattern Insight Clone Detection Pattern Insight Clone Detection TM The fastest, most effective way to discover all similar code segments What is Clone Detection? Pattern Insight Clone Detection is a powerful pattern discovery technology

More information

Engineering Process Software Qualities Software Architectural Design

Engineering Process Software Qualities Software Architectural Design Engineering Process We need to understand the steps that take us from an idea to a product. What do we do? In what order do we do it? How do we know when we re finished each step? Production process Typical

More information

Application for a Parental Order Section 54 Human Fertilisation and Embryology Act 2008

Application for a Parental Order Section 54 Human Fertilisation and Embryology Act 2008 C51 Application for a Parental Order Section 54 Human Fertilisation and Embryology Act 2008 To be completed by the court Name of court Date received by the court Date issued Please complete this form using

More information

Critical Success Factors of CAD Data Migrations

Critical Success Factors of CAD Data Migrations Critical Success Factors of CAD Data Migrations Executive Summary Organizations implement PLM systems with several goals in mind: to better manage corporate assets, bring products to market faster, meet

More information

Modern Databases. Database Systems Lecture 18 Natasha Alechina

Modern Databases. Database Systems Lecture 18 Natasha Alechina Modern Databases Database Systems Lecture 18 Natasha Alechina In This Lecture Distributed DBs Web-based DBs Object Oriented DBs Semistructured Data and XML Multimedia DBs For more information Connolly

More information

Lecture 9. Semantic Analysis Scoping and Symbol Table

Lecture 9. Semantic Analysis Scoping and Symbol Table Lecture 9. Semantic Analysis Scoping and Symbol Table Wei Le 2015.10 Outline Semantic analysis Scoping The Role of Symbol Table Implementing a Symbol Table Semantic Analysis Parser builds abstract syntax

More information

CAREER TRACKS PHASE 1 UCSD Information Technology Family Function and Job Function Summary

CAREER TRACKS PHASE 1 UCSD Information Technology Family Function and Job Function Summary UCSD Applications Programming Involved in the development of server / OS / desktop / mobile applications and services including researching, designing, developing specifications for designing, writing,

More information

ECM Governance Policies

ECM Governance Policies ECM Governance Policies Metadata and Information Architecture Policy Document summary Effective date 13 June 2012 Last updated 17 November 2011 Policy owner Library Services, ICTS Approved by Council Reviewed

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Scanner-Parser Project Thursday, Feb 7 DUE: Wednesday, Feb 20, 9:00 pm This project

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Universiteit Leiden. Opleiding Informatica

Universiteit Leiden. Opleiding Informatica Internal Report 2012-08 August 2012 Universiteit Leiden Opleiding Informatica Maintaining a software system with the use of Domain-Specific languages Tyron Offerman BACHELOR THESIS Leiden Institute of

More information

Data Coding and Entry Lessons Learned

Data Coding and Entry Lessons Learned Chapter 7 Data Coding and Entry Lessons Learned Pércsich Richárd Introduction In this chapter we give an overview of the process of coding and entry of the 1999 pilot test data for the English examination

More information