ehg New Trends in e Humanities Amsterdam
|
|
- Phebe Nichols
- 8 years ago
- Views:
Transcription
1 ehg New Trends in e Humanities Amsterdam
2 Overview 1) Dialect geography 2) A unified structure for Dutch dialect dictionary data 3) Dialectgebieden in Brabant. Geografische clustering op basis van de ruwe lexicale gegevens van het Woordenboek van de Brabantse Dialecten 4) Visualization as a Research Tool for Dialect Geography Using a Geo browser
3 1 Dialect geography
4 Dialect geography What dialect differences are there between villages or towns? What dialect areas are there? What relations are there between dialect areas and other geographic data?
5 Dialect geography Dictionary of the Brabantic Dialects (WBD)
6
7
8 2 A unified structure for Dutch dialect dictionary data Folkert de Vriend, Lou Boves, Henk van den Heuvel, Roeland van Hout, Joep Kruijsen, Jos Swanenberg (2006). In: Proceedings of The fifth international conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, pp
9 Dialect geography Dictionary of the Brabantic Dictionary of the Limburgian Dictionary of the Flemmish Dictionary of the Zeelandish Dialects (WBD) Dialects (WLD) Dialects (WVD) Dialects (WZD)
10 Total research area
11 Form based dictionary: Dictionary of the Zeelandish Dialects (WZD)
12 Sense based dictionary: Dictionary of the Brabantic Dialects (WBD)
13 Map from Dictionary of the Brabantic Dialects (WBD)
14 Unification: Map showing unified data from WBD, WLD, WZD, WVD
15 Unification: Map showing unified data from WBD and WLD (CLARIN COAVA (Cornips et al. 2011)) Frog ( Kikker )
16 Towards a unified structure based on standards All dialect dictionary projects use the same core data types: form, sense and location.
17 WBD
18 WVD
19 Dictionary of the Zeelandish Dialects (WZD)
20 Mapping core data onto the LMF core model Although the data organisation is either form based or sense based, the core data types have the same heterarchical relation
21 LMF XML implementation for core data <LexicalResource name='"unified Dialect Lexicon"> <Lexicon name="wbd"> <LexicalEntry> <Form>ulling</Form> <Sense>Fret</Sense> <Location>Rosmalen</Location> </LexicalEntry> </Lexicon> <Lexicon name="wvd"> <LexicalEntry> <Form>voejerkuil</Form> <Sense>Groenvoerkuil</Sense> <Location>K136a</Location> </LexicalEntry> </Lexicon> <Lexicon name="wzd"> <LexicalEntry> <Form> aerdwurm</form> <Sense>dauwworm</Sense> <Location>Z.eil.</Location> </LexicalEntry> </Lexicon> </LexicalResource>
22 Additional information needed for specifying core data types Form Type: Lexical, Phonetic Alphabet: IPA, Genoveva, Latin Sense Type: Concept, Meaning Location Type: Placename, Area, Kloeke code Standardisation: Where possible convert data to the same standard (f.i. location: longitude/lattitude) Minimal: map type and alphabet labels to ISOCat
23 Additional information needed for classifying core data types Form Class: phonetic forms have a lexical classification in WBD, WLD en WVD Sense: Class: taxonomy Location Class: geopolitical taxonomy Unified classifications can be used to provide access to the unified data
24 Conclusions All dictionaries share the same core data types A unified structure built around these core data types will enable Using the data of the different dictionaries as one huge dataset. Organising the data based on either sense, form, or location. This will enable different perspectives on the data.
25 3 Dialectgebieden in Brabant. Geografische clustering op basis van de ruwe lexicale gegevens van het Woordenboek van de Brabantse Dialecten Folkert de Vriend, Jos Swanenberg, Roeland van Hout (2007). In: Taal en Tongval, themanummer 20, Dialectlexicografie, pp
26 Introduction Computational analysis of the data that were collected for WBD using cluster analyses Cluster analyses is defined in Jain and Dubes (1988) as the process of classifying objects into subsets that have meaning in the context of a particular problem. In a dialect geographic context the problem can be finding dialect areas. Aim: to see if we could find detailed dialect patterns in Brabant based on lexical data only The dialect patterns that we found were compared to the dialect map of Belemans and Goossens (2000)
27 Belemans and Goossens (2000): Represents a traditional view on the classification of the Brabant area Based on qualitative analyses of several types of data.
28 Data selection Only a subselection of the WBD data could be used: Part III of WBD Data collected in the Nijmeegse enquetete. These were collected for the whole research area and not just for subareas of Brabant. Only the data for the core data types: concept, lexical form and location Resulting data selection: a data matrix with 614,941 lexical forms for 4229 concepts and 639 locations.
29 Method 1) Lexical distances were computed for all location pairs in the data set. (RuG/L04 ) 2) Using these lexical distances the locations were grouped together using cluster analyses. (RuG/L04) 3) For interpreting the resulting groups of locations, these were then converted to a KML symbol map (cartographic software developed by Meertens and Radboud University) 4) This symbol map was overlaid onto the map of Belemans and Goossens so that (mis)matches between the two maps could be visually inspected. (Geobrowser)
30 Data characteristics that were problematic for the method Nijmeegse enquete covered the whole research area, but not for every concept a lexical form was recorded in each location. Result: Datamatrix was a huge gatenkaas (83.3% of the cells in the data matrix were empty) Distance matrix was also a gatenkaas. (Since often no distance could be calculated for a pair of locations.) This was problematic for the cluster algorithms we used since they cannot deal with missing distances.
31 Solution 1 Create plausible lexical distances for the empty cells using the lexical distance to locations that are geographically near ( imputation ). Cluster analyses set to yield nine clusters showed that it was possible to find dialect areas for Brabant based on lexical data.
32 But Although the resulting dialect maps showed some general resemblance with Belemans and Goossens (2000), the results were not very satisfactory. The dialect maps contained clusters that overlapped each other much and also clusters covering the entire research area.
33 6 overlapping clusters
34 3 clusters covering entire research area
35 Solution 2 Strongly reduce the percentage of empty cells in the data matrix by completely removing all concepts and locations with little or no lexical forms. Result: Data matrix was reduced from 614,941 lexical forms to 100,277 lexical forms. The percentage of empty cells in the data matrix was reduced from 83.3% to 20.3%. Now for every pair of locations a distance could be calculated (without using imputation)
36 Result Requiring the cluster analysis to return nine clusters resulted in a close resemblance to the nine main areas of the dialect map of Belemans and Goossens (2000). Also, the result did not contain clusters covering the entire research area anymore.
37 Final result based on 100,277 lexical forms
38 Conclusions We could find detailed dialect patterns in Brabant based on lexical data only. These detailed dialect patterns resembled the map of Belemans and Goossen (2000) closely. But for our computational method the dataset had to be manipulated extensively. Relevant for ehumanities perspective: The WBD dataset was collected with a typical humanities aim in mind: collecting as much variation as possible. For searching for general patterns using cluster analyses the gatenkaas character of the datamatrix suddenly was a problem.
39 4 Visualization as a Research Tool for Dialect Geography Using a Geobrowser Folkert de Vriend, Lou Boves, Roeland van Hout, Jos Swanenberg (2011). In: Literary and Linguistic Computing, 26(1), pp
40 Basic research chain for conducting dialect geography research Applies to dialect dictionary projects as well as to dialect atlas projects. Modelled very much as a pipeline with a unidirectional data and process flow.
41 Visualization as a research tool Visualization of map data does not have to be a static and final stage in the research chain. The basic research chain can be extended with support for using visualization as a research tool. What do we mean by that?
42 Scheidermans mantra Scheidermans mantra for designing advanced information visualization interfaces (Shneiderman, 1996) can also be applied to map data It formulates the basic principles as: overview first, zoom and filter, then details on demand. It can be regarded as a set of minimum requirements for using visualization as a research tool. A research chain that meets the requirements of Scheidermans mantra will need some form of dynamic visualisation.
43 Incorporation of independent Support for hypotheses about the processes underlying patterns in dialect variation might be found in independent diatopical data. Shattered block (Weijnen 1977) diatopical data (1)
44 Incorporation of independent diatopical data (2) By combining different types of non linguistic diatopical data with dialect data, one is able to explore hypotheses about relations between the structures found in the data sets. The basic research chain should be extended with the ability to combine visualizations of dialect data with visualisations of independent diatopical data, for example by overlaying maps. Ideally, research into relations between such independent diatopical and dialect data already starts in the interpretation stage. Therefore, in the extended research chain (next slide) incorporation of independent diatopical data is an additional input to the geographic interpretation stage.
45 Extended research chain with support for visualization as a research tool Original unidirectional basic research chain is turned into an architecture that supports an iterative process flow that aids exploration of multiple hypotheses about the data.
46 Support for visualization as a research tool We checked to what extent tools that are already available for dialect geography research support visualization as a research tool. The two main European workbenches for dialect geography research (RuG/L04 and VDM) are very sophisticated but they did (in 2010) not offer much support for using visualization as a research tool. Geobrowsers (like Google Earth or Nasa Worldwind) do support using visualization as a research tool: They fully adhere to Shneiderman s visual informationseeking mantra. For overlaying maps with independent diatopical data, Google Earth offers easy to use built in tools.
47 Demo geobrowser (Google Earth)
48 Conclusions With dynamic visualization and the ability to incorporate independent data the role of the map changes from a static presentation of research findings into a research tool that can be used to gain new insights about (dialect geographic) data. A first step for existing tools for computational analysis of dialect geographic data towards the full extended research chain described, would be to make them more interoperable with geobrowsers.
49 CLARIN MIGMAP (Bloothooft et al): KML output
50
Curation Report. Brabants Nederlands en Nederlands Brabants Handwoordenboek
Curation Report Brabants Nederlands en Nederlands Brabants Handwoordenboek CLARIN NL Data Curation Service Version 1, 2 October 2013 Henk van den Heuvel CLST, Radboud University Nijmegen 1. Introduction
More informationCuration Report. Zoo prôte wèij in Nuejne mi mekaâr
Curation Report Zoo prôte wèij in Nuejne mi mekaâr NUENENS DIALECTWOORDENBOEK CLARIN NL Data Curation Service Version 1, 8 oktober 2013 Henk van den Heuvel CLST, Radboud University Nijmegen 1. Introduction
More informationA Unified Structure for Dutch Dialect Dictionary Data
A Unified Structure for Dutch Dialect Dictionary Data Folkert de Vriend 1, Lou Boves 1,2, Henk van den Heuvel 1, Roeland van Hout 2, Joep Kruijsen 2, Jos Swanenberg 2 1 Centre for Language and Speech Technology
More informationCuration Report KEMPENSCH TAALEIGEN
Curation Report KEMPENSCH TAALEIGEN BERGEIJKS DIALECTWOORDENBOEK CLARIN NL Data Curation Service Version 1, 8 oktober 2013 Henk van den Heuvel CLST, Radboud University Nijmegen 1. Introduction There are
More informationApplying quantitative methods to dialect Dutch verb clusters
Applying quantitative methods to dialect Dutch verb clusters Jeroen van Craenenbroeck KU Leuven/CRISSP jeroen.vancraenenbroeck@kuleuven.be 1 Introduction Verb cluster ordering is a well-known area of microparametric
More informationATLAS.ti 6 Distinguishing features and functions
SoftwareReviews:ATLAS.ti6 ATLAS.ti6 Distinguishingfeaturesandfunctions Thisdocumentisintendedtobereadinconjunctionwiththe ChoosingaCAQDASPackageWorkingPaper which provides a more general commentary of
More informationHow To Create A Clarin Metadata Infrastructure
Creating & Testing CLARIN Metadata Components Folkert de Vriend (1), Daan Broeder (2), Griet Depoorter (3), Laura van Eerten (3), Dieter van Uytvanck (2) 1) Meertens Institute Joan Muyskenweg 25, Amsterdam,
More informationSkyEYE Tracking Feature List 1.1 Contents
1 SkyEYE Tracking Feature List 1.1 Contents Real Time Tracking... 2 Vehicle Track History Mapping... 3 Automatic Electronic Trip Log Book... 3 Over speed Monitoring... 4 Customer Site Visit Monitoring...
More informationCLARIN-NL Second Open Call. Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010 Overview Background Project Types Project Goals Roles Resource Curation Projects Demonstrator Projects CLARIN Centres
More informationPilot project: A Dictionary of the Dutch Dialects Jacques Van Keymeulen and Veronique De Tier Ghent University
Pilot project: A Dictionary of the Dutch Dialects Jacques Van Keymeulen and Veronique De Tier Ghent University The lexicon of the traditional dialects in the Dutch language area is disappearing at a rapid
More informationThe Migmap project: technical aspects
The Migmap project: technical aspects New Trends in e-humanities, 29 November 2012 Jan Pieter Kunst, Meertens Institute 1 General architecture of the application 2 General architecture of the application
More informationThe Syntactic Atlas of the Dutch Dialects
The Syntactic Atlas of the Dutch Dialects A corpus of elicited speech as an on-line Dynamic Atlas Sjef Barbiers & Jan Pieter Kunst Meertens Institute (KNAW) 1 Coordination Hans Bennis (Meertens Institute)
More informationAn example. Visualization? An example. Scientific Visualization. This talk. Information Visualization & Visual Analytics. 30 items, 30 x 3 values
Information Visualization & Visual Analytics Jack van Wijk Technische Universiteit Eindhoven An example y 30 items, 30 x 3 values I-science for Astronomy, October 13-17, 2008 Lorentz center, Leiden x An
More informationExploratory Data Analysis for Ecological Modelling and Decision Support
Exploratory Data Analysis for Ecological Modelling and Decision Support Gennady Andrienko & Natalia Andrienko Fraunhofer Institute AIS Sankt Augustin Germany http://www.ais.fraunhofer.de/and 5th ECEM conference,
More informationGIS & Spatial Modeling
Geography 4203 / 5203 GIS & Spatial Modeling Class 2: Spatial Doing - A discourse about analysis and modeling in a spatial context Updates Class homepage at: http://www.colorado.edu/geography/class_homepages/geog_4203
More informationGet the most value from your surveys with text analysis
PASW Text Analytics for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That
More informationCurrent Order Tool Experiences Complaints
Current Order Tool Experiences Complaints Log in unadvertised case sensitivity for email address that is used as login id CERES Dataset Info pages are too crowded!! On the Data Products Catalog page, remove
More informationIntroduction to Exploratory Data Analysis
Introduction to Exploratory Data Analysis A SpaceStat Software Tutorial Copyright 2013, BioMedware, Inc. (www.biomedware.com). All rights reserved. SpaceStat and BioMedware are trademarks of BioMedware,
More informationONLINE RESOURCES FOR RESEARCH. Indika Karunathilake
ONLINE RESOURCES FOR RESEARCH Indika Karunathilake Why online resources for research? What are the online resources available for research? Brainstorming Tools Search Engines Online databases Online journals
More informationBringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the
More informationNakeDB: Database Schema Visualization
NAKEDB: DATABASE SCHEMA VISUALIZATION, APRIL 2008 1 NakeDB: Database Schema Visualization Luis Miguel Cortés-Peña, Yi Han, Neil Pradhan, Romain Rigaux Abstract Current database schema visualization tools
More informationUSING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS
USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS Koua, E.L. International Institute for Geo-Information Science and Earth Observation (ITC).
More informationLinguistic Research with CLARIN. Jan Odijk MA Rotation Utrecht, 2015-11-10
Linguistic Research with CLARIN Jan Odijk MA Rotation Utrecht, 2015-11-10 1 Overview Introduction Search in Corpora and Lexicons Search in PoS-tagged Corpus Search for grammatical relations Search for
More informationDATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7
DATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7 Contents GIS and maps The visualization process Visualization and strategies
More informationData Interoperability Extension Tutorial
Data Interoperability Extension Tutorial Copyright 1995-2010 Esri All rights reserved. Table of Contents About the Data Interoperability extension tutorial...................... 3 Exercise 1: Using direct-read
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationThere are various ways to find data using the Hennepin County GIS Open Data site:
Finding Data There are various ways to find data using the Hennepin County GIS Open Data site: Type in a subject or keyword in the search bar at the top of the page and press the Enter key or click the
More informationVisualization Method of Trajectory Data Based on GML, KML
Visualization Method of Trajectory Data Based on GML, KML Junhuai Li, Jinqin Wang, Lei Yu, Rui Qi, and Jing Zhang School of Computer Science & Engineering, Xi'an University of Technology, Xi'an 710048,
More informationPDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/54957
More informationGalaxy Morphological Classification
Galaxy Morphological Classification Jordan Duprey and James Kolano Abstract To solve the issue of galaxy morphological classification according to a classification scheme modelled off of the Hubble Sequence,
More informationTHREE-DIMENSIONAL CARTOGRAPHIC REPRESENTATION AND VISUALIZATION FOR SOCIAL NETWORK SPATIAL ANALYSIS
CO-205 THREE-DIMENSIONAL CARTOGRAPHIC REPRESENTATION AND VISUALIZATION FOR SOCIAL NETWORK SPATIAL ANALYSIS SLUTER C.R.(1), IESCHECK A.L.(2), DELAZARI L.S.(1), BRANDALIZE M.C.B.(1) (1) Universidade Federal
More informationATLAS.ti 7 Distinguishing features and functions
ATLAS.ti 7 Distinguishing features and functions This document is intended to be read in conjunction with the Choosing a CAQDAS Package Working Paper which provides a more general commentary of common
More informationProduct Navigator User Guide
Product Navigator User Guide Table of Contents Contents About the Product Navigator... 1 Browser support and settings... 2 Searching in detail... 3 Simple Search... 3 Extended Search... 4 Browse By Theme...
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationData Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining
Data Mining Clustering (2) Toon Calders Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Outline Partitional Clustering Distance-based K-means, K-medoids,
More informationSPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
More informationPERFORMANCE TOOLS DEVELOPMENTS
PERFORMANCE TOOLS DEVELOPMENTS Roberto A. Vitillo presented by Paolo Calafiura & Wim Lavrijsen Lawrence Berkeley National Laboratory Future computing in particle physics, 16 June 2011 1 LINUX PERFORMANCE
More informationThe Language Archive at the Max Planck Institute for Psycholinguistics. Alexander König (with thanks to J. Ringersma)
The Language Archive at the Max Planck Institute for Psycholinguistics Alexander König (with thanks to J. Ringersma) Fourth SLCN Workshop, Berlin, December 2010 Content 1.The Language Archive Why Archiving?
More informationSAND: Relation between the Database and Printed Maps
SAND: Relation between the Database and Printed Maps Erik Tjong Kim Sang Meertens Institute erik.tjong.kim.sang@meertens.knaw.nl May 16, 2014 1 Introduction SAND, the Syntactic Atlas of the Dutch Dialects,
More informationReasoning Component Architecture
Architecture of a Spam Filter Application By Avi Pfeffer A spam filter consists of two components. In this article, based on my book Practical Probabilistic Programming, first describe the architecture
More informationGEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING
Geoinformatics 2004 Proc. 12th Int. Conf. on Geoinformatics Geospatial Information Research: Bridging the Pacific and Atlantic University of Gävle, Sweden, 7-9 June 2004 GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL
More informationSecurity visualisation
Security visualisation This thesis provides a guideline of how to generate a visual representation of a given dataset and use visualisation in the evaluation of known security vulnerabilities by Marco
More informationAn Introduction to KeyLines and Network Visualization
An Introduction to KeyLines and Network Visualization 1. What is KeyLines?... 2 2. Benefits of network visualization... 2 3. Benefits of KeyLines... 3 4. KeyLines architecture... 3 5. Uses of network visualization...
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationOnline Search Engine Advertising Data Visualization Tool
Online Search Engine Advertising Data Visualization Tool Project Proposal Yingsai Dong dysalbert@gmail.com Department of Computer Science University of British Columbia CPSC 547 Information Visualization
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationCRISP - DM. Data Mining Process. Process Standardization. Why Should There be a Standard Process? Cross-Industry Standard Process for Data Mining
Mining Process CRISP - DM Cross-Industry Standard Process for Mining (CRISP-DM) European Community funded effort to develop framework for data mining tasks Goals: Cross-Industry Standard Process for Mining
More informationSurfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
More informationTranscription bottleneck of speech corpus exploitation
Transcription bottleneck of speech corpus exploitation Caren Brinckmann Institut für Deutsche Sprache, Mannheim, Germany Lesser Used Languages and Computer Linguistics (LULCL) II Nov 13/14, 2008 Bozen
More informationData Integration for ArcGIS Users Data Interoperability. Charmel Menzel, ESRI Don Murray, Safe Software
Data Integration for ArcGIS Users Data Interoperability Charmel Menzel, ESRI Don Murray, Safe Software Product overview Extension to ArcGIS (optional) Jointly developed with Safe Software Based on Feature
More informationDialect Corpora Taken Further: The DynaSAND corpus and its application in newer tools
PACLIC 24 Proceedings 759 Dialect Corpora Taken Further: The DynaSAND corpus and its application in newer tools Jan Pieter Kunst a and Franca Wesseling b a Meertens Institute, Royal Netherlands Academy
More informationArcGIS Online. Visualizing Data: Tutorial 3 of 4. Created by: Julianna Kelly
ArcGIS Online Visualizing Data: Tutorial 3 of 4 2014 Created by: Julianna Kelly Contents of This Tutorial The Goal of This Tutorial In this tutorial we will learn about the analysis tools that ArcGIS Online
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationEasy Map Excel Tool USER GUIDE
Easy Map Excel Tool USER GUIDE Overview Easy Map tool provides basic maps showing customized data, by Ontario health unit geographies. This tool will come in handy especially when there is no dedicated
More informationBig Data Processing and Analytics for Mouse Embryo Images
Big Data Processing and Analytics for Mouse Embryo Images liangxiu han Zheng xie, Richard Baldock The AGILE Project team FUNDS Research Group - Future Networks and Distributed Systems School of Computing,
More informationInteractive Visual Data Analysis in the Times of Big Data
Interactive Visual Data Analysis in the Times of Big Data Cagatay Turkay * gicentre, City University London Who? Lecturer (Asst. Prof.) in Applied Data Science Started December 2013 @ the gicentre (gicentre.net)
More informationWeb Data Extraction: 1 o Semestre 2007/2008
Web Data : Given Slides baseados nos slides oficiais do livro Web Data Mining c Bing Liu, Springer, December, 2006. Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008
More informationRAMS Software Techniques in European Space Projects
RAMS Software Techniques in European Space Projects An Industrial View J.M. Carranza COMPASS Workshop - York, 29/03/09 Contents Context and organisation of ESA projects Evolution of RAMS Techniques in
More informationCOC131 Data Mining - Clustering
COC131 Data Mining - Clustering Martin D. Sykora m.d.sykora@lboro.ac.uk Tutorial 05, Friday 20th March 2009 1. Fire up Weka (Waikako Environment for Knowledge Analysis) software, launch the explorer window
More informationDeveloping Fleet and Asset Tracking Solutions with Web Maps
Developing Fleet and Asset Tracking Solutions with Web Maps Introduction Many organizations have mobile field staff that perform business processes away from the office which include sales, service, maintenance,
More informationDeliverable 12.1 Training Plan
Deliverable 12.1 Training Plan DAM-LR 011841 Distributed Access Management for Language Resources implemented as Specific Support Action Contract Number: 011841 Project Coordinator: Peter Wittenburg Project
More informationDecision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010
Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product
More informationEnsembles and PMML in KNIME
Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris Adä 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science Universität Konstanz Konstanz, Germany First.Last@Uni-Konstanz.De
More informationEinführung in die Kognitive Ergonomie
147 Vorlesung 8, den 9. Dezember 1999 148 147 Vorlesung 8, den 9. Dezember 1999 Donnerstag, den 9. Dezember 1999 Einführung in die Kognitive Ergonomie Wintersemester 1999/2000 1. Direct Manipulation and
More informationTEXT-FILLED STACKED AREA GRAPHS Martin Kraus
Martin Kraus Text can add a significant amount of detail and value to an information visualization. In particular, it can integrate more of the data that a visualization is based on, and it can also integrate
More informationLossless Data Compression Standard Applications and the MapReduce Web Computing Framework
Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Sergio De Agostino Computer Science Department Sapienza University of Rome Internet as a Distributed System Modern
More informationCo-Creation of Models and Metamodels for Enterprise. Architecture Projects.
Co-Creation of Models and Metamodels for Enterprise Architecture Projects Paola Gómez pa.gomez398@uniandes.edu.co Hector Florez ha.florez39@uniandes.edu.co ABSTRACT The linguistic conformance and the ontological
More informationNational Register of Historic Places: GIS Webinar Cultural Resource GIS Facility National Park Service June 2012
National Register of Historic Places: GIS Webinar Cultural Resource GIS Facility National Park Service June 2012 In February and March 2012 the National Register of Historic Places held webinars in conjunction
More informationCrowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach
Outline Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach Jinfeng Yi, Rong Jin, Anil K. Jain, Shaili Jain 2012 Presented By : KHALID ALKOBAYER Crowdsourcing and Crowdclustering
More informationPoS-tagging Italian texts with CORISTagger
PoS-tagging Italian texts with CORISTagger Fabio Tamburini DSLO, University of Bologna, Italy fabio.tamburini@unibo.it Abstract. This paper presents an evolution of CORISTagger [1], an high-performance
More informationDeep profiling of multitube flow cytometry data Supplemental information
Deep profiling of multitube flow cytometry data Supplemental information Kieran O Neill et al December 19, 2014 1 Table S1: Markers in simulated multitube data. The data was split into three tubes, each
More informationUSGS Community for Data Integration
Community of Science: Strategies for Coordinating Integration of Data USGS Community for Data Integration Kevin T. Gallagher USGS Core Science Systems January 11, 2013 U.S. Department of the Interior U.S.
More informationEmployee Survey Analysis
Employee Survey Analysis Josh Froelich, Megaputer Intelligence Sergei Ananyan, Megaputer Intelligence www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 310 Bloomington, IN 47404
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationAutomate Data Integration Processes for Pharmaceutical Data Warehouse
Paper AD01 Automate Data Integration Processes for Pharmaceutical Data Warehouse Sandy Lei, Johnson & Johnson Pharmaceutical Research and Development, L.L.C, Titusville, NJ Kwang-Shi Shu, Johnson & Johnson
More informationADVANCED SEMI-AUTOMATIC VISUALIZATION OF SPATIAL DATA USING INSTANTATLAS
CO-384 ADVANCED SEMI-AUTOMATIC VISUALIZATION OF SPATIAL DATA USING INSTANTATLAS VONDRAKOVA A., HARBULA J., HLADISOVA B., VOZENILEK V. Palacky University Olomouc, OLOMOUC, CZECH REPUBLIC Introduction Semi-automatic
More informationUSING SELF-ORGANISING MAPS FOR ANOMALOUS BEHAVIOUR DETECTION IN A COMPUTER FORENSIC INVESTIGATION
USING SELF-ORGANISING MAPS FOR ANOMALOUS BEHAVIOUR DETECTION IN A COMPUTER FORENSIC INVESTIGATION B.K.L. Fei, J.H.P. Eloff, M.S. Olivier, H.M. Tillwick and H.S. Venter Information and Computer Security
More informationHomework 4 Statistics W4240: Data Mining Columbia University Due Tuesday, October 29 in Class
Problem 1. (10 Points) James 6.1 Problem 2. (10 Points) James 6.3 Problem 3. (10 Points) James 6.5 Problem 4. (15 Points) James 6.7 Problem 5. (15 Points) James 6.10 Homework 4 Statistics W4240: Data Mining
More informationDATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION
DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION Katalin Tóth, Vanda Nunes de Lima European Commission Joint Research Centre, Ispra, Italy ABSTRACT The proposal for the INSPIRE
More informationnot possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
More informationA Statistical Spatial Framework to Inform Regional Statistics
A Statistical Spatial Framework to Inform Regional Statistics Martin Brady & Gemma Van Halderen Australian Bureau of Statistics, Canberra, Australia Corresponding Author: m.brady@abs.gov.au Abstract Statisticians
More informationConnecting Segments for Visual Data Exploration and Interactive Mining of Decision Rules
Journal of Universal Computer Science, vol. 11, no. 11(2005), 1835-1848 submitted: 1/9/05, accepted: 1/10/05, appeared: 28/11/05 J.UCS Connecting Segments for Visual Data Exploration and Interactive Mining
More informationVisualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
More informationCLARIN-NL Third Call: Closed Call
CLARIN-NL Third Call: Closed Call CLARIN-NL launches in its third call a Closed Call for project proposals. This called is only open for researchers who have been explicitly invited to submit a project
More informationPentaho Data Mining Last Modified on January 22, 2007
Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org
More informationIdentifying Patterns in DNS Traffic
Identifying Patterns in DNS Traffic Pieter Lexis System and Network Engineering Thu, Jul 4 2013 Reflection and Amplification Attacks DNS abused as DDoS Tool Spamhaus hit with 300 Gigabit/second DDoS Reflected
More informationDATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
More informationWEB-BASED VISUAL EXPLORATION AND ERROR DETECTION IN LARGE DATA SETS: ANTARCTIC ICEBERG TRACKING DATA AS A CASE
WEB-BASED VISUAL EXPLORATION AND ERROR DETECTION IN LARGE DATA SETS: ANTARCTIC ICEBERG TRACKING DATA AS A CASE Connie A. Blok blok@itc.nl Ulanbek Turdukulov turdukulov@itc.nl Barend Köbben Juan Luis Calle
More informationQuick and Easy Web Maps with Google Fusion Tables. SCO Technical Paper
Quick and Easy Web Maps with Google Fusion Tables SCO Technical Paper Version History Version Date Notes Author/Contact 1.0 July, 2011 Initial document created. Howard Veregin 1.1 Dec., 2011 Updated to
More informationPerformance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
More informationEasily add Maps and Geo Analytics in MicroStrategy
Easily add Maps and Geo Analytics in MicroStrategy Agenda Introduction Configure to use Maps in MicroStrategy MicroStrategy Geo Analysis Capabilities and Examples Key Takeaways and Q&A Why Geospatial Analysis
More informationTo introduce software process models To describe three generic process models and when they may be used
Software Processes Objectives To introduce software process models To describe three generic process models and when they may be used To describe outline process models for requirements engineering, software
More informationCI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.
CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationFastStats & Dashboard Product Overview
FastStats & Dashboard Product Overview Guide for Clients July 2011 Version 1 Matrix FastStats Overview Matrix believes that FastStats is an ideal analytics tool for UK Mortgage lenders. Matrix FastStats
More informationwww.thevantagepoint.com
Doing More with Less: How efficient analysis can improve your vantage point on information Nils Newman Director of New Business Development Search Technology newman@searchtech.com PIUG Workshop Topics
More informationCompiling a Dictionary of an Unwritten Language: A Noncorpus-based
Compiling a Dictionary of an Unwritten Language: A Noncorpus-based Approach Jacques van Keymeulen, Department of Dutch Linguistics, Ghent University, Belgium (jacques.vankeymeulen@ugent.be) Abstract: In
More informationBetween voicing and aspiration
Workshop Maps and Grammar 17-18 September 2014 Introduction Dutch-German dialect continuum Voicing languages vs. aspiration languages Phonology meets phonetics Phonetically continuous, phonologically discrete
More informationClassify then Summarize or Summarize then Classify
Classify then Summarize or Summarize then Classify DIMACS, Rutgers University Piscataway, NJ 08854 Workshop Honoring Edwin Diday held on September 4, 2007 What is Cluster Analysis? Software package? Collection
More informationA GIS BASED GROUNDWATER MANAGEMENT TOOL FOR LONG TERM MINERAL PLANNING
A GIS BASED GROUNDWATER MANAGEMENT TOOL FOR LONG TERM MINERAL PLANNING Mauro Prado, Hydrogeologist - SRK Consulting, Perth, Australia Richard Connelly, Principal Hydrogeologist - SRK UK Ltd, Cardiff, United
More information