Semantic Word Clouds
|
|
- Noah Knight
- 7 years ago
- Views:
Transcription
1 Seman&c Analysis in Language Technology Semantic Word Clouds Marina San(ni Department of Linguis(cs and Philology Uppsala University, Uppsala, Sweden Spring 2016
2 Previous lecture: Ontologies 2
3 Semantic Web & Ontologies The goal of the Seman(c Web is to allow web informa(on and services to be more effec(vely exploited by humans and automated tools. Essen(ally, the focus of the seman(c web is to share data instead of documents. This data must be meaningful both for human and for machines (ie automated tools and web applica(ons) Q: How are we going to represent meaning and knowledge on the web? A: via annota&on. Knowledge is represented in the form of rich conceptual schemas/formalisms called ontologies. Therefore, ontologies are the backbone of the Seman(c Web. Ontologies give formally defined meanings to the terms used in annota&ons, transforming them into seman&c annota&ons. 3
4 concepts that are hierarchically organized Ontologies are Tree of Porphyry, III AD Wordnet, XXI AD (see Lect 5, ex similarity measures) 4
5 Reasoning: RDF/OWL vs Databases (and other data structures) OWL axioms behave like inference rules rather than database constraints.! Class: Phoenix!!SubClassOf: ispetof only Wizard!! Individual: Fawkes! Types: Phoenix! Facts: ispetof Dumbledore! Fawkes is said to be a Phoenix and to be the pet of Dumbledore, and it is also stated that only a Wizard can have a pet Phoenix. In OWL, this leads to the implica(on that Dumbledore is a Wizard. That is, if we were to query the ontology for instances of Wizard, then Dumbledore would be part of the answer. In a database se[ng the schema could include a similar statement about the Phoenix class, but in this case it would be interpreted as a constraint on the data: adding the fact that Fawkes ispetof Dumbledore without Dumbledore being already known to be a Wizard would lead to an invalid database state, and such an update would therefore be rejected by a database management system as a constraint viola(on. 5
6 So, what is an ontology for us? An ontology is a FORMAl, EXPLICIT specifica&on of a SHARED conceptualiza&on Machine-readable Consensual Knowledge Concepts, properties relations, functions, constraints, axioms, are explicitly defined Abstract model and simplified view of some phenomenon in the world that we want to represent Studer, Benjamins, Fensel. Knowledge Engineering: Principles and Methods. Data and Knowledge Engineering. 25 (1998) An ontology is an explicit specification of a conceptualization 6 Gruber, T. A translation Approach to portable ontology specifications. Knowledge Acquisition. Vol
7 How to build an ontology Generally speaking (and roughly said), when designing an ontology, four main components are used: 1. Classes 2. Rela(ons 3. Axioms 4. Instances 7
8 Prac(cal Ac(vity: emo(ons Your remarks: Emo(ons are ambiguous: eg. happiness can be also ill- directed The polarity of some emo(ons cannot be assessed etc. Classes Rela(ons Axioms Instances etc. 8
9 Occupa(onal psychology (wikipedia) Industrial and organiza(onal psychology (also known as I O psychology, occupa(onal psychology, work psychology, WO psychology, IWO psychology and business psychology) is the scien$fic study of human behavior in the workplace and applies psychological theories and principles to organiza(ons and individuals in their workplace. I- O psychologists are trained in the scien(st prac((oner model. I- O psychologists contribute to an organiza(on's success by improving the performance, mo(va(on, job sa(sfac(on, occupa(onal safety and health as well as the overall health and well- being of its employees. An I O psychologist conducts research on employee behaviors and a[tudes, and how these can be improved through hiring prac(ces, training programs, feedback, and management systems. 9
10 In summary Why to build an ontology? To share common understanding of the structure of informa(on among people or machines To make domain assump$ons explicit Ojen based on controlled vocabulary To analyze domain knowledge To enable reuse of domain knowledge 10
11 Ontologies and Tags Ontologies and tagging systems are two different ways to organize the knowledge present in Web. The first one has a formal fundamental that derives from descrip(ve logic and ar(ficial intelligence. Domain experts decide the terms. The other one is simpler and it integrates heterogeneous contents, and it is based on the collabora(on of users in the Web 2.0. User- generated annota(on. 11
12 Folksonomies Tagging facili(es within Web 2.0 applica(ons have shown how it might be possible for user communi$es to collabora$vely annotate web content, and create simple forms of ontology via the development of loosely- hierarchically organised sets of tags, onen called folksonomies. 12
13 Folksonomy=Social Tagging Folksonomies (also known as social tagging) are user- defined metadata collec(ons. Users do not deliberately create folksonomies and there is rarely a prescribed purpose, but a folksonomy evolves when many users create or store content at par(cular sites and iden(fy what they think the content is about. Tag clouds pinpoint the frequency of certain tags. 13
14 A common way to organize tags is in tag clouds 14
15 Automa(c folksonomy construc(on The collec(ve knowledge expressed though user- generated tags has a great poten(al. However, we need tools to efficiently aggregate data from large numbers of users with highly idiosyncra$c vocabularies and invented words or expressions. Many approaches to automa(c folksonomy construc(on combine tags using sta(s(cal methods... Ample space for improvement 15
16 Ontology, taxonomy, folksonomy, etc. Many different defini(ons A good summary and interpreta(on is here: hpp:// ontologies
17 Today We will talk more generally about word clouds 17
18 Further Reading Seman&c Similarity from Natural Language and Ontology Analysis by Sébas(en Harispe, Sylvie Ranwez, Stefan Janaqi, and Jacky Montmain Synthesis Lectures on Human Language Technologies, May 2015, Vol. 8, No. 1 The two state- of- the- art approaches for es(ma(ng and quan(fying seman(c similari(es/relatedness of seman(c en((es are presented in detail: the first one relies on corpora analysis and is based on Natural Language Processing techniques and seman(c models while the second is based on more or less formal, computer- readable and workable forms of knowledge such as seman(c networks, thesauri or ontologies. 18
19 Previous lecture: the end 19
20 Acknowledgements This presenta(on is based on the following paper: Barth et al. (2014). Experimental Comparison of Seman(c Word Cloud. In Experimental Algorithms, Volume 8504 of the series Lecture Notes in Computer Science pp Link: hpps:// Some slides have been borrowed from Sergey Pupyrev. 20
21 Today Experiments on seman&cs- preserving word clouds, in which seman(cally related words are close to each other. 21
22 Outline What is a Word Cloud? 3 early algorithms 3 new algorithms Metrics & Quan(ta(ve Evalua(on 22
23 Word Clouds Word clouds have become a standard tool for abstrac(ng, visualizing and comparing texts We could apply the same or similar techniques to the huge amonts of tags produced by users interac(ng in the social networks 23
24 Comparison & conceptualiza(on Tool Word Clouds as a tool for conceptualizing documents. Cf Ontologies Ex: 2008, comparison of speeches: Obama vs McCain Cf. Lect 10: Extrac(ve summariza(on & Abstrac(ve summariza(on 24
25 Word Clouds and Tag Clouds are ojen used to represent importance among terms (ex, band popularity) or serve as a naviga(on tool (ex, Google search results). 25
26 The Problem How to compute seman(c- preserving word clouds in which seman(cally- related words are close to each other? 26
27 Wordle hpp:// Prac(cal tools, like Wordle, make word cloud visualiza(on easy. They offer an appealing way to SUMMARIZE text Shortoming: they do not capture the rela(onships between words in any way since word placement is independent of context 27
28 Many word clouds are arranged randomly (look also at the scapered colours) 28
29 Paperns and Vicinity/Adjacency Humans are spontaneously papern- seekers: if they see two words close to each other in a word cloud, they spontaneously think they are related 29
30 In Linguis(cs and NLP This natural tendency in linking spacial vicinity to seman&c relatedness is exploited as evidence that words are seman(cally related or seman(cally similar Remember? : You shall know a word by the company it keeps (Firth, J. R. 1957:11) 30
31 So, it makes sense to place such related words close to each other (look also at the color distribu(on) 31
32 Seman(c word clouds have higher user sa(sfac(on compared to other layouts 32
33 All recent word cloud visualiza(on tools aim to incoprorate seman(cs in the layout 33
34 but none of them provide any guarantee about the quality of the layout in terms of seman(cs 34
35 Early algorithms: Force- Directed Graph Most of the exis(ng algorithms are based on force- directed graph layout. Force- directed graph drawing algorithms are a class of algorithms for drawing graphs in an aesthe(cally pleasing way Aprac(ve forces between pairs to reduce empty space Repulsive forces ensure that words do not overlap Final force preserve seman(c rela(ons between words. Some of the most flexible algorithms for calcula(ng layouts of simple undirected graphs belong to a class known as force- directed algorithms. Such algorithms calculate the layout of a graph using only informa(on contained within the structure of the graph itself, rather than relying on domain- specific knowledge. Graphs drawn with these algorithms tend to be aesthe(cally pleasing, exhibit symmetries, and tend to produce crossing- free layouts for planar 35 graphs.
36 Newer Algorithms: rectangle representa(on of graphs Vertex- weighted and edge- weighed graph: The ver(ces of the graph are the words Their weight correspond to some measure of importance (eg. word frequencies) The edges capture the seman(c relatedness of pair of words (eg. co- occurrence) Their weight correspond to the strength of the rela(on Each vertex can be drawn as a box (rectangle) with a dimension determing by its weight A realized adjacency is the sum of the edge weights for all pairs of touching boxes. The goal is to maximize the realized adjacencies. 36
37 Purpose of the experiments that are shown here: Seman(cs preserva(on in terms of closeness/ vicinity/adjacency 37
38 Example A contact of 2 boxes is a common boundary. The contact of two boxes is interpredet as seman(c relatedness The contact of 2 boxes can be calculated, so the adjacency can be computed and evaluated. 38
39 Preprocessing: 1) Term Extrac(on 2) Ranking 3) Similarity/Dissimilarity Computa(on 39
40 Similarity/dissimilarity matrix 40
41 cos( v, w) = Lect 6: Repe((on v w v w = v w w v = Which pair of words is more similar? cosine(apricot,informa(on) = N i=1 N 2 v i=1 i v i w i N 2 w i i=1 large data computer apricot digital informa(on = =.16 cosine(digital,informa(on) = = =.58 cosine(apricot,digital) = = 0 41
42 Lect 06: Other possible similarity measures 42
43 Input - Output The input for all algorithms is a collec(on of n rectangles, each with a fixed width and height propor(onal to the rank of the word A similarity/dissimilarity matrix The output is a set of non- overlapping posi(ons for the rectangles. 43
44 Early Algorithms 1. Wordle (Random) 2. Context- Preserving Word Cloud Visualiza(on (CPWCV) 3. Seam Carving 44
45 Wordle à Random The Wordle algorithm places one word at a (me in a greedy fashion, ie aiming to use space as efficiently as possible. First the words are sorted by weight/rank in decreasing order. Then for each word in the order, a posi(on is picked at random. 45
46 1: Random 46
47 2: Random 47
48 3: Random 48
49 4: Random 49
50 5: Random 50
51 6: Random 51
52 Context- Preserving Word Cloud Visualiza(on (CPWCV) First, a dissimilarity matrix is computed and Mul(dimensional Scaling (MDS) is performed Mul(dimensional Scaling (MDS) aims at detec(ng meaningful underlying dimensions in the data. Second, effort to create a compact layout 52
53 1: Context- Preserving 53
54 2: Context- Preserving : repulsive force 54
55 3: Context- Preserving : aprac(ve force 55
56 Seam Carving Basically, an algorithm for image resizing It was invented at Mitsubishi s 56
57 1: Seam Carving 57
58 2: Seam Carving : space is divided into regions 58
59 3: Seam Carving : empty paths trimmed out itera(vely 59
60 4: Seam Carving 60
61 5: Seam Carving 61
62 6: Seam Carving: space divided into regions 62
63 7: Seam Carving 63
64 3 New Algorithms 1. Inflate and Push 2. Star Forest 3. Cycle Cover 64
65 Inflate- and- Push Simple heuris(c method for word layout, which aims to preserve seman(c rela(ons between pair of words. Based on 1. Heuris(cs: scaling down all word rectangles by some constant; 2. Compu(ng MDS (mul(dimensional scaling) on the dissimilarity matrix 3. Iteretavely increase the size of rectangles by 5% (ie inflate words; 4. When words overlaps, apply a force- directed algorithm to push words away. 65
66 Inflate: star(ng point 66
67 Inflate : scaling down 67
68 Inflate : seman(cally- related words are placed close to each other. Apply inflate words (5%) itera(vely. 68
69 Inflate: push words : repulsive force to resolve overlaps 69
70 Inflate: final stage 70
71 Star Forest A star is a tree A star forest is a forest whose connected components are all stars. 71
72 Repe((on: trees and graphs A tree is special form of graph i.e. minimally connected graph and having only one path between any two ver(ces. In a graph there can be more than one path i.e. graph can have uni- direc(onal or bi- direc(onal paths (edges) between nodes. 72
73 Three steps 1. Extrac(ng the star forest: par&&on a graph into disjoint stars 2. Realising a star: build a word cloud for every star 3. Pack all the stars together 73
74 Star Forest : star = tree 1. Extract stars greedily from a dissimilarity matrix à disjoint stars = star forest 2. Compute the op(mal stars, ie the best set of words to be adjacent 3. Aprac(ve force to get a compact layout 74
75 Cycle Cover This algorithm is based on a similarity matrix. First, a similarity path is created Then, the op(mal level of compact- ness is computed 75
76 Quan(ta(ve Metrics 1. Realized Adjacenies how close are similar words to each other? 2. Distor(on how distant are dissimilar words? 3. Uniform Area U(liza(on uniformity of the distribu(on (overpopulated vs sparse areas in the word cloud) 4. Comptactness how well u(lized is the drawing area? 5. Aspect Ra(o width and height of the bounding box 6. Running Time execu(on (me 76
77 2 datasets (1) WIKI, a set of 112 plain- text ar(cles extracted from the English Wikipedia, each consis(ng of at least 200 dis(nct words (2) PAPERS, a set of 56 research papers published in conferences on experimental algorithms (SEA and ALENEX) in
78 Cycle Cover wins 78
79 Seam Carving wins 79
80 Random wins 80
81 Inflate wins 81
82 Random and Seam Carving win 82
83 All ok except Seam Carving 83
84 Demo 84
85 The end 85
Opportuni)es and Challenges of Textual Big Data for the Humani)es
Opportuni)es and Challenges of Textual Big Data for the Humani)es Dr. Adam Wyner, Department of Compu)ng Prof. Barbara Fennell, Department of Linguis)cs THiNK Network Knowledge Exchange in the Humani)es
More informationANALYTICAL TECHNIQUES FOR DATA VISUALIZATION
ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION CSE 537 Ar@ficial Intelligence Professor Anita Wasilewska GROUP 2 TEAM MEMBERS: SAEED BOOR BOOR - 110564337 SHIH- YU TSAI - 110385129 HAN LI 110168054 SOURCES
More informationExtrac'ng People s Hobby and Interest Informa'on from Social Media Content
Extrac'ng People s Hobby and Interest Informa'on from Social Media Content Thomas Forss, Shuhua Liu and Kaj- Mikael Björk Dept of Business Administra?on and Analy?cs Arcada University of Applied Sciences
More informationData Warehousing. Yeow Wei Choong Anne Laurent
Data Warehousing Yeow Wei Choong Anne Laurent Databases Databases are developed on the IDEA that DATA is one of the cri>cal materials of the Informa>on Age Informa>on, which is created by data, becomes
More informationXML, Seman9c Web and Content Analy9cs
XML, Seman9c Web and Content Analy9cs XML Prague Pre- conference 2014 Felix Sasaki DFKI / W3C Fellow 1 What do you need to follow this session? Ideal: a computer with internet access, to be able to provide
More informationIns+tuto Superior Técnico Technical University of Lisbon. Big Data. Bruno Lopes Catarina Moreira João Pinho
Ins+tuto Superior Técnico Technical University of Lisbon Big Data Bruno Lopes Catarina Moreira João Pinho Mo#va#on 2 220 PetaBytes Of data that people create every day! 2 Mo#va#on 90 % of Data UNSTRUCTURED
More informationOntology and automatic code generation on modeling and simulation
Ontology and automatic code generation on modeling and simulation Youcef Gheraibia Computing Department University Md Messadia Souk Ahras, 41000, Algeria youcef.gheraibia@gmail.com Abdelhabib Bourouis
More informationCS 5150 So(ware Engineering Evalua4on and User Tes4ng
Cornell University Compu1ng and Informa1on Science CS 5150 So(ware Engineering Evalua4on and User Tes4ng William Y. Arms Usability: The Analyze/Design/Build/Evaluate Loop Analyze requirements Design User
More informationSBML SBGN SBML Just my 2 cents. Alice C. Villéger COMBINE 2010
SBML SBGN SBML Just my 2 cents Alice C. Villéger COMBINE 2010 Disclaimer Fuzzy talk work in progress last minute slides Someone else has been working on very similar stuff and should really have been talking
More informationCloud Data Management System (CDMS)
Cloud Management System (CMS) Wiqar Chaudry Solu9ons Engineer Senior Advisor CMS Overview he OpenStack cloud data management system features a canonical data modeling framework designed to broker context
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace
More informationPerformance Management. Ch. 9 The Performance Measurement. Mechanism. Chiara Demar8ni UNIVERSITY OF PAVIA. mariachiara.demar8ni@unipv.
UNIVERSITY OF PAVIA Performance Management Ch. 9 The Performance Measurement Mechanism Chiara Demar8ni mariachiara.demar8ni@unipv.it Master in Interna+onal Business and Economics Defini8on Performance
More informationSeman&c Web: Benefits For Clinical Decision Support At The Bedside. Emory Fry, MD SemTechBiz 2013
Seman&c Web: Benefits For Clinical Decision Support At The Bedside Emory Fry, MD SemTechBiz 2013 Clinical Decision Support (CDS) A system providing knowledge and person specific or popula8on informa8on
More informationData Mining. Supervised Methods. Ciro Donalek donalek@astro.caltech.edu. Ay/Bi 199ab: Methods of Computa@onal Sciences hcp://esci101.blogspot.
Data Mining Supervised Methods Ciro Donalek donalek@astro.caltech.edu Supervised Methods Summary Ar@ficial Neural Networks Mul@layer Perceptron Support Vector Machines SoLwares Supervised Models: Supervised
More informationDoing Big Data Projects: What s the Best Team Process Methology?
Doing Big Data Projects: What s the Best Team Process Methology? October 2015 1 Executive Summary What s the Best Team Process Methology? September 2015 2 Executive Summary What s the Best Team Process
More informationNetwork Maps for End Users: Collect, Analyze, Visualize and Communicate Network Insights with Zero Coding
Network Maps for End Users: Collect, Analyze, Visualize and Communicate Network Insights with Zero Coding A project from the Social Media Research Founda8on: h:p://www.smrfounda8on.org About Me Introduc8ons
More informationUsing Social Media to Drive Recommender Systems for Mobile Apps. - GRP Presenta=on - Jovian Lin (A0026542M)
Using Social Media to Drive Recommender Systems for Mobile Apps - GRP Presenta=on - Jovian Lin (A0026542M) Structure of Presenta=on Introduc=on Why Recommender Systems (RS)? Problems in Recommending Our
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationSemantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
More informationINCREMENTAL, APPROXIMATE DATABASE QUERIES AND UNCERTAINTY FOR EXPLORATORY VISUALIZATION. Danyel Fisher Microso0 Research
INCREMENTAL, APPROXIMATE DATABASE QUERIES AND UNCERTAINTY FOR EXPLORATORY VISUALIZATION Danyel Fisher Microso0 Research Exploratory Visualiza9on Ini9al Query Process query Get a response Change parameters
More informationIbis: Scaling Python Analy=cs on Hadoop and Impala
Ibis: Scaling Python Analy=cs on Hadoop and Impala Wes McKinney, Budapest BI Forum 2015-10- 14 @wesmckinn 1 Me R&D at Cloudera Serial creator of structured data tools / user interfaces Mathema=cian MIT
More informationKeeping Pace with Big Data
- A Data Mining Perspec>ve Huan Liu, Tempe, AZ hep://www.public.asu.edu/~huanliu NSF Workshop on Big Data Analy6cs for Infrastructure and Building Resilience and Sustainability, Beijing, China Sept 19-20,
More informationDTCC Data Quality Survey Industry Report
DTCC Data Quality Survey Industry Report November 2013 element 22 unlocking the power of your data Contents 1. Introduction 3 2. Approach and participants 4 3. Summary findings 5 4. Findings by topic 6
More informationThe Development of a Strategic Planning Framework for VCU s College of Humani?es and Sciences
The Development of a Strategic Planning Framework for VCU s College of Humani?es and Sciences Data Analysis and Representa?on Interpreta?on U?liza?on Why are we here? During the fall 0 CHS retreat, Dean
More informationInformation Services for Smart Grids
Smart Grid and Renewable Energy, 2009, 8 12 Published Online September 2009 (http://www.scirp.org/journal/sgre/). ABSTRACT Interconnected and integrated electrical power systems, by their very dynamic
More informationRun$me Query Op$miza$on
Run$me Query Op$miza$on Robust Op$miza$on for Graphs 2006-2014 All Rights Reserved 1 RDF Join Order Op$miza$on Typical approach Assign es$mated cardinality to each triple pabern. Bigdata uses the fast
More informationSemantic Interoperability
Ivan Herman Semantic Interoperability Olle Olsson Swedish W3C Office Swedish Institute of Computer Science (SICS) Stockholm Apr 27 2011 (2) Background Stockholm Apr 27, 2011 (2) Trends: from
More informationHow To Understand The Big Data Paradigm
Big Data and Its Empiricist Founda4ons Teresa Scantamburlo The evolu4on of Data Science The mechaniza4on of induc4on The business of data The Big Data paradigm (data + computa4on) Cri4cal analysis Tenta4ve
More informationResearch at the Department of Computer Science and Software Engineering. Professor Yong Yue BEng, PhD, CEng, FIET, FIMechE 17 October 2014
Research at the Department of Computer Science and Software Engineering Professor Yong Yue BEng, PhD, CEng, FIET, FIMechE 17 October 2014 Research Areas Ar%ficial intelligence Robo%cs Data mining Image
More informationProtec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology
Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology Alexey Kirichenko, F- Secure Corpora7on ICT SHOK, Future Internet program 30.5.2012 Outline 1. Security WP (WP6) overview
More informationI. INTRODUCTION NOESIS ONTOLOGIES SEMANTICS AND ANNOTATION
Noesis: A Semantic Search Engine and Resource Aggregator for Atmospheric Science Sunil Movva, Rahul Ramachandran, Xiang Li, Phani Cherukuri, Sara Graves Information Technology and Systems Center University
More informationAssociate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
More informationNodes, Ties and Influence
Nodes, Ties and Influence Chapter 2 Chapter 2, Community Detec:on and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010. 1 IMPORTANCE OF NODES 2 Importance of Nodes Not
More informationReverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms
Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Irina Astrova 1, Bela Stantic 2 1 Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn,
More informationData Integra*on in a Networked World. Karl Aberer EPFL karl.aberer@epfl.ch h@p://lsir.epfl.ch/ h@p://www.mics.ch/
Data Integra*on in a Networked World Karl Aberer EPFL karl.aberer@epfl.ch h@p://lsir.epfl.ch/ h@p://www.mics.ch/ Overview Mo*va*on: Seman*c Interoperability Peer- to- peer Data Integra*on Mapping Discovery
More informationWeb Services and Development of Semantic Applications
Web Services and Development of Semantic Applications Trish Whetzel Outreach Coordinator THE NATIONAL CENTER FOR BIOMEDICAL ONTOLOGY Na#onal Center for Biomedical Ontology Mission To create software for
More informationBusiness Analysis Center of Excellence The Cornerstone of Business Transformation
February 20, 2013 Business Analysis Center of Excellence The Cornerstone of Business Transformation John E. Parker, CEO Enfocus Solutions Inc. www.enfocussolutions.com 0 John E. Parker (Introduc3on) President
More informationCMMI for High-Performance with TSP/PSP
Dr. Kıvanç DİNÇER, PMP Hace6epe University Implemen@ng CMMI for High-Performance with TSP/PSP Informa@on Systems & SoFware The Informa@on Systems usage has experienced an exponen@al growth over the past
More informationCitationBase: A social tagging management portal for references
CitationBase: A social tagging management portal for references Martin Hofmann Department of Computer Science, University of Innsbruck, Austria m_ho@aon.at Ying Ding School of Library and Information Science,
More informationCSER & emerge Consor.a EHR Working Group Collabora.on on Display and Storage of Gene.c Informa.on in Electronic Health Records
electronic Medical Records and Genomics CSER & emerge Consor.a EHR Working Group Collabora.on on Display and Storage of Gene.c Informa.on in Electronic Health Records Brian Shirts, MD, PhD University of
More informationProcessing of Mix- Sensi0vity Video Surveillance Streams on Hybrid Clouds
Processing of Mix- Sensi0vity Video Surveillance Streams on Hybrid Clouds Chunwang Zhang, Ee- Chien Chang School of Compu2ng, Na2onal University of Singapore 28 th June, 2014 Outline 1. Mo0va0on 2. Hybrid
More informationHow to write a Bachelor s Thesis in Cogni4ve and Decision Sciences? Gilles Du4lh
How to write a Bachelor s Thesis in Cogni4ve and Decision Sciences? Gilles Du4lh Who I Am Gilles Du4lh, 32 Psychology at University of Amsterdam Master Psychological Methods Got my PhD in mathema4cal psychology
More informationCONTENTS. Introduc on 2. Undergraduate Program 4. BSC in Informa on Systems 4. Graduate Program 7. MSC in Informa on Science 7
1 1 2 CONTENTS Introducon 2 Undergraduate Program 4 BSC in Informaon Systems 4 Graduate Program 7 MSC in Informaon Science 7 MSC in Health Informacs 13 2 3 Introducon The School of Informaon Science at
More informationData Management within Land Use Division
Data Management within Land Use Division Goals and func8on of the Land Use Division. Brief overview of GLIS. Database management problems. Conclusions. Primary goals of the division To provide informa8on
More informationSome Security Challenges of Cloud Compu6ng. Kui Ren Associate Professor Department of Computer Science and Engineering SUNY at Buffalo
Some Security Challenges of Cloud Compu6ng Kui Ren Associate Professor Department of Computer Science and Engineering SUNY at Buffalo Cloud Compu6ng: the Next Big Thing Tremendous momentum ahead: Prediction
More informationData Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
More informationSDN- based Mobile Networking for Cellular Operators. Seil Jeon, Carlos Guimaraes, Rui L. Aguiar
SDN- based Mobile Networking for Cellular Operators Seil Jeon, Carlos Guimaraes, Rui L. Aguiar Background The data explosion currently we re facing with has a serious impact on current cellular networks
More informationWelcome! Accelera'ng Pa'ent- Centered Outcomes Research and Methodological Research. Andrea Heckert, PhD, MPH Program Officer, Science
Accelera'ng Pa'ent- Centered Outcomes Research and Methodological Research Emily Evans, PhD, MPH Program Officer, Science Andrea Heckert, PhD, MPH Program Officer, Science June 22, 2015 Welcome! Emily
More informationCS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #5: En-ty/Rela-onal Models- - - Part 1
CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #5: En-ty/Rela-onal Models- - - Part 1 Announcements- - - Project Goal: design a database system applica-on with a web front-
More informationInterna'onal Standards Ac'vi'es on Cloud Security EVA KUIPER, CISA CISSP EVA.KUIPER@HP.COM HP ENTERPRISE SECURITY SERVICES
Interna'onal Standards Ac'vi'es on Cloud Security EVA KUIPER, CISA CISSP EVA.KUIPER@HP.COM HP ENTERPRISE SECURITY SERVICES Agenda Importance of Common Cloud Standards Outline current work undertaken Define
More informationA Web Page Prediction Model Based on Click-Stream Tree Representation of User Behavior
A Web Page Predicon Model Based on Click-Stream Tree Representaon of User Behavior Şule Gündüz Computer Engineering Department Istanbul Technical University Istanbul, Turkey gunduz@cs.itu.edu.tr M. Tamer
More informationSocial Network Mining
SSIIM - Seminários de Sistemas Inteligentes, Interacção e Mul8média, MIEIC Social Network Mining Eduarda Mendes Rodrigues Assistant Professor DEI- FEUP, Universidade do Porto hhp://www.fe.up.pt/~eduarda
More informationService Oriented Architecture
Service Oriented Architecture Charlie Abela Department of Artificial Intelligence charlie.abela@um.edu.mt Last Lecture Web Ontology Language Problems? CSA 3210 Service Oriented Architecture 2 Lecture Outline
More informationDe la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data
De la Business Intelligence aux Big Data Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris 22/01/14 Séminaire Big Data 1 Agenda EvoluHon of Business Intelligence SemanHc Technologies
More informationTopic Extrac,on from Online Reviews for Classifica,on and Recommenda,on (2013) R. Dong, M. Schaal, M. P. O Mahony, B. Smyth
Topic Extrac,on from Online Reviews for Classifica,on and Recommenda,on (2013) R. Dong, M. Schaal, M. P. O Mahony, B. Smyth Lecture Algorithms to Analyze Big Data Speaker Hüseyin Dagaydin Heidelberg, 27
More informationBENCHMARKING V ISUALIZATION TOOL
Copyright 2014 Splunk Inc. BENCHMARKING V ISUALIZATION TOOL J. Green Computer Scien
More informationManaged Services. An essen/al set of tools for today's businesses
Managed Services An essen/al set of tools for today's businesses Manage your enterprise better with a holis/c solu/on to all your IT worries only at Infolob What are Managed Services? By far the most cu/ng
More informationUnderstanding Prototype Theory and How it Can be Useful in Analyzing and Creating SEC XBRL Filings
Understanding Prototype Theory and How it Can be Useful in Analyzing and Creating SEC XBRL Filings By Charles Hoffman This information is inspired by the book Everything is Miscellaneous: the power of
More informationThe use of Semantic Web Technologies in Spatial Decision Support Systems
The use of Semantic Web Technologies in Spatial Decision Support Systems Adam Iwaniak Jaromar Łukowicz Iwona Kaczmarek Marek Strzelecki The INSPIRE Conference 2013, 23-27 June Wroclaw University of Environmental
More informationECIA RiSE Initiative. Risk Assessment Database
ECIA RiSE Initiative Risk Assessment Database Contents Background Planning Outcome Process (Training Slides) System in prac:ce Background BB audit & inspec:on process established differing approaches to
More informationFounda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework
Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework Steven Hunt Enterprise IT Governance Strategist NASA Ames Research Center Michael
More informationMSc Data Science at the University of Sheffield. Started in September 2014
MSc Data Science at the University of Sheffield Started in September 2014 Gianluca Demar?ni Lecturer in Data Science at the Informa?on School since 2014 Ph.D. in Computer Science at U. Hannover, Germany
More informationSuppor&ng a social media research environment by mining big textual data. Sophia Ananiadou Na-onal Centre for Text Mining www.nactem.ac.
Suppor&ng a social media research environment by mining big textual data Sophia Ananiadou Na-onal Centre for Text Mining www.nactem.ac.uk Mo-va-on Much social media data consists of unstructured, noisy
More informationBig Data from a Database Theory Perspective
Big Data from a Database Theory Perspective Martin Grohe Lehrstuhl Informatik 7 - Logic and the Theory of Discrete Systems A CS View on Data Science Applications Data System Users 2 Us Data HUGE heterogeneous
More informationUsing Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources
Using Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources Investigating Automated Sentiment Analysis of Feedback Tags in a Programming Course Stephen Cummins, Liz Burd, Andrew
More informationisecure: Integrating Learning Resources for Information Security Research and Education The isecure team
isecure: Integrating Learning Resources for Information Security Research and Education The isecure team 1 isecure NSF-funded collaborative project (2012-2015) Faculty NJIT Vincent Oria Jim Geller Reza
More informationSocial Media Analy.cs (SMA)
Social Media Analy.cs (SMA) Emanuele Della Valle DEIB - Politecnico di Milano emanuele.dellavalle@polimi.it hap://emanueledellavalle.org What's social media? haps://www.youtube.com/watch?v=sgniiud_oqg
More informationPSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software
More informationMaking Sense of Big Data. Dr. Thomas E. Potok Computa2onal Data Analy2cs Group Leader Oak Ridge Na2onal Laboratory potokte@ornl.
Making Sense of Big Data Dr. Thomas E. Potok Computa2onal Data Analy2cs Group Leader Oak Ridge Na2onal Laboratory potokte@ornl.gov 865-574- 0834 ORNL s Big Data Legacy Science National Security Energy
More informationPu?ng B2B Research to the Legal Test
With the global leader in sampling and data services Pu?ng B2B Research to the Legal Test Ashlin Quirk, SSI General Counsel 2014 Survey Sampling Interna6onal 1 2014 Survey Sampling Interna6onal Se?ng the
More informationOntology-Based Semantic Modeling of Safety Management Knowledge
2254 Ontology-Based Semantic Modeling of Safety Management Knowledge Sijie Zhang 1, Frank Boukamp 2 and Jochen Teizer 3 1 Ph.D. Candidate, School of Civil and Environmental Engineering, Georgia Institute
More informationLanguage Resources, Language Technology, Text Mining, the Seman8c Web: How interoperability of machines can help humans in the mul8lingual web
Language Resources, Language Technology, Text Mining, the Seman8c Web: How interoperability of machines can help humans in the mul8lingual web Felix Sasaki DFKI / University of Appl. Sciences Potsdam W3C
More informationExpanding Assessment of Analy3cal Skills among Biology Majors: From Introductory labs to Upper Division Elec3ves
Expanding Assessment of Analy3cal Skills among Biology Majors: From Introductory labs to Upper Division Elec3ves Presented by Kathleen McAuley PI: Serena Moseman- Val3erra, Ph.D. Department of Biological
More informationConnec(ng to the NC Educa(on Cloud
NC Educa)on Cloud Connec(ng to the NC Educa(on Cloud May 2012 Update! http://cloud.fi.ncsu.edu! Dave Furiness, MCNC! Phil Emer, Friday Institute! 1 First Things First Year one was about planning we are
More informationCost Effec/ve Approaches to Best Prac/ces in Data Analy/cs for Internal Audit
Cost Effec/ve Approaches to Best Prac/ces in Data Analy/cs for Internal Audit Presented to: ISACA and IIA Joint Mee/ng October 10, 2014 By Outline Introduc.on The Evolving Role of Internal Audit The importance
More informationORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM
ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM IRANDOC CASE STUDY Ammar Jalalimanesh a,*, Elaheh Homayounvala a a Information engineering department, Iranian Research Institute for
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary
More informationPython for Data Analysis and Visualiza4on. Fang (Cherry) Liu, Ph.D fang.liu@oit.gatech.edu PACE Gatech July 2013
Python for Data Analysis and Visualiza4on Fang (Cherry) Liu, Ph.D PACE Gatech July 2013 Outline System requirements and IPython Why use python for data analysis and visula4on Data set US baby names 1880-2012
More informationWebsite Design. A Crash Course. Monique Sherre, monique@boxcarmarke4ng.com
Website Design A Crash Course Monique Sherre, monique@boxcarmarke4ng.com When & Why Do We Re- Design no mobile BoxcarMarke6ng.com aesthe6c update Raincoast.com legacy CMS ABCBookWorld.com new company,
More informationScalable Mul*- Class Traffic Management in Data Center Backbone Networks
Scalable Mul*- Class Traffic Management in Data Center Backbone Networks Amitabha Ghosh (UtopiaCompression) Sangtae Ha (Princeton) Edward Crabbe (Google) Jennifer Rexford (Princeton) Outline Mo*va*on Contribu*ons
More informationGraduate Systems Engineering Programs: Report on Outcomes and Objec:ves
Graduate Systems Engineering Programs: Report on Outcomes and Objec:ves Alice Squires, alice.squires@stevens.edu Tim Ferris, David Olwell, Nicole Hutchison, Rick Adcock, John BrackeL, Mary VanLeer, Tom
More informationData Management in the Cloud: Limitations and Opportunities. Annies Ductan
Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management
More informationOffice of Business and Financial Services. Department Budget Presenta0on
Office of Business and Financial Services Department Budget Presenta0on Office of Business and Financial Services Overview Office of Business and Financial Services Overview Fund for Budgetary Purposes General
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationA Brief Introduc/on to CDISC SDTM and Data Mapping
A Brief Introduc/on to CDISC SDTM and Data Mapping Agenda Flow of Clinical Trials Data The Problem Introducing CDISC Understanding SDTM Concepts of Data Mapping References 5/3/10 2 Flow of Clinical Trials
More informationKNOWLEDGE ORGANIZATION
KNOWLEDGE ORGANIZATION Gabi Reinmann Germany reinmann.gabi@googlemail.com Synonyms Information organization, information classification, knowledge representation, knowledge structuring Definition The term
More informationScalus A)ribute Workshop. Paris, April 14th 15th
Scalus A)ribute Workshop Paris, April 14th 15th Content Mo=va=on, objec=ves, and constraints Scalus strategy Scenario and architectural views How the architecture works Mo=va=on for this MCITN Storage
More informationBig Data Visualiza9on
Big Data Visualiza9on Dr. Steve Cutchin Associate Professor Computer Science 2012 Boise State University 1 Computer Science Department 10 Faculty + 3 Lectures + 2 New hires. 400 Undergraduates Enrolled
More informationWandering Lonely as a Cloud. Arts and Humani7es, Clouds, Crowds and Seamless Infrastructures
Wandering Lonely as a Cloud. Arts and Humani7es, Clouds, Crowds and Seamless Infrastructures Sheila Anderson Centre for e- Research, King s College London ISGC 2011, Taiwan 1 Clouds and Crowds - Wordsworth
More informationHow To Use A Webmail On A Pc Or Macodeo.Com
Big data workloads and real-world data sets Gang Lu Institute of Computing Technology, Chinese Academy of Sciences BigDataBench Tutorial MICRO 2014 Cambridge, UK INSTITUTE OF COMPUTING TECHNOLOGY 1 Five
More information«Shanoir : une solu/on pour la ges/on de données distribuées en imagerie in- vivo» Jus/ne Guillaumont Isabelle Corouge
«Shanoir : une solu/on pour la ges/on de données distribuées en imagerie in- vivo» Jus/ne Guillaumont Isabelle Corouge Shanoir: a solu-on for neuro- imaging data management Jus/ne Guillaumont, Isabelle
More informationThe author(s) shown below used Federal funds provided by the U.S. Department of Justice and prepared the following final report:
The author(s) shown below used Federal funds provided by the U.S. Department of Justice and prepared the following final report: Document Title: Criminal Justice System State Administrative Agencies: Research
More informationProgram Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional.
Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional. 163 Stormont Street New Concord, OH 43762 614-286-7895
More informationMega Modeling for Scien/fic Big Data Processing
Mega Modeling for Scien/fic Big Data Processing Stefano Ceri, Emanuele Della Valle (Politecnico di Milano) Dino Pedreschi, Roberto Trasar/ (ISTI- CNR and University of Pisa) 1 The context 2 Scenario BIG
More informationExperiments on cost/power and failure aware scheduling for clouds and grids
Experiments on cost/power and failure aware scheduling for clouds and grids Jorge G. Barbosa, Al0no M. Sampaio, Hamid Harabnejad Universidade do Porto, Faculdade de Engenharia, LIACC Porto, Portugal, jbarbosa@fe.up.pt
More informationGetting Real with Policies for Software Defined Infrastructure. Manish Dave Principal Engineer, Intel IT
Getting Real with Policies for Software Defined Infrastructure Manish Dave Principal Engineer, Intel IT Manish Dave, Principal Engineer, Intel IT Network Security Architect @ Intel IT 15+ years of experience
More informationIdentity and Access Positioning of Paradgimo
1 1 Identity and Access Positioning of Paradgimo Olivier Naveau Managing Director assisted by Bruno Guillaume, CISSP IAM in 4D 1. Data Model 2. Functions & Processes 3. Key Components 4. Business Values
More informationPerformance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp
Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)
More informationA Framework for Ontology-Based Knowledge Management System
A Framework for Ontology-Based Knowledge Management System Jiangning WU Institute of Systems Engineering, Dalian University of Technology, Dalian, 116024, China E-mail: jnwu@dlut.edu.cn Abstract Knowledge
More information