New Frontiers of Automated Content Analysis in the Social Sciences
|
|
- Lora Norton
- 8 years ago
- Views:
Transcription
1 Symposium on the New Frontiers of Automated Content Analysis in the Social Sciences University of Zurich July 1-3, Abstract Automated Content Analysis (ACA) is one of the key fields of methodological innovation in the social sciences, not least because there is a growing need to analyze the increasing number of digitally available text collections. Our goal is to bring together computational linguists and social scientists in order to improve the dialogue between the two research communities and to exploit mutual benefits for the advancement of ACA in the social sciences. More precisely, our program pairs social scientists and computational linguists into thematically coherent sessions, which are related to event analysis, trend identification, text classification, text scaling and sentiment detection. This setup should enable social scientists to gain insights into the sophisticated methodological instruments of computational linguistics in order to enhance their analyses. Computational linguists, in contrast, have the opportunity to apply their concepts and instruments to the vast array of research questions debated in the social sciences. The conference is jointly organized by the Swiss National Center of Competence Research in Democracy, the Stein Rokkan Chair of the European University Institute, as well as the Department of Political Science at the University of Zurich. Organisers Prof. Gerold Schneider and Dr. des. Bruno Wueest (NCCR Democracy) Prof. Hanspeter Kriesi (European University Institute) Prof. Silja Häusermann (University of Zurich) 1
2 Speakers and topics A: Social Sciences B: Computational linguistics Introductory keynote (July 1, 17:00-18:00, UZH main building, KOL-H-317) B. Kathleen McKeown Session 1 Extracting Complex Relational Data Chair: Jasmine Lorenzini A. Alexander Hanna and Pamela Oliver: Automated Coding of Protest Event Data B. Peter Makarov and Klaus Rothenhäusler: Towards Automated Protest Event Analysis Session 2 Retrieving Events from Large-Scale Data Chair: Bruno Wueest A. Wouter van Atteveldt: Using grammatical clauses for social and semantic network analysis B. Vasileios Lampos: Extracting Interesting Concepts from Large-Scale Textual Data Session 3 Trend Identification Chair: Swen Hutter A. Bruno Wueest: Taking care of time dependency and theoretical mismatch in topic models of political attention B. Michael Amsler and Gerold Schneider: Data-Driven and Linguistically Motivated Trend Identification Session 4 Enhancing Text classification Chair: Silja Häusermann A. Nils Weidmann and Mihai Croicu: Improving the Selection of News Reports for Event Coding Using Ensemble Classification B. Jordan Boyd-Graber: Interactive Topic Modeling for Labeling and Making Sense of Large Corpora Session 5 Data-Driven vs. Annotation-driven Text Mining Chair: Thomas Kurer A. Martin Wettstein and Werner Wirth: Semi-automated content analysis of news texts B. Andrew Salway: Some possibilities and limits of data-driven content analysis Session 6 Actor-level Sentiment Chair: Gerold Schneider A. Martin Haselmayer and Marcelo Jenny: Dictionary-based Sentiment Analysis with Crowdcoding B. Jochen Leidner: A Critical Analyisis of Sentiment Analysis Session 7 Text Scaling / Document-level sentiment Chair: Hanspeter Kriesi A. Will Lowe: Scaling things we can count B. Ralf Steinberger: Observing trends in multilingual media analysis Closing keynote (July 3, 16:30-17:30, UZH main building, KOL-H-317) A. Justin Grimmer 2
3 Outline As much as ACA is on the verge of becoming a standard tool for social scientists, scholars still dispute its promises and pitfalls. Hence, existing approaches to analyze unstructured text data mainly developed in computational linguistics need to be amended and adapted to the specific requirements of social scientific studies. To achieve this, we bring together computational linguists and social scientists who share interests in the analysis of large-scale text data. The conference is structured into seven thematic sessions, which are accompanied by an introductory and closing keynote speech. Prof. Kathleen McKeown will provide the introductory talk from the perspective of computational linguistics. The focus of her presentation will lie on the potential of computational linguistics for the social sciences. While possible applications seem abundant, there may well be paramount challenges for the integration of computational linguistic approaches into social scientific research frameworks. The closing talk will be given by Prof. Justin Grimmer. His talk will sum up the most important findings of the conference and give an outlook on the most likely advancements in the social scientific application of ACA in the near future. Session 1 Extracting Complex Relational Data Events such as the eruption of political protest or hostilities in armed conflicts are the unit of enquiry of many social scientific analyses. Obviously, the conceptual and operational specifications of what constitutes an event vary significantly. However, what all event analyses have in common is that a combination of several individual indicators is necessary to specify an event. On the most basic level, events are usually defined by the relation of an action, a date, and a location. When working with large-scale text data, this relation mining task of linking the single indicators to an event is far from trivial, especially since further indicators such as the goals of the action and actors involved are frequently added. Hence, one of the major challenges of automated event analyses is to generate models that allow one to extract events defined as compounds of single indicators. We have invited two research teams (Hanna and Oliver; Makarov and Rothenhäusler), who will report on their progress in dealing with this chal- 3
4 lenge. Both teams are in the process of creating a system for the automated recognition of political protest events, the former dealing with protest events in the US and the latter in Europe. Session 2 Retrieving Events from Large-Scale Data The output that social scientists need from event analyses is information on the actual occurrence of events, and not only the number of mentions of these events in the data. The insights on the recognition of events in session 1 thus have to be enriched with approaches on how to aggregate the single event instances found in the data. Indeed, if event data is retrieved largely through automated procedures, two challenging problems for the retrieval of events arise. First, an aggregative model needs to be able to distinguish between reports belonging to the same event and reports covering different events. Second, there most certainly is bias in how frequently the data source contains information on particular events. The most pressing issue here is how these biases can be assessed and controlled for. This session includes two presentations (van Attveldt; Lampos) approaching such questions from different perspectives. Session 3 Trend Identification This session deals with models to explore corpora in which documents have a sequential order. Agenda research, i.e. the study of attention to political topics over time, is a prominent research area in the social sciences where such corpora are used. Serial correlation in these corpora can be both a curse and a blessing. On the one hand, time-specific dynamics in textual data can be directly used to identify trends. On the other hand, the general evolvement of language over time needs to be taken into account in studies measuring time-invariant concepts such as topic categories. While this may complicate tracking topics, short-term linguistic changes, particularly the introduction of new terms and multi-word units, are equally a useful instrument. The presenters in this session (Gilardi, Wueest and Giovanoli; Amsler and Schneider) will, thus, propose and evaluate approaches that deal with trends in different ways. 4
5 Session 4 Enhancing Text Classification This session will present and discuss new approaches to classify textual content. Classification is one of the most frequent tasks of content analyses also in the social sciences. An important issue in this area of research is the frequent mismatch between the researcher s theoretical expectations and the results of unsupervised text classifications. While inductively generated text classifications are statistically sound, they often considerably deviate from the researchers conception of the structure of the data. Supervised classifications, in contrast, may suffer from poor predictive robustness if the classes strongly confound the statistical properties of the data. The first presentation by Boyd-Graber and Hu discusses a specific model that reconciles the potential conflict between theoretical expectations and statistical predictions. The second presentation by Weidmann and Croicu presents an application and extensive evaluation of a supervised classification on a large newswire corpus. Session 5 Data-driven vs. Annotation-driven Text Mining The participants of this session (Wettstein and Wirth; Salway) are invited to engage in the fundamental question on content analyses in the social sciences, that is whether we should approach text mining in a deductive or rather inductive way. Most social scientists expect manual approaches to the quantification of content to remain indispensable for some tasks at least in the near future. The question thus arises whether and how computational models can support human-generated data collections. An opposite perspective argues for a largely data-driven content analysis. The idea here is to automatically augment representations of text content until results come close to the concepts social scientists want to explore. We expect that the comparison of these two perspectives will lead to a particularly fruitful exchange on the possibilities and constraints to automated content analyses. Session 6 Actor-level Sentiment The identification of tonality in language is essential for many social scientific research questions, first of all for all analyses of political rhetoric and discourse. For many such applications, however, sentiment measures are only valuable if they can be attributed to political actors. In most cases, this involves the detection of sentiment at the level of statements and a model 5
6 relating this sentiment to the speakers communicating them. Among the pressing questions for this session are thus a) how tonality can be measured at the level of single statements such as sentences and speech acts and b) how this tonality can be related to speakers and addressees so that information on the intensity of political conflict can be generated. Presenters in this session are Haselmayer and Jenny as well as (tba). Session 7 Text Scaling / Document-level sentiment Research in the social sciences has already brought forward an impressive array of approaches that aim to locate text on latent scales such as ideological dimensions or documentlevel sentiment. These efforts have developed largely independently from similar advances in computational linguistics, which means the potential for an interdisciplinary exchange seems especially large in this area. The presentation by Lowe will provide the most recent advancements in this field from the social scientific perspective. Ralf Steinberger will complement the session by showing how computational linguists generalize such approaches to the study of trends over time, across different languages, and in different media. 6
Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED
Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED 17 19 June 2013 Monday 17 June Salón de Actos, Facultad de Psicología, UNED 15.00-16.30: Invited talk Eneko Agirre (Euskal Herriko
More informationTEACHING INTERCULTURAL COMMUNICATIVE COMPETENCE IN BUSINESS CLASSES
22 TEACHING INTERCULTURAL COMMUNICATIVE COMPETENCE IN BUSINESS CLASSES Roxana CIOLĂNEANU Abstract Teaching a foreign language goes beyond teaching the language itself. Language is rooted in culture; it
More informationChapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
More informationIntroduction to Pattern Recognition
Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationRoadmapping Discussion Summary. Social Media and Linked Data for Emergency Response
Roadmapping Discussion Summary Social Media and Linked Data for Emergency Response V. Lanfranchi 1, S. Mazumdar 1, E. Blomqvist 2, C. Brewster 3 1 OAK Group, Department of Computer Science University of
More informationF. Aiolli - Sistemi Informativi 2007/2008
Text Categorization Text categorization (TC - aka text classification) is the task of buiding text classifiers, i.e. sofware systems that classify documents from a domain D into a given, fixed set C =
More informationDEFINING EFFECTIVENESS FOR BUSINESS AND COMPUTER ENGLISH ELECTRONIC RESOURCES
Teaching English with Technology, vol. 3, no. 1, pp. 3-12, http://www.iatefl.org.pl/call/callnl.htm 3 DEFINING EFFECTIVENESS FOR BUSINESS AND COMPUTER ENGLISH ELECTRONIC RESOURCES by Alejandro Curado University
More informationIT services for analyses of various data samples
IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical
More informationA Survey on Product Aspect Ranking
A Survey on Product Aspect Ranking Charushila Patil 1, Prof. P. M. Chawan 2, Priyamvada Chauhan 3, Sonali Wankhede 4 M. Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra,
More informationPreface. A Plea for Cultural Histories of Migration as Seen from a So-called Euro-region
Preface A Plea for Cultural Histories of Migration as Seen from a So-called Euro-region The Centre for the History of Intercultural Relations (CHIR), which organised the conference of which this book is
More informationAutomated Multilingual Text Analysis in the Europe Media Monitor (EMM) Ralf Steinberger. European Commission Joint Research Centre (JRC)
Automated Multilingual Text Analysis in the Europe Media Monitor (EMM) Ralf Steinberger European Commission Joint Research Centre (JRC) https://ec.europa.eu/jrc/en/research-topic/internet-surveillance-systems
More informationdm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING
dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING ABSTRACT In most CRM (Customer Relationship Management) systems, information on
More informationThe international conference Networks in the Global World. Bridging Theory and Method: American, European, and Russian Studies took place at St.
The international conference Networks in the Global World. Bridging Theory and Method: American, European, and Russian Studies took place at St. Petersburg State University on June 27-29, 2014. It was
More informationSearch Based Applications
CHAPTER 1 Search Based Applications 1 1.1 INTRODUCTION Figure 1.1: Can you see the search engine behind these screens? Management of information via computers is undergoing a revolutionary change as the
More informationData Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority
More informationCollecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
More informationComputational Linguistics and Learning from Big Data. Gabriel Doyle UCSD Linguistics
Computational Linguistics and Learning from Big Data Gabriel Doyle UCSD Linguistics From not enough data to too much Finding people: 90s, 700 datapoints, 7 years People finding you: 00s, 30000 datapoints,
More informationUsing LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.
White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,
More informationText Analysis for Big Data. Magnus Sahlgren
Text Analysis for Big Data Magnus Sahlgren Data Size Style (editorial vs social) Language (there are other languages than English out there!) Data Size Style (editorial vs social) Language (there are
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationMIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
More informationFacilitating Business Process Discovery using Email Analysis
Facilitating Business Process Discovery using Email Analysis Matin Mavaddat Matin.Mavaddat@live.uwe.ac.uk Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process
More informationGallito 2.0: a Natural Language Processing tool to support Research on Discourse
Presented in the Twenty-third Annual Meeting of the Society for Text and Discourse, Valencia from 16 to 18, July 2013 Gallito 2.0: a Natural Language Processing tool to support Research on Discourse Guillermo
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationStudy Plan for Master of Arts in Applied Linguistics
Study Plan for Master of Arts in Applied Linguistics Master of Arts in Applied Linguistics is awarded by the Faculty of Graduate Studies at Jordan University of Science and Technology (JUST) upon the fulfillment
More informationSemantic Search in E-Discovery. David Graus & Zhaochun Ren
Semantic Search in E-Discovery David Graus & Zhaochun Ren This talk Introduction David Graus! Understanding e-mail traffic David Graus! Topic discovery & tracking in social media Zhaochun Ren 2 Intro Semantic
More informationPREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA
PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA IMS Symposium at ISPOR at Montreal June 2 nd, 2014 Agenda Topic Presenter Time Introduction:
More informationBridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project
Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Ahmet Suerdem Istanbul Bilgi University; LSE Methodology Dept. Science in the media project is funded
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationPresentation fiche: ESCO, the forthcoming European Skills, Competencies and Occupations taxonomy
EUROPEAN COMMISSION Employment, Social Affairs and Equal Opportunities DG Employment, Lisbon Strategy, International Affairs Employment Services, Mobility Brussels, 18 January 2010 EMPL D-3/LK D(2009)
More informationFind the signal in the noise
Find the signal in the noise Electronic Health Records: The challenge The adoption of Electronic Health Records (EHRs) in the USA is rapidly increasing, due to the Health Information Technology and Clinical
More informationProbabilistic topic models for sentiment analysis on the Web
University of Exeter Department of Computer Science Probabilistic topic models for sentiment analysis on the Web Chenghua Lin September 2011 Submitted by Chenghua Lin, to the the University of Exeter as
More informationLogo. International Symposium Security Dimensions in Europe Today 2004/12/20
Logo International Symposium Security Dimensions in Europe Today 2004/12/20 state A Institution hostility state B state C state A state B state C Need for Institutionalized Cooperation among
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationMethods in writing process research
Carmen Heine, Dagmar Knorr and Jan Engberg Methods in writing process research Introduction and overview 1 Introduction Research methods are at the core of assumptions, hypotheses, research questions,
More informationQualitative Corporate Dashboards for Corporate Monitoring Peng Jia and Miklos A. Vasarhelyi 1
Qualitative Corporate Dashboards for Corporate Monitoring Peng Jia and Miklos A. Vasarhelyi 1 Introduction Electronic Commerce 2 is accelerating dramatically changes in the business process. Electronic
More informationBig data workshop. Digital Reading Network 6 th March 2014, Sheffield. Andrew Salway, Uni Research, Bergen Daniel Allington, The Open University
Big data workshop Digital Reading Network 6 th March 2014, Sheffield Andrew Salway, Uni Research, Bergen Daniel Allington, The Open University Overview What does big data mean (for social science and humanistic
More informationHow the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
More informationText Mining - Scope and Applications
Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss
More informationIntelligent Search for Answering Clinical Questions Coronado Group, Ltd. Innovation Initiatives
Intelligent Search for Answering Clinical Questions Coronado Group, Ltd. Innovation Initiatives Search The Way You Think Copyright 2009 Coronado, Ltd. All rights reserved. All other product names and logos
More informationInformation Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
More informationContractual Relationships in Open Source Structures
Contractual Relationships in Open Source Structures Carsten Schulz JBB Rechtsanwälte carsten.schulz@ifross.de Abstract: The article provides an overview of the legal relationships in Open Source Structures.
More informationACEDS Membership Benefits Training, Resources and Networking for the E-Discovery Community
ACEDS Membership Benefits Training, Resources and Networking for the E-Discovery Community! Exclusive News and Analysis! Weekly Web Seminars! Podcasts! On- Demand Training! Networking! Resources! Jobs
More informationComparing Support Vector Machines, Recurrent Networks and Finite State Transducers for Classifying Spoken Utterances
Comparing Support Vector Machines, Recurrent Networks and Finite State Transducers for Classifying Spoken Utterances Sheila Garfield and Stefan Wermter University of Sunderland, School of Computing and
More information2013 IOS Press. This document is published in:
This document is published in: Bossé, E. et al. (eds.) (2013) Prediction and Recognition of Piracy Efforts Using Collaborative Human-Centric Information Systems, Proceedings of the NATO Advanced Study
More informationText as (Big) Data. Fabrizio Gilardi. Department of Political Science University of Zurich. HWZ-Darden-Conference 4 June 2015. (Updated June 4, 2015)
Text as (Big) Data Fabrizio Gilardi Department of Political Science University of Zurich HWZ-Darden-Conference 4 June 2015 (Updated June 4, 2015) 1 / 31 Outline Text as Big Data Analyzing Text as Data
More informationNAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju
NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised
More informationMid-Term Review: A contractual obligation and a fruitful dialogue
FP7 Marie Curie Initial Training Networks Mid-Term Review: A contractual obligation and a fruitful dialogue Guidelines for the Mid-Term Review 1 January 2014 2 1 These guidelines shall guide through the
More informationText Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com
Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationSurvey Results: Requirements and Use Cases for Linguistic Linked Data
Survey Results: Requirements and Use Cases for Linguistic Linked Data 1 Introduction This survey was conducted by the FP7 Project LIDER (http://www.lider-project.eu/) as input into the W3C Community Group
More informationSpatio-Temporal Patterns of Passengers Interests at London Tube Stations
Spatio-Temporal Patterns of Passengers Interests at London Tube Stations Juntao Lai *1, Tao Cheng 1, Guy Lansley 2 1 SpaceTimeLab for Big Data Analytics, Department of Civil, Environmental &Geomatic Engineering,
More informationHow To Analyse The Diffusion Patterns Of A Lexical Innovation In Twitter
GOOD MORNING TWEETHEARTS! : THE DIFFUSION OF A LEXICAL INNOVATION IN TWITTER REBECCA MAYBAUM (University of Haifa) Abstract The paper analyses the diffusion patterns of a community-specific lexical innovation,
More informationBig Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
More informationText Analytics. A business guide
Text Analytics A business guide February 2014 Contents 3 The Business Value of Text Analytics 4 What is Text Analytics? 6 Text Analytics Methods 8 Unstructured Meets Structured Data 9 Business Application
More informationBuilding a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
More informationTHE BACHELOR S DEGREE IN SPANISH
Academic regulations for THE BACHELOR S DEGREE IN SPANISH THE FACULTY OF HUMANITIES THE UNIVERSITY OF AARHUS 2007 1 Framework conditions Heading Title Prepared by Effective date Prescribed points Text
More informationII. TYPES OF LEVEL A.
Study and Evaluation for Quality Improvement of Object Oriented System at Various Layers of Object Oriented Matrices N. A. Nemade 1, D. D. Patil 2, N. V. Ingale 3 Assist. Prof. SSGBCOET Bhusawal 1, H.O.D.
More informationCOURSE DESCRIPTION FOR THE BACHELOR DEGREE IN INTERNATIONAL RELATIONS
COURSE DESCRIPTION FOR THE BACHELOR DEGREE IN INTERNATIONAL RELATIONS Course Code 2507205 Course Name International Relations of the Middle East In this course the student will learn an historical and
More informationHow To Become A Data Scientist
Programme Specification Awarding Body/Institution Teaching Institution Queen Mary, University of London Queen Mary, University of London Name of Final Award and Programme Title Master of Science (MSc)
More informationBagged Ensemble Classifiers for Sentiment Classification of Movie Reviews
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 2 February, 2014 Page No. 3951-3961 Bagged Ensemble Classifiers for Sentiment Classification of Movie
More informationWorkshop Series on Open Source Research Methodology in Support of Non-Proliferation
The International Centre for Security Analysis The Policy Institute at King s King s College London Workshop Series on Open Source Research Methodology in Support of Non-Proliferation Workshop 1: Exploiting
More informationTowards a new paradigm of science
Essay Towards a new paradigm of science in scientific policy advising 2007 Eva Kunseler Kansanterveyslaitos Environment and Health Department Methods of Scientific Thought Prof. Lindqvist Introduction:
More informationGrammAds: Keyword and Ad Creative Generator for Online Advertising Campaigns
GrammAds: Keyword and Ad Creative Generator for Online Advertising Campaigns Stamatina Thomaidou 1,2, Konstantinos Leymonis 1,2, Michalis Vazirgiannis 1,2,3 Presented by: Fragkiskos Malliaros 2 1 : Athens
More informationKnowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization
Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization Atika Mustafa, Ali Akbar, and Ahmer Sultan National University of Computer and Emerging
More informationResearch Challenge on Opinion Mining and Sentiment Analysis *
Research Challenge on Opinion Mining and Sentiment Analysis * David Osimo 1 and Francesco Mureddu 2 Draft Background The aim of this paper is to present an outline for discussion upon a new Research Challenge
More information6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
More informationOPEN SOURCE INFORMATION ACQUISITION, ANALYSIS, AND INTEGRATION IN THE IAEA DEPARTMENT OF SAFEGUARDS 1
JAMES MARTIN CENTER FOR NONPROLIFERATION STUDIES Twentieth Anniversary Celebration: The Power and Promise of Nonproliferation Education and Training December 3-5, 2009 OPEN SOURCE INFORMATION ACQUISITION,
More informationSentiment analysis on tweets in a financial domain
Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International
More informationUsing Artificial Intelligence to Manage Big Data for Litigation
FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear
More informationIntroduction to Text Mining and Semantics. Seth Grimes -- President, Alta Plana
Introduction to Text Mining and Semantics Seth Grimes -- President, Alta Plana New York Times October 9, 1958 Text expresses a vast, rich range of information, but encodes this information in a form that
More informationCONNECTING DATA WITH BUSINESS
CONNECTING DATA WITH BUSINESS Big Data and Data Science consulting Business Value through Data Knowledge Synergic Partners is a specialized Big Data, Data Science and Data Engineering consultancy firm
More informationToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database
ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database Dina Vishnyakova 1,2, 4, *, Julien Gobeill 1,3,4, Emilie Pasche 1,2,3,4 and Patrick Ruch
More informationBig Data. Data is the new content: How publishers can use Big Data to increase revenues. September 2014
Big Data Data is the new content: How publishers can use Big Data to increase revenues September 2014 Big Data revolutionizes publishing business About this report Qualitative enquiry with 15 German publishers
More informationMaster of Arts in Linguistics Syllabus
Master of Arts in Linguistics Syllabus Applicants shall hold a Bachelor s degree with Honours of this University or another qualification of equivalent standard from this University or from another university
More informationThe multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
More informationFacebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
More informationCrime Pattern Analysis
Crime Pattern Analysis Megaputer Case Study in Text Mining Vijay Kollepara Sergei Ananyan www.megaputer.com Megaputer Intelligence 120 West Seventh Street, Suite 310 Bloomington, IN 47404 USA +1 812-330-01
More informationConquering the Astronomical Data Flood through Machine
Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:
More informationHonorary Fellow of the Amsterdam School of Communication Research (ASCoR), University of Amsterdam, The Netherlands
Klaus Schönbach Chair of General Communication Science, Department of Communication, University of Vienna, Austria Honorary Professor of Zeppelin University, Friedrichshafen, Germany Honorary Fellow of
More informationInformation Need Assessment in Information Retrieval
Information Need Assessment in Information Retrieval Beyond Lists and Queries Frank Wissbrock Department of Computer Science Paderborn University, Germany frankw@upb.de Abstract. The goal of every information
More informationInternet of Things, data management for healthcare applications. Ontology and automatic classifications
Internet of Things, data management for healthcare applications. Ontology and automatic classifications Inge.Krogstad@nor.sas.com SAS Institute Norway Different challenges same opportunities! Data capture
More informationNatural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
More informationThis Symposium brought to you by www.ttcus.com
This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data
More informationHow to prepare and submit a proposal for EARLI 2015
How to prepare and submit a proposal for EARLI 2015 If you intend to contribute to the scientific programme of EARLI 2015, you have to choose between various conference formats, which are introduced in
More informationThe Six Critical Considerations of Social Media Threat Intelligence
The Six Critical Considerations of Social Media Threat Intelligence Every day, angry rhetoric and hints of potential danger flow though streams of social media data. Some of these threats may affect your
More informationMiracle Integrating Knowledge Management and Business Intelligence
ALLGEMEINE FORST UND JAGDZEITUNG (ISSN: 0002-5852) Available online www.sauerlander-verlag.com/ Miracle Integrating Knowledge Management and Business Intelligence Nursel van der Haas Technical University
More informationStudy program International Communication (120 ЕCTS)
Study program International Communication (120 ЕCTS) Faculty Cycle Languages, Cultures and Communications Postgraduate ECTS 120 Offered in Skopje Description of the program The International Communication
More informationNSF Workshop on Big Data Security and Privacy
NSF Workshop on Big Data Security and Privacy Report Summary Bhavani Thuraisingham The University of Texas at Dallas (UTD) February 19, 2015 Acknowledgement NSF SaTC Program for support Chris Clifton and
More informationSentiment Analysis on Big Data
SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social
More informationThe Knowledge Sharing Infrastructure KSI. Steven Krauwer
The Knowledge Sharing Infrastructure KSI Steven Krauwer 1 Why a KSI? Building or using a complex installation requires specialized skills and expertise. CLARIN is no exception. CLARIN is populated with
More informationWhy are Organizations Interested?
SAS Text Analytics Mary-Elizabeth ( M-E ) Eddlestone SAS Customer Loyalty M-E.Eddlestone@sas.com +1 (607) 256-7929 Why are Organizations Interested? Text Analytics 2009: User Perspectives on Solutions
More informationVolume 2, Issue 12, December 2014 International Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 12, December 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com
More informationAssessing speaking in the revised FCE Nick Saville and Peter Hargreaves
Assessing speaking in the revised FCE Nick Saville and Peter Hargreaves This paper describes the Speaking Test which forms part of the revised First Certificate of English (FCE) examination produced by
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationRethinking Sentiment Analysis in the News: from Theory to Practice and back
1 Rethinking Sentiment Analysis in the News: from Theory to Practice and back Alexandra Balahur 1,2, Ralf Steinberger 1 1 European Commission Joint Research Centre 2 University of Alicante, Department
More informationClustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
More informationContext Aware Predictive Analytics: Motivation, Potential, Challenges
Context Aware Predictive Analytics: Motivation, Potential, Challenges Mykola Pechenizkiy Seminar 31 October 2011 University of Bournemouth, England http://www.win.tue.nl/~mpechen/projects/capa Outline
More informationBing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures ~ Spring~r Table of Contents 1. Introduction.. 1 1.1. What is the World Wide Web? 1 1.2. ABrief History of the Web
More informationComputer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015
Computer-Based Text- and Data Analysis Technologies and Applications Mark Cieliebak 9.6.2015 Data Scientist analyze Data Library use 2 About Me Mark Cieliebak + Software Engineer & Data Scientist + PhD
More information