Competitiveness Grant, Business Cases Report: Daniel Hardt Business Cases: Workshops on Big Data and Language Technology Copenhagen Workshop: Business Case Summaries LT in Business: How a Jeopardy winning Machine can Make the World a Better Place. Kim Escherich (IBM). Presented the technology behind the system that created a sensation by winning the game show Jeopardy, by exploiting massive computer power and advanced AI and NLP technology. Discussed a variety of IBM plans for exploiting the technology in areas such as Health Care, Finance, and Smart Cities. Using Social Media to Restore Trust. Thomas Heilskov, Head of Online Media and PR, Danske Bank. Presented several public relations crises experienced by Danske Bank in social media, including the public outcry over the launch of a policy of dividing customers into segments based on their value to the bank. He described the need for Buzz Monitoring, Sentiment Analysis and other technologies exploiting NLP. New York Workshop: Business Case Summaries Big Data and Language Technology. Bernardo Huberman, Research Director, HP. Summarized a variety of research directions being pursued at HP s research labs, including results on successfully predicting box office receipts for major Hollywood films, and studying the motivation for postings of online reviews. Automatic Question Answering and Information Extraction for the Real Time Web. Mohamed Al Tantawy Co Founder, Agolo. Presented the technology and business model behind a New York based startup that exploits NLP technology stemming from research at Columbia University.
Competitiveness Grant, Workshop Conclusions Report: Daniel Hardt 1 Conclusions: Workshops on Big Data and Language Technology General Conclusions The Danish workshop demonstrated that there are many compelling business cases involving Language Technology with Danish firms: in particular: this was made abundantly clear in presentations by Infomedia, Danske Bank, Telia, Mærsk, and IBM Denmark. Breakout group discussions provided more detail on specific challenges facing these companies, and how they can be addressed with Language Technology. The U.S. workshop provided similarly compelling examples of the challenges facing many U.S. firms, and the relevance of Language Technology to these challenges. These examples ranged from new companies like Agolo, where Language Technology is central to their entire business plan, to iconic U.S. companies like Wrigley, facing new challenges in analyzing massive amounts of text data. Below we discuss different ways we are beginning to address these issues. Current Initiatives Sentiment Analysis Workshop The technique of Sentiment Analysis automatically identifying positive or negative sentiments in a given text emerged as a technology of great interest for many of the firms in both workshops. We will be hosting a practical, hands on workshop in which participants will build Sentiment Analysis systems, compare results, analyze different approaches, and attempt to advance the state of the art. There will be a particular focus on Danish language. The Workshop will culminate in a high profile public presentation in late February, 2014. Infomedia The media monitoring firm Infomedia deals with large amounts of text in virtually all of its major activities: it selects relevant texts for its clients, analyzes those texts for sentiment, and provides translations and summarizes. All of these are extremely active areas in Language Technology research. We have established research initiatives with Infomedia, and have already performed initial analyses of some of their data, and have built a proof of concept Sentiment
Competitiveness Grant, Workshop Conclusions Report: Daniel Hardt 2 Analysis system that compares favorably with their own manually produced Sentiment Analysis. In early 2014 we intend to apply for an Industrial Ph.D. for Julie Wulff, a participant in the two workshops. We also have made plans to scale up the analysis and development work already undertaken. Larger Research Project We are exploring possibilities for a more large scale research project, focusing on some of the main challenges identified in the workshops, such as the following: Sentiment Analysis: providing reliable information about sentiment on text data in a way that is relevant for a firm s strategic interests. Danske Bank provides a clear illustration of the need for this General Text Analytics: Language Technology provides techniques for automatically analyzing, filtering and organizing the vast amounts of text data that today s firms need to respond to. A central challenge here is to fine tune these techniques in ways that address the actual strategic interest of firms. Impact: why do postings have the impact that they do? This is a question of great interest to any organization where a social media presence is a priority. It is a topic pursued in the workshops with The Danish National Gallery and The Danish Cancer Society, and we have developed an array of techniques for investigating this topic, which has become crucial to so many organizations.
Competitiveness Grant, Research Report: Daniel Hardt Research Report: Workshops on Big Data and Language Technology Copenhagen Workshop: Research Presentation Summaries Detecting Situational Influence in Online Discussion. Kathleen McKeown. Director of Institute for Data Sciences and Engineering, Columbia University Presented research using machine learning techniques to automatically identify the most influential participants in online discussions. Automatic Identification of Fake Online Reviews. Claire Cardie. Professor of Computer Science, Cornell University. Investigated the problem of identifying fake online reviews, a growing problem for many sites, such as TripAdvisor. The problem is shown to be quite difficult for human judges; automatic techniques achieved human level accuracy. Manifestations of Power in Written Interaction. Owen Rambow. Research Scientist, Center for Computational Learning Systems, Columbia University. Used the ENRON corpus to investigate types of power that can be observed and automatically identified in dialogues. New York Workshop: Research Presentations The Art of Impact. Ida Sofie Brolund and Julie Wulff. Research Assistants, CBS. Investigated factors influencing impact of Facebook postings for two organizations: The National Gallery of Denmark and The Danish Cancer Society. Reordering Without Limits. Jakob Elming, Assistant Professor, Copenhagen University. Presented results on a new approach to word order variations in Statistical Machine Translation. Sentiment Analysis and Business Strategy at Danske Bank. Anders Boje Larsen, Master s Student, CBS. Developed techniques for acquiring
Competitiveness Grant, Research Report: Daniel Hardt Facebook data relevant to Danske Bank s strategic business interests, and performed various forms of analysis, including sentiment analysis and the linking of Facebook data with internal Danske Bank data.