RAI ACTIVE NEWS: INTEGRATING KNOWLEDGE IN NEWSROOMS
|
|
|
- Blaise York
- 10 years ago
- Views:
Transcription
1 RAI ACTIVE NEWS: INTEGRATING KNOWLEDGE IN NEWSROOMS M. Montagnuolo and A. Messina RAI Radiotelevisione Italiana, Italy ABSTRACT In the modern digital age methodologies for professional Big Data analytics represent a strategic necessity to make information resources available to professional journalists and media producers in a more effective and efficient way and enabling new forms of production like datadriven journalism. The challenge lies in the ability of collecting, connecting and presenting heterogeneous content streams accessible through different sources, such as digital TV, the Internet, social networks and media archives, and published through different media modalities, such as audio, speech, text and video, in an organic and semantics-driven way. Rai Active News is a portal for professional information services that addresses these challenges with a uniform and holistic approach. At the core of the system there is a set of artificial intelligence techniques and advanced statistical tools to automate tasks such as information extraction and multimedia content analysis targeted at discovering semantic links between resources, providing users with text, graphics and video news organized according to their individual interests. INTRODUCTION The exponential growth of digital resources availability is enabling new forms of content creation, sharing, and delivery. Methodologies for aggregation and presentation of heterogeneous content are needed to make these resources effective and easily available to the final users. Here, the challenge lies in the ability of collecting, connecting and presenting data streams from different media sources, e.g. television, press, the Internet, and of different media types such as audio, speech, text and video. Rai Active News is a portal for professional information services that addresses these challenges with a uniform and holistic approach. At the core of the system there is a set of artificial intelligence techniques and advanced statistical tools to automate tasks such as information extraction and multimedia content analysis targeted at discovering semantic links between resources, providing users with text, graphics and video news organized according to their individual interests. The system allows to define customized search profiles that are automatically and dynamically updated with the relevant contents found in the monitored information sources, which include Web feeds, television channels and specialized circuits such as the Eurovision News Exchange Network (EVN), or legacy archives. The system also provides a recommendation service based on the analysis of social activities in blogs which dynamically models the user's interest and exploits it to recommend appropriate media content.
2 This paper presents the underlying infrastructure and technology on which Rai Active News is built. The paper is organized as follows. As a first step we overview challenges and opportunities for knowledge integration in modern news production workflows. Then we describe the architecture developed to address these needs, as well as the services built on top of it. Finally, we conclude with considerations for future work. CHALLENGES FOR KNOWLEDGE INTEGRATION IN NEWS PRODUCTION News production is a complex and dynamic process which requires creating content in a fast-paced way and using a wide variety of media, including written texts, images, speech and videos. Contrary to the past, when the life cycle of news items (from sourcing of content to distribution and consumption of products) was typically linear and isolated, we are nowadays dealing with more dynamic and interactive ways to produce, publish and consume news items. Thanks to the proliferation of social networks and open data sources, not only professional journalists, but also individuals are currently taking part in the so called data journalism phenomenon. Data journalism is the process of collecting, filtering and structuring big data for storytelling and reporting. In this context the event, i.e. any relevant fact happening at some time and place, and the topic, i.e. real word entities or things like people, organizations, places or themes, become the basic units around which contents are produced and organized. A topic can include either closed in time or still open events. As an example, the topic about a naval disaster might contain events about the shipwreck, the rescue operations or the ship demolition. The task of data journalism is to extract, track and visualize such hidden information from available data. Machine learning, data mining and semantic Web techniques can be used to gather and analyze data in an unsupervised fashion, thus greatly ensuring productivity and efficiency through all the steps of the process. As an example the News Storyline Ontology 1 is a generic model to describe and organize the stories told by news organizations. News aggregation aims at structuring content from newswires and broadcasts into topics. Publicly accessible aggregators, such as Google News or Yahoo! News, are becoming popular because of their capability of presenting most recent news automatically aggregated and organized by e.g. number of sources and categories, thus making users save the time they would spend in manually finding information. A study of the impact of news aggregators on Internet news consumption is presented by Athey and Mobius (1). On the other hand, a critical problem with news aggregators is that the volumes of generated information can be overwhelming to the users. The challenge is to help users in selecting and tracking not everything that is produced but just what they really need. To this purpose, recommendation engines aim to suggest to the users potentially interesting contents based on their historical activity and implicit or explicit feedback. Liu et al (2) present a personalized news recommendation system based on profiles learned from user activity in Google News. Similarly, Li et al (3) describe a system that selects articles from Yahoo! News. Although the cited works constitute an interesting attempt to realize the conception of intelligent and personalized news fruition, they still present drawbacks that hinder their application to massive production environments. In particular, there is a general lack of temporal depth and semantic organization of data in most of the approaches presented so far, thus it is difficult for the users effectively consuming information according to some preferred criteria such as temporal spanning, information 1
3 correlation and trends. To overtake this drawback, Lebal et al (4) describe a system to identify world events from multilingual news articles. Detected events are searchable by means of dates, entities and graphs. However, only textual resources from the Web are considered. As a consequence, processing is based on unimodal and single-domain resources only, i.e. further modalities such as speech or visual content and different domains such as TV broadcasts or digital libraries are not considered. On the other hand, the opportunity of logically integrating contents from multiple modalities and delivered by different service providers can enhance the user satisfaction during content fruition. ACTIVE NEWS SYSTEM ARCHITECTURE The Rai Active News architecture is shown in Figure 1. The pipeline is made of four modules: data ingestion, which is responsible of collecting news items from the registered data sources; data preprocessing, where several multimodal content analysis techniques are used to extract valuable information from the input streams; data clustering and warehousing, where news items about the same event are aggregated and semantically annotated; data index and access, where the news events and related metadata are made available for browsing, searching, retrieving and exportation. The overall architecture was developed with open-source tools and Figure 1 - Pipeline for the Active News system. integrated with in-house software. Data Ingestion Input data are ingested from RSS feeds, users blogs, DTT channels and other services such as the Rai archives and the EVN platform. The complexity introduced here is primarily represented by the necessity of automatically detecting and processing heterogeneous information in real-time, i.e. 24 hours/day and 365 days/year, and fulfilling the users expectations in terms of functionality and quality. The RSS feeds input pipeline collects Web articles from about 60 national press agencies and online newspapers, processing more than 2,000 articles per day. At the time of writing, about 4,500,000 articles were processed and stored in a PostgreSQL database running on a Linux server. An RSS feed contains a list of news article URLs and some associated metadata, such as title, author, description, publication date, enclosed elements, etc. Articles that either are new or have a more recent publication date (if already present in the database) are added to a list of article URLs to be downloaded. Title, enclosed images and publication date are also stored in the database, if included in the corresponding RSS item. Blog posts are
4 collected by registered users submitting their preferred blog feeds. The daily programming of the major national TV channels is acquired from the DTT broadcasts and monitored to detect newscast programmes. Video elements indicating starting and ending points of newscasts are used as visual prototypes to be automatically searched through the acquired TV streams. Metadata about the detected newscasts, e.g. broadcaster name, channel, start date/time, end date/time and title, are also stored in the database. The legacy archive input stream is connected to the Rai internal multimedia database management system, to collect metadata about historical Rai television and radio programmes. The last input source in the current version of the Active News is the EVN, the European Broadcasting Union (EBU) distribution platform to share audiovisual news footage and metadata between its member organizations. Data Pre-processing Ingested contents are processed by content analysis techniques in order to analytically characterize their textual and audiovisual features. The web articles linked by the tracked RSS item URLs are periodically fetched and downloaded. The process is governed by a statistical priority queue scheduler in order to load balancing the number of items per feed at each processing cycle. The complete HTML pages pass through a cleaning step that removes undesired content such as scripts, footers, styles, advertisements, etc. The cleaning process is based on the justext 2 tool whose outputs are plain text files containing the main content of the news articles. The Apache OpenNLP 3 toolkit is then used to perform sentence detection, tokenization, part-of-speech (POS) tagging and named entity detection from the extracted texts. Automatic classification is applied to categorize the articles according to the journalistic taxonomy used by Rai archivists. All of such information is stored in the database as part of the RSS item metadata. Similarly, blog posts by users are converted to an RSS format and treated accordingly. Newscast programmes are automatically segmented into stories. Story boundary detection is based on the probabilistic cluster model originally proposed by Di Iulio and Messina (5), because of its demonstrated robustness against different newscast editorial formats. The core algorithm adopts a multimodal approach based on merging information from (audio) speaker and (visual) shot clustering. Detected stories are transcribed by a multilingual speech recognition software. A timestamp is associated to each transcribed word, so that it is possible to find the exact position of each spoken word within the story timeline. TV news stories are classified using the same taxonomy as for the web articles, thus providing additional semantic information that can be used for further filtering and data mining. Data Clustering and Warehousing The data clustering module implements a bottom-up approach to identify events and topics. First, RSS items and TV news stories about the same event are grouped together. Since news about the same event are typically produced and published/broadcasted in close time proximity, an interval time window is used to limit the overall amount of processed data. At the core of the method is the hierarchical, asymmetric co-clustering algorithm described by Messina and Montagnuolo (6). Each RSS item is represented by the set of nouns and proper nouns extracted by the POS tagging process from its title and
5 body content. This word set is matched with the speech transcriptions of the TV news stories, thus resulting in a vector in which each element is the relatedness of a TV news story to the RSS item. Based on the computed vector a pseudo-semantic affinity (6) is calculated to cluster RSS items. Because of the asymmetric nature of the algorithm, cause and effect relations between the clustered elements can be defined. As a result, an event is modeled as a directed graph; nodes represent RSS items labeled by the titles of the corresponding Web articles, and edges are relations such as entailments (i.e. inbound arrows), and equivalences (i.e. bi-directional arrows) between them. Node sizes and edges are determined based on (6). Particularly of interest is the in-degree, which denotes the representativeness of the RSS item w.r.t. the event. This criterion is based on the observation that the higher is the in-degree of a node, the higher is the number of nodes that are semantically entailed by it, so that the corresponding article content is expected to be the most complete w.r.t. the semantics of the cluster. Event title and description are determined according to this criterion. As an example, the Expo Milano 2015 inauguration concert took place in Piazza del Duomo the night of 30 th April. This was one of the events scheduled to celebrate the opening of the Universal Exposition. Figure 2 shows how the event is represented in the Active News system. The body of the representative article (i.e. the green node in the graph) and some of the included TV stories are also shown. Information about detected events, e.g. included RSS items and news stories, overall categories and entities, is stored in the database as hypermedia dossiers. In order to group events related to the same topic we perform a second level clustering based on the nouns and named entities in the event clusters. The method is based on the assumption that the more nouns, persons, organizations and locations are shared among events, the higher the probability that they will refer to the same topic. Figure 3 shows an example. The overall topic is the opening of Expo Milano Nodes in the graph are related events labeled with their representative title. In particular, three main groups of events can be identified, namely the opening ceremony show, the speech of the Italian Prime Minister, and the riots occurred in the city. Among the opening ceremony nodes, the yellow node is related to the concert, as already detailed in Figure 2. Co-occurring nouns and entities are also shown. The clustering process runs every two hours, thus resulting in more than 10 daily updates. This Figure 2 Graph-based representation of an event. Nodes are Web articles. Edges are relations between them weighted based on their shared TV news. service level agreement was judged to be acceptable by a panel of Rai s journalists and editors.
6 Figure 3 - Example of a topic and related events The data warehouse (DW) module allows users to create customized statistical and analytical reports about the detected events. The system was designed basing on the Dimensional Fact Model (DFM) principle, and integrating data available from the internal database with external data, such as TV audience scores/categories and social network ratings, e.g. likes, dislikes, shares, etc. Cross-provider statistics, e.g. TV channels and RSS providers correlations, are also identified. These statistics are stored in a PostgreSQL database and are accessible through a Web-based dashboard. Personalized multimedia news recommendations are generated based on the users interests. Personal interests are inferred by latent semantic analysis of the users blog posts, assuming that what they wrote can be interpreted as a reasonably good approximation of their interests. Further details on the core algorithm are given in Di Massa et al (7). Generated recommendations are distributed according to the MPEG-21 User Description 4 (MPEG-21 UD) standard, thus ensuring interoperability and enabling the seamless integration of heterogeneous data sources including user generated contents, i.e. blogs, and professional information items, i.e. online newspapers, press services and TV channels. Experiments with a panel of users demonstrated the quality of the method, getting a precision of about 91% for the proposed recommendations. Data Index and Access Data processed and generated by the Active News system are maintained in two DBMS, i.e. the core database and the DW database. Different full text 5 search indexes are constantly updated to provide a scalable, flexible and fast search and retrieval interface to
7 input/output data and metadata. RESTful APIs and an HTML5 Web portal were developed to support search, filtering, browsing and export of data. The search interface enables users to look for information covering international, national or local news. Listed on the home page is a selection of search profiles users can subscribe (see Figure 5). New profiles can be created entering a name for the new profile, and a list of concepts (entities, keywords or phrases). Once a profile is selected, or a new one is created, the Figure 5 Active News home page interface shows the list of search results organized in different tabs, each corresponding to one search index. The list of results can be filtered by date and sorted by date or relevance to the profile. Figure 4 shows the TV news search results for the profile Expo Milano 2015 in the last year. In order to get the global picture of the context of the profile different panels are displayed. The panels are interactive, so that, when selected, data in the visualization is filtered and updated accordingly. Information in each panel can be exported in CSV, XML or JSON format, enabling integration with the most common content management systems. The timeline shows the distribution of the news over time. Peaks of information occur in correspondence with the existence of some related event (see e.g. the peak around May 1 st, 2015 opening day of the exposition). The visualization of categories shows the categorization of the profile news w.r.t. the adopted taxonomy. Figure 4 Example of search results for the profile Expo Milano 2015
8 Figure 6 Locations and organizations for the profile Expo Milano 2015 Most important entities are shown accordingly with their relevance w.r.t. the profile (see Figure 6). The geolocation panel shows the coverage of the profile news around the world (see Figure 7). CONCLUSIONS This paper described a novel architecture called Active News. The system uses multimodal content sources to aggregate, link and organize data by news topics. Semantic and statistical information about aggregated contents is provided, thus going towards the requirements of modern data-driven journalism applications. The use of open source tools and the complete automation of the processing pipeline minimized the overall costs of the system, while meeting the requested service level requirements. Research plans for the future include the integration of multilingual functionalities, the creation of a knowledge base to represent and share topics information and relations, and further retrieval facilities, such as visual search. REFERENCES Figure 7 World coverage of the profile news 1. Athey, S. and Mobius, M., The impact of News Aggregators on Internet News Consumption: the Case of Localization. Working paper, Harward University. 2. Liu, J., Dolan, P. and Pedersen, E. R., Personalized news recommendation based on click behaviour. IUI '10: Proc. of the 14th Intl. Conf. on Intelligent user interfaces. pp. 31 to Li, L., Chu, W., Langford, J., and Schapire, R.E., A contextual-bandit approach to personalized news article recommendation. Proc. of the 19th Intl. Conf. on World wide web (WWW '10), pp. 661 to Leban, G., Fortuna, B., Brank, J., and Grobelnik, M., Event registry: learning about world events from news. Proc. of the companion publication of the 23rd Intl. Conf. on World wide web companion (WWW Companion '14), pp. 107 to Di Iulio, M. and Messina, A., Use of Probabilistic Clusters Supports for Broadcast News Segmentation. DEXA Workshops, pp. 600 to Messina A, and Montagnuolo M., Heterogeneous data co-clustering by pseudosemantic affinity functions. Proc. of the 2nd Italian information retrieval workshop. 7. Di Massa, R., Montagnuolo, M. and Messina, A Implicit news recommendation based on user interest models and multimodal content analysis. In Proc. Of the 3rd Intl. Workshop on Automated information extraction in media production. pp 33 to 38.
What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy
What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy Much higher Volumes. Processed with more Velocity. With much more Variety. Is Big Data so big? Big Data Smart Data Project HAVEn: Adaptive Intelligence
How RAI's Hyper Media News aggregation system keeps staff on top of the news
How RAI's Hyper Media News aggregation system keeps staff on top of the news 13 th Libre Software Meeting Media, Radio, Television and Professional Graphics Geneva - Switzerland, 10 th July 2012 Maurizio
How To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
SHAREPOINT CI. Spots Business Opportunities, Eliminates Risk COMPETITIVE INTELLIGENCE FOR SHAREPOINT
RESULTS Receive key intelligence on clients, prospects, competitors and threats Optimize business strategies Act on critical business opportunities you can t afford to miss. Leverage your knowledge resources
Delivering Smart Answers!
Companion for SharePoint Topic Analyst Companion for SharePoint All Your Information Enterprise-ready Enrich SharePoint, your central place for document and workflow management, not only with an improved
Flattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING
MEDIA MONITORING AND ANALYSIS GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING Searchers Reporting Delivery (Player Selection) DATA PROCESSING AND CONTENT REPOSITORY ADMINISTRATION AND MANAGEMENT
White Paper. How Streaming Data Analytics Enables Real-Time Decisions
White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream
How To Write A Paper On The Integrated Media Framework
The Integrated www.avid.com The Integrated Media production and distribution businesses are working in an environment of radical change. To meet the challenge of this change, a new technology and business
Sentiment Analysis on Big Data
SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
IT services for analyses of various data samples
IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
HOW TO DO A SMART DATA PROJECT
April 2014 Smart Data Strategies HOW TO DO A SMART DATA PROJECT Guideline www.altiliagroup.com Summary ALTILIA s approach to Smart Data PROJECTS 3 1. BUSINESS USE CASE DEFINITION 4 2. PROJECT PLANNING
Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari [email protected]
Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari [email protected] Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content
Folksonomies versus Automatic Keyword Extraction: An Empirical Study
Folksonomies versus Automatic Keyword Extraction: An Empirical Study Hend S. Al-Khalifa and Hugh C. Davis Learning Technology Research Group, ECS, University of Southampton, Southampton, SO17 1BJ, UK {hsak04r/hcd}@ecs.soton.ac.uk
Intinno: A Web Integrated Digital Library and Learning Content Management System
Intinno: A Web Integrated Digital Library and Learning Content Management System Synopsis of the Thesis to be submitted in Partial Fulfillment of the Requirements for the Award of the Degree of Master
BIG DATA: IT MAY BE BIG BUT IS IT SMART?
BIG DATA: IT MAY BE BIG BUT IS IT SMART? Turning Big Data into winning strategies A GfK Point-of-view 1 Big Data is complex Typical Big Data characteristics?#! %& Variety (data in many forms) Data in different
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software
TV INSIGHTS APPLICATION OF BIG DATA TO TELEVISION
TV INSIGHTS APPLICATION OF BIG DATA TO TELEVISION AN ARRIS WHITE PAPER BY: BHAVAN GANDHI, ALFONSO MARTINEZ- SMITH, & DOUG KUHLMAN TABLE OF CONTENTS ABSTRACT... 3 INTRODUCTION INTERSECTION OF TV & BIG DATA...
Interactive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
M3039 MPEG 97/ January 1998
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION ISO/IEC JTC1/SC29/WG11 M3039
Content Management Software Drupal : Open Source Software to create library website
Content Management Software Drupal : Open Source Software to create library website S.Satish, Asst Library & Information Officer National Institute of Epidemiology (ICMR) R-127, Third Avenue, Tamil Nadu
Web-based Multimedia Content Management System for Effective News Personalization on Interactive Broadcasting
Web-based Multimedia Content Management System for Effective News Personalization on Interactive Broadcasting S.N.CHEONG AZHAR K.M. M. HANMANDLU Faculty Of Engineering, Multimedia University, Jalan Multimedia,
Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall.
Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com
How To Use Superdesk
The open source newsroom and workflow management tool. Superdesk delivers the entire publishing cycle for print and broadcast, from ingest to output. Built by journalists, for journalists, with the aim
01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.
(International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models
Business Intelligence Solutions. Cognos BI 8. by Adis Terzić
Business Intelligence Solutions Cognos BI 8 by Adis Terzić Fairfax, Virginia August, 2008 Table of Content Table of Content... 2 Introduction... 3 Cognos BI 8 Solutions... 3 Cognos 8 Components... 3 Cognos
TDAQ Analytics Dashboard
14 October 2010 ATL-DAQ-SLIDE-2010-397 TDAQ Analytics Dashboard A real time analytics web application Outline Messages in the ATLAS TDAQ infrastructure Importance of analysis A dashboard approach Architecture
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
A STUDY REGARDING INTER DOMAIN LINKED DOCUMENTS SIMILARITY AND THEIR CONSEQUENT BOUNCE RATE
STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume LIX, Number 1, 2014 A STUDY REGARDING INTER DOMAIN LINKED DOCUMENTS SIMILARITY AND THEIR CONSEQUENT BOUNCE RATE DIANA HALIŢĂ AND DARIUS BUFNEA Abstract. Then
Search Result Optimization using Annotators
Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Survey Results: Requirements and Use Cases for Linguistic Linked Data
Survey Results: Requirements and Use Cases for Linguistic Linked Data 1 Introduction This survey was conducted by the FP7 Project LIDER (http://www.lider-project.eu/) as input into the W3C Community Group
Digital Asset Management
Digital Asset Management 1 Multimedia content is king but how to conveniently valorize it? Discovery Reply Vision & Mission DISCOVERING, CREATING AND MANAGING SERVICES THAT ENABLE FINAL USERS TO INTERACT
Semantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 [email protected] 2 [email protected] Abstract A vast amount of assorted
Enhancing Document Review Efficiency with OmniX
Xerox Litigation Services OmniX Platform Review Technical Brief Enhancing Document Review Efficiency with OmniX Xerox Litigation Services delivers a flexible suite of end-to-end technology-driven services,
Research of Postal Data mining system based on big data
3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication
Visualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR [email protected] Abstract An e-government
Publish Acrolinx Terminology Changes via RSS
Publish Acrolinx Terminology Changes via RSS Only a limited number of people regularly access the Acrolinx Dashboard to monitor updates to terminology, but everybody uses an email program all the time.
C E D A T 8 5. Innovating services and technologies for speech content management
C E D A T 8 5 Innovating services and technologies for speech content management Company profile 25 years experience in the market of transcription/reporting services; Cedat 85 Group: Cedat 85 srl Subtitle
aloe-project.de White Paper ALOE White Paper - Martin Memmel
aloe-project.de White Paper Contact: Dr. Martin Memmel German Research Center for Artificial Intelligence DFKI GmbH Trippstadter Straße 122 67663 Kaiserslautern fon fax mail web +49-631-20575-1210 +49-631-20575-1030
Client Overview. Engagement Situation. Key Requirements
Client Overview Our client is one of the leading providers of business intelligence systems for customers especially in BFSI space that needs intensive data analysis of huge amounts of data for their decision
SIPAC. Signals and Data Identification, Processing, Analysis, and Classification
SIPAC Signals and Data Identification, Processing, Analysis, and Classification Framework for Mass Data Processing with Modules for Data Storage, Production and Configuration SIPAC key features SIPAC is
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
IBM Social Media Analytics
IBM Analyze social media data to improve business outcomes Highlights Grow your business by understanding consumer sentiment and optimizing marketing campaigns. Make better decisions and strategies across
Pragmatic Web 4.0. Towards an active and interactive Semantic Media Web. Fachtagung Semantische Technologien 26.-27. September 2013 HU Berlin
Pragmatic Web 4.0 Towards an active and interactive Semantic Media Web Prof. Dr. Adrian Paschke Arbeitsgruppe Corporate Semantic Web (AG-CSW) Institut für Informatik, Freie Universität Berlin [email protected]
Big Data for Investment Research Management
IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable
ABSTRACT The World MINING 1.2.1 1.2.2. R. Vasudevan. Trichy. Page 9. usage mining. basic. processing. Web usage mining. Web. useful information
SSRG International Journal of Electronics and Communication Engineering (SSRG IJECE) volume 1 Issue 1 Feb Neural Networks and Web Mining R. Vasudevan Dept of ECE, M. A.M Engineering College Trichy. ABSTRACT
Information Technology Web Solution Services
Information Technology Web Solution Services Icetech 940 West North Avenue Baltimore, Maryland 21217 Tel: 410.225.3117 Fax: 410.225.3120 www. Icetech. net Hubzone Copyright @ 2012 Icetech, Inc. All rights
Understanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
Digital Asset Management. Content Control for Valuable Media Assets
Digital Asset Management Content Control for Valuable Media Assets Overview Digital asset management is a core infrastructure requirement for media organizations and marketing departments that need to
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
ISSN: 2348 9510. A Review: Image Retrieval Using Web Multimedia Mining
A Review: Image Retrieval Using Web Multimedia Satish Bansal*, K K Yadav** *, **Assistant Professor Prestige Institute Of Management, Gwalior (MP), India Abstract Multimedia object include audio, video,
Analysis of Data Mining Concepts in Higher Education with Needs to Najran University
590 Analysis of Data Mining Concepts in Higher Education with Needs to Najran University Mohamed Hussain Tawarish 1, Farooqui Waseemuddin 2 Department of Computer Science, Najran Community College. Najran
Leveraging Global Media in the Age of Big Data
WHITE PAPER Leveraging Global Media in the Age of Big Data Introduction Global media has the power to shape our perceptions, influence our decisions, and make or break business reputations. No one in the
User Guide to the Content Analysis Tool
User Guide to the Content Analysis Tool User Guide To The Content Analysis Tool 1 Contents Introduction... 3 Setting Up a New Job... 3 The Dashboard... 7 Job Queue... 8 Completed Jobs List... 8 Job Details
Available online at www.sciencedirect.com Available online at www.sciencedirect.com. Advanced in Control Engineering and Information Science
Available online at www.sciencedirect.com Available online at www.sciencedirect.com Procedia Procedia Engineering Engineering 00 (2011) 15 (2011) 000 000 1822 1826 Procedia Engineering www.elsevier.com/locate/procedia
USER GUIDE Chapter 20 Using Podcasts. Schoolwires Academic Portal Version 4.1
USER GUIDE Chapter 20 Schoolwires Academic Portal Version 4.1 TABLE OF CONTENTS Introduction... 1 Adding a New Podcast Page... 3 Adding a New Episode... 5 Supported File Types... 5 What is an MP3 File?...
User Interface Design for a Content-aware Mobile Multimedia Application: An Iterative Approach 1)
User Interface Design for a Content-aware Mobile Multimedia Application: An Iterative Approach 1) Zahid Hussain, Martin Lechner, Harald Milchrahm, Sara Shahzad, Wolfgang Slany, Martin Umgeher, Thomas Vlk,
MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group
Big Data and Its Implication to Research Methodologies and Funding Cornelia Caragea TARDIS 2014 November 7, 2014 UNT Computer Science and Engineering Data Everywhere Lots of data is being collected and
Context Aware Predictive Analytics: Motivation, Potential, Challenges
Context Aware Predictive Analytics: Motivation, Potential, Challenges Mykola Pechenizkiy Seminar 31 October 2011 University of Bournemouth, England http://www.win.tue.nl/~mpechen/projects/capa Outline
Big Data Text Mining and Visualization. Anton Heijs
Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark
Bayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
Collaborative Open Market to Place Objects at your Service
Collaborative Open Market to Place Objects at your Service D6.2.1 Developer SDK First Version D6.2.2 Developer IDE First Version D6.3.1 Cross-platform GUI for end-user Fist Version Project Acronym Project
ELPUB Digital Library v2.0. Application of semantic web technologies
ELPUB Digital Library v2.0 Application of semantic web technologies Anand BHATT a, and Bob MARTENS b a ABA-NET/Architexturez Imprints, New Delhi, India b Vienna University of Technology, Vienna, Austria
Business Intelligence in E-Learning
Business Intelligence in E-Learning (Case Study of Iran University of Science and Technology) Mohammad Hassan Falakmasir 1, Jafar Habibi 2, Shahrouz Moaven 1, Hassan Abolhassani 2 Department of Computer
Semantically Enhanced Web Personalization Approaches and Techniques
Semantically Enhanced Web Personalization Approaches and Techniques Dario Vuljani, Lidia Rovan, Mirta Baranovi Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, HR-10000 Zagreb,
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo
Selecting a Taxonomy Management Tool Wendi Pohs InfoClear Consulting #SLATaxo InfoClear Consulting What do we do? Content Analytics Strategy and Implementation, including: Taxonomy/Ontology development
City Data Pipeline. A System for Making Open Data Useful for Cities. [email protected]
City Data Pipeline A System for Making Open Data Useful for Cities Stefan Bischof 1,2, Axel Polleres 1, and Simon Sperl 1 1 Siemens AG Österreich, Siemensstraße 90, 1211 Vienna, Austria {bischof.stefan,axel.polleres,simon.sperl}@siemens.com
Draft Response for delivering DITA.xml.org DITAweb. Written by Mark Poston, Senior Technical Consultant, Mekon Ltd.
Draft Response for delivering DITA.xml.org DITAweb Written by Mark Poston, Senior Technical Consultant, Mekon Ltd. Contents Contents... 2 Background... 4 Introduction... 4 Mekon DITAweb... 5 Overview of
Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project
Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Ahmet Suerdem Istanbul Bilgi University; LSE Methodology Dept. Science in the media project is funded
Social Media Monitoring: Engage121
Social Media Monitoring: Engage121 User s Guide Engage121 is a comprehensive social media management application. The best way to build and manage your community of interest is by engaging with each person
SURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
Twitter Analytics: Architecture, Tools and Analysis
Twitter Analytics: Architecture, Tools and Analysis Rohan D.W Perera CERDEC Ft Monmouth, NJ 07703-5113 S. Anand, K. P. Subbalakshmi and R. Chandramouli Department of ECE, Stevens Institute Of Technology
Data processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek
Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pelánek 2015 Today lecture, basic principles: content-based knowledge-based hybrid, choice of approach,... critiquing, explanations,...
Web Mining using Artificial Ant Colonies : A Survey
Web Mining using Artificial Ant Colonies : A Survey Richa Gupta Department of Computer Science University of Delhi ABSTRACT : Web mining has been very crucial to any organization as it provides useful
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014
Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4
Data Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
