Why are Organizations Interested?



Similar documents
CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING

STAR WARS AND THE ART OF DATA SCIENCE

IBM Content Analytics with Enterprise Search, Version 3.0

TEXT ANALYTICS INTEGRATION

Internet of Things, data management for healthcare applications. Ontology and automatic classifications

Hexaware E-book on Predictive Analytics

Safe Harbor Statement

SAP BusinessObjects Edge BI, Standard Package Preferred Business Intelligence Choice for Growing Companies

ifinder ENTERPRISE SEARCH

REPUTATION RISK, FACTORS & ANALYSIS PROVIDED BY SAS OPRISK GLOBAL DATA

Quality Data for Your Information Infrastructure

IBM SPSS Modeler Premium

Infor M3 Report Manager. Solution Consultant

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy

Introduction to Text Mining and Semantics. Seth Grimes -- President, Alta Plana

AccuRead OCR. Administrator's Guide

Delivering Smart Answers!

NICE MULTI-CHANNEL INTERACTION ANALYTICS

HP Backup and Recovery Manager

Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data

SAP For Insurance A focus on Billing and Collections. Robert Schwartz Industry Principal

Approaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER

This Symposium brought to you by

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data

Text Analytics. A business guide

Leveraging the power of UNSPSC for Business Intelligence

Data First Framework. How to Build Your Enterprise Data Hub. Luis Campos Big Data Solutions Director Oracle Europe, Middle East and Africa

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

Business Intelligence Solutions for Gaming and Hospitality

Customer Analytics. Turn Big Data into Big Value

Protection for your account

PROMT Technologies for Translation and Big Data

KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore

Cleaned Data. Recommendations

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy

How To Manage Your Spam On Graymail On Pc Or Macodeo.Com

How To Make Sense Of Data With Altilia

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

THE STATE OF Social Media Analytics. How Leading Marketers Are Using Social Media Analytics

JamiQ Social Media Monitoring Software

Overview, Goals, & Introductions

MT Search Elastic Search for Magento

Maximize Social Media Effectiveness with Data Science. An Insurance Industry White Paper from Saama Technologies, Inc.

IBM SPSS Direct Marketing

SAP BusinessObjects EDGE BI WITH DATA MANAGEMENT CENTRALIZE DATA QUALITY FUNCTIONALITY. SAP Solutions for Small Businesses and Midsize Companies

redesigning the data landscape to deliver true business intelligence Your business technologists. Powering progress

Provalis Research Text Analytics and the Victory Index

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

The Value of Taxonomy Management Research Results

Text Mining - Scope and Applications

Web Conferencing Comparison Guide

Interactive product brochure :: Nina TM Mobile: The Virtual Assistant for Mobile Customer Service Apps

Real World Application and Usage of IBM Advanced Analytics Technology

SPATIAL DATA CLASSIFICATION AND DATA MINING

Knowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success

Database Marketing, Business Intelligence and Knowledge Discovery

White Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices.

IBM Software Understanding big data so you can act with confidence

Initiate Master Data Service

Extend your analytic capabilities with SAP Predictive Analysis

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

Easily Identify the Right Customers

Solve Your Toughest Challenges with Data Mining

Data Science & Big Data Practice

2015 Workshops for Professors

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek

SUSTAINING COMPETITIVE DIFFERENTIATION

relevant to the management dilemma or management question.

A HUMAN RESOURCE ONTOLOGY FOR RECRUITMENT PROCESS

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Solve your toughest challenges with data mining

Text Mining and Analysis

Session 2: Designing Information Architecture for SharePoint: Making Sense in a World of SharePoint Architecture

Foundations of Business Intelligence: Databases and Information Management

SWOT Assessment: BMC Remedy v9

Big Data Text Mining and Visualization. Anton Heijs

Maintaining a Competitive Edge with Interaction Analysis

Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics

Social Media Implementations

Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

This software agent helps industry professionals review compliance case investigations, find resolutions, and improve decision making.

QAD Business Intelligence Release Notes

Transcription:

SAS Text Analytics Mary-Elizabeth ( M-E ) Eddlestone SAS Customer Loyalty M-E.Eddlestone@sas.com +1 (607) 256-7929 Why are Organizations Interested? Text Analytics 2009: User Perspectives on Solutions and Providers Seth Grimes 2 1

Integration of Risk Management Source Global Risk Management Survey Sixth Edition by Deloitte in 2009 3 Unstructured and Semi-structured Data The Dark Matter for IT Unstructured data Structured data 25% 5% 70% Semistructured data 4 2

Text Analytics Basics Most everything people do with electronic documents falls into one of four classes: 1. Compose, publish, manage, and archive 2. Index and Search 3. Categorize and classify according to metadata and contents 4. Summarize and extract information Text Analytics 2009, Seth Grimes, An Alta Plana Research Study 5 How do you know? Vice Chancellor Samuel Ray Jones of North Carolina State University announced that his left arm had been severed accidently in a bazaar incident as he left his vehicel. 6 3

SAS Text Analytics Information Organization and Access Predictive Modeling, Discover Trends and Patterns SAS Enterprise Content Categorization SAS Ontology Management SAS Text Miner SAS Sentiment Analysis 7 SAS Text Analytics Integration of Text Mining, Sentiment Analysis and Content Categorization SAS Text Miner Explore large volumes of text Concept Linking Clustering Merge with structured data for Segment Profiling Prediction Natural Language Processing Part-of-speech tagging Stemming Tokenization Phrase Recognition Entity Extraction 30+ languages SAS Sentiment Analysis Identifies overall and feature level sentiment Combines statistical models and business rules Automatically scores sentiment of new documents SAS Content Categorization Adds Metadata to Content for easier search and retrieval Builds Taxonomies Through Rules Engine Automatically categorizes incoming documents 8 4

Language Detection Cumulative Cumulative Language Frequency Percent Frequency Percent Arabic 2 0.07 2 0.07 Chinese (simplified) 5 0.18 9 0.33 Danish 3 0.11 12 0.44 Dutch 20 0.73 32 1.16 English 2398 86.98 2430 88.14 French 19 0.69 2449 88.83 German 20 0.73 2469 89.55 Italian 10 0.36 2479 89.92 Japanese 32 1.16 2511 91.08 Korean 35 1.27 2546 92.35 Norwegian 2 0.07 2548 92.42 Polish 1 0.04 2549 92.46 Portuguese 131 4.75 2680 97.21 Spanish 75 2.72 2755 99.93 Swedish 2 0.07 2757 100 9 What is Content Categorization? Often used in conjunction with enabling better SEARCH More relevant search is facilitated by creating taxonomies for content, associating metadata with the content, and automating the process to increase findability. Consistency with Automation - content tagging is often manual, redundant, and error-prone Classifying, tracking, and reporting of topics How many documents were classified in these topic areas? Or mention these people or places? How many times are drugs mentioned with these side-effects? Is this changing over time? 12 5

ECC Example - New York Times Topics Pages Automatically organize your Content Increase Search Engine Optimization ranking Topics Automatic Entities Extraction Automatic Categorization 14 Social Media = Noisy Data Actual Content of Data Provided by Major Bank Retailer Arts/Sports Romanian Jobs Phishing Actually About Bank Only 38% of the records pulled about the bank had anything to do with banking. Almost 58% of records were definitely in Romanian. This number could be as high as 90% however. 15 6

Reporting on Categories 17 Reporting on Categories 18 7

Ontology Definition A mapping of relationships Way of organizing information across different fields or classification systems Means of creating shared vocabulary and generating consistency across units Integrating a Collection of Taxonomies Potential Business Uses Consolidation of vocabularies across departments Mergers and Acquisitions Enhancement of search Additional structure with metadata 28 Complexity of Ontologies Ontologies range from simple taxonomies to highly tangled networks including constraints associated with concepts and relationships. Light-weight concepts is-a hierarchy among concepts relations between concepts Heavy-weight cardinality constraints taxonomy of relations Axioms (restrictions) 29 8

Example: People Ontology 30 SAS Ontology Management Ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. Ontologies contain: Classes are groups of objects Categories or Concepts from Content Categorization projects Slots are the metadata attributes Link the rules across the concepts and are universal to each taxonomy, regardless of which project the taxonomy is stored in Instances are specific objects Assigned to the various Categories and Concepts Value Restrictions Allowable values of attributes and relationships (of slots) 32 9

Example: An Ontology for dogs Classes: dog, poodle, terrier, collie, pit bull, chihuahua, Slots: fur color, fur length, size, number of legs, region of origin,... Value restrictions (on slots): Fur length = short, medium, long Number of legs =< 4 Instances: Lassie, Petey, Gidget 33 The Cat The Vet and Grandma associate different views for the concept cat. 35 10

What is Sentiment Analysis? A process that identifies, analyzes, and interprets the attitudes, opinions, and emotions in digital content Statistical Rules Based Hybrid Leverage both advanced analytics and human expertise 36 How is Sentiment Analysis Used? Often Sentiment Analysis is used in conjunction with evaluating the customer experience. } Surveys Call Center notes Unstructured Text Social data Chat sessions Hotel Experience Service Value Bathroom Beds Room Size Lobby Concierge Restaurants Check In / Out Fitness Center Structured Data Area of the Country North East South West Traveler Type Business Personal Hotel Type Luxury Standard Economy 37 11

SAS Text Miner Text Mining is the process of analyzing a corpus of documents, through Natural Language Processing and statistical methods, to uncover topics hidden within the documents 50 Two General Goals of Text Mining Exploration Uncovering hidden themes and key concepts Concept Linking Clustering Prediction Classification Identify which input variables are most influential to the value of a target variable Scoring - Derive a model or set of rules that produces a predicted target value for a given set of inputs 51 12

Identify and count word occurrences 52 What are Concept Links? The strength of association of two terms is computed and visually represented as a Concept Link 53 13

What are Clusters? Clustering involves finding groups of documents that are more similar to each other than they are to the rest of the documents in the collection. Once the clusters are determined, examining the words that occur in the cluster reveals the focus of the cluster. 54 SAS Sentiment Analysis Workbench Creates Word or Phrase Clouds Data exploration and visualization 66 14

Compare Sentiment of Specific Features of Your Products vs the Competition Output from SAS Sentiment Analysis can be input to SAS BI for greater depth and flexibility of reporting. 67 The Synergy of SAS Text Analytics The value of the individual SAS Text Analytics solutions is greatly enhanced when the solutions are used together to gain even greater insight. Examples: Enhancing the value of topics discovered and defined in SAS Enterprise Content Categorization by adding sentiment to them Enhancing predictive modeling by adding sentiments discovered using SAS Sentiment Analysis 68 15

SAS Sentiment Analysis and SAS Content Categorization Used in Conjunction Taxonomies can be highly customized for each customer to ensure best alignment and accuracy SAS Content Categorization can be used to further clean, filter, and organize the raw data SAS measures both document-level and attribute-level sentiment using a hybrid of statistical and rules based methods 69 Predict Sentiment or NPS Scores Using New Sentiment Variables 70 16

Decision Tree With Sentiment Variables New variables derived from SAS Sentiment Analysis turned out to be highly predictive in the decision tree, adding more lift SAS Text Analytics Information Organization and Access Predictive Modeling, Discover Trends and Patterns SAS Enterprise Content Categorization SAS Ontology Management SAS Text Miner SAS Sentiment Analysis 72 17

Thank you for being a valued SAS customer! 18