Sentiment Analysis on Big Data



Similar documents
SPAN. White Paper. Warehouse Management through Mobile. Abstract. Warehouse Management through Mobile

SPAN. White Paper. Change Management. Introduction

SPAN. White Paper. Enabling Enterprise Mobility. SPAN Solution Engineering Approach. Introduction

SPAN. White Paper. Key Elements of Enterprise Mobility Strategy. Elements of Enterprise Mobility Strategy

Digital Marketing Capabilities

Social Business Intelligence For Retail Industry

SPAN. White Paper. Enterprise Application Integration. Introduction

Leveraging unstructured data for improved decision making: A retail banking perspective

WHITE PAPER Analytics for digital retail

Cleaned Data. Recommendations

How To Make Sense Of Data With Altilia

Real-Time Analytics: Integrating Social Media Insights with Traditional Data

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics

WHITE PAPER. Social media analytics in the insurance industry

Internet Marketing Institute Delhi Mobile No.: DIMI. Internet Marketing Institute Delhi (DIMI)

Spend Enrichment: Making better decisions starts with accurate data

the beginner s guide to SOCIAL MEDIA METRICS

Voice of the Customer: How to Move Beyond Listening to Action Merging Text Analytics with Data Mining and Predictive Analytics

ORACLE SOCIAL ENGAGEMENT AND MONITORING CLOUD SERVICE

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Moreketing. With great ease you can end up wasting a lot of time and money with online marketing. Causing

IBM Social Media Analytics

ORACLE SOCIAL ENGAGEMENT AND MONITORING CLOUD SERVICE

Customer Experience Management

THE FUTURE OF SOCIAL CRM

Social Media Implementations

NICE MULTI-CHANNEL INTERACTION ANALYTICS

The impact of social media is pervasive. It has

Maximize Social Media Effectiveness with Data Science. An Insurance Industry White Paper from Saama Technologies, Inc.

Capturing Meaningful Competitive Intelligence from the Social Media Movement

WHITE PAPER Social Media In Technology. A Unified Strategy for Success

SOCIAL MEDIA FOR MSMEs A turning point. By DR. PRALAY DEY National Small Industries Corporation (NSIC)

Standardization in the Outsourcing Industry

IBM G-Cloud - IBM Social Media Analytics Software as a Service

KYCS - Integrating KYC with Social Identity: The Future-Ready Marketing Approach

MarketsandMarkets. Publisher Sample

Bruhati Technologies. About us. ISO 9001:2008 certified. Technology fit for Business

Business Process Services. White Paper. Predictive Analytics in HR: A Primer

JamiQ Social Media Monitoring Software

MarketsandMarkets. Publisher Sample

Predictive Analytics: Turn Information into Insights

Value of. Clinical and Business Data Analytics for. Healthcare Payers NOUS INFOSYSTEMS LEVERAGING INTELLECT

WHITEPAPER. Unlocking Your ATM Big Data : Understanding the power of real-time transaction analytics.

Text Mining - Scope and Applications

Know Your Buyer: A predictive approach to understand online buyers behavior By Sandip Pal Happiest Minds, Analytics Practice

IBM Social Media Analytics

INSIGHT. IDC's Social Business Taxonomy, 2011 IDC OPINION IN THIS INSIGHT. Scott Guinn

Keywords social media, internet, data, sentiment analysis, opinion mining, business

Business Process Services. White Paper. Social Media Influence: Looking Beyond Activities and Followers

Voice. listen, understand and respond. enherent. wish, choice, or opinion. openly or formally expressed. May Merriam Webster.

Predicting & Preventing Banking Customer Churn by Unlocking Big Data

Social Media ROI. First Priority for a Social Media Strategy: A Brand Audit Using a Social Media Monitoring Tool. Whitepaper

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Dealer Management Services (DMS) Framework

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May ISSN BIG DATA: A New Technology

Social Media Marketing. Hours 45

W H I T E P A P E R. Building your Big Data analytics strategy: Block-by-Block! Abstract

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

Fogbeam Vision Series - The Modern Intranet

The Next Generation Learning Management System

Process Compliance to Business Excellence A Journey

STATE OF B2B SEARCH MARKETING 2015

Engage your customers

How To Listen To Social Media

and Analytic s i n Consu m e r P r oducts

Doing Multidisciplinary Research in Data Science

SOCIAL MEDIA MONITORING AND SENTIMENT ANALYSIS SYSTEM

GE Capital The Net Promoter Score: A low-cost, high-impact way to analyze customer voices

Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences

JAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights Copyright Metric insights, Inc.

Expanded Frequency Capping

Predicting & Preventing Banking Customer Churn by Unlocking Big Data

Solve Your Toughest Challenges with Data Mining

Our Five Step Guide To Successful Internet Marketing. Getting the best from your website: An introductory guide

Adding SPICE to Internet Banking

Navigating Big Data business analytics

GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING

WHITE PAPER Social Media in Government. 5 Key Considerations

Amplify Conversations to Convert Prospects to Customers. B2B Event Marketing Tactics Workbook

Our strength your value Social Media Intelligence

The 4 Pillars of Technosoft s Big Data Practice

STATE OF B2B SOCIAL MEDIA MARKETING 2015

Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data

SOCIAL MEDIA OPTIMIZATION

Innovation & Quality for Higher Competitiveness of Companies

Beyond listening Driving better decisions with business intelligence from social sources

Transcription:

SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social networking sites like Facebook, Twitter, Google+, LinkedIn, YouTube, etc., blogs and discussion forums send out loud messages through users, who voice their opinions openly. The data captured from these sites is usually unstructured and huge in volume, and analyzing such massive content manually is a tedious task. This is where SPAN achieves a collateral edge through its accurately developed solution, where data from all the above sources on the web, treading back multiple years, can be collected and processed to derive concise results. Our real-time search technology enables us to extract a complete body of expressions through these sources from many users, simultaneously, on any given subject. This white paper describes the possible techniques in which sentiments of users on multiple social forums can be used and analyzed to gain a meaningful and actionable insight.

SPAN Prediction Engine SPAN s Prediction Engine on big data retrieves posts, comments or tweets about a company or a product to obtain predictive insights on consumer thought process. To analyze data and quantify the moods of individuals of social forums, the tool uses an algorithm developed specifically to analyze sentiments from social media conversation, on a massive scale. Our Prediction Engine examines the data from social sources, scores the post, comments, tweets, etc., on the sentiment scale and classifies the text by sentiment, as positive, negative, campaign, reply or query. Instances: Social Media Sources The following post has a negated sentiment as text, classified as negative: What a horrible company with a horrible customer service and horrible attitudes. On the other hand, this tweet is classified as positive: Nina was incredibly helpful and definitely made me a lot happier with your service. A post can be classified as a campaign if it is posted as an advertisement by the company. Try out our new 4G data plan for this month http://sampleurl.com/xjkk SPAN s Predictive Engine Positive Negative Reply Campaign Query In response to a question on customer satisfaction, a typical reply would be: I found the service to be good, and prompt. A consumer may want to inquire about a service center, with a query like: Where is the service center nearest to my location? The same applies to other sentiment categories such as reply and query. When each post expresses an adjective that belongs to one of the above categories, it becomes possible to compute a statistical model that can capture and quantify how people feel about something, as expressed in these social forums. A variety of sentiment analysis methods exist for analyzing all types of content from general news sources and other public data sources. SPAN Prediction Engine results in relatively high accuracy rate a 78 percent agreement rate with manually reviewed content. Statistically put, typically, even humans have about 80 percent agreement rate with each other. Our tool processes sentiments for every single post on social forums, allowing the application to separate the mood around a particular product from the changes in the overall mood of the moment. If the sentiment for Product X is low on a Monday morning, is it because people are unhappy with the product, or because the sentiment for all terms is more negative on that Monday morning? You can analyze the general mood patterns of individuals to determine the true sentiment for any specific term using the SPAN Prediction Engine. Sentiment Analysis on Big Data 2

Machine Learning Machine learning deals with construction and study of intelligent systems that are developed to identify changes in the data in hand and improve the algorithmic order to accommodate new findings. For example, a machine learning system could be made to adopt changes constantly, (based on buyer opinion) to rate health, life or automotive insurance policies with respect to coverage, duration, premium, benefits, popularity, etc. For an insurance service provider, this provides a high degree of success in selling its products. Ratings based on buyer sentiment can appropriately be used to recommend a policy that meets the expectations of a buyer. When we gather large volumes of direct or indirect opinions, views, interests and perspectives, we need to apply learning algorithms to generalize or establish new points of interest. Machine learning poses many scientific and engineering challenges. Statistics of the data collected and observed shifts rapidly in real-time and so do the feature of interests and views. Hence, the machine learning algorithms need to be continuously adaptive. For increased reliability, the statistical models need to be applied across multiple algorithms to obtain consolidated results. The machine learning algorithms used to perform sentiment analysis described in this paper are supervised learning algorithms. As the learning engine progresses with continuous arrival of inputs (training data), the prediction accuracy of the engine increases. The learning engine is generic in nature and can be used for a variety of applications and across multiple domains. Sentiment Analysis For analysis purposes, SPAN Prediction Engine was applied over extended time periods across all the social media data, isolating only those conversations referencing a telecom service provider company. This enabled us to comprehend how people actually felt, when the company released a product or raised its tariff for existing customers. We compared SPAN Prediction Engine s output and stabilized these posts on different social media, and also quantified the volume of keywords related to the company or its products. As depicted in the image above the amount of negative sentiments expressed on social forums on a daily basis was more than the positive sentiments realized for that month. The graph represents the trend of negative comments posted in a particular month when a service by the telecom company was released. Sentiment Analysis on Big Data 3

Basic Building Blocks in Sentiment Analysis Training Sets Fetched from HDFS; posts & tweets labeled manually based on the nature of sentiments Data Pre-Processing Training sets & input sources with NLP Sentiment Analysis Model built as per training set; predicts sentiments for posts & comments from input source Sentiment Scores The prediction output from the previous step is shown in reports. Input Data Source (Big Data) Data source in HDFS; posts & tweets are fetched for product/ company Lexicons and Linguistic resources Libraries to carry out NLP Implementation Model Unstructured Data Using Hadoop User Web Portal Learning Engine Reporting Engine Report Product Services Structured Data Visualization Statistical Model A statistical model was built by giving thousands of training sets, which were tagged manually with precision. This model was further applied to the next set of social feeds from different social sources about the telecom company, which enabled us determine the sentiments with an accuracy rate of 78 percent. Percentages above 60 are acceptable in predictive analytics since most of the sentiment analytic models tag sentiments in three categories - positive, negative and neutral. We have categorized neutral sentiments into reply, queries and campaign sections. Sentiment Analysis on Big Data 4

As depicted in the image above, the amount of negative and queries are expressed on these social forums on daily basis were found to be correlated. The graphical representation depicts the correlation between negative comments and queries, while the reply section is on the lower end. This portrays the increased percentage of queries asked that spikes up the negative graph. The graph was validated when the company s social media page was checked for user responses. This depiction provides a number of insights for a company to determine the ideal time to post a campaign about its new introductions to obtain more of positives than the negatives or the neutrals. The image above shows the time of the day when most customers are active, which is mostly late nights. There is a spike at 8 PM that is rising high till midnight, which indicates that a company should post a campaign or an ad about their new product between 8 PM and midnight. Subsequent to considering the ideal time to post your campaign or an ad, you would also know the top influencers and most used words by people in their conversations, to understand what the users of different age groups expect from a product / service. The image shows top influencers and words used in such conversations. Sentiment Analysis on Big Data 5

Conclusion With millions of conversations occurring on the social media each day, the science of extracting relevant data and using statistics to quantify how people are expressing themselves has become a rapidly evolving discipline. There are significant advantages to identifying correlations in social sentiments and product marketing when you are able to apply search techniques to social data, extracting only those conversations related to your company or product. When sentiment analysis is applied to such focused set of conversations over longer durations, it gives precise outcomes to open up prospective avenues for a company to enhance the value of its product / service portfolio. SPAN s analytical solution provides additional results as they become available, and allows for deeper R&D, thereby improving an organization s overall capabilities. For more information on our entire range of solutions and related offerings, get in touch with: sales@spanservices.com About SPAN: SPAN is an established software services company offering comprehensive IT services since 1994. Our clients include Fortune 1000 companies, software firms (ISVs) and tech start-ups.span s offshore development centers in India are certified for ISO 9001:2008 & ISO 27001:2005 and appraised at CMMI Maturity Level 5 and PCMM Maturity Level 5. SPAN has a global footprint with offices in the U.S., Singapore, India, and group offices in Europe. There are multiple offshore development centers in Bangalore and Chandigarh, India. SPAN is ranked as #7 Best IT Employers in India by a leading IT publication. SPAN s Relationship Management (RM) Model is a well-defined, yet flexible framework, which provides ongoing business wholly owned by the largest Nordic IT services major, EVRY (www.evry.com). www.spansystems.com USA Headquarters SPAN Systems Corporation 1425 Greenway Drive, Suite 490 Irving, Texas 75038 Phone: 972-514-1113 / 1-888-SPAN-SYS Fax: 972-514-1109 India Headquarters SPAN Infotech (India) Pvt. Ltd. 18/2, Vani Vilas Road, Basavanagudi Bangalore 560 004, India Phone: +91-80- 40219600 Fax:+91-80- 40219632 Copyright 2015 by SPAN. All rights reserved. The contents of this document are protected by copyright law and international treaties. SPAN acknowledges the proprietary rights of the trademarks and product names of other companies mentioned in this document. The reproduction or distribution of the document or any portion of it thereof, in any form or by any means without the prior written permission of SPAN is prohibited.