Do You Need to be a Data Scientist to Analyze Text? Fern Halper

Size: px
Start display at page:

Download "Do You Need to be a Data Scientist to Analyze Text? Fern Halper"

Transcription

1 Do You Need to be a Data Scientist to Analyze Text? Fern Halper TDWI Research Director for Advanced Analytics May 29, 2013

2 Sponsor

3 Speakers Fern Halper Research Director, Advanced Analytics, TDWI Judson Chase Principal Industry Solution Consultant, TIBCO Spotfire Rik Tamm-Daniels VP of Technology, Channels and Alliances, Attivo 3

4 Agenda Text analytics overview Trends and use cases for text analytics Data science and text analytics 4

5 Text is everywhere 5

6 What is Text Analytics? Text analytics is the process of analyzing unstructured text, extracting relevant information, and transforming it into structured information that can be leveraged in various ways

7 What is extracted? Terms Entities Facts Concepts Sentiment

8 Text Analytics Basics- An Example I bought this phone during your March promotion. Is there something wrong with the battery? I can t get the light on my phone to work. I need help. I m having problems with my phone. The backlight is poor. Analysis: I/person bought this phone/thing during your March/date promotion. Is there something wrong/negative with the battery/thing?

9 Text Analytics Basics- Example ID Entity Issue Sentiment XXY Phone battery unhappy XYX Phone light neutral XXX Phone backlight unhappy 9

10 DISCOVERY 10

11 Text Analytics Usage is Increasing 60.0% 50.0% Three Years 40.0% 30.0% Today 20.0% 10.0% 0.0% Advanced data visualization Big data analytics Predictive analytics Social media analytics Text analytics 11

12 Poll Question Are you currently using text as part of your analytics? 12

13 Trend 1: Text being used with other data External Text Data Internal Text Data Other Data Marrying Data Sources 13

14 Trend 2: Unified Access Platforms Analytics Platform Hadoop Apps DW Content Server Web Server 14

15 Trend 3: Operationalizing Part of Business Process Source: SBConsulting 15

16 Analytics Options Do it yourself Statistician/ Share Outsource/ Cloud Operationalize/ Automate it 16

17 Use Case: Voice of the Customer 17

18 Use Case: Threat and Fraud 18

19 Use Case: Social Media/Competitive Intelligence 19

20 Use Case: Problem Analysis 20

21 The Data Scientist and Text Analytics vs. 21

22 Data Scientist Skills Data management Computer Science/ Software Engineer Curiosity/ Creativity/ Discipline Statistics/ Math Business Acumen/ Communi cation NLP? Taxonomy 22

23 Do you need to be a Data Scientist for Text Analytics? It depends.. On the business problem you re trying to solve On the analytic skills in your organization On the tools and technologies you have at your disposal 23

24 It depends Maybe Yes How many disparate data sources are involved? What is the frequency? How much custom code do you need to write? Will the extracted text be used in an advanced model? What are you doing with the results? 24

25 A sentiment extractor 25

26 ..Maybe No Are you analyzing data you understand? Is there a taxonomy available? Has the data been structured? Are the tools user friendly? 26

27 Another sentiment tool 27

28 Minimum skill set required Critical thinker Understand data Curiosity Disciplined approach to problem solving Business acumen Communication skills 28

29 Attivio and Spotfire = Actionable Insights Extracting Actionable Insights from Structured and Unstructured Content Spotfire Industry Analytics Group Judson Chase Copyright TIBCO Software Inc. All rights reserved. TIBCO Confidential & Proprietary Information.

30 Enterprise Analytics Architecture Enterprise Analytics Platform Measure Diagnose Model & Predict Operationalize Automate Driving a Sustainable Competitive Advantage Hadoop Repository Text Analytics Analytic Data Engines Enterprise Data Warehouses Event Processing AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM 3RD-PARTY DATA ERP Copyright TIBCO Software Inc. All rights reserved. TIBCO Confidential & Proprietary Information.

31 Enterprise Analytics Architecture Enterprise Analytics Platform Measure Diagnose Model & Predict Operationalize Automate Driving a Sustainable Competitive Advantage Hadoop Repository Text Analytics Analytic Data Engines Enterprise Data Warehouses Event Processing AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM 3RD-PARTY DATA ERP Copyright TIBCO Software Inc. All rights reserved. TIBCO Confidential & Proprietary Information.

32 VISUALIZE STRUCTURED DATA & UNSTRUCTURED CONTENT simultaneously contextually graphically Copyright TIBCO Software Inc. All rights reserved. TIBCO Confidential & Proprietary Information.

33 TAP INTO UNSTRUCTURED CONTENT Key phrase extraction Entity extraction Sentiment analysis Document classification Ontology-based tagging FORMS CMS SOCIAL Copyright TIBCO Software Inc. All rights reserved. TIBCO Confidential & Proprietary Information. SHAREPOINT

34 DRIVING INSIGHTS ACROSS MANY USERS & INDUSTRIES Copyright TIBCO Software Inc. All rights reserved. TIBCO Confidential & Proprietary Information.

35 Demo If you can establish a common visual language for data, you can radically upgrade the use of that data to drive decision-making and action. The best case I can cite for this argument is Procter & Gamble, which has institutionalized data visualization as a primary management tool, working with visual analytics software vendor TIBCO Spotfire. Thomas H. Davenport Harvard Business Review blog, 2013 World-renowned author & business analytics expert Copyright TIBCO Software Inc. All rights reserved. TIBCO Confidential & Proprietary Information.

36 Questions? 36

37 Contact Information If you have further questions or comments: Fern Halper, TDWI Judson Chase, TIBCO Rik Tamm-Daniels, Attivio 37