אנליטיקה עסקית בארגונים ממוקדי לקוח
|
|
- Brice Arnold
- 8 years ago
- Views:
Transcription
1 Big Data vs. big Data? אנליטיקה עסקית בארגונים ממוקדי לקוח אילן ששון ד"ר 30/3/2015
2 מטרות מהזה?Big Data Analytics מה זהScience?Data DDD וחשיבותובארגוניםמוטילקוחושירות מגווןמקורותנתונים מושגיםבסיסייםביצירת Data Products המלצותוגישותלהבניית יכולותאנליטיות מדענינתונים ולמהזה עשוילעניין...אתכם? מושגייסודבכרייתנתונים ניהולפרויקטיםמוטיאנליטיקהעסקית (CRIPS-DM) Data Privacy ולמהזהחשוב דוגמאותקוד Data/Text mining R
3 Trend of Google Searches of Big Data and Data science over time showing the popularity of the terms Data Science the connective tissue between big data processing technologiesand data-driven decisionmaking (DDD) (Provost & Fawcett, 2013)
4 Terminology Data-Driven Decision-Making (DDD) refers to the practice of basing decisions on the analysis of data, rather than purely on intuition. (Provost & Fawcett, 2013) Data Science is a set of fundamental principles that support the extraction of information and knowledge form data. It involves principles, processes, and techniquesfor understanding phenomena via the (automated) analysis of data. Big Data Technologies are used to process and handle big data, and include preprocessing prior to implementing data mining techniques. The new approach to Business Analytics
5 Why do we really care? DDDaffects firm performance the more data-driven a firm is the more productiveis with a 4%-6% increase and highly correlated with higher ROI, ROE, asset utilizationand market value. (Brynjolfsson et al. Strength in numbers: How does datadriven decision making affect firm performance, 2013 MIT). BD Technologies utilization correlates with significant additional productivity growth affects firm performance 3% increase in productivity than the average firm. (TambeP. Big data know-how and business value, 2012 NYU). CompetitiveAdvantage What can I now do that I couldn t do before, or do better than I could do before?
6 3 Principles of the new era of computing Datawill be the basis of competitive intelligence for any organization companies, government entities, cites and individuals Data in this new era notlimited resource Changing how we make decision -Decisions will be based not on intuition or past experience, but on predictive analytics. Changing how we create value - Organizations - private and public - will become social enterprises. Changing how we deliver value -Success will depend upon the ability to create products and services for individuals -not market segments.
7 Big Data Every Where! Lots of data is being collected and warehoused Transactional data Web data, e-commerce purchases at department/ grocery stores Bank/Credit Card transactions Social Network Multi media content Scientific data Networks sensors Mobile phones User generated content Internet of Things Data is becoming the new currency - vital natural resource Datafication -taking all aspectsof life and turning them into data (The rise of big data, Foreign Affairs)
8 What to do with these data? Aggregation and Statistics: Data warehouse OLAP Indexing, Searching, and Querying: Keyword based search Pattern matching (RDF/XML) Knowledge discovery: Data mining Text mining Graph mining Statistical modeling Big Data Big Assumptions Collecting and using a lot of data rather than small samples ( N= All ) Accepting messiness in your data Giving up on knowing the causes
9 Big Data Use Cases Big Data can play a significant economic role to Private commerce Public sector National economies
10 big Data The enterprise perspective Enterprise data is big but it is not Google-big OLTP ETL OLAP IT-Oriented Classic BI Boundary Dash-bored OLTP / Dark data/ Log / Social/ web ETL Business-Oriented Big Data Warehouse Augmented DWH + Extreme-Scale- Analytics
11 הפתרון הקיים DWH- מה הם סוגי המוצרים הבנקאיים הנמכרים ביותר? מה היא התפלגות הכנסות על פי מוצרים בנקאיים? מה היא התפלגות ההוצאות על פי יחידות מטה? רווחיות על פי מוצרים על פני מימד הזמן ומימד הסניפים? באילו סוגי מוצרים קיימת מגמת עונתיות?
12 מרחב הבעיה? מי הם הלקוחות הפוטנציאליים ביותר להלוואה מעל 300,000 ש"ח? איך ניתן לקצר את תהליך הטיפול במתן אשראי ללקוח חדש? מה הם המאפיינים של לקוח נוטש? מה הם המאפיינים של לקוח רווחי? אילו מוצרים חדשים מומלץ להציע ללקוחות קיימים? כיצד ניתן לייעל תהליכים בארגון?
13 מרחב הפתרון from Business Intelligence to Business Analytics from DHW/OLAP to Large Scale Data/Text Mining Verification Based Analysis ~ ~ ~ ~ Discovery Based Analysis ~ ~ ~ ~ ~ תהליך אנליטי מבוסס גילוי כלים/אלגורתימים הפועלים על מרחב הנתונים חושפים תבניות חבויות תהליכיהקבצה, ניבויואסוציאציה Unsupervised Learning Machine תהליכים המחייבים בסיס נתונים היסטורי גדול תהליך אנליטי מבוסס אימות משתמש מניח היפותיזה כלשהיא מופעלותטכניקותאישוש/סתירה תהליכיםמבוססימשתמש היכולת להניחהנחותנכונות,בחירת הכלים,ופרשנותהתוצאות תהליכים משלימים
14 Big Data Architecture & Pipeline פנימי/חיצוני מקורנתונים Streams Real Time Analytics Network/Sensor Internet of Things Video/Audio Entity Analytics Information Ingestion Unified Information Access (UIA) Master Data Data Integration Stream Processing Exploration, Analytics Discovery Predictive Operative Descriptive Prescriptive Landing Area Zone & Archive Raw Data Structured Data Unstructured Data Text Analytics Data Mining Machine Learning Complex Event Processing Intelligence Analysis Decision Management BI & Predictive Analytics Reporting & Discovery Business Processes
15 Data-Analytic Thinking One of the most critical aspects of data science is the support of data-analytic thinking throughout the organization Data-oriented business environment Basic understanding of basic principles In order to assess and envision opportunities accurately (data-analytics projects) Professional advantage in being able to interact competently (dataanalytics team) Business units must interact with data science team (domain knowledge) Data science project require close interaction with business people responsible for decision making
16 Conveying the message. Data miningis movingfrom the research arena into the pragmatic world of business There is continuouseffort of refining algorithms and coming up with new ones Now with new developments in algorithms and architecture smallscale development teams can build large-scale projects Practicaldata mining weighs the trade-offs between the most advanced and accurate model with the costs and complexity in realworld business environment New analytics tools and platforms make data mining much more easier and powerful for people at all levels of expertise Hadoop-based computing ecosystem is evolving rapidly, making project with very large-scale datasets much more affordable
17 The Ladder Approach Build a foundation Learn to think analytically(data mining models, visualization, statistics etc.) Develop a strategy and road map based on business needs (pick a theme) C-level management engagement (presentation) Adopt a step-by-step process (problem definition results: CRISP-DM) Pick and learn a tool (R, Python etc.) Practice on small datasets Build a portfolio Deliverable POCs and pilot projects (3-5) Quick-wins Practice on small datasets Write-up findings (storytelling) Deliver solutions Adopt technology infrastructure (HDFS, MapReduce, NoSQL Spark SQL. etc.) Ongoing revisions of models (data products) Continue to apply advanced analytics Business Scope & Deliverables
18 Rethinking the Business & IT Model Data Management & Business Analytics are Core Business Competencies o o o The Business Owns the Data Recognize Analytics as a Business Driven and Owned Process Technology is an Enabler Shift to Business Configurable and Controlled o o Acknowledge the Differencebetween Software Development and Business Analytics Redefinethe IT Support Model to Enable The Business to Acquire, Assess, Analyze, Test, and Deploy Analytical Outcomes Change the IT funding & Financial Model o o Current Infrastructure Model is Geared towards Legacy & Transactional Platform Recognize Analytics as a Business Driven and Owned Process Technology is an Enabler מקור השקף: מצגת MetLife כנסביגדאטהIBM אוקטובר 2013
19 Big Data Adoption התוויתתוכניתעבודה בחינתתרחיש (אחדאויותרלמימוש) קורסMining Big Data, Data Science & Data בן 10 מפגשיםקורס Data- Business בן 8 מפגשיםלאנשי Analytic Thinking הקמתקבוצת «360» R&D Team Infrastructure & Operations Business Unit Analysts Business IT Support Team
20 The Data Journey מהעושיםכיוםבארגון :.1 OLAP דוחותמימדימוצרשיווקתמחור.2 מודליםשלכרייתנתונים?... Internal Data מידע תפעולי קיים במחסן הנתונים New Internal Data (Dark Data) מידע קיים שלא מוגדר במחסן הנתונים, מידעמובנה, מיילים, מידע טקסטואלי (סוכנים, שמאים..) New External Data הערכה: 80% מהמידע בארגון אינו מובנה ואינו ממודל ולפיכך אינו זמין לניתוח ואנליזה בכלים הקיימים והמסורתיים מידעממקורותחיצוניים: אינטרנט, מתחרים,רשתותחברתיות, מידע סלולארימבוססמיקום, טלמטיקה סנסוריםועוד Data Management before Business Analytics בשלב ראשון לא נרתיח את האוקיינוס... Big Data doesn t have to be big it can be managed and built incrementally. Big Data may or may not include social media (eventually it will). Big Data may or may not include external data (eventually it will). Sometimes information is good enough.
21 Data Products Motivation: turning data assets products and services A data product is an algorithm, software, application, presentation or reproducible report based on data analytics A data product is the production output from a statistical analysis, data mining, text miming, AI etc. Initially online companies: search algorithms (Google) similar offerings (Amazon) recommendations for people you may know (Facebook) A data product is a product that facilitates an end goal through the use of data. DJ Patil Developing and launching data products, particularly if you are an offline business it won t be second nature... Data-as-a-Service (DaaS) - a cloud strategy used to facilitate the accessibility of businesscritical data in a well-timed, protected and affordable manner B2B "renting" data service
22 The Model Assembly Line Do you own the data? Business model Do you have the data? Data quality? Type of analysis Do you havethe data? Do you ownthe data? (legal issues, consider anonymized personal data) Is it high-quality and useful data? Do you have a business model? (bundling, selling, free) What types of analysisare you offering? (descriptive analytics vs. predictive analytics) Do you have differentiationor competitive advantage? (proprietary vs. commodity data) Competi tive adv.
23 The Model Assembly Line: A case study of DaaS Cellular companies Do you own the data? Business model Do you have the data? Data quality? Type of analysis Competi tive adv. מידע מיקומי based) (Location מרכזי מידע מיקומי מפתחי אפליקציות חברתיות ערים חלוקהגיאוגרפית - מרכזהעיר, איזוריקניות, איזוריבילוי, מרכזיעסקים תדירותעדכוןהנתונים - יומי/שבועי/חודשי נגישותלנתונים - Online/Batch סיווגלקוח עסקי, פרטי סוג תקשורת - SMS Voice, based) (Location עורקי תחבורה ראשיים Pricing Models עיריות ומוסדות תכנון ממשלתיים Volume based model Quantity-based pricing (amount) Pay-per-call (PPCall) Data type based model based on the type or attribute of data Subscription based model an unlimited amount of data חלוקהגיאוגרפית - עיר,פרבר סוגכביש - מהירביןעירוני, עירוני, אוטוסטרדה תדירותעדכוןהנתונים - יומי/שבועי/חודשי נגישותלנתונים- Online/Batch
24 Implementations Approaches The Full Service Approach:Relying on a 3rd party to develop and maintain the model The Full Control Approach: In house model development and deployment The Consultant Approach: Hybrid methodology
25 Implementations Approaches The Full Service Approach:Relying on a 3rd party to develop and maintain the model The Full Control Approach: In house model development and deployment The Consultant Approach: Hybrid methodology o Pros: o the ideal solution for companies who are resource constrained o the ideal solution for companies lacking technical and analytics staff o the model development can rely on expertise provided by the vendor o the quickest path to implementation o Cons: o reliance on the vendor to provide a solution without any independent review o not being able to make changes to the model directly o Internal staff is not trained to ensure attainment of desired results
26 Implementations Approaches The Full Service Approach: Relying on a 3rd party to develop and maintain the model The Full Control Approach:In house model development and deployment The Consultant Approach: Hybrid methodology o Pros: o the ideal solution for companies with analyticsand IT resources o Helps to protect IP in case of a novel idea or product o This approach offers the most flexibility in making revisions or customizations to the model o Cons: o The firm can t take advantage of any data or expertise accumulated by vendors and consultants o If a fundamental modeling error has been made, it may never be discovered o historically the slowest path to deployment, with successful implementations measured in years(?)
27 Implementations Approaches The Full Service Approach: Relying on a 3rd party to develop and maintain the model The Full Control Approach: In house model development and deployment The Consultant Approach: Hybrid methodology o Pros: Build your own core competencies coupled o the ideal solution for companies lacking depth in their analytics department, but who have available resources in systems and IT o There is a built-in independent review phase in this approach. o Companies are able to make changes directly to the model as needed with high-end data science consultancy o Cons: o If companies lack internal technical or analytical resources, they may be at the mercy of the vendor in the future should a model update or revision be needed. o Some companies attempt to update vendor models, but lack the in-depth knowledge of modeling techniques used. As a result, they may inadvertently make fundamental modeling errors o Continuous management attention
28 Roles in Data Science Data Scientist Applied statistician X computer scientist Computer science Math Statistics Machine learning Domain expertise Communication and presentation skills Data visualization No one person can be the perfect data scientists A team.? Data Scientist (noun): better at statistics than any software engineer and better at software engineering than any other statistician Josh Wills shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts to analyze big data (McKinsey, 2011)
29 Data Scientist Skills required to exploit big data Skills to work with business stakeholders to understand the business issue and context Analytical and decision modeling skills for discovering relationships within data and proposing patterns Data management skills are required to build the relevant dataset used for the analysis. Broad combination of soft and technical skills Sample of Program Offerings DB -Databases BI Business Intelligence, Data Warehousing ST Advanced-Level Statistics BA Business Analytics, Web Analytics DM Data Mining, Machine Learning, Text Mining, Natural-Language Processing BD Big Data Technologies, Visualization KM Knowledge Management, Social-Web Analysis קוסמולוגים של היקום הדיגיטלי
30 Building Models Introduction A model captures the knowledge exhibited by the data and encodes it in some language no model can perfectlyrepresent the real world Automatic or semi-automatic extraction of Interesting Non-trivial Implicit Previously unknown Potentially useful Forecasting what may happen in the future Classifying items into groups by recognizing patterns Clustering items into groups based on their attributes Associating what events are likely to occur together Sequencingwhat events are likely to lead to later events
31 Building Models Introduction Models fall into the categories of data mining: descriptive and predictive Predictive Tasks Use some variables to predictunknown or future values of other variables Descriptive Tasks Find human-interpretable patterns that describe the data Supervised learning Unsupervised learning Meta learning (ensemble learners) 31
32 Types of Data Mining Tasks Many business problems have as an important component one of these DM tasks: Affinity grouping (a.k.a. associations, market-basket analysis ) What items are commonly purchased together? Similarity Matching What other companies are like our best small business customers? Description/Profiling What does normal behavior look like? Clustering Do my customers form natural groups? Unsupervised Predictive Modeling (including causal modeling & link prediction) Will customer X churn next month/default on her loan? How much would prospect X spend? Who might be good friends on our social networking site? Supervised 32
33 Data Mining vs. Deployment
34 Merging Traditional & Big Data approaches
35 Merging Traditional & Agile approaches Time to market slow process Disconcert between the business people (consumers) and IT people (producers) The overall cost is high Breaking down the walls Discovery process and not a traditional SW development project Business owns the data
36 Codification of The Process Extracting useful knowledge from data to solve business problems can be treated systematicallyby following a processwith reasonably well-defined stages CRISP-DM- The Cross Industry Process for Data Mining - ( (CRISP-DM; Shearer, 2000) Structured process with critical points: Human Intuition High-powered analytical tools A well-understood processthat places a structure on a problem which still involves art science + craft + creativity + common sense 36
37 CRISP-DM The point of actuallyusing your results This process diagram makes explicit the fact that iteration is the rule rather than the exception not a linear process Preparatory activity what data? where is the data? accuracy and reliability of the data Both mathematical and logical The most substantial components (65%) timeconsuming and laborintensive 37
38 CRISP-DM Business Understanding A creative problem formulation -what is the problem? Think carefully about the use scenario and the actual business need What exactly do we want to do? How exactly would we do it? What parts of this use scenario constitute possible data mining models? Data Understanding It is important to understand the strengths and limitations of the data. Historical data often are collected for purposes unrelatedto the current business problem. Estimating the costsand benefits of each data source Data having varying degrees of reliability Cost of acquiring the data Data manipulation Data quality 38
39 CRISP-DM Data Preparation Pre-processing tasks Data conversions Data transformations (e.g., normalization, scaling etc.) Missing values, Outliers Redundant or non-informative features (i.e., feature selection, between-predictors correlations) Dimensionality reduction techniques (e.g., PCA, SVD) Modeling The primary place where data mining techniques are applied to the data It is important to have some understanding of the fundamental ideas of data mining, including the sorts of techniques algorithms and tuning parameters. Evaluation The evaluation stage is to assess the data mining results rigorously and to gain confidence that they are valid and reliable before moving on. Measuring models performance and generalization 39
40 Basic Principles - Privacy Collection limitation -Data should be obtained lawfully and fairly, while some very sensitive data should not be held at all. Data quality - Data should be relevant to the stated purposes, accurate, complete, and up-to-date; proper precautions should be taken to ensure this accuracy. Purpose specification -The purposes for which data will be used should be identified, and the data should be destroyed if it no longer serves the given purpose. Use limitation -Use of data for purposes other than specified is forbidden. Source: the OECD (Organization for Economic Co-operation and Development (OECD), 1980).
41 41 Data Science Course אפליקציות ושימושים של Big Data הצגת מגוון מודלים לכריית נתונים Predictive and Descriptive Analyticsו- Exploratory Data Analysis הכוללים בין היתר: Cluster Analysis Association Analysis Decision Trees & Random Forest Support Vector Machine Neural Networks Anomaly Detection Graph mining,social Network Analysis והצגת מושגי יסוד כדוגמת: Degree & Degree Distribution Centrality, Betweeness, Closeness Centralization ועוד שיטות לכריית נתונים טקסטואליים מבוססות NLP לצורך Text Categorization Information Extraction הצגת מושגי יסוד Information Retrieval ושיטות של ייצוג נתונים טקסטואליים מבוססי Bag-Of-Words שימוש בסביבת R לצורך תחקור סטטיסטי, כרייה והצגה של נתונים גישות ויזואליזציה וגרפיקה לאפליקציות מבוססות ניתוח נתונים טקסטואלי ) graph co-occurrences network, neighborhood ועוד) טכנולוגיות מתקדמות לניהול נתונים וארכיטקטורות אחסון ועיבוד הצגת מודל CRISP-DM לניהול פרויקטי אנליטיקה עסקית
42 Why R? R is a free and open source language and environment for statistical computing and graphics. R is already the most popular amongst the leading software for statistical analysis. Key features: It s a mature & widely used NYT Excellent graphics capabilities Highly extensible, with over 4300 user-contributed packages It s easy to use and has excellent online help and associated documentation -Manuals, tutorials, etc. provided by users of R
43 ביג דאטההוא ייצוג של תהליך בעל מגמות אבולוציוניות: מורכבות גיוון והתמחות תודה על ההקשבה
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationBIG DATA & DATA SCIENCE
BIG DATA & DATA SCIENCE ACADEMY PROGRAMS IN-COMPANY TRAINING PORTFOLIO 2 TRAINING PORTFOLIO 2016 Synergic Academy Solutions BIG DATA FOR LEADING BUSINESS Big data promises a significant shift in the way
More informationPredictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
More informationANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationUnderstanding Your Customer Journey by Extending Adobe Analytics with Big Data
SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction
More informationIntroduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationBIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
More informationIntroduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
More informationHow to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
More informationBig Data 101: Harvest Real Value & Avoid Hollow Hype
Big Data 101: Harvest Real Value & Avoid Hollow Hype 2 Executive Summary Odds are you are hearing the growing hype around the potential for big data to revolutionize our ability to assimilate and act on
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationThis Symposium brought to you by www.ttcus.com
This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data
More informationMike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.
Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationWell packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
More informationData Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC
Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.
More informationBig Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
More informationReference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
More informationData Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationSunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
More informationIntegrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
More informationChapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
More informationAdvanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
More informationBig Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
More informationThe University of Jordan
The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationA STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
More informationLambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
More informationData Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
More information2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist
2015 Analyst and Advisor Summit Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist Agenda Key Facts Offerings and Capabilities Case Studies When to Engage
More informationPREDICTIVE MARKETING, DIGITAL ATTRIBUTION, OPTIMIZATION, AND DATA-DRIVEN PERSONALIZATION
PREDICTIVE MARKETING, DIGITAL ATTRIBUTION, OPTIMIZATION, AND DATA-DRIVEN PERSONALIZATION A m a r t y a B h a t t a c h a r j y & S u n e e l G r o v e r P r i n c i p a l S o l u t i o n A r c h i t e
More informationNavigating Big Data business analytics
mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what
More informationANALYTICS STRATEGY: creating a roadmap for success
ANALYTICS STRATEGY: creating a roadmap for success Companies in the capital and commodity markets are looking at analytics for opportunities to improve revenue and cost savings. Yet, many firms are struggling
More informationTDWI Best Practice BI & DW Predictive Analytics & Data Mining
TDWI Best Practice BI & DW Predictive Analytics & Data Mining Course Length : 9am to 5pm, 2 consecutive days 2012 Dates : Sydney: July 30 & 31 Melbourne: August 2 & 3 Canberra: August 6 & 7 Venue & Cost
More informationData Mining for Everyone
Page 1 Data Mining for Everyone Christoph Sieb Senior Software Engineer, Data Mining Development Dr. Andreas Zekl Manager, Data Mining Development Page 2 Executive Summary Contents 2 Data mining in the
More informationHadoop s Advantages for! Machine! Learning and. Predictive! Analytics. Webinar will begin shortly. Presented by Hortonworks & Zementis
Webinar will begin shortly Hadoop s Advantages for Machine Learning and Predictive Analytics Presented by Hortonworks & Zementis September 10, 2014 Copyright 2014 Zementis, Inc. All rights reserved. 2
More informationLluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining
Lluis Belanche + Alfredo Vellido Intelligent Data Analysis and Data Mining a.k.a. Data Mining II Office 319, Omega, BCN EET, office 107, TR 2, Terrassa avellido@lsi.upc.edu skype, gtalk: avellido Tels.:
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationEnd to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
More informationJournée Thématique Big Data 13/03/2015
Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets
More informationHealthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
More informationDMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information
More informationbirt Analytics data sheet Reduce the time from analysis to action
Reduce the time from analysis to action BIRT Analytics is the newest addition to ActuateOne. This new analytics product is fast and agile, and adds to the already rich Actuate BIRT product lineup the simpleto-use
More informationBig Data. Introducción. Santiago González <sgonzalez@fi.upm.es>
Big Data Introducción Santiago González Contenidos Por que BIG DATA? Características de Big Data Tecnologías y Herramientas Big Data Paradigmas fundamentales Big Data Data Mining
More informationHow the oil and gas industry can gain value from Big Data?
How the oil and gas industry can gain value from Big Data? Arild Kristensen Nordic Sales Manager, Big Data Analytics arild.kristensen@no.ibm.com, tlf. +4790532591 April 25, 2013 2013 IBM Corporation Dilbert
More informationPRIME DIMENSIONS. Revealing insights. Shaping the future.
PRIME DIMENSIONS Revealing insights. Shaping the future. Service Offering Prime Dimensions offers expertise in the processes, tools, and techniques associated with: Data Management Business Intelligence
More informationAdobe Insight, powered by Omniture
Adobe Insight, powered by Omniture Accelerating government intelligence to the speed of thought 1 Challenges that analysts face 2 Analysis tools and functionality 3 Adobe Insight 4 Summary Never before
More informationBig Data and Your Data Warehouse Philip Russom
Big Data and Your Data Warehouse Philip Russom TDWI Research Director for Data Management April 5, 2012 Sponsor Speakers Philip Russom Research Director, Data Management, TDWI Peter Jeffcock Director,
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationData Warehouse Architecture Overview
Data Warehousing 01 Data Warehouse Architecture Overview DW 2014/2015 Notice! Author " João Moura Pires (jmp@di.fct.unl.pt)! This material can be freely used for personal or academic purposes without any
More information<Insert Picture Here> Oracle Retail Data Model Overview
Oracle Retail Data Model Overview The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
More informationData Virtualization and ETL. Denodo Technologies Architecture Brief
Data Virtualization and ETL Denodo Technologies Architecture Brief Contents Data Virtualization and ETL... 3 Summary... 3 Data Virtualization... 7 What is Data Virtualization good for?... 8 Applications
More informationA Knowledge Management Framework Using Business Intelligence Solutions
www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For
More informationData Mining Techniques
15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationSURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
More informationDiscovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III
www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed
More informationCertificate Program in Applied Big Data Analytics in Dubai. A Collaborative Program offered by INSOFE and Synergy-BI
Certificate Program in Applied Big Data Analytics in Dubai A Collaborative Program offered by INSOFE and Synergy-BI Program Overview Today s manager needs to be extremely data savvy. They need to work
More informationHow To Turn Big Data Into An Insight
mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed
More informationWhy big data? Lessons from a Decade+ Experiment in Big Data
Why big data? Lessons from a Decade+ Experiment in Big Data David Belanger PhD Senior Research Fellow Stevens Institute of Technology dbelange@stevens.edu 1 What Does Big Look Like? 7 Image Source Page:
More informationExtending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team rlancaster@orbitz.com @rob1lancaster Organizer of Chicago
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam
ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open
More informationBIG Data Analytics Move to Competitive Advantage
BIG Data Analytics Move to Competitive Advantage where is technology heading today Standardization Open Source Automation Scalability Cloud Computing Mobility Smartphones/ tablets Internet of Things Wireless
More informationA Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data
White Paper A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data Contents Executive Summary....2 Introduction....3 Too much data, not enough information....3 Only
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationFoundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationICT Perspectives on Big Data: Well Sorted Materials
ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in
More informationDecision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010
Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product
More informationAnalytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
More informationOracle Big Data Discovery Unlock Potential in Big Data Reservoir
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All
More informationBig Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
More informationIBM SPSS Modeler Professional
IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model
More informationBig Data, Start Small! Dr. Frank Säuberlich, Director Advanced Analytics (Teradata International) 26 th May 2015
Big Data, Start Small! Dr. Frank Säuberlich, Director Advanced Analytics (Teradata International) 26 th May 2015 Agenda Introduction Big Data And The Emergence Of The Logical Data Warehouse Architecture
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationSome Research Challenges for Big Data Analytics of Intelligent Security
Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,
More informationBig Data Executive Survey
Big Data Executive Full Questionnaire Big Date Executive Full Questionnaire Appendix B Questionnaire Welcome The survey has been designed to provide a benchmark for enterprises seeking to understand the
More informationData Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms
Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge
More informationBIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationBIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
More informationData Isn't Everything
June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,
More informationAn Introduction to Advanced Analytics and Data Mining
An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit
More informationCS590D: Data Mining Chris Clifton
CS590D: Data Mining Chris Clifton March 10, 2004 Data Mining Process Reminder: Midterm tonight, 19:00-20:30, CS G066. Open book/notes. Thanks to Laura Squier, SPSS for some of the material used How to
More informationUNIFY YOUR (BIG) DATA
UNIFY YOUR (BIG) DATA ANALYTIC STRATEGY GIVE ANY USER ANY ANALYTIC ON ANY DATA Scott Gnau President, Teradata Labs scott.gnau@teradata.com t Unify Your (Big) Data Analytic Strategy Technology excitement:
More informationHow To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More information[callout: no organization can afford to deny itself the power of business intelligence ]
Publication: Telephony Author: Douglas Hackney Headline: Applied Business Intelligence [callout: no organization can afford to deny itself the power of business intelligence ] [begin copy] 1 Business Intelligence
More informationBig Data Can Drive the Business and IT to Evolve and Adapt
Big Data Can Drive the Business and IT to Evolve and Adapt Ralph Kimball Associates 2013 Ralph Kimball Brussels 2013 Big Data Itself is Being Monetized Executives see the short path from data insights
More informationSAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics
SAP Brief SAP HANA Objectives Transform Your Future with Better Business Insight Using Predictive Analytics Dealing with the new reality Dealing with the new reality Organizations like yours can identify
More informationHadoop in the Hybrid Cloud
Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big
More informationDATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2
DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.
More informationData Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
More information