Native Connectivity to Big Data Sources in MSTR 10
Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single Database MapReduce & NOSQL Databases Elastic Map Reduce BigInsights Distribution HDFS Columnar Databases Redshift Data Warehouse Appliances HANA Parallel Data Warehouse Relational Databases Multidimensional Databases Analysis Services SaaS-Based App Data Google Analytics Zendesk Generic Web Services SOAP REST Generic Web Services with OAuth..many more.. User / Departmental Data Clipboard MicroStrategy Dataset
Connectivity to Hbase,Cassandra and Spark We connect to HBase via Apache Phoenix. User needs to install the JDBC driver. Create database instance with Apache Phoenix 4.x JDBC. Limitation: Phoenix specific SQL syntax. Some SQL features are not supported, such as cross join, union etc. We certify the open source JDBC driver for Cassandra Allow freeform and query builder reports against the source Connection is Spark is ODBC similar to that to Hive Users can use the pre-installed ODBC driver for Apache Hive Limitation: Same as Hive
How does MicroStrategy integrates with Hadoop? SQL on Hadoop Apache Hive Apache Shark/Spark Apache Pig MicroStrategy certifies Cloudera Impala, Google Big Query and Pivotal HAWQ as a data source. MicroStrategy optimizes and certifies Hadoop/Hive as a data source. MicroStrategy certifies Spark/Shark on HDFS. MicroStrategy also provides a connector to execute Freeform Pig-Latin reports
Usage Patterns for MicroStrategy with Hadoop as a Data Source 1.Visually explore subject matter extract in-memory through a one-time query to Hadoop 2.Self-service parameterized queries directly to Hadoop 3.Model-driven access to Hadoop. 4.Query multi-source schema model and drill down among Intelligent Cubes, EDW, Hive Multi-dimensional Business Model RDBMS ETL Maturity of Data Access
Key BI Characteristics: INDUSTRY: BI COMPONENTS: USERS ~200 DATABASE: HADOOP DISTRIBUTION: VOLUME OF DATA TYPE OF DATA APPLICATIONS: Entertainment 1 Application; Traditional Reports Hadoop, Teradata Amazon EMR Petabytes Log and Events data Sales Analysis Business Use and Benefits Sales Analysis generally with a new launch in new region, quick report analysis to understand the new accounts, number of hours of viewing etc. Directly querying and reporting from MicroStrategy on logs via Hive Able to make better Sales decisions Making Better Sales Decisions by Getting Insights from Web Logs in Hadoop
Key BI Characteristics: INDUSTRY: E-commerce BI COMPONENTS: 1 Application; Reports, Dashboards, VI USERS ~200 DATABASE: Hadoop, Oracle HADOOP DISTRIBUTION: Apache VOLUME OF DATA Petabytes TYPE OF DATA Web Logs, Online behavior APPLICATIONS: Sales Analysis Business Use and Benefits Analyzing web logs/online behavior stored in Hadoop. Dashboards and VI analysis run against our in-memory cubes. And ad-hoc reports run live against Hive. Analyzing Online Behavior with Live Queries to Hadoop End users do not need to code with MapReduce Developers are more productive delivering self service BI through a tool instead of coding custom user interface.
Multi-Channel Digital Distribution Provider Key BI Characteristics: INDUSTRY: Electronics and Media BI COMPONENTS: 1 Application; Reports, VI, Dashboards DATABASE: Hadoop, Hive HADOOP DISTRIBUTION: Cloudera Impala VOLUME OF DATA Over 1 Billion traffic attribute combinations APPLICATIONS: Traffic Attribute Multiplier Business Use and Benefits The Traffic Attribute Multiplier application is helping Adconion to get precisely target their digital ads, shorten the time to prepare and tune models and better ad delivery ROI for their customers Precise Targeting of Digital Ads Leveraging MicroStrategy s integration to Impala and the rich visualizations library, making it easy to be consumed by business users. Achieved 2.4% improvement in ad budgets spending efficiency
Tap into Hadoop Natively MicroStrategy 9.4.1 Hadoop MicroStrategy Analytics Platform Hive ODBC Connector Hadoop Distribution Hive HDFS MicroStrategy v10 MicroStrategy Analytics Platform Big Data Engine NEW Hadoop HDFS Big Data Engine gives us higher performance, as we bypass the Hive layer Ability to consume unstructured data from Hadoop! No ODBC overhead 9
Connect Live How Big Data Engine works? Data Data. partition partition Parallel Partitioned In-Memory Cube Big Data Engine is a native YARN application that enables direct access to HDFS This component would be installed on the Hadoop cluster BDE creates the metadata on the fly when files are selected and imported With the Connect Live option, multi-table data import is not supported With internal testing, BDE is at least 5 times faster than other comparable Hive based technologies. Data Node Big Data Execution Engine Data Node Big Data Execution Engine Name Node Big Data Query Engine Hadoop Cluster
Three Steps for Self Service Access to Hadoop with Native Connectivity Import Data from HDFS directly Cleanse, Refine with Data Wrangler Analyze with Visual Insight Cleanse, refine and transform data from HDFS, make it ready for analysis. Designed for business users Get full insights from Hadoop/HDFS data using Visual Insight Web logs, survey/feedback forms, machine generated data 11
Demo Demo
Support for Interactive Searches on Unstructured Data Polaris brings the capability to do full text search via Apache Solr. This enables users to quickly investigate: Website logs Application usage Surveys and free form text fields Event and error monitoring logs Configure the file in Apache Solr Import the file via data import Search for keywords via Visual Insight 13
Questions? Q&A