2015 MapR Technologies 2015 MapR Technologies 1
MapR: Best Solution for Customer Success Best Product High Growth 700+ Customers Premier Investors Apache Open Source 2X 2X Growth In Direct Customers Growth In Annual Subscriptions ( ACV) 140% Dollar-based Net Expansion 90% Subscription Licenses Software Margins 2015 MapR Technologies 2
Data Increasingly Stored in Non-Relational Datastores Volume GBs-TBs TBs-PBs Structure Development Structured Planned (release cycle = months-years) Structured, semi-structured and unstructured Iterative (release cycle = days-weeks) Database RELATIONAL DATABASES Fixed schema DBA controls structure NON-RELATIONAL DATASTORES Dynamic / Flexible schema Application controls structure 1980 1990 2000 2010 2020 2015 MapR Technologies 3
REALITY 2 Hadoop is Being Used to Drive Small, Rapid Decisions High Arrival Rate Data Clickstream Social media Sensor data, Business Impact Revenue optimization Risk mitigation Operational efficiency 2015 MapR Technologies 4
What Problem is Drill Solving? Making data actionable much faster Removing the Hadoop Skills barrier. 2015 MapR Technologies 5
Industry's First Schema-free SQL engine for Big Data 2015 MapR Technologies 6
Business Benefits Rapid time-to-value for business analysts: SQL specialists and BI analysts can query any dataset including complex nested data instantly, versus waiting several weeks for data preparation by IT. Efficiency with easy governance for IT: IT can avoid unnecessary ETL cycles and schema maintenance activities, but still ensure governance through easy-to-deploy granular access controls. Accelerated big data adoption for businesses: Organizations can use the existing and large SQL talent base and tools to rapidly discover new business insights from big data. 2015 MapR Technologies 7
Apache Drill Brings Flexibility & Performance Access to any data type, any data source Relational Nested data Schema-less Rapid time to insights Query data in-situ No Schemas required Easy to get started Integration with existing tools ANSI SQL BI tool integration Scale in all dimensions TB-PB of scale 1000 s of users 1000 s of nodes Granular Security Authentication Row/column level controls De-centralized 2015 MapR Technologies 8
Enabling As-It-Happens Business with Instant Analytics Total time to insight: weeks to months Governed approach Hadoop data Data modeling Transformation Data movement (optional) Users Source data evolution New Business questions Total time to insight: minutes Exploratory approach Hadoop data Users 2015 MapR Technologies 9
Agility & Business Value Extending Self Service to Schema-free data Schema-Free Data Exploration Analyst-driven with no IT dependency Self-Service BI Self-Service BI Analyst-driven with IT support for ETL IT-Driven BI IT-Driven BI IT-Driven BI IT-created reports, spreadsheets 1980s -1990s 2000s Now Use cases for BI 2015 MapR Technologies 10
Drill s Role in the Enterprise Data Architecture Raw data Optimized data Centrally-structured data Relational data JSON, CSV,... Parquet, Schemas in Hive Metastore Highly-structured data Exploration (known and unknown questions) Oracle, Teradata Hive, Impala, Spark SQL 2015 MapR Technologies 11
Access control that scales Use r PAM Authentication + User Impersonation Use r Drill View 1 Drill View 2 U Files HBase Hive U U Fine-grained row and column level access control with Drill Views no centralized security repository required 2015 MapR Technologies 12
Granular security permissions through Drill views Raw File (/raw/cards.csv) Name City State Credit Card # Dave San Jose CA 1374-7914-3865-4817 John Boulder CO 1374-9735-1794-9711 Owner Admins Permission Admins Owner Admins Permission Business Analysts Owner Admins Permission Data Scientists Business Analyst View Name City State Dave San Jose CA John Boulder CO Data Scientist View (/views/maskedcards.csv) Name City State Credit Card # Dave San Jose CA 1374-1111-1111-1111 John Boulder CO 1374-1111-1111-1111 Business Analyst Not a physical data copy Data Scientist 2015 MapR Technologies 13
Case Studies 2015 2015 MapR MapR Technologies Technologies 15
Drill works with wide set of Sources and Tools Raw Data Exploration JSON Analytics DWH Offload {JSON}, Parquet Text Files Files Directories Hive HBase 2015 MapR Technologies 16
Data Warehouse Offload with Drill & MapR Ultimately replace existing expensive SQL analytics platform with Hadoop OBJECTIVES Mine credit card data and compares consumer shopping habits Require internal SQL specialists to gain instant access to data at all times CHALLENGES Want to preserve instant access to data but a lower price point Need a system that is reliable, does not lose data and is fast Must be able to leverage the SQL skill sets in the company SOLUTION Apache Drill allows interactive analysis on large datasets with MapR as the underlying platform that meets scale, reliability and data protection needs SQL users did not have to learn Pig, HiveQL or any other language and continue to use Tableau and Squirrel on top of Drill Business Impact Potential Hadoop and Drill dramatically reduce the price point to less than $1,000 / TB MapR platform with Drill delivers reliability and performance for the end users Leverage existing BI and SQL skill-sets on Hadoop without retraining 2015 MapR Technologies 17
Telecom OEM application with Drill & MapR Leverage Drill s JSON capabilities to create revenue-generating IOT services OBJECTIVES Offer service to mobile operators to proactively monitor and improve their subscriber experience Instant availability of data from diverse and disparate sources CHALLENGES Data is very diverse and dynamic using JSON as the key format Require interactive, ad-hoc analysis capabilities via standard BI tools such as Tableau and Spotfire SOLUTION Apache Drill is being used to build the engine for the interactive experience Drill allows SQL queries on incoming JSON structures natively without requiring any centralized schema definitions Drill connects to all BI tools using standard ODBC connectors Business Impact Potential Provide new revenue-generating services to mobile operators Enable deeper, instant intelligence about the networks and users Reduce maintenance costs - no IT intervention required for schema changes 2015 MapR Technologies 18
Drill Benefits - Recap Business Analyst, Data scientists, VP of Hadoop Dev., Director of BI & Analytics, Enterprise architect Business users Self Service access to Hadoop data from BI tools Technical IT Drive Hadoop adoption in company Agility with no IT intervention Enable better/new BI in raw, real time and new data types Interactive performance Reduce cost of traditional systems 2015 MapR Technologies 20
Q & A Engage with us! @mapr maprtech mapr-technologies MapR nitin@mapr.com maprtech 2015 MapR Technologies 21