Douglas Moore, Principal Consultant, Architect June 2013 Big Data Analytics Energy Industry
Agenda Why Big Data in Energy? Imagine Overview - Use Cases - Readiness Analysis - Architecture - Development Capability - Q&A CONFIDENTIAL 2
Leading Provider of Innovative Big Analytics Services Building Modern Analytics Solutions to Monetize Big Data Investments IMAGINE Strategy and ILLUMINATE Training and Education IMPLEMENT Hands-On Data Science and Data Engineering CONFIDENTIAL 3
Big Data Heaven Electricity 10 s of millions Markets Big Data Heaven of meters, 5 minute interval1 Evolving nature of energy markets spot energy markets Demand Response: PJM Interconnection agreed to have 9 percent of its projected electricity needs met by load shedders 2 Image courtesy of duke.edu Multiple services: supply, capacity, phase, standby, renewables Complex interaction: physical, financial, data V2G: "[The vehicle-togrid cars] essentially act as batteries for the grid 3 CONFIDENTIAL 4
Electricity Markets Cool, midmorning day in New England $30 swing Predictive Capabilities CONFIDENTIAL 5
Current Challenges in the Industry Data-related challenges: - Growing volume, quality and timeliness of data from: smart grid, power meters, generators/infrastructure, building management systems, etc. Data volumes will only continue to grow: 90M homes to employ home automation 1 - Predicting grid outages & reliability - High cost for Energy Efficiency services Technical challenges: - Data silos - Data quality challenges - Costly data storage - Rigid architecture Smart meters will increase the number of reads from 6 million to 18 billion per year for a mid-size utility. 2 CONFIDENTIAL 6
POLL CONFIDENTIAL 7
Imagine Project Overview Objective: develop a Big Data strategy, architecture, and roadmap (crosssection of use cases from across the business units) 2 weeks 1 week 1 week 1 week Big Data Strategy Readiness Analysis Key Recommendations Use Case Prioritization Use Case Definition Gap Analysis & Readiness Assessment Architecture Design Capability Definition 24-Month Executive Presentation CONFIDENTIAL 8
Strategy Readiness Key Recommen dations Engaging the Business Priority Discover Validate Use Cases CONFIDENTIAL 9
Strategy Readiness Key Recommen dations Use Cases reduce setback after 9PM 1 50 MW 40 MW 45 MW 150 MW 2 CONFIDENTIAL 10
Strategy Readiness Key Recommen dations Big Data Readiness Big Data Area Reports & Analytics Applications Data Capability Standard Reports Decision Support and Business Ad Hoc Analysis Data Science Analysis Collaboration Tools Development Stack Application Integration Extensibility Scalability Personalization Application Performance Workflow Management Unstructured Data Ingestion Structured Data Ingestion Data Availability Security and Privacy Retention Meta Data Management Data Publication Data Volumes ETL/Data Transformation Data Latency Data Veracity Hire big data engineers, and expose existing engineering team to Big Data technologies, such as Hadoop/Hive/Pig and other big data workshops. Transition data analysts to Big Data-compliant technologies, such as R, Python, etc. Empower data scientists with ad-hoc web-based visualization tools, such as RStudio Shiny. Big Data Area Infrastructure Capability Hosting Strategy Cluster Workload Planning and Allocation Backup & Recovery Availability Security Integration Disaster Recovery Capacity Management & Monitoring Production Group System Admin Cluster Admin Software Developers People & Resources Data Scientists Business Analysts QA Training / Mentorship CONFIDENTIAL 11
Strategy Readiness Key Recommen dations Architecture CONFIDENTIAL 12
Strategy Readiness Key Recommen dations Governance The following is a high-level list of focus areas that the big data governing body may consider for developing processes and policy related to the new Big Data technology. Data Classification PII or other sensitive data - Data transmission security requirements - Data partitioning / segmentation - Data access controls and auditing - Historical data retention requirements / availability Technology / Architecture - New technology introduction - Enterprise versus Business Unit Requirements Data Stewardship - System of record - Data Services and SLAs - Global Access Controls Resource allocation CONFIDENTIAL 13
Strategy Readiness Key Recommen dations Skills Transformation Prior experience: Diverse system environments Application performance management Systems appreciation Metrics-focused New skills: Management & monitoring tools Metrics Automation for scale Lower-level workload tuning Database Administrator Big Data Administrator New Skills: Introduction to Hadoop New tools for data manipulation Variety of new models Challenging top-down approaches Working with unstructured data Bottoms-up pattern discovery Efficient programming at scale Large scale Machine Learning Top-Tier Statistician Data Science Math Modeler Prior experience: Data-focused: digging into details Diverse database environments Deep domain knowledge Familiarity with unstructured data Hybrid db and non-db systems New skills: Data modeling for unstructured data Alternative tools and documentation Languages and APIs (Hive, Pig, M/R) Process Models (M/R, Key/Value) Streaming data processing Lower-level optimizations Data Architect Data Architect Big Data Modeling New Skills: Processing models (MapReduce, Key/Value) Data modeling Schemas for unstructured Languages/APIs (Hive, Pig, M/R) User Defined Functions Streaming Data Processing Work process from small to full-scale Investigating approaches Manual optimization DevOps Developers Big Data Engineer Build on current Data, Analyst, and Developer roles to embrace new Big Data CONFIDENTIAL tools & skills 14
Strategy Readiness Key Recommen dations Big Data Data Quality Use Cases Predictive Grid Failures Automated EE Identification Data Interval, ERP, BMS, 3 rd Party Architecture Real-Time Ingestion Governance, Metadata, Versioning, Tagging People & Organization Staffing & Training Build Data Science Practice CONFIDENTIAL 15
POLL CONFIDENTIAL 16
Questions? CONFIDENTIAL 17