Intelligent Operation Analysis and Application of Power Big Data BaiQing DIAO 2014.9.15 1
Contents Ⅰ Introduction of Power Big Data Ⅱ Application of Technology Framework Ⅲ Power Big Data Quality Management Ⅳ Data Mining and Visualization 2
Introduction of Power Big Data-Types of Business Operation data of the company includes all kinds of business information of power grid. Transmission Condition mo nitoring for transmission line HVDC Flexible Transformation Intelligent power station Distribution Distribution Automation MDS Management Utilization Electricity Consumption Information Gathering Intelligent b uilding Electric Vehicles Dispatch EMS OMS Resources Guarantee HR Science and t echnology 3
Introduction of Power Big Data- Business System With business system deployment and application, data volume increases sharply, various types of data, the data storage, handling and value mining put forward higher requirements, and the need to frame a unified data platform and data management. ERP HR System Finalcial System Materials System TB->PB Manufacturing system Storage of equipment information, maintenance and order information Data volume to TB Dispatching System Cover scope broad huge data volume high data processing performance Marketing system Data volume increases from TB to PB Amount of data from customer service senter to PB Structured Data Platform Data volume of data warehouse to TB Growth of data volume to TB every month Mass/Real time Data Platform Electricity consumption and transformation systems information gathering Data storage to TB Unstructured Data Platform Amount of data to 10 million in existing business systmes Total data storage to TB GIS Data Platform GIS data platform includes graphic data, attribute data and topological data Data storage in GIS to TB Data volume from TB to Pb, high data saving performance and extensibility Intelligent and subtilized busines s, requires real time and high data processing complexity Data mining capability in cross-business and platform needs to improve 4
Introduction of Power Big Data-Business Data Power Big Data include:structured Data Unstructured Data Mass/Real time Data and GIS Data. Structured Data Structured Data(Relational Data),Data from business applications, for horizontal sharing, vertical cascade and data analysis, etc Unstructured Data Unstructured Data(Files Data) Includes office documents, pictures, XML, HTML, videos and audios. Mass/Real time Data Mass/Real time Data(Timestamp Data) generated from the acquisition system and the condition monitoring system. GIS Data Graphic data Attribute data Topological data 5
Introduction of Power Big Data-Data Attribute 4 V : Volume Variety Velocity Value 3 E : Energy Exchange Empathy 4V Engergy Data Factory Efficiency Power Plant Power to Data Exchange Visualization Real-time Two-way interaction Exchange Empathy Reflect requirements Cross boundary Ecological benefit
Contents Ⅰ Introduction of Power Big Data Ⅱ Application of Technology Framework Ⅲ Power Big Data Quality Management Ⅳ Data Mining and Visualization 7
Application of Technology Framework BI/Report Retrieval/Vi sualization Functional Application Professional Application Analytics Application Predictive Analytics DATA Services Platform 数 据 管 理 Metadata Management Model Management Data Management And Service Data Service Data Quality Data Mining Stream Computing Relational Database Data Application Data Mining And Analysis Data Integration and Govemance Data Integration Data Analysis Data Computing Data Calculation Data Storage Data Store NoSQL Data Retrieval Parallel Computing Distributed File System Data Integration And Governance Data Governance Structural Data Unstructured Data Data Source Data Mass Source Real Time Data GIS Data 8
Key Technology 1-Data Integration Using technologies such as ETL, OGG, extraction in the distribution of structured/unstructured/mass/real-time/gis data in a business system. Data Source Structured data Unstructured data Mass/realtime data GIS data Data Extraction ETL SQL WebService Copy OPC Data Association Data filtering Data standardization Extracts features Associate metadata Data Quality Check Integrity Accuracy Timeliness Target Data Structured data Unstructure d data Mass/realtime data GIS data 9
Key Technology 2-Data Storage Improving the capacity of data storage using infrastructure such as Relational database cluster, Distributed real-time database and Distributed file System. Structured Data Management Platform Unstructured Data Management Platform Mass/real-time Data Management Platform GIS Data Management Platform Relational Database Cluster Distributed real-time database Distributed file System RDB RDB RDB Database management in-memory database Metadata management Access control NoSQL Database redundancy strategy Cloud Storage 10
Key Technology 3-Data Calculation Improving the capacity of data calculation with technology such as Parallel Processing Technology, Stream Computing Technology and Indexing Technology. Parallel Processing Technology Stream computing technology Indexing Technology 11
Key Technology 4-Data Mining Using data mining technology, to build the business analysis model, find business value. Data Mining Technology Multi-dimensional Analysis Parallel computing Cluster Association Rules MapReduce Semantic engines Data Mining Classify Memory computing Time Series Regression Analysis Data Mining Model State grid analysis Model Load forecast of distribution Model Electricity sales forecast Model User behavior Model Optimize purchasing structure Model Analytical Application Power grid state tracking Load forecast of distribution Analytical Electricity sales Analytical User behavior Analytical Intelligence knowledge base 12
Key Technology 5-Data Application and Presentation Data presentation: By using different kinds of data presentation, such as charts, GIS, video, and map. To display data analysis results through PC, large screen, tablet, and mobile devices. Display Vector Display Form Analysis and Mining Clustering analysis Predictive analytics Association Analysis Variance analysis Data Source Structured Data Unstructured Data Mass/Real time Data GIS Data 13
Key Technology 6-Data Management and Services Data management and services: By establishing data service platform, to unified manage data storage, calculation, and data mining. Data Quality Management Platform Interface verify management Code verify management Monitor class management Data Quality Management Data quality statistic Verify scene management Data interface monitoring Data interface monitoring Data quality report Verify result board Data access monitoring Data schedule management Data stream monitoring Automati c dispatch Data service management Connecti on Service Access Service Data model manageme nt Data operations manageme nt Metadata Management Master Data Management Structured Data Unstructured Data GIS Data Mass/Real time Data Metadata Master Data Integration enterprise layer management model 14
Contents Ⅰ Introduction of Power Big Data Ⅱ Application of Technology Framework Ⅲ Power Big Data Quality Management Ⅳ Data Mining and Visualization 15
Power Big Data Quality Management-Data Link Power big data come from various business systems. Through organizing data link, to make sure the management responsibility of each point in data link, to draw the data distribution and flow map. Application Storage Stream Storage Business 16
Power Big Data Quality Management- Data Link Monitoring To design monitor rule of data link points in creation, calculation, transformation, and circulation, so that to locate data quality related issues and realize closed-loop management. 17
Power Big Data Quality Management- Technical Standard Development of data quality technical standard, including data model, data access, application development, and utilization. 18 18
Power Big Data Quality Management- Management Standard Development of data quality management standard, including data connection, alteration, and utilization. 19 19
Contents Ⅰ Introduction of Power Big Data Ⅱ Application of Technology Framework Ⅲ Power Big Data Quality Management Ⅳ Data Mining and Visualization 20
Data Mining-Application Framework 11 22 44 33 1. Indicator Data mining of single indicator of historical data, find change law of indicators,monitor indicator of data quality and business tansaction. Suitable data mining algorithms: time series, linear fitting. 2. Vertical Indicator relationship The historical data of the lower layer indicator as the basis, through the data mining for the upper point of concern. Suitable data mining algorithms: Regression, associatio n rules, clustering 3. Horizontal Indicator relationship Through studying the relevance between the index of two two types of data, build the indicator data correlation network di agram, to find potential business association t o provide decision basis. Suitable data mining algorithms: correlation analysis, association rules, clustering. 21
Data Mining1-Indicator Example Short-term Power Load Forecasting based on Fuzzy Information Granulation SVM Predicted Results: [L,R,U]=[595.24,680.69,727.26] 716 703 690 669 658 648 638 637 644 646 648 635 623 619 602 604 627 637 669 667 669 693 701 698 711 712 727 723 721 708 692 713 716 718 722 710 705 725 706 718 700 670 666 676 678 690 686 723 22
Data Mining2-Vertical Indicator Relationship Example power distribution reliability (27indicators,6 notes,21 parameters) The average outage time of user Power supply reliability rate User average failure times of interruption User average scheduled interruption time Customer average interru ption time Transformer fault outage rate weight?% weight?% weight?% weight?% weight?% weight?% weight?% Customer average insufficient power supply Scene selection Indicator Set This scene aims at studying electricity average interruption duration and 27 indicators, to analyze the weights and score of power distribution reliability. To solve current business issues. The analysis based on power supply reliability indicator, by selecting the time history data of power supply reliability, to analyze the relationship between correlation factor and weight. 23
Data Mining2-Vertical Indicator Relationship Example Through factor analysis can obtain the factor of original 27 indicators: varianc e weight The average outage time of user Transform er fault outag e rate User average failure times of interruptio n 业 务 稳 定 Customer 性 分 析 average insufficient power supply Factor total value 0.066 0.052 0.048 0.040 0.5305 factor1/facto r total Factor2/fac tor total Factor3/factor total factor4/factor total Factor n/factor total 1 power distribution reli ability 0.124 The average outage time of user 0.09 Transformer fault outage rate 0.15 0.09 User average failure times of interruption Supply Radius 0.10 Service Life 0.13 Cost of Deprecia tion Facility Cost 24
Data Mining3-Hrizontal Indicator Realtionship Among the indicator rule is starting from one or more concern, research the data regularity between the two indicators and business rules, so as to construct the indicators correlation network diagram, improve business personnel s understanding of the indicator rule. Monitoring the lifting point: (1) through the positive and negative correlation between indicators provide monitoring means for monitoring the quality of data (2) through the positive and negative correlation between business state can monitor the current integrated service quality Analysis of lifting points: (1) the strength of the correlation between indicators, provide the data analysis range of thematic analysis later (2) the correlation between indicators found potential business rules, business optimization and upgrading to provide analysis basis Indicator1 Indicator2 25
Detailed monitoring and data analysis E 市 Power feature with area: The city E has 60% samples in the cluster-3, 40% samples in cluster-1. From the time trend point of view, city E has the trend to change from cluster-3 to cluster-1 recently. 10 年 11 年 12 年 13 年 10 年 11 年 12 年 13 年 10 年 11 年 12 年 13 年 聚 类 一 聚 类 二 聚 类 三 A 市 47 1 B 市 48 C 市 48 D 市 48 E 市 18 30 F 市 48 G 市 48 H 市 48 I 市 47 1 J 市 48 K 市 1 46 1 L 市 1 47 M 市 47 1 N 市 48 26
Big Data Application Technology Power load flow analysis Use the data flow diagram to show the area of enterprise power load variation with time changes. electric power load change. Usage: (1) each ribbon r epresents the load of corresponding city; (2) ribbon width represents load value or the percentage of load in the province; (3) by moving the mouse can view the load of any time and city. 27
Big Data Application Technology Treemap Analysis Using Treemap to show the detailed data of various categories. (1) the block area on behalf of the blocks represent the total applied electricity capacity; (2) using the nested model conveniently show multilevel data constitute relationship. 28
Big Data Application Technology Bubble chart analysis To access different electricity capacity increase of cities, through the analysis of the data, for each city using a bubble chart display, a bubble chart size represents electricity increase that appeared, the bubble number withi n a block represents the number of classification of the electricity increase that the staff had made. Example The chart records different electricity capacity increase of cities. As we can see, the pink one(rural residents living electricity) is the largest part of every city except city C, grey part is the second part (general business). The city in the lower right chart has few capacity increase(7 kinds) compared with other city. 29