itesla Project Innovative Tools for Electrical System Security within Large Areas

itesla Project Innovative Tools for Electrical System Security within Large Areas Samir ISSAD RTE France samir.issad@rte-france.com PSCC 2014 Panel Session 22/08/2014 Advanced data-driven modeling techniques for Power Systems

Content 1. itesla Project general presentation 2. Data mining & analysis in the project 3. Offline MC analysis : historical data mining and sampling 4. Security indexes and screening rules with Decision Trees 2

itesla Project general presentation A 2012-2016 european funded project Under the 7th Framework Package (FP7) itesla partners : Transmission System Operators Universities Research Centers Industrials & IT providers 4

itesla Project general presentation To develop a toolbox that will be needed by Transmission System Operators to operate the European power system in the years to come 1. To model the increasing amount of uncertainties in the decision process 2. To take into account system dynamics in the security assessment 3. To model preventive and corrective actions and take them into account in the decision process 4. To provide a solution to a continuous optimization problem from 2 days ahead to real time under uncertainty 5. To develop an open and interoperable toolbox Uncert ainties Action recomm endatio n Dynam ics New online security assessment 5

Starting point: the existing solution Online External data (forecasts and snapshots) Data acquisition and storage Merging module Contingency screening (static Load-Flow) Synthesis of recommendations for the operator

Upgrade #1: dynamic simulations Online External data (forecasts and snapshots) Data acquisition and storage Merging module Contingency screening (Time domain simulations) Info to the operator about transient stability Offline validation of dynamic models Synthesis of recommendations for the operator

Upgrade #2: uncertainties Online External data (forecasts and snapshots) Data acquisition and storage Use of historical data to build uncertainty patterns Merging module Offline validation of dynamic models Monte Carlo approach using base case + uncertainties Contingency screening (Time domain simulations) Synthesis of recommendations for the operator M contingencies N sampled states = NxM online dynamic simulations to be performed in a 15 min time window

Upgrade #3: Filtering process Online Offline External data (forecasts and snapshots) Data acquisition and storage Computation of security rules Sampling of stochastic variables Merging module Elaboration of starting network states Contingency screening (several stages) Impact Analysis (time domain simulations) Offline validation of dynamic models Time domain simulations Synthesis of recommendations for the operator Data mining on the results of simulation

Upgrade #3: Filtering process Offline workflow properties: Not permanently running: only on demand for offline security rules update Called one day per week Use of historical data and data mining techniques to build similar situations to forecast not yet available Offline computation platform has much more computation capacity than the online platform but only used periodically Online workflow properties: online does not mean real time but permanently running Analyses forecasts from D-2 to real time Requires results of offline workflow High filtering rate expected to reduce the online time domain simulations Number of online samples << Number of offline sampled cases 10

Proposed final architecture Online Offline External data (forecasts and snapshots) Offline validation of dynamic models Improvements of defence and restoration plans Data acquisition and storage Merging module Contingency screening (several stages) Time domain simulations Synthesis of recommendations for the operator Computation of security rules Sampling of stochastic variables Elaboration of starting network states Impact Analysis (time domain simulations) Data mining on the results of simulation Anticipate Classify Analyse

Data mining & analysis in the project Data mining techniques will be widely used to extract knowledge from a Bigdata dataset, in particular : to model various stochastic variables through analysis of historical data in order to build realistic samples of network situations to analyze correlations between stochastic variables to build up criteria to detect unacceptable situations to compute confidence intervals of forecast variables in order to replace the classical security assessment of the best estimate situation by a more probabilistic approach 13

Data mining & analysis in the project The IT architecture will be chosen to cope with: - a large volume of data and results of simulations to be processed - high performance requirements (data mining algorithms, dynamic simulations, etc.) Different kinds of HPC-type solutions will be investigated A full scale IT system will be used during the project to demonstrate the relevance of the chosen solution and the feasibility of the itesla SA approach at the Pan-European level 14

Data mining & analysis in the project Data management Data mining services Dynamic simulation Optimizers Graphical interfaces External data (forecasts and snapshots) Data acquisition and storage Computation of security rules Sampling of stochastic variables Merging module Elaboration of starting network states Contingency screening (several stages) Impact Analysis (time domain simulations) Offline validation of dynamic models Improvements of defence and restoration plans Time domain simulations Synthesis of recommendations for the operator Data mining on the results of simulation

Offline MC analysis : historical data mining and sampling Online Offline External data (forecasts and snapshots) Data acquisition and storage Computation of security rules Sampling of stochastic variables Merging module Elaboration of starting network states Contingency screening (several stages) Impact Analysis (time domain simulations) Offline validation of dynamic models Improvements defence and restoration plans Time domain simulations Synthesis of recommendations for the operator Data mining on the results of simulation 22

Offline MC analysis : historical data mining and sampling Input Sampling of external variables Starting point initialisation Dynamic simulations Result classification Extract screening rules Generate snapshots of external (i.e. not controllable) stochastic variables Sampling of: Load levels (active, reactive) Renewable generation capacity (wind, solar, ) Generator availabilities Challenges: Sample full range of parameters that can be encountered on-line (in a future time frame) Obtain sufficient sample density to capture system behaviour at all points Key tasks: Extract probability distributions from historical data Sample high-dimensional dependent variables (e.g. thousands of load points) Use feedback to bias sampling towards high information regions Output 8/27/2014 23

Offline MC analysis : historical data mining and sampling Data dimensionality 1,000s of stochastic variables Size of historical library 10,000s of historical measurements per variable Non-Gaussian data Non-Gaussian marginals Non-linear dependence Correlation is not enough! 24

Offline MC analysis : historical data mining and sampling Historical data Principal Component Analysis Data Clustering ecdf Vine Copula Construction C-Vine Decomposition Copula Family Selection Maximum Likelihood Parameter Estimation Goodness-of-fit Test Sampled data Back- Projection ecdf -1 Copula Sampling Actual domain (MW) PC domain (MW) Rank-uniform domain [0,1] 25

Information Retained Offline MC analysis : historical data mining and sampling PCA is used to reduce the dimension of data, by only retaining variables that contain significant information. Principal Components are linear combinations of the original data e.g. 1: Total system load 2: North-South load variation Etc.. 100% 80% 60% 40% 20% 95% 0% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Number of Principal Components 26

Offline MC analysis : historical data mining and sampling Clustering techniques are useful in partitioning the observed data according to the different modes of the stochastic parameters. 27

Offline MC analysis : historical data mining and sampling Copula Basics 28

Offline MC analysis : historical data mining and sampling The Gumbel copula is an asymmetric Archimedean copula, exhibiting greater dependence in the positive tail. theta=1.5 theta=4

Offline MC analysis : historical data mining and sampling Copulas are used to capture a wide range of non-linear dependencies while decoupling from the non-gaussian marginals. Family: Clayton Parameter: 0.83 BEST FIT 30

Offline MC analysis : historical data mining and sampling Example Sampled 31

Offline MC analysis : historical data mining and sampling Example Historical data 32

Offline MC analysis : historical data mining and sampling In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. In the previous work, we use Anderson-Darling test to do the Goodness-of-fit test for choosing the best pair-copula to fit it on each two pairs of data. However, we only use the calculated Anderson-Darling Statistics to compare and choose the smallest one to determine the best family of copula. In fact, we should use P-value instead of the AD statistics to choose the best family of every pair copula. As the distribution we want to check is Chi-Square distribution, the P-value can be directly obtained by searching the Chi-square distribution table. In Matlab, function chi2pval can get the p-value by inputting the statistic and the degree of freedom. Finally, for k=65, the p-value of the overall sampling output for this case is about 0.996 which means this technique works very well.

Security indexes and screening rules with Decision Trees Input Sampling of external variables Starting point initialisation Dynamic simulations Classify the outcome of each simulation Approach: use dynamic simulation trajectory to compute 5 security indexes measuring different aspects of system performance Overloads Over/under voltages Small signal stability Transient stability Voltage stability Result classification Extract screening rules Challenges: Identifying most suitable security indexes Converting simulator output to security indexes Key tasks: Study of security indexes and their properties Output 8/27/2014 35

Security indexes and screening rules with Decision Trees Input Sampling of external variables Starting point initialisation Dynamic simulations Result classification Analysis: extract screening rules Produce screening rules to be used by online platform for security assessment Approach: Store classification results in database Mine data to extract rules per contingency Rules take the form of decision trees Challenges Screening rules should be conservative Data analysis is performed in many dimensions Extract screening rules Output (WP5) Key tasks Comparison of data mining methods Communication standard regarding screening rule requirements 8/27/2014 36

Security indexes and screening rules with Decision Trees Building security rules ( here on 2D for the sake of clarity actually 1000+ dimensions ) 37

Security indexes and screening rules with Decision Trees Building security rules 38

Security indexes and screening rules with Decision Trees Building security rules 39

Security indexes and screening rules with Decision Trees Building security rules Inequalities on input variables Tree leafs give security status 40

Security indexes and screening rules with Decision Trees Convexity constraint? Usage of Reduced variables Examples of Issues encountered If so, need to store PCA definitions as part of security rules Validity domain : usage? Feedback for refining sampling process (importance sampling) Attribute selection Rule encoding Etc.

http://www.itesla-project.eu/ Thanks you for your attention! 42