APPLICATIONS OF DATA MINING TO PREDICT MESOSCALE WEATHER EVENTS (TORNADOES AND CLOUDBURSTS)



Similar documents
6A.2 The testing of NSSL multi-sensor applications and data from prototype platforms in NWS forecast operations

Developing Continuous SCM/CRM Forcing Using NWP Products Constrained by ARM Observations

UTILIZING GOOGLE EARTH AS A GIS PLATFORM FOR WEATHER APPLICATIONS

NOWCASTING OF PRECIPITATION Isztar Zawadzki* McGill University, Montreal, Canada

How To Forecast Solar Power

Regional Forecast Center Timişoara 15. Gh. Adam St., Timişoara, Romania,

Parameterization of Cumulus Convective Cloud Systems in Mesoscale Forecast Models

[ Climate Data Collection and Forecasting Element ] An Advanced Monitoring Network In Support of the FloodER Program

Real-time Quality Control of Reflectivity Data Using Satellite Infrared Channel and Surface Observations

Development of a. Solar Generation Forecast System

Baudouin Raoult, Iryna Rozum, Dick Dee

THE STRATEGIC PLAN OF THE HYDROMETEOROLOGICAL PREDICTION CENTER

Meteorological Forecasting of DNI, clouds and aerosols

ANALYSIS OF THUNDERSTORM CLIMATOLOGY AND CONVECTIVE SYSTEMS, PERIODS WITH LARGE PRECIPITATION IN HUNGARY. Theses of the PhD dissertation

SOLAR IRRADIANCE FORECASTING, BENCHMARKING of DIFFERENT TECHNIQUES and APPLICATIONS of ENERGY METEOROLOGY

Artificial Neural Network and Non-Linear Regression: A Comparative Study


Hong Kong Observatory Summer Placement Programme 2015

II. Related Activities

8B.6 A DETAILED ANALYSIS OF SPC HIGH RISK OUTLOOKS,

Random forest algorithm in big data environment

USING SIMULATED WIND DATA FROM A MESOSCALE MODEL IN MCP. M. Taylor J. Freedman K. Waight M. Brower

Data Sets of Climate Science

Project Title: Quantifying Uncertainties of High-Resolution WRF Modeling on Downslope Wind Forecasts in the Las Vegas Valley

Design and Deployment of Specialized Visualizations for Weather-Sensitive Electric Distribution Operations

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Nowcasting of significant convection by application of cloud tracking algorithm to satellite and radar images

David P. Ruth* Meteorological Development Laboratory Office of Science and Technology National Weather Service, NOAA Silver Spring, Maryland

Use of Artificial Neural Network in Data Mining For Weather Forecasting

MICROPHYSICS COMPLEXITY EFFECTS ON STORM EVOLUTION AND ELECTRIFICATION

Requirements of Aircraft Observations data and Data Management Framework for Services and Other Data Users. (Submitted bymichael Berechree)

Flash Flood Guidance Systems

Basic Climatological Station Metadata Current status. Metadata compiled: 30 JAN Synoptic Network, Reference Climate Stations

REDUCING UNCERTAINTY IN SOLAR ENERGY ESTIMATES

P2.7 Online Weather Studies in a 2-year program in Applied Meteorology at West Virginia State University

ENVIRONMENTAL MONITORING Vol. I - Remote Sensing (Satellite) System Technologies - Michael A. Okoye and Greg T. Koeln

Development of an Integrated Data Product for Hawaii Climate

1 In this report, "tropical cyclone (TC)" is used as a generic term that includes "low pressure area (LPA)", "tropical depression

Partnership to Improve Solar Power Forecasting

Weather Radar Basics

Estimating Firn Emissivity, from 1994 to1998, at the Ski Hi Automatic Weather Station on the West Antarctic Ice Sheet Using Passive Microwave Data

Basics of weather interpretation

New challenges of water resources management: Title the future role of CHy

IMPACT OF SAINT LOUIS UNIVERSITY-AMERENUE QUANTUM WEATHER PROJECT MESONET DATA ON WRF-ARW FORECASTS

Climate Extremes Research: Recent Findings and New Direc8ons

Predicting Flight Delays

The THREDDS Data Repository: for Long Term Data Storage and Access

The Scientific Data Mining Process

Real-time, rapidly updating severe weather products for virtual globes

Empirical study of the temporal variation of a tropical surface temperature on hourly time integration

Implementation of Data Mining Techniques for Weather Report Guidance for Ships Using Global Positioning System

Recent activities on Big Data Assimilation in Japan

RAVEN: A GUI and an Artificial Intelligence Engine in a Dynamic PRA Framework

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Joint Polar Satellite System (JPSS)

COMPUTING CLOUD MOTION USING A CORRELATION RELAXATION ALGORITHM Improving Estimation by Exploiting Problem Knowledge Q. X. WU

AMS 2009 Summer Community Meeting Renewable Energy Topic

Knowledge Discovery from patents using KMX Text Analytics

Hurricanes. Characteristics of a Hurricane

Application of Numerical Weather Prediction Models for Drought Monitoring. Gregor Gregorič Jožef Roškar Environmental Agency of Slovenia

Sanjeev Kumar. contribute

Grid Density Clustering Algorithm

INVESTIGATIONS INTO EFFECTIVENESS OF GAUSSIAN AND NEAREST MEAN CLASSIFIERS FOR SPAM DETECTION

Comparative Evaluation of High Resolution Numerical Weather Prediction Models COSMO-WRF

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Solar Irradiance Forecasting Using Multi-layer Cloud Tracking and Numerical Weather Prediction

COASTAL WIND ANALYSIS BASED ON ACTIVE RADAR IN QINGDAO FOR OLYMPIC SAILING EVENT

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Fundamentals of Climate Change (PCC 587): Water Vapor

Prediction of Heart Disease Using Naïve Bayes Algorithm

Application of Google Earth for flood disaster monitoring in 3D-GIS

SPATIAL DATA CLASSIFICATION AND DATA MINING

IBM Big Green Innovations Environmental R&D and Services

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

Medical Big Data Interpretation

WEATHER RADAR VELOCITY FIELD CONFIGURATIONS ASSOCIATED WITH SEVERE WEATHER SITUATIONS THAT OCCUR IN SOUTH-EASTERN ROMANIA

Predictive modelling around the world

SIXTH GRADE WEATHER 1 WEEK LESSON PLANS AND ACTIVITIES

2. The map below shows high-pressure and low-pressure weather systems in the United States.

ANALYSIS OF INDIAN WEATHER DATA SETS USING DATA MINING TECHNIQUES

The State of the Climate And Extreme Weather. Deke Arndt NOAA s National Climatic Data Center

Weather forecast prediction: a Data Mining application

DIN Department of Industrial Engineering School of Engineering and Architecture

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Transcription:

International Journal of Computer Engineering and Technology (IJCET) Volume 6, Issue 7, July 2015, pp. 19-26, Article ID: 50120150607003 Available online at http://www.iaeme.com/currentissue.asp?jtype=ijcet&vtype=6&itype=7 ISSN Print: 0976 6367 and ISSN Online: 0976 6375 IAEME Publication APPLICATIONS OF DATA MINING TO PREDICT MESOSCALE WEATHER EVENTS (TORNADOES AND CLOUDBURSTS) Miss Gurbrinder Kaur Assistant Professor, M.C.A Department, BCIIT, Delhi ABSTRACT Over the last decade or so, predicting the weather and climate has emerged as one of the most important areas of scientific Research. This is partly because the increase in skill of current weather forecasts has made society more and more dependent on them day to day for a whole range of decision making. And it is partly because climate change is now widely accepted and the realization is growing rapidly that it will affect every person in the world either directly or indirectly. Keywords: False Alarm Ratio (FAR), Mesocyclone Detection Algorithm (MDA), Numerical Weather Prediction (NWP), Receiver Operating Characteristic (ROC), Probability of Detection (POD). Cite This Article: Miss Gurbrinder Kaur, Applications of Data Mining To Predict Mesoscale Weather Events (Tornadoes and Cloudbursts). International Journal of Computer Engineering and Technology, 6(7), 2015, pp. 19-26. http://www.iaeme.com/currentissue.asp?jtype=ijcet&vtype=6&itype=7 1. INTRODUCTION Although considerable progress has been made in the observation, modeling and understanding of tornadoes, warning and forecasting before ahead remains a considerable challenge for the forecasters. The statistics have clearly shown warning probability of detection (POD) and lead time have remained at the same level in recent years with false alarm ratio (FAR) remaining relatively constant. This is principally because the existing radars and weather detection methodologies suffer from limitations that allow meteorological quantities and associated features to go undetected. There is a need of new advances in this area if substantial improvements in warning and forecasting accuracy are to take place. The improvements over the past decades are evident from the Figure.1 with POD and Lead Time but FAR relatively remain constant over the past 20 years with approximate value of 75%.Because FAR never shown improvement over the past 20 http://www.iaeme.com/ijcet.asp 19 editor@iaeme.com

Miss Gurbrinder Kaur years thus is not expected to improve in near future until new advancement in the technology is not developed. Figure.1 Nationwide tornado warning verification statistics from 1986-2007 as well as NWS goals for new storm-based beginning in 2008: Probability of Detection (black line with circles), false alarm ration (red line with squares) and lead time (blue line) with future goals (same with dotted lines). [Data courtesy of B. MacAloney II, National Weather Service Performance Branch, 2008] 2. RELATED WORK A. Predicting Tornadoes by Applying Data Mining Techniques In [1] the goal of much of Amy McGovern s research as an associate professor in the School of Computer Science at the University of Oklahoma has been to revolutionize tornado prediction and other forms of severe weather. The author has done these using artificial intelligent techniques, data mining, machine learning, and storm simulations. The research proves that Radars provide an incomplete picture of the atmosphere. Although they can sense the intensity of the precipitation and a single dimension of the wind vector, there are many other important variables such as the full threedimensional wind field, pressure, temperature, etc. that are important to prediction [2].The author has developed a unique set of simulations of supercell thunderstorms which are most severe type of thunderstorms and cause most destructive tornadoes. McGovern s models provide the ability to identify spatiotemporal relationships between these regions that can be used to predict the severe weather events. Novel data mining models has been developed that make use of the spatiotemporal nature of the data because neither space nor time can be ignored for weather prediction. Weather is three-dimensional and the models can identify arbitrary shapes and relationships between the shapes. In [3] McGovern et al. developed spatiotemporal models and applied these models to severe weather data. These models addressed both the spatial and spatiotemporal changes in data using a relational approach. In their work they have also developed a set of high resolution simulations capable of resolving tornadoes. In [4] V Lakshmanan, Gregory J. Stumpf, Arthur Witt developed A Mesocyclone Detection Algorithm (MDA) and a near-storm environment (NSE) algorithm at the National Severe Storms Laboratory. The MDA algorithm identified those storm-scale circulations which are precursors to tornadoes. Marzban and Stumpf in [5] and [6] http://www.iaeme.com/ijcet.asp 20 editor@iaeme.com

Applications of Data Mining To Predict Mesoscale Weather Events (Tornadoes and Cloudbursts) developed a neural network based on the MDA parameters to identify which of the circulations would be tornadic using a small set of data cases [5]. That work was extended to cover 43 storm days in [7] using a more robust methodology. The neural networks developed in this paper (both for MDA and MDA+NSE inputs) achieve similiar Heidke skill scores on the training, validation and independent data sets. The low variability of the Receiver Operating Characteristic (ROC) plots in this paper also suggest that the neural networks developed in this paper are robust and not overtrained. In [8] Indra Adrianto, Theodore B. Trafalis, And Valliappa lakshmanan make use of Support Vector Machines for predicting the location and time of tornadoes. They extended the work of Lakshmanan et al [7] to use a set of 33 storm days and introduced some variations to the above results. The objective of the research was to estimate the probability of a tornado event at a particular location within a given time window. They presented least-squares methodology to estimate shear, quality control of radar reflectivity, morphological image processing to estimate gradients, fuzzy logic to generate compact measures of tornado possibility and support vector machine classification to generate the final spatiotemporal probability field. The results of the research proved that it might increase the lead time of tornado warning since the estimated probability that there would be a tornado at a particular spatial location in the next 30 minutes, while the average lead time of a tornado being predicted by the National Weather Service currently is 18 minutes. Thus the results were promising. Thus more spatial inputs can be considered and other classification methods such as Bayesian SVMs and Bayesian neural networks may improve the results. B. Application of Data Mining In Predicting Cloudburst Formation There is no satisfactory technique for anticipating the occurrence of cloud bursts because of their small scale. A very fine net work of radars is required to be able to detect the likelihood of a cloud burst and this would be prohibitively expensive. Only the areas likely to receive heavy rainfall can be identified on a short range scale. A real life case of cloudburst has been discussed using DM k-means clustering technique by Kavita in [9]. It is observed that this very large region of relative humidity is an early signal of formation of cloudburst. In the research, the derivation of sub-grid scale weather systems from NWP model output products is demonstrated. Such signals are not possible through normal MOS technique. The study has demonstrated that intelligent systems can be a good alternative for unstable MOS. Data mining, specially clustering when applied on divergence and relative humidity can provide an early indication of formation of cloudburst. This study is an effort towards providing timely and actionable information of these events using data mining techniques in supplement with NWP models that can be a great benefit to society. 3. PRINCIPAL AND METHODOLOGY OF WEATHER FORECASTING A. Ensemble Forecasting A forecast is an estimate of the future state of the atmosphere. It is created by estimating the current state of the atmosphere using observations, and then calculating how this state will evolve in time using a numerical weather prediction computer model. As the atmosphere is a chaotic system, very small errors in its initial state can http://www.iaeme.com/ijcet.asp 21 editor@iaeme.com

Miss Gurbrinder Kaur lead to large errors in the forecast. This means that we can never create a perfect forecast system because we can never observe each detail of the atmosphere's initial state. Tiny errors in the initial state will be amplified, so there is always a limit to how far ahead we can predict any detail. To test how these small differences in the initial conditions may affect the outcome of the forecast, an ensemble system can be used to produce many forecasts. Instead of running just a single forecast, the computer model is run a number of times from slightly different starting conditions. The complete set of forecasts is referred to as the ensemble, and individual forecasts within it as ensemble members. Instead of running just a single forecast, the computer model is run a number of times from slightly different starting conditions. The complete set of forecasts is referred to as the ensemble, and individual forecasts within it as ensemble members. Figure. 2. Schematic of how the ensemble samples the uncertainty in the forecast. The notion of ensemble forecasting was first introduced in the studies of Lorenz [10], where he examined the initial state uncertainties and well known butterfly effect. The study of Lorenz showed that no matter how good the observations are, or how good the forecasting techniques, there is almost certainly an insurmountable limit as to how far into the future one can forecast. In ensemble forecasting the major issue relates to the removal of the collective errors of multimodels. The major drawback of straight average approach of assigning an equal weight of 1.0 to each model is that it may include several poor models. The average of these poor models degrades the overall results. To address this problem if ensemble forecasting, in [11] and [12] Krishnamurti introduced a multimodel super ensemble technique that shows a major improvement in the prediction skill. B. Observation and Assimilation of Observational Data Observations are important to the process of creating forecasts. Around huge number of observations is received recording the atmospheric conditions around the world every day. Current main sources of observations are: Surface and marine data, satellites, weather balloons and aircraft. To use these observations in an operational weather forecasting system, observations have to monitor their availability; quality controls them, and processes them into a form that can be used by the computer models and forecasters. Current main sources of observations are surface and marine data, satellites, radiosondes and aircrafts. Even with the many observations received we do not have enough information to tell us what the atmosphere is doing at all http://www.iaeme.com/ijcet.asp 22 editor@iaeme.com

Applications of Data Mining To Predict Mesoscale Weather Events (Tornadoes and Cloudbursts) points on and above the Earth's surface. There are large areas of ocean, inaccessible regions on land and remote levels in the atmosphere where we have very few, or no, observations. To fill in the 'gaps' we can combine what observations we do have with forecasts of what we expect the conditions in the atmosphere to be. This is a process called data assimilation and gives us our best estimate of the current state of the atmosphere - the first step in producing a weather forecast. Without data assimilation, any attempt to produce reliable forecasts is almost certain to end in failure. Data assimilation research is focused on making the best use of observations using advanced variational and ensemble data assimilation techniques. C. Numerical Weather Prediction Model The numerical weather prediction (NWP) process involves assimilation of observations to provide the starting conditions for a numerical weather forecast model. The model is essentially a computer simulation of the processes in the Earth's atmosphere, land surface and oceans which affect the weather. Once current weather conditions are known, the changes in the weather are predicted by the model. Even tiny changes in the atmospheric conditions can lead to drastically different weather patterns after only a short time, so it is vital that the current state of the atmosphere is represented as accurately as possible. This process is highly mathematical and takes the supercomputer longer to accurately estimate the current atmospheric state than it does to actually make the forecast. Weather Forecasting entails predicting how the present state of the atmosphere will change. Present weather conditions are obtained by ground observations, observation from satellites, ships, aircraft, buoys, balloons and weather stations covering the entire planet. This includes information from over the oceans, from the surface (ships and buoys), from high in the atmosphere (satellites) and below the oceans (a network of special floats called Argo).Creating forecasts is a complex process which is constantly being updated. Weather forecasts made for 12 and 24 hours are typically quite accurate. Forecasts made for two and three days are usually good. But beyond about five days, forecast accuracy falls off rapidly. The rate of data generation and storage far exceeds the rate of data analyses. This represents lost opportunities in terms of scientific insights not gained and impacts or adaptation strategies not adequately informed. D. The Synoptic and Mesoscale Weather Phenomenon The synoptic scale in meteorology is the term used to describe the scale of large-scale weather systems of the scale of the order of 1000 kilometres or more. The extratropical weather. This corresponds to weather events to occur at low pressure areas e.g extropical cyclones. The term mesoscale is believed to have been introduced by Ligda in [13] reviewing the use of weather radar, in order to describe phenomena smaller than the synoptic scale but larger than the microscale, a term that was widely used at the time (and still is) in reference to phenomena having a scale of a few kilometers or less. Several weather events associated with small-scale disturbances, regarded as noise in daily weather analyses, became the focal point of storm researchers a micro study by Fujita [14].Meanwhile U.S weather Bureau defined the mesoscale to be centered between 10 and 100 mi, leading to the publication of mesometeorological (mesometeorological study of squall lines by Fujita[15].Further Fujita in [16] found that diameter of tornadoes rarely exceeds 1000m or the mesoscale. http://www.iaeme.com/ijcet.asp 23 editor@iaeme.com

Miss Gurbrinder Kaur Figure. 3 Typical Time and Space Scale of atmospheric motion (Source: DTU university of Denmark) Figure.4. From large scale to small scale forecast (source: Mesoscale meteorological modeling, university of Denmark) 4. CONCLUSION While forecasters can identify conditions favorable for major tornado outbreaks several days in advance, short-term forecasting of individual storms, providing additional advanced notice, and predicting probable tornado paths remain a challenge. Because of these limitations the weather forecasters strongly need to corporate additional information to develop the better understanding of the formation of tornadoes. 5. ACKNOWLEDGEMENT The author would like to express deepest sense of gratitude to Guide Dr. Rattan K. Datta, Former Advisor, Department of Science & Technology, Government of India http://www.iaeme.com/ijcet.asp 24 editor@iaeme.com

Applications of Data Mining To Predict Mesoscale Weather Events (Tornadoes and Cloudbursts) and currently Director, Mohyal Educational Research Institute of Technology, for his encouragement, guidance and mentoring. Without his support, it would not have been possible to take up research in this challenging field. REFERENCES [1] McGovern, Amy and Barto, Andrew G. (2002) Autonomous Discovery of Temporal Abstractions from Interaction with an Environment.Poster presentation at the Symposium on Abstraction, Refomulation, and Approximation (SARA 2002), Volume 2371/2002, pages 338-339. [2] McGovern, Amy and Hiers, Nathan and Collier, Matthew and Gagne II, David J. and Brown, Rodger A. (2008). Spatiotemporal Relational Probability Trees. Proceedings of the 2008 IEEE International Conference on Data Mining, Pages 935-940. Pisa, Italy. 15-19 December 2008. [3] McGovern, Amy and Gagne II, David John and Troutman, Nathaniel and Brown, Rodger A. and Basara, Jeffrey and Williams, John. (2011) Using Spatiotemporal Relational Random Forests to Improve our Understanding of Severe Weather Processes. Statistical Analysis and Data Mining, special issue on the best of the 2010 NASA Conference on Intelligent Data Understanding. Vol 4, Issue 4, pages 407-429 [4] Lakshmanan, V., Rabin, R. and DeBrunner, V. (2003a) Multiscale storm identification and forecast, Atmospheric Research, 67-68, 367 380. [5] Lakshmanan, V., Hondl, K., Stumpf, G., and Smith, T. (2003b) Quality control of weather radar data using texture features and a neural network, in 5th International Conferece on Advances in Pattern Recognition (Kolkota, India), IEEE. [6] Lakshmanan, V., Adrianto, I., Smith, T., and Stumpf, G. (2005a) A spatiotemporal approach to tornado prediction, in Proceedings of 2005 IEEE International Joint Conference on Neural Networks (Montreal, Canada), 3, 1642 1647. [7] Lakshmanan, V., Stumpf, G., and Witt, A. (2005b) A neural network for detecting and diagnosing tornadic circulations using the mesocyclone detection and near storm 21 environment algorithms, in 21st International Conference on Information Processing Systems (San Diego, CA), American Meteorological Society, CD ROM, J5.2. [8] Adrianto, I., Trafalis, T. B., & Lakshmanan, V., Support vector machines for spatiotemporal tornado prediction, International Journal of General Systems, Volume 38, Issue 7, Pages 759 776, 2009. [9] (Kavita Pabreja; Rattan K. Datta) A data warehousing and data mining approach for analysis and forecast of cloudburst events using OLAP-based data hypercube Int. J. of Data Analysis Techniques and Strategies, 2012 Vol.4, No.1, pp.57 82 [10] Lorenz E.N 1963 Deterministic non-periodic flow. J. Atmos. Sci. 42, 433 471. [11] Krishnamurti, T. N., C. M. Kishtawal, T. LaRow, D. Bachiochi, Z. Zhang, C. E. Williford, S. Gadgil, and S. Surendran (1999), improved weather and seasonal climate forecasts from multimodel superensemble, Science, 285, 1548 1550, doi:10.1126/science.285.5433.1548. [12] Krishnamurti, T. N., C. M. Kishtawal, Z. Zhang, T. LaRow, D. Bachiochi, C. E. Williford, S. Gadgil, and S. Surendran (2000), Multimodel ensemble forecasts for weather and seasonal climate, J. Clim., 13, 4196 4216, doi:10.1175/1520-0442(2000)0132.0.co. http://www.iaeme.com/ijcet.asp 25 editor@iaeme.com

Miss Gurbrinder Kaur [13] Ligda, M. G. H., 1951: Radar storm observation. Compendium of Meteorology, T. F. Malone, Ed., Amer. Meteor. Soc., 1265 1282 [14] Fujita, T.T., 1973: Proposed mechanism of tornado formation from rotating thunderstorms. [15] Climatological Data, National Summary, 4, 6, 1953, p. 181. FUJITA, T., 1950: Microanalytical study of thundernose, Geoph. Mag. ojjapan, 22, 2, pp. 71-88. [16] Fujita, T. T., 1963: Analytical mesometeorology: A review, Meteor. Monogr., 5, No. 27, Amer. Meteor. Soc., 77-125 http://www.iaeme.com/ijcet.asp 26 editor@iaeme.com