Web based NWP model management



Similar documents
Jozef Matula. Visualisation Team Leader IBL Software Engineering. 13 th ECMWF MetOps Workshop, 31 th Oct - 4 th Nov 2011, Reading, United Kingdom

Getting to Know the SQL Server Management Studio

CS3600 SYSTEMS AND NETWORKS

INSIDE. Preventing Data Loss. > Disaster Recovery Types and Categories. > Disaster Recovery Site Types. > Disaster Recovery Procedure Lists

CS420: Operating Systems OS Services & System Calls

High Availability Essentials

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire 25th

Antelope Enterprise. Electronic Documents Management System and Workflow Engine

How To Use An Easymp Network Projection Software On A Projector On A Computer Or Computer

DiskPulse DISK CHANGE MONITOR

Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont.

HIPAA Compliance Use Case

Grid Computing in SAS 9.4 Third Edition

ACS 5.x and later: Integration with Microsoft Active Directory Configuration Example

In the same spirit, our QuickBooks 2008 Software Installation Guide has been completely revised as well.

THE WINDOWS AZURE PROGRAMMING MODEL

Jetico Central Manager. Administrator Guide

BMC Control-M Workload Automation

What s new in TIBCO Spotfire 6.5

Practice Fusion API Client Installation Guide for Windows

How To Install An Aneka Cloud On A Windows 7 Computer (For Free)

NetIQ Advanced Authentication Framework - Administrative Tools. Installation Guide. Version 5.1.0

Personalizing your Access Database with a Switchboard

IceWarp to IceWarp Server Migration

IBRIX Fusion 3.1 Release Notes

How To Use An Easymp Network Projector On A Computer Or Network Projection On A Network Or Network On A Pc Or Mac Or Ipnet On A Laptop Or Ipro Or Ipo On A Powerbook On A Microsoft Computer On A Mini

Xerox 700 Digital Color Press with Integrated Fiery Color Server. Utilities

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

NASA Workflow Tool. User Guide. September 29, 2010

Information Technology Solutions

Final Report - HydrometDB Belize s Climatic Database Management System. Executive Summary

Introducing: Lync Call Recorder Agent

CLOUD BASED N-DIMENSIONAL WEATHER FORECAST VISUALIZATION TOOL WITH IMAGE ANALYSIS CAPABILITIES

How To Test Your Web Site On Wapt On A Pc Or Mac Or Mac (Or Mac) On A Mac Or Ipad Or Ipa (Or Ipa) On Pc Or Ipam (Or Pc Or Pc) On An Ip

ECFLOW USERS GUIDE. February 2015

WHITE PAPER Citrix XenServer: Virtual Machine Backup. Citrix XenServer. Virtual Machine Backup.

Attix5 Pro Disaster Recovery

VMware Mirage Web Manager Guide

Licensing for BarTender s Automation Editions

Desktop Activity Intelligence

EMC ApplicationXtender Server

Welcome to the new Netop School 6.0 interface!

FactoryTalk Batch Software Suite. PlantPAx. FactoryTalk Batch FactoryTalk eprocedure FactoryTalk Batch Material Manager. Process Automation System

EMC ApplicationXtender Server

Monitoring App V eg Enterprise v6

Outline. Failure Types

Top 10 Information Technology Best Practices for the Architecture, Engineering, and Construction Industry

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005

Yiwo Tech Development Co., Ltd. EaseUS Todo Backup. Reliable Backup & Recovery Solution. EaseUS Todo Backup Solution Guide. All Rights Reserved Page 1

Do it Yourself System Administration

Microsoft Diagnostics and Recovery Toolset 7 Evaluation Guide

Sage CRM Connector Tool White Paper

Computer Networks. Chapter 5 Transport Protocols

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

Amicus Link Guide: Outlook/Exchange

This Deployment Guide is intended for administrators in charge of planning, implementing and

Network device management solution

This Readme includes information pertaining to Novell Service Desk 7.0.

Copyright 2012 Trend Micro Incorporated. All rights reserved.

Installation Notes for Outpost Network Security (ONS) version 3.2

EasyMP Network Projection Operation Guide

Understanding Task Scheduler FIGURE Task Scheduler. The error reporting screen.

Chapter 2: Getting Started

Advanced Aircraft Analysis 3.6 Network Floating License Installation

ibolt V3.2 Release Notes

Chapter 3 Application Monitors

schedulix!focus The independit schedulix Scheduling System in Data Warehouse Environments

HARDWARE LICENSING (DONGLE) USERS GUIDE

MathCloud: From Software Toolkit to Cloud Platform for Building Computing Services

NSi Mobile Installation Guide. Version 6.2

High-Availability User s Guide v2.00

IT Service Level Management 2.1 User s Guide SAS

Version 4.61 or Later. Copyright 2013 Interactive Financial Solutions, Inc. All Rights Reserved. ProviderPro Network Administration Guide.

ONLINE BACKUP MANAGER MS EXCHANGE MAIL LEVEL BACKUP

Migrating from SharePoint 2007 to SharePoint

Backup Server DOC-OEMSPP-S/6-BUS-EN

Using the CoreSight ITM for debug and testing in RTX applications

SQLBase. Starter Guide

Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015

RDS Online Backup Suite v5.1 Brick-Level Exchange Backup

FactoryTalk Gateway Getting Results Guide

Network device management solution.

Workflow Templates Library

Rentavault Online Backup. MS Exchange Mail Level Backup

Syslog Monitoring Feature Pack

Kaseya 2. User Guide. for Network Monitor 4.1

JumpCloud is your Directory-as-a-Service. A fully managed directory to rule your infrastructure whether on-premise or in the cloud.

In order to upload a VM you need to have a VM image in one of the following formats:

Comparison of the High Availability and Grid Options

docs.hortonworks.com

UNICORN 6.4. Administration and Technical Manual

Continuous Data Protection. PowerVault DL Backup to Disk Appliance

SCF/FEF Evaluation of Nagios and Zabbix Monitoring Systems. Ed Simmonds and Jason Harrington 7/20/2009

Programming LEGO NXT Robots using NXC

Transcription:

www.iblsoft.com Web based NWP model management Igor Andruska Numeric Weather Team Michal Weis Managing Director ECMWF Visualisation in Meteorology week 2015 15 th Workshop on Meteorological Operational Systems 29 th September 2015

Motivation - where do we come from IBL develops variety of software for meteorological services: about 40% of ECMWF member/coop NMSs use IBL products operationally Processing flow - Data collection & xfer - Message switching - forwarding to upstream systems - Forecaster s workplace - Integrates a lot of data & tools to create fcst - Visualisation - OGC Web Services - Web weather display - Utilizing OGC Web Services - Tools for forecasters - Widgets for public web

Motivation - workflow & dataflow Visual Weather - operational forecaster's workstation (but not limited to) - mainly integrates & displays observations, remote sensing data, and - models. Used for: for operational forecasting, research, study/universities. Consequently we deal with tasks of forecasters that depend on data (model) availability.

The right schedule

High-level design goals Build a NWP scheduler that can be operated and used without guru-level IT skills for: ad hoc model runs by researchers, students multi-user support regular operational production NWP workflows created from predefined parameterizable functional blocks, just like Lego Example of building blocks: Compute COSMO single domain forecast for initial time <T> and model domain <D> Compute WAVEWATCH III forecast for initial time <T> and domain <D>, take driving wind fields from model <M>

ecflow strengths as we see it Python API - the most important feature for IBL! Hard to live without API nowadays in general. End users often require unpredicted functionality Allows us to implement our own extensions and facades on top of ecflow core Reliable - the most important for end-users (without much skills) Server is extremely stable (never crashed at IBL!) Reliable handling of zombie jobs Smooth recovery after power cuts Support for multi-user environments It is straightforward to run a separate ecflow server instance for each user without any undesired interference among users

ecflow weaknesses #1 Missing possibility to parametrize suite in runtime without coding in Python Parameters: model initial time, forecast range, etc. Built-in commands are bit low level (for non-daily users) To reliably stop a complex suite (kill all running jobs + prevent queued tasks from being executed) with complicated triggering (e.g., nodes triggered when other nodes abort) might require issuing a series of kills instead of just one. This is confusing for non-it users To run a node, users must distinguish between begin, run, re-queue and restart commands. The background is a little bit technical for non-it users

ecflow weaknesses #2 Lack of stable and user friendly UI for monitoring and control ecflowview crashes a lot, complicated UI Output of ecflow_client --get_state CLI command is hard to read No intrinsic support for recursion - (temporal) recursion is very common in NWP, e.g., for: Continuous data assimilation Ocean wave model restarted from the previous run

Building on top of ecflow

Extended suite definitions We extended ecflow suite definition to allow runtime parametrization of tasks: Statements starting with #> Shell-like parameter expansion: Environment variables! Suite arguments % Cross references to other parameters $ Evaluated in runtime when task starts

Recursion Continuous model run: run initialized from forecast of the previous run e.g. continuous data assimilation, warm start of ocean wave model, etc. We must make sure that the chain is always linked, no cold starts e.g. by power outages, hardware failures, maintenance, etc. Need to model this relationship between successive model runs recursion IBL solution: special task in a suite (typically the first one) that checks whether predecessor has been loaded in ecflow. If no, loads it waits until predecessor completes

Macro commands Using Python API, we combined elementary ecflow commands into convenient macro commands with well defined behaviour that cover vast majority of use cases: load-suite - loads suite definition into server, parametrizes suite (e.g., model initial time) and (optionally) runs suite. All in single command run-node - runs suite/family/task regardless of its state (OK, there are few inevitable exceptions). If node is already running there is a switch to stop the node first and then run again run-aborted-tasks - runs only aborted tasks inside family or suite - common operation during recovery of the workflow stop-node - reliably stops suite/family/task in one shot regardless of state of node, its subnodes and triggering scheme delete-suite - deletes suite definition from server + removes all the files associated with the suite on the disk

Monitoring ecflow Command line interface With Python API we created a more user friendly version of ecflow_client --get_state command CLI switch to show/hide triggers labels events meters variables flags Use standard 16 terminal colors to highlight node state Useful for checking remote systems over slow internet lines

Monitoring ecflow Web-based interface Renders state of loaded suites Macro commands Zero-footprint installation

Monitoring ecflow Email alerts 1. Sent automatically when task aborts. Contains error message and Python traceback 2. Sent explicitly when certain conditions occur during processing, like missing crucial observation types for data assimilation

Monitoring model outputs

Monitoring of hardware & system

We will appreciate your comments and welcome further questions. Michal.Weis@iblsoft.com www.iblsoft.com