Mitglied der Helmholtz-Gemeinschaft Anwendungsintegration und Workflows mit UNICORE 6 Bernd Schuller und UNICORE-Team Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH 26. November 2009 D-Grid Workshop Numerische Simulationen Göttingen
Outline UNICORE overview Data management and job model Applications in UNICORE Workflows Demo?! Slide 2
Integrated, complete Grid middleware stack Easy to install, configure, administrate and monitor Excellent application and workflow support Suited for both high performance and high-throughput usage Wide variety of clients: GUI, commandline, APIs Java/Perl based, supports common operating and resource management systems OS: UNIXes, MacOS X, Windows. RMS: no-batch, Torque, LoadLeveller, LSF, SGE,... Active developers, responsive to user wishes ;-), quick and efficient support Open source, see http://www.unicore.eu Slide 3
Architecture Slide 4
Architecture Slide 5
Architecture Slide 6
Architecture Slide 7
Data management Slide 8
Storage support UNICORE storage management service ( SMS ) provides an abstract filesystem-like view on a storage resource Common operations Initiate filetransfers Import/export files from/to the client Send/receive files from/to other servers Various backends available mkdir, delete, listdirectory, etc Filesystem, Apache Hadoop, irods (prototype!) Designed for extensibility (e.g. Amazon S3,...) Slide 9
Integrated storage management in the UNICORE Rich client Grid browser Create files Drag and drop from/to desktop environment Copy and paste Remote file editing Slide 10
File transfer options Both client-to-server and server-to-server FT available Builtin: BFT transfer (based on HTTPs) Single open port needed, (almost) full UNICORE security Simple interface (bulk write, read supports byte ranges) Fast (several MB/sec.) Builtin: OGSA ByteIO (uses SOAP messages) Single port, full UNICORE security Rich interface (POSIX-like, block read/write, etc) Slow, ~400kB/sec Plus: alternative mechanisms can be plugged in UDT, GridFTP, parallel HTTP... Slide 11
Job model Slide 12
Job description JSDL 1.0 (OGF standard) What to execute Application name / version (mapped to executable by UNICORE) or: Executable path Arguments Environment variables Optional stdin/stdout redirect Data staging specification Into job directory from URL From job directory to URL Resources requested (number of CPUs, etc) Slide 13
Job execution Target System Service 1. submit 1.2 return job address 1.1 create 3. start 4. wait until done Local Filespace Client Job 2. stage-in data 2. import data 5a. export data USpace 5b. stage-out data Remote Storage Spaces Slide 14
Application Integration Slide 15
Application integration Application software installed on target system Add entry in UNICORE server configuration (IDB file) Graphical Client Generic plugin Application specific plugin User s own executable Upload (import) it to the target system Use it in a shell script Slide 16
Client Plugins / GridBeans Application specific graphical interface Generates job description from user input Provides graphical user interface for input data Provides graphical user interface for output data Consist of Job description generation code One or more user interface modules Developer's Guide: http://www.unicore.eu/documentation/manuals/gridbeandevelopersguide.pdf Slide 17
Example: MOPAC Molecular Orbital PACkage, a semi-empirical quantum chemistry program Configuration file entry on the server side (IDB file) <idb:idbapplication> <idb:applicationname>cmopac</idb:applicationname> <idb:applicationversion>7.0</idb:applicationversion> <jsdl:posixapplication xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl posix"> <jsdl:executable>/usr/local/c9m qsar/bin/mopac</jsdl:executable> <jsdl:argument>$input</jsdl:argument> <jsdl:argument>$output</jsdl:argument> <jsdl:argument>$errlog</jsdl:argument> </jsdl:posixapplication> </idb:idbapplication> Slide 18
Example: MOPAC - 2 Use it through Client s generic plugin: Slide 19
Example: MOPAC - 3 Specify input and output in the Generic plugin: Slide 20
Example: MOPAC - 4 application specific plugin Slide 21
Example: MOPAC - 5 Display results (application specific plugin) Slide 22
Example: MOPAC - 6 Display results (application specific plugin) Slide 23
Workflow system Slide 24
Workflow features Simple graphs (DAGs) Workflow variables Loops and control constructs Conditions while, for-each, if-else Exit code, file existence, file size, workflow variables Clients UNICORE Rich client Commandline client Slide 25
Workflow: For-each loop Iterate over files or variables Data files can be local or remote Slide 26
For-each loop Very useful for parameter studies and high-throughput computing Usage example: high-throughput patent mining, together with Fraunhofer SCAI Executed on JSCs D-Grid cluster JUGGLE Workflows involving thousands of jobs each Has been an excellent test case for finding performance and scalability issues in UNICORE Slide 27
Workflow system Portal client, e.g. GridSphere commandline client Eclipsebased client scientific clients and applications authentication Gateway UNICORE Tracing Service UNICORE Workflow Engine Service Registry UNICORE hosting env. UNICORE Service Orchestrator Resource Information Service Basic UNICORE services XNJS + TSI UNICORE hosting env. Local RMS (e.g. Torque, LL, LSF, etc.) UNICORE hosting env. brokering and job management UNICORE hosting env. Basic UNICORE services XNJS + TSI workflow execution UNICORE hosting env. job execution and data storage Local RMS (e.g. Torque, LL, LSF, etc.) parallel scientific jobs of multiple end-users on target systems
Workflow System summary Workflow engine High-level process enactment Pluggable, domain-specific workflow languages Service orchestrator Resource brokering based on pluggable strategies Low-level Grid job execution and monitoring Multiple instances can be deployed for scalability Tracing service Logs workflow events with timestamps Useful for performance evalations and reporting
UNICORE 6 commandline client (UCC) Provides access to all the UNICORE 6 functionality from the commandline, plus Batch mode for high-throughput job processing Scriptable, easily extensible Slide 30
Planned for 6.3.0: Improved MPI support Current state User needs to know the details about the parallel environments at a site (MPI executable, available options,...) Planned Generalised execution environments in UNICORE Example: MPI, OpenMP, hybrid,... Configured in IDB User can select required environment for execution, set options etc (Future: compilation support) Slide 31
Thank you! Questions? Slide 32