Pluggable Rule Engine

Similar documents
Technical. Overview. ~ a ~ irods version 4.x

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un

Data Management using irods

The National Consortium for Data Science (NCDS)

CSC 370 Database Systems Summer 2004 Assignment No. 2

INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS)

OSF INTEGRATOR for INGRAM MICRO Integration Guide

Managing and Maintaining Windows Server 2008 Servers

Avid Interplay Web Services. Version 1.4

Digital Asset Management. Content Control for Valuable Media Assets

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire 25th

Using SSH Secure Shell Client for FTP

Pulse Secure Client. Customization Developer Guide. Product Release 5.1. Document Revision 1.0. Published:

StruxureWare Data Center Expert Release Notes

Human Translation Server

Avid. Avid Interplay Web Services. Version 2.0

Performing Database and File System Backups and Restores Using Oracle Secure Backup

PoS(ISGC 2013)021. SCALA: A Framework for Graphical Operations for irods. Wataru Takase KEK wataru.takase@kek.jp

StruxureWare Data Center Expert Release Notes

Anwendungsintegration und Workflows mit UNICORE 6

Spyglass Portal Manual v

Bitrix Site Manager ASP.NET. Installation Guide

Managing and Maintaining Windows Server 2008 Servers (6430) Course length: 5 days

CHAPTER 10: WEB SERVICES

Configuring. Moodle. Chapter 82

Planning and Administering Windows Server 2008 Servers

Junos Pulse for Google Android

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Windows Operating Systems. Basic Security

CatDV Pro Workgroup Serve r

Department of Engineering Science. Understanding FTP

Planning and Administering Windows Server 2008 Servers

P-Synch by M-Tech Information Technology, Inc. ID-Synch by M-Tech Information Technology, Inc.

Solving Business Pains with SQL Server Integration Services. SQL Server 2005 / 2008

Cisco Prime Collaboration Deployment Troubleshooting

70-414: Implementing a Cloud Based Infrastructure. Course Overview

USB Secure Management for ProCurve Switches

NETWRIX USER ACTIVITY VIDEO REPORTER

Understanding Task Scheduler FIGURE Task Scheduler. The error reporting screen.

Workflow Templates Library

QuickStart Guide for Mobile Device Management

Online Test Administrator Quick Reference Guide Updated 08/02/2012 v.2.0

State of Michigan Data Exchange Gateway. Web-Interface Users Guide

Creating the Conceptual Design by Gathering and Analyzing Business and Technical Requirements

Ben Hall Technical Pre-Sales Manager Barry Kew Pre-Sales Consultant

Getting Started. Getting Started with Time Warner Cable Business Class. Voice Manager. A Guide for Administrators and Users

Integrating ConnectWise Service Desk Ticketing with the Cisco OnPlus Portal

MathCloud: From Software Toolkit to Cloud Platform for Building Computing Services

Basic File Recording

Planning and Administering Windows Server 2008 Servers

Adobe Systems Incorporated

DiskPulse DISK CHANGE MONITOR

VTLBackup4i. Backup your IBM i data to remote location automatically. Quick Reference and Tutorial. Version 02.00

QuickStart Guide for Mobile Device Management. Version 8.6


10 Things IT Should be Doing (But Isn t)

Administration Guide ActivClient for Windows 6.2

Amazon Glacier. Developer Guide API Version

McAfee Application Control / Change Control Administration Intel Security Education Services Administration Course

Recovering and Analyzing Deleted Registry Files

Advanced Configuration Steps

DataLogger Kepware, Inc.

Adobe Summit 2015 Lab 718: Managing Mobile Apps: A PhoneGap Enterprise Introduction for Marketers

CommVault Simpana Archive 8.0 Integration Guide

Fundamentals of LoadRunner 9.0 (2 Days)

What We Do: Simplify Enterprise Mobility

We begin with a number of definitions, and follow through to the conclusion of the installation.

Centrify Identity Service and Mac - Online Training

NASA s Big Data Challenges in Climate Science

Acronis Backup & Recovery 10 Server for Linux. Update 5. Installation Guide

MJPEG Camera Client User Manual

Virtual Data Centre. User Guide

Kaseya 2. User Guide. Version 1.1

ORACLE DATABASE 11G: COMPLETE

Computer Forensics using Open Source Tools

MICROSOFT EXAM QUESTIONS & ANSWERS

Security FAQs (Frequently Asked Questions) for Xerox Remote Print Services

CrashPlan Security SECURITY CONTEXT TECHNOLOGY

R3: Windows Server 2008 Administration. Course Overview. Course Outline. Course Length: 4 Day

List of FTP commands for the Microsoft command-line FTP client

Mail 2 ZOS FTPSweeper

User Manual. Version connmove GmbH Version: Seite 1 von 33

State of Tennessee Sourcing Event #9160 ServiceNow Preliminary Statement of Work (SOW)


MEGARAC XMS Sx EXTENDIBLE MANAGEMENT SUITE SERVER MANAGER EDITION

Citrix Training. Course: Citrix Training. Duration: 40 hours. Mode of Training: Classroom (Instructor-Led)

irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI!

Acronis Backup & Recovery 10 Server for Linux. Installation Guide

Application Security Testing. Erez Metula (CISSP), Founder Application Security Expert

Transcription:

Pluggable Rule Engine CurateGear2016 Terrell Russell, Ph.D. @terrellrussell Senior Data Scientist, irods Consortium Renaissance Computing Institute (RENCI), UNC-Chapel Hill 1

2

irods Consortium The irods Consortium was created to ensure the sustainability of irods and to further its adoption and continued evolution. To this end, the Consortium works to standardize the definition, development, and release of irods-based data middleware technologies, evangelize irods among potential users, promote new advances in irods, and expand the adoption of irods-based data middleware technologies through the development, release, and support of an open-source, mission-critical, production-level distribution of irods. Current Members: 3

Four Major Areas of Deployment Health Care & Life Science Oil & Gas Media & Entertainment Archives & Records Management 4

Open Source Data Management Middleware irods enables data discovery using a metadata catalog that describes every file, every directory, and every storage resource in the data grid. irods automates data workflows, with a rule engine that permits any action to be initiated by any trigger on any server or client in the grid. irods enables secure collaboration, so users only need to log in to their home grid to access data hosted on a remote grid. irods implements data virtualization, allowing access to distributed storage assets under a unified namespace, and freeing organizations from getting locked in to single-vendor storage solutions. 5

Open Source Data Management Middleware irods enables data discovery using a metadata catalog that describes every file, every directory, and every storage resource in the data grid. irods automates data workflows, with a rule engine that permits any action to be initiated by any trigger on any server or client in the grid. irods enables secure collaboration, so users only need to log in to their home grid to access data hosted on a remote grid. irods implements data virtualization, allowing access to distributed storage assets under a unified namespace, and freeing organizations from getting locked in to single-vendor storage solutions. 6

Pluggable Rule Engine Part of irods 4.2, Spring 2016 Rule Engine Plugins are written in C++ Allows rules to be written in any language (both interpreted and compiled) Multiple rule engines can run concurrently, allowing calls from one language to another Today's demo: (existing) irods Rule Language Javascript Python 7

Pluggable Rule Engine Rule Engine Plugin LOC (w/ comments) irods Rule Language 228 Javascript 222 Python 252 Auditing (C++) 157 Defined operations: start stop rule_exists exec_rule (probably also) prepend_to_rulebase 8

Policy Enforcement Points (PEPs) A Policy Enforcement Point (or PEP) is a location in the irods Server code where policy can be inserted by an organization. Policy takes the form of a set of rules that are executed by the rule engine. Consider, for example, a rule to transfer ownership of data objects to the project manager when a user is deleted; the trigger (or PEP) is the deletion of the user. Rules could be written to extract metadata or pre-process data whenever a file is uploaded to a storage device. Or, upon access to particular data objects, a rule can create a log of the event, send an email notification to the project manager, or perform some other task you need to occur as a result of the data's access. We will be using acpostprocforput(). This is the location in the code that is hit after a file has been uploaded into irods. This location in the code has a full picture of the context in which it is executing. It knows the username, the filename, the location on disk, etc. 9

Policy Enforcement Points (PEPs) audit_pep_auth_agent_auth_response_post audit_pep_auth_agent_auth_response_pre audit_pep_auth_agent_start_post audit_pep_auth_agent_start_pre audit_pep_database_check_auth_post audit_pep_database_check_auth_pre audit_pep_database_gen_query_access_control_setup_post audit_pep_database_gen_query_access_control_setup_pre audit_pep_database_gen_query_post audit_pep_database_gen_query_pre audit_pep_database_get_rcs_post audit_pep_database_get_rcs_pre audit_pep_database_mod_data_obj_meta_post audit_pep_database_mod_data_obj_meta_pre audit_pep_database_reg_data_obj_post audit_pep_database_reg_data_obj_pre audit_pep_exec_microservice_post audit_pep_exec_microservice_pre audit_pep_exec_rule_post audit_pep_exec_rule_pre audit_pep_resource_resolve_hierarchy_post audit_pep_resource_resolve_hierarchy_pre audit_pep_resource_stat_post audit_pep_resource_stat_pre audit_pep_resource_write_post audit_pep_resource_write_pre 10

Three Rule Bases, Part 1 - irods Rule Language # existing irods Rule Language - custom.re irodsfunc0() { writeline("serverlog", "custom.re - BEGIN - irodsfunc0()"); writeline("serverlog", "custom.re -- Hi Curate Gear!"); writeline("serverlog", "custom.re - END - irodsfunc0()"); } irodsfunc1(*foo) { writeline("serverlog", "custom.re - BEGIN - irodsfunc1(foo): [*foo]"); pyfunc1("called from custom.re"); writeline("serverlog", "custom.re - END - irodsfunc1(foo)"); } add_metadata_to_objpath(*str, *objpath, *objtype) { msistring2keyvalpair(*str, *kvp); msiassociatekeyvaluepairstoobj(*kvp, *objpath, *objtype); } getsessionvar(*name, *output) { *output = eval("str($"++*name++")"); } 11

Three Rule Bases, Part 2 - Javascript /* Javascript - core.js */ function jsfunc0(callback) { callback.writeline("serverlog", "JAVASCRIPT - BEGIN - jsfunc0(callback)"); callback.writeline("serverlog", "JAVASCRIPT -- Happy New Year!"); callback.writeline("serverlog", "JAVASCRIPT - END - jsfunc0(callback)"); } function jsfunc1(foo, callback) { callback.writeline("serverlog", "JAVASCRIPT - BEGIN - jsfunc1(foo, callback)"); callback.writeline("serverlog", " - with parameter foo[" + foo + "]"); try { callback.doesnotexist("nope"); } catch(e) { throw e + " -- ERROR HANDLING FTW!"; } callback.writeline("serverlog", "JAVASCRIPT - END - jsfunc1(foo, callback)"); } /********************************** * DEMO 1 - Just Print To rodslog * **********************************/ function XXXacPostProcForPut(callback) { callback.writeline("serverlog", "JAVASCRIPT - BEGIN - acpostprocforput()"); callback.irodsfunc0(); callback.pyfunc0(); callback.writeline("serverlog", "JAVASCRIPT - END - acpostprocforput()"); } 12

Three Rule Bases, Part 3 - Python # Python - core.py import datetime def pyfunc0(rule_args, callback): callback.writeline('serverlog', 'PYTHON - BEGIN - pyfunc0(callback)') callback.writeline('serverlog', 'PYTHON -- ' + str(datetime.datetime.now())) callback.writeline('serverlog', 'PYTHON - END - pyfunc0(callback)') def pyfunc1(rule_args, callback): callback.writeline('serverlog', 'PYTHON - BEGIN - pyfunc1(rule_args, callback)') for arg in (rule_args): callback.writeline('serverlog', 'PYTHON -- arg=[' + arg + ']') callback.writeline('serverlog', 'PYTHON - END - pyfunc1(rule_args, callback)') ########################################## # DEMO 2 - Parameters and Error Handling # ########################################## def XXXacPostProcForPut(rule_args, callback): callback.writeline('serverlog', 'PYTHON - BEGIN - acpostprocforput()') callback.irodsfunc1("called from python, apples") callback.jsfunc1("called from python, bananas") session_vars = ['usernameclient', 'datasize',] for s in session_vars: v = callback.getsessionvar(s, 'dummy')[1] callback.writeline('serverlog', s + ' :: ' + v) callback.writeline('serverlog', 'PYTHON - END - acpostprocforput()') ################################# # DEMO 3 - EXIF Extraction Demo # ################################# import os import sys from PIL import Image from PIL.ExifTags import TAGS def XXXacPostProcForPut(rule_args, callback): phypath = callback.getsessionvar('filepath', 'dummy')[1] callback.writeline('serverlog', phypath) objpath = callback.getsessionvar('objpath', 'dummy')[1] callback.writeline('serverlog', objpath) exiflist = [] for (k, v) in Image.open(phypath)._getexif().iteritems(): exifpair = '%s=%s' % (TAGS.get(k), v) exiflist.append(exifpair) exifstring = "%".join(exiflist) callback.add_metadata_to_objpath(exifstring, objpath, '-d') callback.writeline('serverlog', 'PYTHON - acpostprocforput() EXIF complete') 13

DEMO - Baseline iput a file into irods $ iput puppies.jpg rodslog: Jan 13 13:00:04 pid:26540 NOTICE: Agent process 30793 started for puser=rods and cuser=rods from #.#.#.# Jan 13 13:00:04 pid:30793 NOTICE: readandprocclientmsg: received disconnect msg from client Jan 13 13:00:04 pid:30793 NOTICE: Agent exiting with status = 0 Jan 13 13:00:04 pid:26540 NOTICE: Agent process 30793 exited with status 0 14

DEMO 1 - Just Print to rodslog overload the acpostprocforput() policy enforcement point # remove prepended XXX in core.js DEMO 1 function name function acpostprocforput(callback) {...} remove, then iput the same file into irods $ irm -rf puppies $ iput puppies.jpg rodslog: Jan 13 13:01:34 pid:26540 NOTICE: Agent process 30936 started for puser=rods and cuser=rods from #.#.#.# Jan 13 13:01:35 pid:30936 NOTICE: readandprocclientmsg: received disconnect msg from client Jan 13 13:01:35 pid:30936 NOTICE: Agent exiting with status = 0 Jan 13 13:01:35 pid:26540 NOTICE: Agent process 30936 exited with status 0 Jan 13 13:01:35 pid:26540 NOTICE: Agent process 30943 started for puser=rods and cuser=rods from #.#.#.# Jan 13 13:01:35 pid:30943 NOTICE: writeline: instring = JAVASCRIPT - BEGIN - acpostprocforput() Jan 13 13:01:35 pid:30943 NOTICE: writeline: instring = custom.re - BEGIN - irodsfunc0() Jan 13 13:01:35 pid:30943 NOTICE: writeline: instring = custom.re -- Hi Curate Gear! Jan 13 13:01:35 pid:30943 NOTICE: writeline: instring = custom.re - END - irodsfunc0() Jan 13 13:01:35 pid:30943 NOTICE: writeline: instring = PYTHON - BEGIN - pyfunc0(callback) Jan 13 13:01:35 pid:30943 NOTICE: writeline: instring = PYTHON -- 2016-01-13 13:01:35.522276 Jan 13 13:01:35 pid:30943 NOTICE: writeline: instring = PYTHON - END - pyfunc0(callback) Jan 13 13:01:35 pid:30943 NOTICE: writeline: instring = JAVASCRIPT - END - acpostprocforput() Jan 13 13:01:35 pid:30943 NOTICE: readandprocclientmsg: received disconnect msg from client Jan 13 13:01:35 pid:30943 NOTICE: Agent exiting with status = 0 Jan 13 13:01:35 pid:26540 NOTICE: Agent process 30943 exited with status 0 15

DEMO 2 - Parameters and Error Handling overload the acpostprocforput() policy enforcement point # remove prepended XXX in core.py DEMO 2 function name def acpostprocforput(rule_args, callback): remove, then iput the same file into irods $ irm -rf puppies $ iput puppies.jpg rodslog: Jan 13 13:02:01 pid:26540 NOTICE: Agent process 30986 started for puser=rods and cuser=rods from #.#.#.# Jan 13 13:02:01 pid:30986 NOTICE: readandprocclientmsg: received disconnect msg from client Jan 13 13:02:01 pid:30986 NOTICE: Agent exiting with status = 0 Jan 13 13:02:01 pid:26540 NOTICE: Agent process 30986 exited with status 0 Jan 13 13:02:01 pid:26540 NOTICE: Agent process 30993 started for puser=rods and cuser=rods from #.#.#.# Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = PYTHON - BEGIN - acpostprocforput() Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = custom.re - BEGIN - irodsfunc1(foo): [called from python, apples] Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = PYTHON - BEGIN - pyfunc1(rule_args, callback) Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = PYTHON -- arg=[called from custom.re] Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = PYTHON - END - pyfunc1(rule_args, callback) Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = custom.re - END - irodsfunc1(foo) Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = JAVASCRIPT - BEGIN - jsfunc1(foo, callback) Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = - with parameter foo[called from python, bananas] Jan 13 13:02:01 pid:30993 ERROR: [-] irods/server/re/src/rules.cpp:674:int actiontablelookup(irods::ms_table_entry &, char *) : status [ [-] irods/server/re/src/irods_ms_plugin.cpp:110:irods::error irods::load_microservice_plugin(ms_table &, const std::string) : s [-] irods/lib/core/include/irods_load_plugin.hpp:145:irods::error irods::load_plugin(plugintype *&, const std::string &, Jan 13 13:02:01 pid:30993 ERROR: -1102000 -- ERROR HANDLING FTW! Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = usernameclient :: rods Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = datasize :: 95891 Jan 13 13:02:01 pid:30993 NOTICE: writeline: instring = PYTHON - END - acpostprocforput() Jan 13 13:02:01 pid:30993 NOTICE: readandprocclientmsg: received disconnect msg from client Jan 13 13:02:01 pid:30993 NOTICE: Agent exiting with status = 0 Jan 13 13:02:01 pid:26540 NOTICE: Agent process 30993 exited with status 0 16

DEMO 3 - EXIF Extraction Demo - 1 of 2 overload the acpostprocforput() policy enforcement point # remove prepended XXX in core.py DEMO 3 function name def acpostprocforput(rule_args, callback): remove, then iput the same file into irods $ irm -rf puppies $ iput puppies.jpg rodslog: Jan 13 13:02:50 pid:26540 NOTICE: Agent process 31039 started for puser=rods and cuser=rods from #.#.#.# Jan 13 13:02:50 pid:31039 NOTICE: readandprocclientmsg: received disconnect msg from client Jan 13 13:02:50 pid:31039 NOTICE: Agent exiting with status = 0 Jan 13 13:02:50 pid:26540 NOTICE: Agent process 31039 exited with status 0 Jan 13 13:02:50 pid:26540 NOTICE: Agent process 31046 started for puser=rods and cuser=rods from #.#.#.# Jan 13 13:02:50 pid:31046 NOTICE: writeline: instring = /var/lib/irods/irods/vault/home/rods/puppies.jpg Jan 13 13:02:50 pid:31046 NOTICE: writeline: instring = /tempzone/home/rods/puppies.jpg Jan 13 13:02:51 pid:31046 NOTICE: writeline: instring = PYTHON - acpostprocforput() EXIF complete Jan 13 13:02:51 pid:31046 NOTICE: readandprocclientmsg: received disconnect msg from client Jan 13 13:02:51 pid:31046 NOTICE: Agent exiting with status = 0 Jan 13 13:02:51 pid:26540 NOTICE: Agent process 31046 exited with status 0 17

DEMO 3 - EXIF Extraction Demo - 2 of 2 list the newly extracted and associated metadata $ imeta ls -d puppies.jpg AVUs defined for dataobj puppies.jpg: <snip> attribute: DateTimeOriginal value: 2012:10:31 18:37:26 units: attribute: ExifImageHeight value: 518 units: attribute: ExifImageWidth value: 777 units: attribute: ExposureTime value: (1, 40) units: attribute: Flash value: 16 units: <snip> attribute: Make value: Canon units: attribute: MeteringMode value: 5 units: attribute: Model value: Canon EOS REBEL T3i units: attribute: Orientation value: 1 units: <snip> verify with third-party jhead $ jhead puppies.jpg File name : puppies.jpg File size : 95891 bytes File date : 2016:01:11 22:16:02 Camera make : Canon Camera model : Canon EOS REBEL T3i Date/Time : 2012:10:31 18:37:26 Resolution : 777 x 518 Flash used : No Focal length : 18.0mm (35mm equivalent: 188mm) CCD width : 3.45mm Exposure time: 0.025 s (1/40) Aperture : f/3.5 ISO equiv. : 2500 Whitebalance : Auto Metering Mode: pattern Exposure : program (auto) JPEG Quality : 94 18

Discussion Powerful abstraction Invites new developers Provides migration path Requires new documentation Requires broader security model Requires careful consideration 19

Questions? Hao Xu, Jason Coposky, Ben Keller, Terrell Russell (2015). Pluggable Rule Engine Architecture. irods User Group Meeting 2015 Proceedings, pp. 29-34 http://irods.org/wp-content/uploads/2015/09/umg2015_p.pdf irods.org github.com/irods @irods Terrell Russell @terrellrussell 20