Tools for Managing Big Data Analytics on z/os Mike Stebner, Joe Sturonas PKWARE, Inc. Wednesday, March 12, 2014 Session ID 14948 Test link: www.share.org
Introduction Heterogeneous Analysis Addressing the process of packaging and transferring z/os based information to an off-board analytic platform in an Effective, Cost-efficient and Secure manner. What are some major hurdles that exploitation of advanced System z facilities can overcome in this venue? 2
Introduction Heterogeneous Analysis Data Transformation Code page differences (EBCDIC/ASCII) Data Structures (Binary, Endian mode numerics, Parsing) Portability between dissimilar file system formats Data Packaging (multiple discrete components) Data Protection Data Volume Total raw size Number of exchanges 3
4 Finding the Sweet Spot
5 What is the business impact of selected designs and facilities?
Focus on experiences with System z Facilities that help address two areas Data Transformation Code page differences (EBCDIC/ASCII) Data Structures (Binary, Endian numerics, Parsing) Portability between dissimilar file system formats Data Packaging (multiple discrete components) Data Protection - Encryption Data Volume Hardware Assisted Compression Total raw size Number of exchanges 6
Data Protection Data-Centric Encryption using ICSF Machine z10- EC 2097 z10- BC 2098 z196 2817 z114 2818 zec12 2827 zbc12 2828 Algorithm Supported DES 3DES DES 3DES DES 3DES DES 3DES DES 3DES DES 3DES AES128, 192, 256 AES128, 192, 256 AES 128, 192, 256 AES 128, 192, 256 AES 128, 192, 256 AES 128, 192, 256 Crypto Hardware CPACF CEX2C CPACF CEX2C CPACF CEX3C CPACF CEX3C CPACF CEX3C CPACF CEX3C CEX3C CEX3C CEX4C CEX4C 7
Application Design Cryptographic Design Influences Data Exchange Format Collection with associative constructs Data Transport (Container Format) In-flight and at rest security Authentication and decryption service availability Cryptographic Identity and Associated Key Management Dynamic vs. Static Keys Inter-system Key Coordination Data Recovery (Contingency Keys) Resource Capacity Timeliness of service 8
9 Key Exposures The need for Key Management
Crypto Facilities ICSF CKDS & PKDS RACF/ACF2/Top Secret Proprietary Certificate Store OpenPGP Keyrings Application Services LDAP Administration Certificate Cryptographic CEXnC / CPACF / Software Crypto X.509 Certificates Public Certificate Authority Native X.509 Certificates 10
Data-Centric Encryption ICSF Data Encipherment Algorithms RSA PKi Encryption Losing ground for longevity due to high cost of processing increased key lengths Symmetric Clear Key DES class, AES (128 256 bit key strength) May be employed with passphrase-generated key or CKDS stored key Symmetric Protected Key (SYMCPACFWRAP) CKDS Secure Key 11
Symmetric Key Operational Comparison Clear Fast, but Risky Protected Fast & Secure Secure Slow o o ICSF Software -or- System z CPACF o System z CPACF o Cryptographic Card o o Passphrase Value -or- ICSF CKDS Registered (clear) o ICSF CKDS Registered (encrypted) o ICSF CKDS Registered (encrypted) 12
13 Leverage ICSF CKDS to Protect Passphrase Derived Keys
14 Illustrate Registered ICSF CKDS Key Set
15 CKDS Policy Control Duplicate Key Value Protection
RACF key ring/certificate with PKDS Label:MSTEBNERSHARETEST ç RACF Label (r_datalib API access) Certificate ID:2QPVweLV4uPFwtXF2fLw8P1A Status:TRUST Start Date:2013/12/17 19:00:25 End Date: 2014/01/18 19:00:24 Serial Number:10F0F1FF3C718DEE4D24BBEDA47A49D0 Issuer's Name:CN=UTN-USERFirst-Client Authentication and Email.OU=http: //www.usertrust.com.o=the USERTRUST Network.L=Salt Lake City.SP=UT.C=US Subject's Name:mike.stebner@pkware.com.CN=Mike Stebner.OU=Corporate Secure Email.OU=Issued through PKWARE E-PKI Manager.O=PKWARE.648 N PL ANKINTON AVE.L=MILWAUKEE.SP=WI.53203.C=US Key Usage:HANDSHAKE Key Type:RSA Key Size:2048 Private Key:YES PKDS Label:SHARE2014MSTEBNER ç ICSF PKDS Label (implied access) 16
17 What is the business impact of selected designs and facilities?
Inherited OpenPGP Data Flow Encryption Layer Compression Layer Literal Data Layer 18
Consider the Basic Data Flow Simple copies from phase to phase 19
20 Understand OpenPGP Internal Stream Formatting (RFC 2440 or 4880)
OpenPGP Data Flow Overhead Additional data manipulation logic from phase to phase 21
Illustration of Container Format Influence on Encipherment Facilities Symmetric Keys X.509 Certificates OpenPGP RACF/ACF/CA-TSS ICSF PKDS ICSF CKDS FIPS 140-2 GOOD WORK REQUIRED NOT AVAILABLE 22
Compression Why is it important? Data acquisition APPLICATION SERVICES GCP/ ziip/zedc Result: Compressed & Encrypted Data on Target Platform Data is offloaded, encrypted, and compressed. 23
What Compression Facilities are Available on System z? Software-based General CP (e.g. gzip, OpenPGP, PKZIP, zlib) Any viable cross-platform compatible algorithm chosen for implementation Deflate (RFC1951) is a commonly used algorithm that combines LZ77 sliding dictionary compression with Huffman coding. Software using ziip offload Execute software routines on a System z9 or later Requires APF authorization to run SRB enclave scheduling Provides economic compression, but may not improve latency performance. 24
What Compression Facilities are Available on System z? Hardware-based System z CMPSC Static Dictionary hardware compression Available since the early 1990 s Static dictionary LZ77 Limited applicability outside of z/os System z Enterprise Data Compression hardware New with zec12 and zbc12 systems PCIE adapter card Implements Deflate algorithm 25
Compression Facility Functional Comparison Software General CP Software on ziip CMPSC Static Dictionary zedc Portable Generalized Compression Requirements General CP Capacity System z9 ziip Capacity (APF) Pre-defined data structures zec12/zbc12 z/os 2.1 zedc Card GOOD WORK REQUIRED NOT AVAILABLE 26
IBM zenterprise Data Compression for z/ OS and the zedc Express Feature (I) IBM Announcement; Document Number: ZSB03059USEN Implements RFC 1951 Deflate compression When zlib uses zedc, there can be up to 118X reduction in CPU and up to 24X throughput improvement One or more PCIE cards servicing multiple partitions (15) Currently supported only under a native z/os LPAR Check IBM statements of direction Optimized for larger amounts of data Has configurable minimum size limits (4k floor) PTFs available for z/os 1.12 and 1.13 to inflate Also see SMP/E FIXCAT(IBM.Function.ZEDC) 27
IBM zenterprise Data Compression for z/ OS and the zedc Express Feature (III) System Use Cases SMF Phased Roll-out intentions BSAM/QSAM (infrastructure layer) DFSMSdss /DFSMShsm backup/restore z/os Java Technology Edition, Version 7 Detailed SHARE sessions 15209: Experiences with IBM zaware and zedc 15099: zenterprise Data Compression: What is it and How Do I Use it? (Wed. 4:30 PM) 15080: z/os zenterprise Data Compression Usage and Configuration 28
IBM zenterprise Data Compression for z/ OS and the zedc Express Feature (IV) z/os V2R1.0 MVS Callable Services for HLL (Ch. 13-15) Deflate stream compatible with GZIP, PKZIP, OpenPGP Hardware availability checks to determine availability IBM-provided compatible C library functions APF Authorized API for single-block compress/inflate Unauthorized zlib interface (streaming data) 29
IBM zenterprise Data Compression for z/ OS and the zedc Express Feature (V) z/os V2R1.0 MVS Callable Services for HLL (Ch. 13-15) Unauthorized zlib interface (streaming data) Uses zlib.net z_stream programming interface (subset) Raw Deflate Stream or GZIP modes (CRC32 with GZIP) libzz.a include wrapper Controlled by SAF-protected FACILITY class resource FPZ.ACCELERATOR.COMPRESSION z/os UNIX _HZC_COMPRESSION_METHOD environment control variable May fall back to zlib software routines depending on zedc requirements, including size limitations PARMLIB IQPPRMxx DEFMINREQSIZE (4K) and INFMINREQSIZE (16K) 30
IBM zenterprise Data Compression PKWARE Early Test Program Experience Objective Assess compression using software GCP, ziip and zedc zec12 5 General CPs, 2 ziips, 1 zedc Workloads Single system (no LPAR sharing of zedc) Large (1gb+) linear with multiple parallel (80 concurrent) Small (256k) high volume Metrics Elapsed Time Processor time 31
zedc Operations Console Display General PCIE Status 32
zedc Operations Display zedc PCIE Adapter Status 33
34 zedc Operational Monitoring (II)
zedc Processing Characteristics Multi-tasking with the zlib API is available zlib API may not run on the zedc hardware (per design) Different minimum buffer size thresholds for deflate & inflate Only one level of zedc Deflate compression 9 levels available in zlib software Internal implementations of RFC 1951 Deflate may differ May experience varying compression ratios (based on level) right around the minimum buffer size restriction. 35
IBM zenterprise Data Compression PKWARE Early Test Program Experience Initial Results Overview (I) zedc sustained 1gb+ per second of raw compression zedc capacity exceeded application resource constraints The affects of I/O and application processing prevented saturation of zedc Under appropriate conditions, ziip met or exceeded application performance when compared to zedc. Optimized zlib C routines showed benefits over the libzz.a wrapper code under some conditions. Small files under the minimum buffer size Inflation 36
IBM zenterprise Data Compression PKWARE Early Test Program Experience Initial Results Overview (II) ETP limitations of first implementation identified Buffer allocation issues Buffer release Rejected concurrent requests for the same size buffer Compression ratio (77% vs. 89% for software implementations) 37
38 Effect of Resource Availability zedc vs. ziip
39 Incorporate Design with Facility Transactional Example (1.5mb each)
Summary Slide The Mainframe is typically the source of record for critical business data Data needs to move off the mainframe quickly, efficiently and securely. Numerous facilities on z/os exist to make this quick, efficient and secure ziip, CryptoExpress4S, CPACF, zedc Proper Transformation is critical to reduce hardware dependencies and facilitate long term viability 40