Sustainability Issues in Digital Preservation



Similar documents
Sustainability and Energy Efficiency in Data Centers Design and Operation

Energy Conscious Virtual Machine Migration by Job Shop Scheduling Algorithm

Long term retention and archiving the challenges and the solution

ECO-FRIENDLY COMPUTING: GREEN COMPUTING

Datacenter Efficiency

Leveraging Renewable Energy in Data Centers: Present and Future


Protect Data... in the Cloud

IBM and Dynamic Infrastructure. Doug Neilson, IBM Systems Group May 2009

Bytes and BTUs: Holistic Approaches to Data Center Energy Efficiency. Steve Hammond NREL


Stochastic Model Predictive Control for data centers

The Efficient Enterprise. Juan Carlos Londoño Data Center Projects Engineer APC by Schneider Electric

EMC, VNX and ENERGY EFFICIENCY

H ig h L e v e l O v e r v iew. S te p h a n M a rt in. S e n io r S y s te m A rc h i te ct

1.- L a m e j o r o p c ió n e s c l o na r e l d i s co ( s e e x p li c a r á d es p u é s ).

Green HPC - Dynamic Power Management in HPC


The Curious Case of Database Deduplication. PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle

Securstore Cloud Backup and Recovery Software.

Oracle Infrastructure Systems Management with Enterprise Manager and Ops Center CON4954

Guideline for Water and Energy Considerations During Federal Data Center Consolidations

June Blade.org 2009 ALL RIGHTS RESERVED

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

EMC AVAMAR. a reason for Cloud. Deduplication backup software Replication for Disaster Recovery

Smarter Data Centers. Integrated Data Center Service Management

How To Find A Sweet Spot Operating Temperature For A Data Center

Put the human back in Human Resources.

Understanding, Modelling and Improving the Software Process. Ian Sommerville 1995 Software Engineering, 5th edition. Chapter 31 Slide 1

EMC EXAM - E Backup and Recovery - Avamar Specialist Exam for Storage Administrators. Buy Full Product.

Wikibon Storage Projections to an All-flash Datacenter in 2016

Hierarchical Approach for Green Workload Management in Distributed Data Centers

ADVANCED DEDUPLICATION CONCEPTS. Larry Freeman, NetApp Inc Tom Pearce, Four-Colour IT Solutions

The New Virtualization Management. Five Best Practices

Design Best Practices for Data Centers

Trends in Application Recovery. Andreas Schwegmann, HP

Veeam Best Practices with Exablox

UNDERSTANDING DATA DEDUPLICATION. Thomas Rivera SEPATON

Parallels Virtuozzo Containers

Office of the Government Chief Information Officer. Green Data Centre Practices


A Survey on Deduplication Strategies and Storage Systems

Data Center Technology: Physical Infrastructure

Optimizing Ext4 for Low Memory Environments

A VANTAGE DATA CENTER INFOPAPER. The Open Compute Project + Your Data Center: What you need to know

BACKUP IS DEAD: Introducing the Data Protection Lifecycle, a new paradigm for data protection and recovery WHITE PAPER

REDUCING DATA CENTER POWER CONSUMPTION THROUGH EFFICIENT STORAGE

Hitachi Content Platform as a Continuous Integration Build Artifact Storage System

THE SUMMARY. ARKSERIES - pg. 3. ULTRASERIES - pg. 5. EXTREMESERIES - pg. 9

Overview of Green Energy Strategies and Techniques for Modern Data Centers

International Journal of Applied Science and Technology Vol. 2 No. 3; March Green WSUS

Real-time Compression: Achieving storage efficiency throughout the data lifecycle

Open Data Center Alliance Usage: VIRTUAL MACHINE (VM) INTEROPERABILITY

Open Data Center Alliance Usage: VIRTUAL MACHINE (VM) INTEROPERABILITY IN A HYBRID CLOUD ENVIRONMENT REV. 1.1

Backup & Recovery for VMware Environments with Avamar 6.0

Data Center Infrastructure Innovation

EMC VNXe File Deduplication and Compression

Agent vs. Agent-less auditing

Lessons from the field: Implementing Information Governance and Records Management with Microsoft SharePoint

SOLUTION BRIEF. Resolving the VDI Storage Challenge

REDUCING DATA CENTER POWER CONSUMPTION THROUGH EFFICIENT STORAGE

IT Economics: A Structured Approach to Identify, Measure and Reduce IT Costs. David Merrill Chief Economist Hitachi Data Systems

SoSe 2014 Dozenten: Prof. Dr. Thomas Ludwig, Dr. Manuel Dolz Vorgetragen von Hakob Aridzanjan

Best Practices for Architecting Storage in Virtualized Environments

Nutanix Tech Note. Data Protection and Disaster Recovery

Integrated Development and Assessment of New Thermoplastic High Voltage Power Cable Systems

Data Deduplication in Tivoli Storage Manager. Andrzej Bugowski Spała

Native Data Protection with SimpliVity. Solution Brief

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief

Dr.R.Anbuselvi Assistant Professor

Introducing Graves IT Solutions Online Backup System

New Features in PSP2 for SANsymphony -V10 Software-defined Storage Platform and DataCore Virtual SAN

Green Data Centers. Jay Taylor Director Global Standards, Codes and Environment (512)

Harvesting Solar Energy

Introducing: Infrascale VMware Backup

How To Use Softxpand (A Thin Client) On A Pc Or Laptop Or Mac Or Macbook Or Ipad (For A Powerbook)

Measuring Power in your Data Center: The Roadmap to your PUE and Carbon Footprint

Chapter 07. Spare Parts Inventory Management

Transcription:

Sustainability Issues in Digital Preservation Krishna Kant George Mason University National Science Foundation Digital Preservation 2013 Alexandria, VA, July 24, 2013

Can Preservation be Sustainable? Significant environmental footprint. Storage media life-cycle (mfg, distribution, ) Housing & access management of media Processing & IO infrastructure (Data Centers)

What do we want to preserve? Not data, but information Ideally, knowledge K. Kant, Sustainability of Digital Preservation

Sustainability Issues Make knowledge derivation more sustainable Retain only essential data Minimize environmental impact of data centers Remove duplicate, inessential data Storage vs. reprocessing Has sustainability tradeoffs K. Kant, Sustainability of Digital Preservation

Data Center Impact Power consumption rising. Most of it wasted: But impact is more than power Power distribution, Cooling Idle Machines Materials, water, manufacturing, Sustainability perspective Energy doesn t matter, its carbon footprint does

IT LOAD 6% loss 94% efficient ~1% loss in switch gear and conductors 480V 13.2kv 13.2kv 0.3% loss 99.7% efficient 208V 13.2kv 115kv UPS: 2.5MW Generator ~180 Gallons/hour 0.5% loss 99.5% efficient 1.0% loss 99.0% efficient 9-10% distribution loss at power source Lots of earth s resources used (metals, rare earths, ) K. Kant, Sustainability of Digital Preservation

Renewable Energy Powered IT? Limit energy draw from grid Less infrastructure & losses, but variable supply Impact on performance, QoS, SLA, Challenges Variability at multiple time scales Reliability issues Need better power adaptability

Cooling Infrastructure

High Temperature Operation Chiller-less data centers Less energy/materials, but space inefficient High temperature operation of comm. /computing equipment Smaller Toutlet Tinlet Deal with occasionally hitting temp. limits. Need smarter thermal adaptability

Overdesign Huge UPS, Generators, dist. frames, power supplies, fans, Engineered for worst case Huge waste of power, materials, Power Supply & VRs Low utilizations Low efficiency Need Better Power Infra. adaptability

Energy Adaptive Computing Overdesign Rightsizing + smart adaptation Adaptation to energy/power/thermal/cooling limitations. Dynamic adaptation of infrastructure & workloads Need coordination across compute, network & storage.

Data Growth Exponential growth in both generation & retention Data vs. useful data More data More junk (Less information) Duplication, Keep just in case, too lazy to purge, But, can we define useful?

Sustainability Impact of Data Increased power consumption 10-40% of total power Insatiable drive demand Life cycle impact Cumulative impact because of little deletion Not sustainable!

Data Reduction Opportuninites n ai t, e m en o D m tiv e,d ra ter t is cen in a at m d d A me, Ho ct e t, bj sho t ar p ct t, e bj sho O ap B, Sn /D le ct e ot, j b sh A om dm O e, in Sn b da is Fi a je ta tr le ps c /D h t ce a B, ot, nt ti er ve,d O Sn b ep Do Fi a je ar m le ps c tm a /D h t o en in B, t, t, O Sn b Fi a je le ps c /D h t B, ot, H Fi O ap B, Sn /D le Fi O ap B, Sn /D le Fi Administrative Domain Home, data center, department, Object Object Object Snapshot, File/DB, Snapshot, File/DB, Snapshot, File/DB, K. Kant, Sustainability of Digital Preservation

Data Reduction Within an object Within and across administrative domain Compression, compressive sampling, delta encoding, remove bundled VM Deduplication across objects & storage nodes Cloud based storage/access/deduplication Only a few copies (collectively) across nodes Tradeoffs Storage vs. data movement vs. processing Fidelity vs. cost (reduced representations) Similar, filtered, derived, Cross domain access/privacy/security issues

Role of Content Creators Best Practices Send link instead of content Don t create local copy Purge obsolete, defective, unneeded data Data is valuable, meta-data is precious! Designed, not an afterthought! Strong association with data Must reflect data quality Preservation more crucial than for data K. Kant, Sustainability of Digital Preservation

Thank you! K. Kant, Sustainability of Digital Preservation

Other Sustainability Aspects Physical degradation of media Will require keeping media healthy for exponentially increasing data Unsustainable Media obsolescence HW & SW obsolescence Increasing amount of materials to manufacture storage media Unable to power the media Loss of meta-data or inadequate meta-data Have the data but don t know how to use it/ K. Kant, Sustainability of Digital Preservation