Research IT Office Data Storage Options for Research By Ashok Mudgapalli Director of Research IT
Agenda Current Research Data Storage Current Data Backup Strategies Available Storage Solution: Enterprise Cloud Storage Solution: BOX Offsite Storage Solution: PKI Offsite Storage Solution: XSEDE Recommended Best Practices Science DMZ Q&A
Current Research Data Storage Local computers and workstations External hard drives and Portable drives Thumb drives Departmental Storage Servers Dropbox cloud solution External vendor solutions from Google docs, Amazon and others Research IT Office
Current Data Backup Strategies Minimal to None Data Backup definition is not well understood Data files are getting copied to local external hard drives, home office computers, portable hard drives, and departmental servers Some researchers are sending out portable hard drives with research data on it to collaborators outside the state as part of their backup plan
Available Storage Solution: Enterprise It is HIPPA Compliant Environment for PHI data Enterprise Research Storage server is hosted at UNMC Data Center and the Backup server is located in Lincoln Each server has 100 TB of usable space The Enterprise storage is offered (with replication & Backup) at the cost of $900/TB/Year Each research faculty is offered a jump start storage space of 25 GB at no cost Backup is performed once in a day however the replication is done live
Cloud Storage Solution: BOX Cloud Expected to be available by first part of 2014 Cloud based solution, suitable for collaborative exchange of research data, files, and documents Each file size can not be more than 5 GB It costs $420 / user / year with unlimited space It is HIPAA compliant environment and the researcher can store PHI data
Offsite Storage Solution: PKI (Holland Computing Center) Expected to be available by first part of 2014 It will be Non-HIPAA compliant environment and will be suitable for non-phi data Primary server will be at PKI and the backup server will be in Lincoln The necessary support and the space will be provided by HCC IT Each server will have 80 TB storage space The storage space will be offered at $400/ TB / Year The data backup will be enforced
Offsite Storage Solutions: XSEDE Extreme Science and Engineering Discovery Environment Massive cyber infrastructure setup using NSF funding ($121 Million in grants) The infrastructure is spread across the US in 17 supercomputing institutions The infrastructure offers HPC, HTC, Visualization, and Storage services All services are offered at NO COST to the researcher
Recommended Best Practices for Research Data Storage All research data should have backup copies Backing data on local computers, external hard drives or portable devices is not considered as true backup True backup involves having a copy of your data at remote location Minimize the use of external hard drives, portable drives, and thumb drives If thumb drives need to used, make sure that you are using encryption grade drives
Recommended Best Practices for Research Data Storage Raw research data can be stored on external hard drives (password protected and encrypted) for short duration Backup data should be stored on Network storage area Backup your research data at least once a week Avoid storing research critical data on external hard drives and mobile devices If an external hard drive fails, it is almost impossible to recover the data in intact form
Recommended Best Practices for Research Data Storage Store confidential information including PHI on computer network drives instead of local drives Do not store confidential information and private data on mobile devices unless it is password protected and encrypted Finally, NEVER EVER SHARE YOUR PASSWORD WITH ANY ONE
Storage Option PHI or HIPAA Cost / year Availability Comments Enterprise Yes $900 / TB Yes Backup / Replication, Free 25 GB space PKI No (only for non-phi data) BOX Cloud Yes $420 / unlimited space $400 / TB Jan 2014 Backup / Replication Jan 2014 Backup / Replication, each File size can not be more than 5 GB XSEDE No FREE Yes No Backup and Replication, consider bandwidth
How the Research Data Moves?
Conclusion Portable storage devices should be used only for short term data storage and avoid using it for storing critical research data Always have data backup at remote location Research IT Office is offering data storage solutions much cheaper than outside vendors To request the research data storage space, please visit our website at http://www.unmc.edu/vcr/data_mgmt.htm
Research IT Office