METAARCHIVE & CLOUD COMPUTING Bill Robbins System Administrator MetaArchive Cooperative Central Server Functions
MetaArchive Central Servers LOCKSS Caches Require a Title Database LOCKSS Caches Require plugins MetaArchive Has Created a Conspectus These Common needs are hosted on Central Servers Other Centralized/Common tools are also hosted on the properties server(s) Wiki, Cache Manager
Why Cloud Computing Unattached Central Servers The central properties server of the cooperative should belong to the whole, not be property of a single member. Make a LOCKSS cache available to prospective members who do not have sufficient technical capacity.
Division of Central Server Functions Functions were separated on two cloud servers and some third party service providers. One server contains functions & material under the public web address, a.k.a. the WWW server. Another server contains functions for network operations, a.k.a. the Admin server. A test network is supported, as well as the production network.
WWW Server Public Web Site Drupal Wikis MetaWiki Inter-Wiki Subversion Source Code Control Trac Ticketing System
Admin Server Title Database production + test Plugin Repository - production + test Keystore Validate the plugins Ssl Key Generation Nagios - Monitoring Cache manager Live View (If time) Conspectus Live View (If time) LOCKSS s/w Repository
Cloud Computing Overview These are the functions hosted in the Amazon Cloud. Standard server functions such as MySQL and Apache are left out for simplicity.
Third Party Functions Email XML Code Repository Listserv
Roles/Accounts - Discussion Referring to new member package, who needs what? Different Responsibilties Expected Duties
Selection of Amazon Cloud Computing for MetaArchive Properties Server Quick Review of Why Cloud Computing Desired Features Maximum Flexibility to Server Configuration Predictable Pricing Model Well Established Service Providers Selection: Amazon EC2
Amazon Pricing Model Finite Virtual Hardware Configurations Finite O/S availability, created as AMI packages Pay as you go for storage Pay as you go for data transfer Minimal charges for storage I/O Computing is flat rate, with reduced pricing for a one year commitment. MetaArchive has a few months of actual usage history Cost estimates were quite accurate
Actual Cost for Central Servers (Sample for three running instances)
3 Server Cost - Monthly
Storage costs
Amazon Flexibility AMI Packages Start with a Base, then configure as needed. Restart a complete image if needed.
Cost Analysis for Using Cloud Servers as a LOCKSS Cache Too Pricy Vendors not trying hard enough TBs of storage become too expensive Constant network traffic becomes too expensive MetaArchive Caches Replace Hardware on a 3 year cycle Latest Server < $4,500 16 TB Disk 6GB Ram Quadcore CPU O/S: CentOS Free Redhat Linux Clone
Cache Price Points What would it take cost wise to afford running a LOCKSS Cache in a Cloud Computing Environment? $100/mo. Definitely $150/mo. Probably $200/mo. Maybe $250/mo. Too expensive
Summary of Experience with Amazon EC2 Amazon EC2 Strengths RAPID Deployment Perfect for Temporary Needs Consistent Build Outs EZ firewall configuration
Summary of Experience with Amazon EC2 Amazon EC2 Drawbacks Storage Capability Poorly Documented Simple Storage Too Simple Complete examples of servers non-existent (Backup, drive mounting, & others) Code Library for Working with Storage is nonvendor provided. (Lack of documentation and working examples)
Summary Cloud Computing Is Working well for MetaArchive for the central properties servers Vendors need to do better in offering a full set of tools for server operations Contact Info: Bill Robbins MetaArchive Cooperative 1230 Peachtree Street, Suite 1900 Atlanta, GA 30309 http://www.metaarchive.org bill.robbins@metaarchive.org or katherine.skinner@metaarchive.org
Side Note The Next Level for Open Source Software Amazon EC2 can be used as a way to bring Open Source projects to the next level. A public server image can be made available to run software developed for LoC, other government agencies or non-profit institutions. There would be no software to download, and / or build and no additional server configuration needed.