The OpenStack TM Object Storage system

Size: px
Start display at page:

Download "The OpenStack TM Object Storage system"

Transcription

1 The OpenStack TM Object Storage system Deploying and managing a scalable, open- source cloud storage system with the SwiftStack Platform By SwiftStack, Inc. contact@swiftstack.com

2 Contents Introduction... 4 Introducing Swift... 5 Swift Characteristics...6 Swift is Scalable...6 Swift is Extremely Durable...7 Swift is Open Source Software...7 Swift is Similar to AWS S3...8 Swift is Built on Industry-standard Components...8 Swift Can Be Deployed In-House or As-a-Service...9 Using Swift Swift Commands - The Basics Client Libraries How Swift Works Building Blocks Proxy Servers The Ring Zones: Failure Boundaries Accounts & Containers Partitions Replication How these are all tied together Swift Cluster Architecture Access Tier Storage Nodes Configuring Networking Large-Scale Networking Medium-Scale Networking Management Network SwiftStack, Inc. All rights reserved. 2

3 Hardware Recommendations Proxy Nodes Storage Nodes Networking Sizing Your Swift Cluster Planning a Deployment Datacenter Facility Planning Integrations Monitoring Administration & Maintenance Understanding TCO Rolling Swift Out in Your Organization Managing Swift with SwiftStack Like to Learn More? SwiftStack, Inc. All rights reserved. 3

4 Introduction In this era of connected devices, the demands on storage systems are increasing exponentially. Users are producing and consuming more data than ever. Social media, online video, user- uploaded content, gaming, and software- as- a- service applications are all contributing to the vast need for easily consumable storage systems that can grow without bounds. To accommodate these demands, storage systems must be able to handle web- scale workloads with many concurrent readers and writers to a data- store. Some data is frequently written and retrieved, such as database files and virtual machine images. Other data, such as documents, images, and backups are generally written once and rarely accessed. Web and mobile data assets also need to be accessible over the web via a URL to support today s web/mobile applications. A one- size- fits- all data storage solution is therefore neither practical nor economical. Public cloud storage systems have risen to the task of handling web- scale workloads. Cloud storage- as- a- services include Amazon.com s Simple Storage Service (S3) and Rackspace s Cloud Files, which both has grown tremendously in usage. For instance, in April 2010, Amazon.com reported that 100 billion objects were stored in S3, a 100% growth from the previous year. In October 2011, Amazon.com reported 566 billion objects were stored in S3. Technology trend analyzers at BuiltWith.com now tracks more than 60,000 websites that serve content directly from S3. However, not every organization will utilize a public storage cloud such as S3 for cost, regulatory, or control reasons. With the OpenStack Object Storage system, aka Swift, there is now an open- source option for organizations needing a highly durable object storage system that is deployed on industry- standard hardware in their own datacenter. These systems can serve as the core for a private storage cloud or public storage- as- a- service offerings. The purpose of this white- paper is to help those who are considering deploying an object storage system based on OpenStack Swift and complements the official Swift documentation which is available at Not every topic related to getting Swift up and running in your environment is covered in this white- paper, but it provides an overview of the key areas to be aware of, including what Swift is, how it works, how to deploy, manage and monitor Swift with the SwiftStack Platform, how to use Swift and some general deployment considerations SwiftStack, Inc. All rights reserved. 4

5 Introducing Swift Swift is a multi- tenant, highly scalable and durable object storage system that is designed to store large amounts of unstructured data at low cost. Highly scalable, means that it can scale from a few nodes and a handful of drives to thousands of machines with multiple Petabytes of storage. Swift is designed to be horizontally scalable there is no single point- of- failure. Swift is used by Fortune 500 enterprises, web companies and service providers worldwide. It is typically used to store unstructured data such as documents, web content, backups, images and virtual machine snaphots. Originally developed as the engine behind RackSpace Cloud Files, it was open- sourced under the Apache 2 license under the OpenStack project in With now more than 100 companies and thousands of developers participating in the OpenStack project, the usage of Swift is increasing rapidly. Swift is not a traditional file system or a raw block device. Instead, it enables you to store, retrieve and delete objects (with its associated metadata) in containers ( buckets in Amazon S3 terminology) via a RESTful HTTP API.. Developers can either write directly to the Swift API or use one of the many client libraries that exist for all popular programming languages, such as Java, Python, Ruby and C#. Amazon S3 and RackSpace Cloud Files users should feel very familiar with Swift. For users who have not used an object storage system before, it will require a different approach and mindset than using a traditional filesystem. Benefits of Swift for developers include: Data can be directly served over the Internet RESTful HTTP interface Access to storage in minutes, not days One multi- tenant storage system for all your apps 2012 SwiftStack, Inc. All rights reserved. 5

6 Focus on app development, not infrastructure plumbing A rich ecosystem of tools and libraries Benefits to IT operations teams include: Use low- cost, industry- standard servers and disks Manage more data and use cases with ease Enable new applications quickly Highly durable architecture with no single- point of failure Swift Characteristics Swift can t be mounted like a folder in your operating system. There is no random access within a file s content and there can be multiple concurrent writers, which makes it unsuitable for transactional applications such as traditional relational databases for which a Storage Area Network (SAN) or Networked Attached Storage (NAS) system may be a better fit. Also, since object storage systems don t provide raw data blocks that an operating system can form into a filesystem, Swift is unsuitable for booting an operating system. The key characteristics and benefits of Swift include: All objects have a URL All objects have their own metadata Developers interact with the object storage system through a RESTful HTTP API Object data can be located anywhere in the cluster The cluster scales by adding additional nodes without sacrificing performance, which allows a more cost- effective linear storage expansion vs. fork- lift upgrades Data doesn t have to be migrated to an entirely new storage system New nodes can be added to the cluster without downtime Failed nodes and disks can be swapped out with no downtime Runs on industry- standard hardware, such as Dell, HP, Supermicro etc. Swift is Scalable To support thousands of concurrent users, today s application architects must take advantage of the latest in distributed architectures, using distributed nosql databases (CouchDB, Cassandra, MongoDB), distributed message / queuing systems (ActiveMQ, RabbitMQ) and distributed processing systems like Hadoop. To that end, application architects need their storage system to scale along with their application. Available space isn t a useful statistic on its own. A key benchmark is the storage system s concurrency. The ability to handle a great number of simultaneous 2012 SwiftStack, Inc. All rights reserved. 6

7 connections from within a datacenter or across the web is critical to satisfy the needs of applications that are built for web scale usage. Swift is designed to have linear growth characteristics. As the system grows in usage and the number of requests increase, performance doesn t degrade. To scale up, the system is designed to grow where needed by adding storage nodes to increase storage capacity, adding proxy nodes as requests increase, and growing network capacity where choke points are detected. Swift is Extremely Durable Built on an architecture similar to Amazon S3, which claims % durability, Swift is extremely durable. To achieve this level of durability, objects are distributed in triplicate across the cluster. A write must be confirmed in two of the three locations to be considered successful. Auditing process run to ensure the integrity of data. Replicators run to ensure that a sufficient number of copies are in the cluster. In the event that a device fails, data is replicated throughout the cluster to ensure that three copies remain. Another feature is the ability to define failure zones. Failure zones allow a cluster to be deployed across physical boundaries, each of which could individually fail. For example, a cluster could be deployed across several nearby data centers, enabling it to survive multiple datacenter failures. The servers that handle incoming API requests scale up just like any front- end tier for a web application. The system uses a shared- nothing approach and employs the same proven techniques that have been used to provide high availability by many web applications. Swift is Open Source Software Swift is licensed under the permissive Apache 2 open source license. What makes Swift different from most other open source projects is that the software has already been stressed tested in a large- scale production deployment at Rackspace before its first public release. As an open source project, Swift provides the following benefits to its users: 1. No vendor lock- in - As an open source project, you have the option to work with a variety of providers or as a DIY project 2. Community support You can access and share tools, best practices and deployment know- how with other organizations and community members that are using Swift 2012 SwiftStack, Inc. All rights reserved. 7

8 3. Large ecosystem With the large number or organizations and developers participating in the OpenStack project, the development velocity and breadth of tools, utilities and services for Swift will only increase over time As the source code is publicly available, it can be reviewed by many more developers than what is the case for proprietary software. This means that potential bugs also tend to be more visible and more rapidly corrected than for proprietary software. In the long term, open generally wins - and Swift might be considered the Linux of storage. Swift is Similar to AWS S3 Access to the Swift object storage system is entirely through a REST API, which is similar to the Amazon.com S3 API and compatible with the Rackspace Cloud Files API. This means that (a) applications that are currently using S3 can use Swift without major re- factoring of the application code and (b) applications that like to take advantage of both private and public cloud storage can do so as the APIs are comparable. Since Swift is comparable with public cloud services, developers & systems architects can also take advantage of a rich ecosystem of commercial and open- source tools is available for these object storage systems. Clients such as Cyberduck, filers like Nasuni and programming libraries which are available in C#, PHP, Perl, Python, Java, and Ruby are just some examples. Swift is Built on Industry- standard Components If you look under- the- hood, Swift is built using mature, standard components such as rsync, MD5, sqlite, memcache, xfs and python. Swift runs on off- the- shelf Linux distributions such as Ubuntu, which is different from most other storage systems, which run on proprietary or highly- customized operating systems. From a hardware perspective, Swift is designed ground up to handle failures so that reliability on the individual component level is less critical. Thus, regular desktop drives can be used in a Swift cluster rather than more expensive enterprise drives. Hardware quality and configuration can be chosen to suit the tolerances of the application and the ability to replace failed equipment. Since commodity hardware can be used with the system, there is consequently no lock- in with any particular storage vendor. This means deployments can continually take advantage of decreasing hardware prices and increasing drive capacity SwiftStack, Inc. All rights reserved. 8

9 Swift Can Be Deployed In- House or As- a- Service For organizations uncomfortable with outsourcing their data storage to a public cloud storage vendor, Swift can help achieve lower costs and similar performance while retaining greater control over network access, security, and compliance. Cost is also a major factor for bringing cloud storage in- house. Public cloud storage costs include per- GB pricing plus data transit charges, which can become very expensive. The network latency to public storage service providers may be unacceptable. A private deployment can provide lower- latency access to storage, as required by many applications. Also, applications may have large volumes of data in flight, which can't go over the public Internet. For the above reasons, organizations can use Swift to build an in- house storage system that has similar durability/accessibility properties and is compatible with the suites of tools available for public cloud storage systems. Swift can also be deployed as a public storage- as- a- service. Operators and service providers, who want to offer S3- like storage services, can get started by offering Swift as a value- added service to their customers. With Swift, it is now cost- effective to build and deploy an object storage cluster for public use. Swift Is Supported As an OpenStack project, Swift has the benefit of a rich community, which includes more than 100 participating companies and developers. The following support options are available for Swift: Commercial support and tools are available through SwiftStack, which has experience deploying, running and supporting Swift at scale Community support is provided through OpenStack community, where best practices can be shared with other organizations and users that are using Swift Swift s documentation is publicly available at SwiftStack, Inc. All rights reserved. 9

10 Using Swift Once deployed, all communication with Swift is done over a REST- ful HTTP API. Application Developers who d like to take advantage of Swift for storing content, documents, files, images etc. can use one of the many client libraries that exist for all all popular programming languages, including Java, Python, Ruby, C# and PHP. Existing backups, data protection and archiving applications which currently support either Rackspace Cloud Files or Amazon S3 can also use Swift as their storage back- end with minor modifications. Swift Commands - The Basics As Swift has a REST- ful API, all communication with Swift is done over HTTP, using the HTTP verbs to signal the requested action. A Swift storage URL looks like this: Swift s URLs have four basic parts. Using the example above, these parts are: Base: swift.example.com/v1/ Account: An account is determined by the auth server when the account is created. Container: Containers are namespaces used to group objects within an account Object: Objects are where the actual data is stored in swift. Object names may contain /, so pseudo- nested directories are possible. To get a list of all containers in an account, use the GET command on the account: GET To create new containers, use the PUT command with the name of the new container: PUT To list all object in a container, use the GET command on the container: GET SwiftStack, Inc. All rights reserved. 10

11 To create new objects with a PUT on the object: PUT The POST command is used to change metadata on containers and objects.when planning a Swift deployment, the first step is to define the application workloads and functional requirements that will determine how your Swift Client Libraries Several client libraries for Swift are available, including: C#/.NET: Java: PHP: Python: Ruby: cloudfiles cloudfiles cloudfiles cloudfiles cloudfiles In addition, a Ruby library is also available through the fog client: The Fuse client can be used to map a filesystem to Swift- For more information building client libraries for Swift see: cloudfiles 2012 SwiftStack, Inc. All rights reserved. 11

12 How Swift Works Building Blocks The components that enable Swift to deliver high availability, high durability and high concurrency are: Proxy Servers: Handles all incoming API requests. Rings: Maps logical names of data to locations on particular disks. Zones: Each Zone isolates data from other Zones. A failure in one Zone doesn t impact the rest of the cluster because data is replicated across the Zones. Accounts & Containers: Each Account and Container are individual databases that are distributed across the cluster. An Account database contains the list of Containers in that Account. A Container database contains the list of Objects in that Container. Objects: The data itself. Partitions: A Partition stores Objects, Account databases and Container databases. It s an intermediate bucket that helps manage locations where data lives in the cluster. Proxy Servers The Proxy Servers are the public face of Swift and handle all incoming API requests. Once a Proxy Server receive a request, it will determine the storage node based on the URL of the object, e.g. The Proxy Servers also coordinates responses, handles failures and coordinates timestamps. Proxy servers use a shared- nothing architecture and can be scaled as needed based on projected workloads. A minimum of two Proxy Servers should be deployed for redundancy. Should one proxy server fail, the others will take over. The Ring The Ring maps Partitions to physical locations on disk. When other components need to perform any operation on an object, container, or account, they need to interact with the Ring to determine its location in the cluster. The Ring maintains this mapping using zones, devices, partitions, and replicas. Each partition in the Ring is replicated three times by default across the cluster, and the 2012 SwiftStack, Inc. All rights reserved. 12

13 locations for a partition are stored in the mapping maintained by the Ring. The Ring is also responsible for determining which devices are used for handoff should a failure occur. The Ring maps partitions to physical locations on disk. Zones: Failure Boundaries Swift allows zones to be configured to isolate failure boundaries. Each piece of data resides in multiple zones. At the smallest level, a zone could be a single drive or a grouping of a few drives. If there were five object storage servers, then each server would represent its own zone. Larger deployments would have an entire rack (or multiple racks) of object stores, each representing a zone. The goal of zones is to allow the cluster to tolerate significant outages of storage servers. As we learned earlier, everything in Swift is stored, by default, three times. Three zones may seem sufficient for holding three copies of data, but consider the case when a zone goes down. There would be no fourth zone into which data may be replicated, leaving only two copies of all stored data. Therefore, it is recommended that at least four zones and preferably five zones be deployed. If a zone goes down, data will be replicated to other zones. Having at least five zones leaves enough margin to accommodate the occasional Zone failure and enough capacity to replicate data across the system. When a disk, node, or zone fails, replica data is distributed to the other zones to ensure there are three copies of the data 2012 SwiftStack, Inc. All rights reserved. 13

14 Accounts & Containers Each account and container is an individual SQLite database that is distributed across the cluster. An account database contains the list of containers in that account. A container database contains the list of objects in that container. To keep track of object data location, each account in the system has a database that references all its containers, and each container database references each object. Partitions A Partition is a collection of stored data, including Account databases, Container databases, and objects. Partitions are core to the replication system. Think of a Partition as a bin moving throughout a fulfillment center warehouse. Individual orders get thrown into the bin. The system treats that bin as a cohesive entity as it moves throughout the system. A bin full of things is easier to deal with than lots of little things. It makes for fewer moving parts throughout the system. The system replicators and object uploads/downloads operate on Partitions. As the system scales up, behavior continues to be predictable as the number of Partitions is a fixed number. The implementation of a Partition is conceptually simple a partition is just a directory sitting on a disk with a corresponding hash table of what it contains. Swift partitions contain all data in the system. Replication In order to ensure that there are three copies of the data everywhere, replicators 2012 SwiftStack, Inc. All rights reserved. 14

15 continuously examine each Partition. For each local Partition, the replicator compares it against the replicated copies in the other Zones to see if there are any differences. How does the replicator know if replication needs to take place? It does this by examining hashes. A hash file is created for each Partition, which contains hashes of each directory in the Partition. Each of the three hash files is compared. For a given Partition, the hash files for each of the Partition's copies are compared. If the hashes are different, then it is time to replicate and the directory that needs to be replicated is copied over. This is where the Partitions come in handy. With fewer things in the system, larger chunks of data are transferred around (rather than lots of little TCP connections, which is inefficient) and there are a consistent number of hashes to compare. The cluster has eventually consistent behavior where the newest data wins. If a zone goes down, one of the nodes containing a replica notices and proactively copies data to a handoff location. How these are all tied together To describe how these pieces all come together, let s walk through a few scenarios and introduce the components. Upload A client uses the REST API to make a HTTP request to PUT an object into an existing Container. The cluster receives the request. First, the system must figure out where the data is going to go. To do this, the Account name, Container name and Object name are all used to determine the Partition where this object should live. Then a lookup in the Ring figures out which storage nodes contain the Partitions in 2012 SwiftStack, Inc. All rights reserved. 15

16 question. The data then is sent to each storage node where it is placed in the appropriate Partition. A quorum is required at least two of the three writes must be successful before the client is notified that the upload was successful. Next, the Container database is updated asynchronously to reflect that there is a new object in it. Download A request comes in for an Account/Container/object. Using the same consistent hashing, the Partition name is generated. A lookup in the Ring reveals which storage nodes contain that Partition. A request is made to one of the storage nodes to fetch the object and if that fails, requests are made to the other nodes SwiftStack, Inc. All rights reserved. 16

17 Swift Cluster Architecture Access Tier Large- scale deployments segment off an Access Tier. This tier is the Grand Central of the Object Storage system. It fields incoming API requests from clients and moves data in and out of the system. This tier is composed of front- end load balancers, ssl- terminators, authentication services, and it runs the (distributed) brain of the object storage system the proxy server processes. Having the access servers in their own tier enables read/write access to be scaled out independently of storage capacity. For example, if the cluster is on the public Internet and requires ssl- termination and has high demand for data access, many access servers can be provisioned. However, if the cluster is on a private network and it is being used primarily for archival purposes, fewer access servers are needed. As this is an HTTP addressable storage service, a load balancer can be incorporated into 2012 SwiftStack, Inc. All rights reserved. 17

18 the access tier. Typically, this tier comprises a collection of 1U servers. These machines use a moderate amount of RAM and are network I/O intensive. As these systems field each incoming API request, it is wise to provision them with two high- throughput (10GbE) interfaces. One interface is used for 'front- end' incoming requests and the other for 'back- end' access to the object storage nodes to put and fetch data. Factors to Consider For most publicly facing deployments as well as private deployments available across a wide- reaching corporate network, SSL will be used to encrypt traffic to the client. SSL adds significant processing load to establish sessions between clients; more capacity in the access layer will need to be provisioned. SSL may not be required for private deployments on trusted networks. Storage Nodes 2012 SwiftStack, Inc. All rights reserved. 18

19 The next component is the storage servers themselves. Generally, most configurations should have each of the five Zones with an equal amount of storage capacity. Storage nodes use a reasonable amount of memory and CPU. Metadata needs to be readily available to quickly return objects. The object stores run services not only to field incoming requests from the Access Tier, but to also run replicators, auditors, and reapers. Object stores can be provisioned with single gigabit or 10 gigabit network interface depending on expected workload and desired performance. Currently 2TB or 3TB SATA disks deliver good price/performance value. Desktop- grade drives can be used where there are responsive remote hands in the datacenter, and enterprise- grade drives can be used where this is not the case. Factors to Consider Desired I/O performance for single- threaded requests should be kept in mind. This system does not use RAID, so each request for an object is handled by a single disk. Disk performance impacts single- threaded response rates. To achieve apparent higher throughput, the object storage system is designed with concurrent uploads/downloads in mind. The network I/O capacity (1GbE, bonded 1GbE pair, or 10GbE) should match your desired concurrent throughput needs for reads and writes SwiftStack, Inc. All rights reserved. 19

20 Configuring Networking Below are two examples of deployments at two scales: the larger deployments with a two- tier networking architecture, and smaller deployments with a single networking tier. Note that when a write comes into the proxy server, there is three times the traffic going to the object stores to write the three replicas. Systems must be designed to account for the expected read/write traffic. Large- Scale Networking Aggregation A pair of aggregation switches with two links back to the access network / border network are used to connect to two pools of the Access Tier and to each of the five Zone switches that connect the Object Stores. All connections to the Access Tier and the Zones are 10GbE. Zone Network Each Zone has a switch to connect itself to the aggregation network. It s possible to use a single, non- redundant switch as the system is designed to sustain a Zone failure SwiftStack, Inc. All rights reserved. 20

21 Depending on overall concurrency desired, a deployment can use either a 1GbE or a 10GbE network to the object stores. Medium- Scale Networking A single network tier is used for smaller deployments in the range of TB. Either 1GbE or 10GbE switches can be used for this purpose depending on the throughput the cluster is expected to sustain. The Access Tier services still contain two interfaces and a VLAN is created for each front- facing API request and back- end network connecting the object server Zones. Management Network A management network is critical to maintaining the health of the cluster. A separate 1GbE management network is created for IPMI, monitoring, and out- of- band access to every machine in the cluster. However, it is typically possible to use the higher- bandwidth connections during provisioning for operating system installation SwiftStack, Inc. All rights reserved. 21

22 Hardware Recommendations Swift is designed to store and retrieve whole files via HTTP across a cluster of industry- standard x86 servers and drives, using replication to ensure data reliability and fault tolerance. While this model provides great flexibility (and low cost) from a hardware perspective, it requires some upfront planning, testing and validation to ensure that the hardware you select is suitable not just for Swift itself, but also for the expected workload that you are designing your cluster for. Your operations team may also have opinions on the hardware selection, as they prefer to work with hardware they are already familiar with. Proxy Nodes Proxy nodes use a moderate amount of RAM and are network IO intensive. Typically, Proxy nodes are 1U systems with a minimum of 12 GB RAM. As these systems field each incoming API request, it is wise to provision them with two high- throughput (10GbE) interfaces. One interface is used for 'front- end' incoming requests and the other for 'back- end' access to the object storage nodes to put and fetch data. For small Swift deployments, the storage nodes can serve as proxy nodes. Storage Nodes Storage nodes are typically high- density 3U or 4U nodes with SATA disks each. These nodes use a reasonable amount of memory and CPU. The storage nodes run services not only to field incoming requests from the proxy nodes, but also replication, auditing and other processes to ensure durability. Storage nodes can be provisioned with single gigabit or 10GbE network interface depending on expected workload and desired performance. For storage nodes, we recommend the following specifications: CPU 64- bit x86 CPU (Intel/AMD), quad- core or greater, running at least 2-2.5GHz RAM A good rule of thumb is approximately 1 GB of RAM for each TB of Disk. I.e. for a node with 24 drives, 36-48GB of RAM should be used. The memory is used for the many processes used field incoming object requests and XFS inode caching. Drives Either 2TB or 3TB 7200 RPM SATA drives, which deliver good price/performance value. Desktop- grade drives can be used where there are responsive remote hands in the data center, and enterprise- grade drives can be used 2012 SwiftStack, Inc. All rights reserved. 22

23 where that is not the case. We don t recommend using green drives. Swift is continuously ensuring data integrity and the power- down functions of green drives may result in excess wear. Extreme container update workload consideration Where the application needs to ingest many millions of files in a single container, it may be necessary to use higher- performing media (RAID 10 with 15k drives or SSDs) for the container indexes. The data set is relatively very small in size, so few space is needed on the higher performing media to store this data. Controller Cards Swift replicates data across zones so there is no need for data redundancy to be provided by the controller. Swift therefore uses standard SATA controller cards without RAID, such as LSI i 6Gb/s SAS / SATA HBA. However, if the controller card requires RAID volumes to be created, set up a RAID 0 group (without striping) for each drive. Network Cards Depending on the use case, single gigabit ethernet (1GbE) on each host may be all that is required. However, it is possible to configure bonded 1GbE or 10GbE if the workload demands it. Networking A typical deployment would have a front- facing access network and a back- end storage network. When designing the network capacity, keep in mind that writes fan- out in triplicate in the storage network. As there are three copies of each object, an incoming write is sent to three storage nodes. Therefore network capacity for writes needs to be considered in proportion to overall workload. Sizing Your Swift Cluster Each node and drive that is added to a Swift cluster will not only provide additional storage capacity, but will also increase the aggregate IO capacity of the cluster as there are more systems and drives available to serve incoming read requests. When selecting a specific hardware configuration for your Swift cluster, it is therefore important to determine which configuration provides the best balance of IO performance, capacity and cost for a given workload. For instance, customer- facing web applications with a large number of concurrent users will have a different profile than one that is used primarily for archiving SwiftStack, Inc. All rights reserved. 23

24 Planning a Deployment Benchmarking and Testing When planning a Swift deployment, the first step is to identity the application workloads and define the corresponding functional requirements, which will determine how your Swift environment will be designed, configured and deployed. Benchmarking goes hand- in- hand with designing the right cluster. Some of the areas to consider when conducting your benchmarking include: 1. The spread of file sizes 2. Number of concurrent workers 3. The proportion of creation / read / update / delete that is expected To conduct your benchmarking, a pre- populated cluster with a set of data to mimic an eventual state should be set up. It is also important to be able to parallelize the benchmarking workloads as it may require many systems to sustain the aggregate throughput a Swift cluster can put out. SwiftStack is available to assist in determining which hardware configuration is most optimal for your workload. As part of a SwiftStack deployment we provide a benchmarking test suite to help ensure and tune your OpenStack Swift cluster so it can handle the workloads you re about to put on it. A testing plan also needs to be developed which outlines how the environment will be tested and the overall testing criteria. Some of the areas to address in the testing plan include: Swift API calls Failure scenarios Swift client libraries that will be used Datacenter Facility Planning The datacenter space and power requirements are critical areas to plan for a successful Swift deployment. They require planning with your datacenter facilities team or datacenter vendor to deploy a full rack of object stores. Be sure to meet early with this team to explain your project and plan for: Power provisioning 2012 SwiftStack, Inc. All rights reserved. 24

25 Cooling Physical space requirements Networking capacity and planning Port layouts Rack layouts SwiftStack can assist with best practices and datacenter facilities planning advice for a successful Swift implementation. Integrations When getting Swift up and running in your data- center, there are several potential integrations with 3rd party systems and services to consider, including: Authentication system, such as LDAP Operations support systems Billing systems External monitoring systems to consume SNMP polling of system information, and SNMP traps etc. Content Distribution Networks (CDNs) Replication to off- site Swift clusters for additional redundancy Each of these areas can be integrated with your Swift environment but requirements will differ based on your specific requirements and use- case. While this white- paper does not cover these areas in any depth, SwiftStack can provide advice and best practices on how to integrate with 3rd party systems and services. Monitoring There are many tools available for application developers and IT operations teams to measure the health of applications and servers. While many of these tools are helpful, for a Swift cluster they may be more complex and provide more data than you need. To ensure that users can quickly measure the overall health of their SwiftStack environment, the SwiftStack Platform is tracking the key metrics that allow you to quickly determine the status of your overall Swift cluster and of the individual nodes. For the overall Swift cluster, the key metrics monitored by the SwiftStack Platform are node CPU utilization, top 5 least free disks, disk I/O and network I/O. For individual nodes, the same key metrics are reported, which can be used to tell the overall health of the node. In addition, external monitoring systems can be configured to consume SNMP polling of system information, and SNMP traps provided through the SwiftStack platform SwiftStack, Inc. All rights reserved. 25

26 Administration & Maintenance The SwiftStack Platform doesn t eliminate the need for a storage administrator but will significantly simplify the job of administrating and maintaining a Swift environment. Some common areas that will need to be administered and maintained include performance tuning of the overall cluster, adding/removing capacity to the cluster, identifying and replacing failed hardware, applying software updates etc. While the SwiftStack Platform automates and simplifies many of these tasks, it will require some hands- on administration. The automation of many of these tasks in through the SwiftStack Platfrom is not just a time saver for a storage administrator, but critical to ensuring system uptime and preventing the loss of data or putting data at risk. Understanding TCO Finally, when planning a deployment, understanding the total- cost of ownership for the cluster is a critical so all direct and indirect costs are included. Costs should include: Design/Development Hardware Hardware Standup Datacenter Space Power/Cooling 2012 SwiftStack, Inc. All rights reserved. 26

27 Network Access Ongoing Software Maintenance and Support Monitoring and Operational Support Rolling Swift Out in Your Organization If your organization does not already have experience building applications with an object storage systems, such as Amazon S3 and Rackspace Cloud Files, it is also critical to train internal development and product integration teams on how to use and take advantage of Swift. Because Swift and other object storage systems requires a different approach to application development from a traditional file system, it is important to start this process early. SwiftStack provides both training and a virtual training appliance with Swift, which can be installed and use on a laptop. This enables developers and IT operations staff to become familiar with how Swift works and how to take advantage of some of its key benefits vis- a- vi a traditional filesystem based approach, including: Self- service provisioning, where developers and users can access storage in minutes The ability for end- users to create huge namespaces for applications without needing to segment storage systems Access to a wide selection of available tools and libraries that provide integration convenience SwiftStack, Inc. All rights reserved. 27

28 Managing Swift with SwiftStack SwiftStack provides the deployment, management and monitoring plane for Swift, which: Drastically simplifies the process of getting Swift up and running in your data- center Deploys Swift on nodes and configures the cluster Enables you to start with one node and add nodes as your data grows Provides a central management console for your Swift nodes and cluster Monitors and alerts for issues in nodes, disks and other resources Enables you to easily expand your cluster and tune for performance Provides diagnostics for issues, which simplifies support and administration The SwiftStack Platform incorporates deployment and operational best practices for Swift and provides a single plane of glass for your entire Swift environment. To get started with SwiftStack, the first step is to download the SwiftStack ISO consisting of: Ubuntu OpenStack Swift SwiftStack Agents 2012 SwiftStack, Inc. All rights reserved. 28

29 After logging in at the SwiftStack Platform, it will guide you through the process of creating a new cluster, creating accounts and users, installing and provisioning Swift on cluster nodes, formatting drives, configuring zones and the other tasks required to set up your Swift environment. Once your Swift environment has been deployed, the SwiftStack Platform helps provides the on- going administration, management and monitoring of your Swift environment. Like to Learn More? Swift and SwiftStack offers a real alternative to proprietary object storage systems and is much easier to use than traditional file- system based approaches. Swift is provided under the Apache 2 open source license, is highly scalable, extremely durable and runs on industry standard hardware. Swift also has a compelling set of compatible tools available from third parties and other open source projects. With the SwiftStack Platform, deployment, on- going management and monitoring can now be done with ease. If you d like to learn more about Swift and SwiftStack, contact us at contact@swiftstack.com. The SwiftStack Team, 2012 SwiftStack, Inc. All rights reserved. 29

Reference Design: Scalable Object Storage with Seagate Kinetic, Supermicro, and SwiftStack

Reference Design: Scalable Object Storage with Seagate Kinetic, Supermicro, and SwiftStack Reference Design: Scalable Object Storage with Seagate Kinetic, Supermicro, and SwiftStack May 2015 Copyright 2015 SwiftStack, Inc. swiftstack.com Page 1 of 19 Table of Contents INTRODUCTION... 3 OpenStack

More information

www.basho.com Technical Overview Simple, Scalable, Object Storage Software

www.basho.com Technical Overview Simple, Scalable, Object Storage Software www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...

More information

SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL davidh@il.ibm.com 2012 IBM Corporation

SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL davidh@il.ibm.com 2012 IBM Corporation Page:1 Openstack Swift Object Store Cloud built from the grounds up David Hadas Swift ATC HRL davidh@il.ibm.com Page:2 Object Store Cloud Services Expectations: PUT/GET/DELETE Huge Capacity (Scale) Always

More information

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...

More information

Scala Storage Scale-Out Clustered Storage White Paper

Scala Storage Scale-Out Clustered Storage White Paper White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current

More information

Zadara Storage Cloud A whitepaper. @ZadaraStorage

Zadara Storage Cloud A whitepaper. @ZadaraStorage Zadara Storage Cloud A whitepaper @ZadaraStorage Zadara delivers two solutions to its customers: On- premises storage arrays Storage as a service from 31 locations globally (and counting) Some Zadara customers

More information

Designing a Cloud Storage System

Designing a Cloud Storage System Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes

More information

Service Description Cloud Storage Openstack Swift

Service Description Cloud Storage Openstack Swift Service Description Cloud Storage Openstack Swift Table of Contents Overview iomart Cloud Storage... 3 iomart Cloud Storage Features... 3 Technical Features... 3 Proxy... 3 Storage Servers... 4 Consistency

More information

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda 1 Outline Build a cost-efficient Swift cluster with expected performance Background & Problem Solution Experiments

More information

The Design and Implementation of the Zetta Storage Service. October 27, 2009

The Design and Implementation of the Zetta Storage Service. October 27, 2009 The Design and Implementation of the Zetta Storage Service October 27, 2009 Zetta s Mission Simplify Enterprise Storage Zetta delivers enterprise-grade storage as a service for IT professionals needing

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

Amazon Cloud Storage Options

Amazon Cloud Storage Options Amazon Cloud Storage Options Table of Contents 1. Overview of AWS Storage Options 02 2. Why you should use the AWS Storage 02 3. How to get Data into the AWS.03 4. Types of AWS Storage Options.03 5. Object

More information

Diagram 1: Islands of storage across a digital broadcast workflow

Diagram 1: Islands of storage across a digital broadcast workflow XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,

More information

Understanding Object Storage and How to Use It

Understanding Object Storage and How to Use It SWIFTSTACK WHITEPAPER An IT Expert Guide: Understanding Object Storage and How to Use It November 2014 The explosion of unstructured data is creating a groundswell of interest in object storage, certainly

More information

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved. Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat

More information

Configuration and Deployment Guide for OpenStack* Swift* Object Storage on Intel Atom Processor and Intel Xeon Processor Microservers

Configuration and Deployment Guide for OpenStack* Swift* Object Storage on Intel Atom Processor and Intel Xeon Processor Microservers Configuration and Deployment Guide for OpenStack* Swift* Object Storage on Intel Atom Processor and Intel Xeon Processor Microservers About this Guide This Configuration and Deployment Guide explores designing

More information

Microsoft Private Cloud Fast Track

Microsoft Private Cloud Fast Track Microsoft Private Cloud Fast Track Microsoft Private Cloud Fast Track is a reference architecture designed to help build private clouds by combining Microsoft software with Nutanix technology to decrease

More information

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research Introduction to Cloud : Cloud and Cloud Storage Lecture 2 Dr. Dalit Naor IBM Haifa Research Storage Systems 1 Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom

More information

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a

More information

Configuration and Deployment Guide For OpenStack * Swift * Object Storage on Intel Atom Processor and Intel Xeon Processor Microservers

Configuration and Deployment Guide For OpenStack * Swift * Object Storage on Intel Atom Processor and Intel Xeon Processor Microservers Software Configuration and Deployment Guide Configuration and Deployment Guide For OpenStack * Swift * Object on Intel Atom Processor and Intel Xeon Processor Microservers About this Guide This Configuration

More information

Accelerating and Simplifying Apache

Accelerating and Simplifying Apache Accelerating and Simplifying Apache Hadoop with Panasas ActiveStor White paper NOvember 2012 1.888.PANASAS www.panasas.com Executive Overview The technology requirements for big data vary significantly

More information

Storage Virtualization

Storage Virtualization Section 2 : Storage Networking Technologies and Virtualization Storage Virtualization Chapter 10 EMC Proven Professional The #1 Certification Program in the information storage and management industry

More information

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

More information

Cloud Based Application Architectures using Smart Computing

Cloud Based Application Architectures using Smart Computing Cloud Based Application Architectures using Smart Computing How to Use this Guide Joyent Smart Technology represents a sophisticated evolution in cloud computing infrastructure. Most cloud computing products

More information

Hadoop: Embracing future hardware

Hadoop: Embracing future hardware Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop

More information

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with

More information

Parallels Cloud Storage

Parallels Cloud Storage Parallels Cloud Storage White Paper Best Practices for Configuring a Parallels Cloud Storage Cluster www.parallels.com Table of Contents Introduction... 3 How Parallels Cloud Storage Works... 3 Deploying

More information

SwiftStack Filesystem Gateway Architecture

SwiftStack Filesystem Gateway Architecture WHITEPAPER SwiftStack Filesystem Gateway Architecture March 2015 by Amanda Plimpton Executive Summary SwiftStack s Filesystem Gateway expands the functionality of an organization s SwiftStack deployment

More information

IOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org

IOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org IOmark- VDI Nimbus Data Gemini Test Report: VDI- 130906- a Test Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VDI, VDI- IOmark, and IOmark are trademarks of Evaluator

More information

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...

More information

Building Cost-Effective Storage Clouds A Metrics-based Approach

Building Cost-Effective Storage Clouds A Metrics-based Approach Building Cost-Effective Storage Clouds A Metrics-based Approach Ning Zhang #1, Chander Kant 2 # Computer Sciences Department, University of Wisconsin Madison Madison, WI, USA 1 nzhang@cs.wisc.edu Zmanda

More information

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Clear the way for new business opportunities. Unlock the power of data. Overcoming storage limitations Unpredictable data growth

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida Amazon Web Services Primer William Strickland COP 6938 Fall 2012 University of Central Florida AWS Overview Amazon Web Services (AWS) is a collection of varying remote computing provided by Amazon.com.

More information

Cloud Optimize Your IT

Cloud Optimize Your IT Cloud Optimize Your IT Windows Server 2012 The information contained in this presentation relates to a pre-release product which may be substantially modified before it is commercially released. This pre-release

More information

June 2009. Blade.org 2009 ALL RIGHTS RESERVED

June 2009. Blade.org 2009 ALL RIGHTS RESERVED Contributions for this vendor neutral technology paper have been provided by Blade.org members including NetApp, BLADE Network Technologies, and Double-Take Software. June 2009 Blade.org 2009 ALL RIGHTS

More information

POWER ALL GLOBAL FILE SYSTEM (PGFS)

POWER ALL GLOBAL FILE SYSTEM (PGFS) POWER ALL GLOBAL FILE SYSTEM (PGFS) Defining next generation of global storage grid Power All Networks Ltd. Technical Whitepaper April 2008, version 1.01 Table of Content 1. Introduction.. 3 2. Paradigm

More information

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

Pivot3 Reference Architecture for VMware View Version 1.03

Pivot3 Reference Architecture for VMware View Version 1.03 Pivot3 Reference Architecture for VMware View Version 1.03 January 2012 Table of Contents Test and Document History... 2 Test Goals... 3 Reference Architecture Design... 4 Design Overview... 4 The Pivot3

More information

A Virtual Filer for VMware s Virtual SAN A Maginatics and VMware Joint Partner Brief

A Virtual Filer for VMware s Virtual SAN A Maginatics and VMware Joint Partner Brief A Virtual Filer for VMware s Virtual SAN A Maginatics and VMware Joint Partner Brief With the massive growth of unstructured data in today s enterprise environments, storage IT administrators are constantly

More information

nexsan NAS just got faster, easier and more affordable.

nexsan NAS just got faster, easier and more affordable. nexsan E5000 STORAGE SYSTEMS NAS just got faster, easier and more affordable. Overview The Nexsan E5000 TM, a part of Nexsan s Flexible Storage Platform TM, is Nexsan s family of NAS storage systems that

More information

PARALLELS CLOUD STORAGE

PARALLELS CLOUD STORAGE PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...

More information

Object storage in Cloud Computing and Embedded Processing

Object storage in Cloud Computing and Embedded Processing Object storage in Cloud Computing and Embedded Processing Jan Jitze Krol Systems Engineer DDN We Accelerate Information Insight DDN is a Leader in Massively Scalable Platforms and Solutions for Big Data

More information

Private cloud computing advances

Private cloud computing advances Building robust private cloud services infrastructures By Brian Gautreau and Gong Wang Private clouds optimize utilization and management of IT resources to heighten availability. Microsoft Private Cloud

More information

Big data management with IBM General Parallel File System

Big data management with IBM General Parallel File System Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers

More information

Parallels Server 4 Bare Metal

Parallels Server 4 Bare Metal Parallels Server 4 Bare Metal Product Summary 1/21/2010 Company Overview Parallels is a worldwide leader in virtualization and automation software that optimizes computing for services providers, businesses

More information

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity P3 InfoTech Solutions Pvt. Ltd http://www.p3infotech.in July 2013 Created by P3 InfoTech Solutions Pvt. Ltd., http://p3infotech.in 1 Web Application Deployment in the Cloud Using Amazon Web Services From

More information

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Jonathan Halstuch, COO, RackTop Systems JHalstuch@racktopsystems.com Big Data Invasion We hear so much on Big Data and

More information

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Report Date: 27, April 2015. www.iomark.

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Report Date: 27, April 2015. www.iomark. IOmark- VDI HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Copyright 2010-2014 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VM, VDI- IOmark, and IOmark

More information

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures 1 Refreshing Your Data Protection Environment with Next-Generation Architectures Dale Rhine, Principal Sales Consultant Kelly Boeckman, Product Marketing Analyst Program Agenda Storage

More information

The Total Cost of (Non) Ownership of a NoSQL Database Cloud Service

The Total Cost of (Non) Ownership of a NoSQL Database Cloud Service The Total Cost of (Non) Ownership of a NoSQL Database Cloud Service Jinesh Varia and Jose Papo March 2012 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1

More information

A survey of big data architectures for handling massive data

A survey of big data architectures for handling massive data CSIT 6910 Independent Project A survey of big data architectures for handling massive data Jordy Domingos - jordydomingos@gmail.com Supervisor : Dr David Rossiter Content Table 1 - Introduction a - Context

More information

Workflow. Connectivity. Expansion. Workflow. Connectivity. Performance. Project and Bin Sharing. New! ShareBrowser Desktop Client

Workflow. Connectivity. Expansion. Workflow. Connectivity. Performance. Project and Bin Sharing. New! ShareBrowser Desktop Client Workflow Connectivity Performance Expansion Project sharing, bin sharing, file sharing, SAN & NAS for professional media applications. Enough Ethernet and Fibre Channel ports to directly connect every

More information

Storage Architectures for Big Data in the Cloud

Storage Architectures for Big Data in the Cloud Storage Architectures for Big Data in the Cloud Sam Fineberg HP Storage CT Office/ May 2013 Overview Introduction What is big data? Big Data I/O Hadoop/HDFS SAN Distributed FS Cloud Summary Research Areas

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

Design and Evolution of the Apache Hadoop File System(HDFS)

Design and Evolution of the Apache Hadoop File System(HDFS) Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop

More information

Improving Scalability Of Storage System:Object Storage Using Open Stack Swift

Improving Scalability Of Storage System:Object Storage Using Open Stack Swift Improving Scalability Of Storage System:Object Storage Using Open Stack Swift G.Kathirvel Karthika 1,R.C.Malathy 2,M.Keerthana 3 1,2,3 Student of Computer Science and Engineering, R.M.K Engineering College,Kavaraipettai.

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems Finding a needle in Haystack: Facebook

More information

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, 2015. 20014 IBM Corporation

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, 2015. 20014 IBM Corporation Boas Betzler Cloud IBM Distinguished Computing Engineer for a Smarter Planet Globally Distributed IaaS Platform Examples AWS and SoftLayer November 9, 2015 20014 IBM Corporation Building Data Centers The

More information

The last 18 months. AutoScale. IaaS. BizTalk Services Hyper-V Disaster Recovery Support. Multi-Factor Auth. Hyper-V Recovery.

The last 18 months. AutoScale. IaaS. BizTalk Services Hyper-V Disaster Recovery Support. Multi-Factor Auth. Hyper-V Recovery. Offline Operations Traffic ManagerLarge Memory SKU SQL, SharePoint, BizTalk Images HDInsight Windows Phone Support Per Minute Billing HTML 5/CORS Android Support Custom Mobile API AutoScale BizTalk Services

More information

Development of nosql data storage for the ATLAS PanDA Monitoring System

Development of nosql data storage for the ATLAS PanDA Monitoring System Development of nosql data storage for the ATLAS PanDA Monitoring System M.Potekhin Brookhaven National Laboratory, Upton, NY11973, USA E-mail: potekhin@bnl.gov Abstract. For several years the PanDA Workload

More information

Caringo Swarm 7: beyond the limits of traditional storage. A new private cloud foundation for storage needs at scale

Caringo Swarm 7: beyond the limits of traditional storage. A new private cloud foundation for storage needs at scale Caringo Swarm 7: beyond the limits of traditional storage. A new private cloud foundation for storage needs at scale Prepared for: Caringo May 2014 TABLE OF CONTENTS TABLE OF CONTENTS 1 EXECUTIVE SUMMARY

More information

How to Choose your Red Hat Enterprise Linux Filesystem

How to Choose your Red Hat Enterprise Linux Filesystem How to Choose your Red Hat Enterprise Linux Filesystem EXECUTIVE SUMMARY Choosing the Red Hat Enterprise Linux filesystem that is appropriate for your application is often a non-trivial decision due to

More information

The Zadara Storage Cloud A Validation of its Use Cases and Economic Benefits

The Zadara Storage Cloud A Validation of its Use Cases and Economic Benefits Technology Insight Paper The Zadara Storage Cloud A Validation of its Use Cases and Economic Benefits By John Webster August 2015 Enabling you to make the best technology decisions The Zadara Storage Cloud

More information

Investigating Private Cloud Storage Deployment using Cumulus, Walrus, and OpenStack/Swift

Investigating Private Cloud Storage Deployment using Cumulus, Walrus, and OpenStack/Swift Investigating Private Cloud Storage Deployment using Cumulus, Walrus, and OpenStack/Swift Prakashan Korambath Institute for Digital Research and Education (IDRE) 5308 Math Sciences University of California,

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX

SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX White Paper SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX Abstract This white paper explains the benefits to the extended enterprise of the on-

More information

Understanding Enterprise NAS

Understanding Enterprise NAS Anjan Dave, Principal Storage Engineer LSI Corporation Author: Anjan Dave, Principal Storage Engineer, LSI Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA

More information

Building Storage as a Service with OpenStack. Greg Elkinbard Senior Technical Director

Building Storage as a Service with OpenStack. Greg Elkinbard Senior Technical Director Building Storage as a Service with OpenStack Greg Elkinbard Senior Technical Director MIRANTIS 2012 PAGE 1 About the Presenter Greg Elkinbard Senior Technical Director at Mirantis Builds on demand IaaS

More information

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything BlueArc unified network storage systems 7th TF-Storage Meeting Scale Bigger, Store Smarter, Accelerate Everything BlueArc s Heritage Private Company, founded in 1998 Headquarters in San Jose, CA Highest

More information

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011 Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011 Executive Summary Large enterprise Hyper-V deployments with a large number

More information

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture Dell Compellent Product Specialist Team THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL

More information

The deployment of OHMS TM. in private cloud

The deployment of OHMS TM. in private cloud Healthcare activities from anywhere anytime The deployment of OHMS TM in private cloud 1.0 Overview:.OHMS TM is software as a service (SaaS) platform that enables the multiple users to login from anywhere

More information

Amazon EC2 Product Details Page 1 of 5

Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of

More information

CMB 207 1I Citrix XenApp and XenDesktop Fast Track

CMB 207 1I Citrix XenApp and XenDesktop Fast Track CMB 207 1I Citrix XenApp and XenDesktop Fast Track This fast paced course provides the foundation necessary for students to effectively centralize and manage desktops and applications in the datacenter

More information

CSE-E5430 Scalable Cloud Computing Lecture 2

CSE-E5430 Scalable Cloud Computing Lecture 2 CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing

More information

Keys to Successfully Architecting your DSI9000 Virtual Tape Library. By Chris Johnson Dynamic Solutions International

Keys to Successfully Architecting your DSI9000 Virtual Tape Library. By Chris Johnson Dynamic Solutions International Keys to Successfully Architecting your DSI9000 Virtual Tape Library By Chris Johnson Dynamic Solutions International July 2009 Section 1 Executive Summary Over the last twenty years the problem of data

More information

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution EMC Virtual Infrastructure for Microsoft Applications Data Center Solution Enabled by EMC Symmetrix V-Max and Reference Architecture EMC Global Solutions Copyright and Trademark Information Copyright 2009

More information

ANY SURVEILLANCE, ANYWHERE, ANYTIME

ANY SURVEILLANCE, ANYWHERE, ANYTIME ANY SURVEILLANCE, ANYWHERE, ANYTIME WHITEPAPER DDN Storage Powers Next Generation Video Surveillance Infrastructure INTRODUCTION Over the past decade, the world has seen tremendous growth in the use of

More information

Building Storage Clouds for Online Applications A Case for Optimized Object Storage

Building Storage Clouds for Online Applications A Case for Optimized Object Storage Building Storage Clouds for Online Applications A Case for Optimized Object Storage Agenda Introduction: storage facts and trends Call for more online storage! AmpliStor: Optimized Object Storage Cost

More information

Software-defined Storage Architecture for Analytics Computing

Software-defined Storage Architecture for Analytics Computing Software-defined Storage Architecture for Analytics Computing Arati Joshi Performance Engineering Colin Eldridge File System Engineering Carlos Carrero Product Management June 2015 Reference Architecture

More information

System Requirements. Version 8.2 November 23, 2015. For the most recent version of this document, visit our documentation website.

System Requirements. Version 8.2 November 23, 2015. For the most recent version of this document, visit our documentation website. System Requirements Version 8.2 November 23, 2015 For the most recent version of this document, visit our documentation website. Table of Contents 1 System requirements 3 2 Scalable infrastructure example

More information

Big Data With Hadoop

Big Data With Hadoop With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

More information

EMC Unified Storage for Microsoft SQL Server 2008

EMC Unified Storage for Microsoft SQL Server 2008 EMC Unified Storage for Microsoft SQL Server 2008 Enabled by EMC CLARiiON and EMC FAST Cache Reference Copyright 2010 EMC Corporation. All rights reserved. Published October, 2010 EMC believes the information

More information

Cloud Computing. Adam Barker

Cloud Computing. Adam Barker Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles

More information

Introduction to Gluster. Versions 3.0.x

Introduction to Gluster. Versions 3.0.x Introduction to Gluster Versions 3.0.x Table of Contents Table of Contents... 2 Overview... 3 Gluster File System... 3 Gluster Storage Platform... 3 No metadata with the Elastic Hash Algorithm... 4 A Gluster

More information

Ultra-Scalable Storage Provides Low Cost Virtualization Solutions

Ultra-Scalable Storage Provides Low Cost Virtualization Solutions Ultra-Scalable Storage Provides Low Cost Virtualization Solutions Flexible IP NAS/iSCSI System Addresses Current Storage Needs While Offering Future Expansion According to Whatis.com, storage virtualization

More information

OPTIMIZING SERVER VIRTUALIZATION

OPTIMIZING SERVER VIRTUALIZATION OPTIMIZING SERVER VIRTUALIZATION HP MULTI-PORT SERVER ADAPTERS BASED ON INTEL ETHERNET TECHNOLOGY As enterprise-class server infrastructures adopt virtualization to improve total cost of ownership (TCO)

More information

Windows Server 2008 Essentials. Installation, Deployment and Management

Windows Server 2008 Essentials. Installation, Deployment and Management Windows Server 2008 Essentials Installation, Deployment and Management Windows Server 2008 Essentials First Edition. This ebook is provided for personal use only. Unauthorized use, reproduction and/or

More information

Maxta Storage Platform Enterprise Storage Re-defined

Maxta Storage Platform Enterprise Storage Re-defined Maxta Storage Platform Enterprise Storage Re-defined WHITE PAPER Software-Defined Data Center The Software-Defined Data Center (SDDC) is a unified data center platform that delivers converged computing,

More information

Alfresco Enterprise on AWS: Reference Architecture

Alfresco Enterprise on AWS: Reference Architecture Alfresco Enterprise on AWS: Reference Architecture October 2013 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1 of 13 Abstract Amazon Web Services (AWS)

More information

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage Sam Fineberg, HP Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations

More information

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network

More information

MaxDeploy Hyper- Converged Reference Architecture Solution Brief

MaxDeploy Hyper- Converged Reference Architecture Solution Brief MaxDeploy Hyper- Converged Reference Architecture Solution Brief MaxDeploy Reference Architecture solutions are configured and tested for support with Maxta software- defined storage and with industry

More information

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014 Highly available, scalable and secure data with Cassandra and DataStax Enterprise GOTO Berlin 27 th February 2014 About Us Steve van den Berg Johnny Miller Solutions Architect Regional Director Western

More information

StorReduce Technical White Paper Cloud-based Data Deduplication

StorReduce Technical White Paper Cloud-based Data Deduplication StorReduce Technical White Paper Cloud-based Data Deduplication See also at storreduce.com/docs StorReduce Quick Start Guide StorReduce FAQ StorReduce Solution Brief, and StorReduce Blog at storreduce.com/blog

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.

More information

HDFS Users Guide. Table of contents

HDFS Users Guide. Table of contents Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

More information