Design and Implementation of a User-Centered Content Distribution Network
|
|
|
- Ami Harper
- 9 years ago
- Views:
Transcription
1 Design and Implementation of a User-Centered Content Distribution Network Guillaume Pierre and Maarten van Steen Vrije Universiteit De Boelelaan 1081a 1081HV Amsterdam The Netherlands {gpierre,steen}@cs.vu.nl Abstract Replicating Web documents at a worldwide scale can help reduce user-perceived latency and wide-area network traffic. This paper presents the design and implementation of Globule, a platform that allows Web server administrators to organize a decentralized replication service by trading Web hosting resources with each other. Globule automates all aspects of such replication: document replication, selection of the most appropriate replication strategies on a per-document basis, consistency management and transparent redirection of clients to replicas. To facilitate the transition from a non-replicated server to a replicated one, we designed Globule as a module for the Apache Web server. Therefore, converting Web documents should require no more than compiling a new module into Apache and editing a configuration file. 1. Introduction Large-scale distributed systems often address performance and quality-of-service issues by way of caching and replication. In the Web, content delivery networks (CDNs) such as Akamai and Digital Island have emerged as a viable solution to achieve scalability through replication [1]. In this approach, content is replicated to places where user demand is high. Content itself can vary from simple static pages to bandwidth-demanding video streams. Content delivery networks are quite popular among administrators of large Web sites. However, servers holding open-source content and small businesses may prefer a cheaper, yet efficient, solution. We believe that such users are willing to contribute their unused resources to the community in exchange for improved performance for their own site. We propose a decentralized scheme where the owner of a Web site can accept to host replicas from other sites, and obtain in return the ability to deploy replicas of his own documents at remote places. This allows administrators to independently organize a service similar to that of commercial CDNs. This approach is attractive, since it allows one to acquire valuable remote resources in exchange for relatively cheap local resources. Globule is a user-centered content delivery network that our group is developing [6]. It allows Web servers to host each other s replicated Web documents. To favor integration into existing Web systems, it is designed as a module for the popular Apache server. Making use of our system should require no more than compiling a new module into Apache and editing a configuration file. This paper presents the design and implementation of Globule, together with performance measurements. Unlike most Web replication systems, Globule does not apply a fixed replication policy to all documents. As we have shown in previous research, there is no single policy that is best in all cases [5]. This statement is true even for simple Web documents that are constructed as a static collection of HTML files, images, icons, and so on. As a consequence, Globule contains a multitude of replication policies, and associates each document with the policy that suits it best. This is realized with an object-based approach in which each document is encapsulated in an object that is fully responsible for its own distribution. In other words, each Web document is considered as an object which does not only encapsulate its state and operations, but also the implementation of a replication policy by which that state is delivered to clients. This allows a document to monitor its own access patterns and to dynamically select the replication policy that suits it best. When a change is detected in access patterns, it can re-evaluate its choice and switch policies on the fly [7]. We evaluate the relative performance of Globule versus Apache, and show that Globule has a constant overhead of
2 about 200 µs per request, which accounts for at most 10% of the total request latency. The throughput of Globule is also acceptable, in the worst case between 5% and 17% below that of unmodified Apache. Whereas [6] described the overall architecture of Globule, in this paper we concentrate on the details of its design and implementation, in particular as an Apache module. Our main contribution is that we demonstrate how a user-centered CDN can be developed with existing components, and that it supports per-document replication policies at virtually no cost. This paper is organized as follows: Section 2 describes our document and server model; Section 3 details the implementation of Globule; Section 4 presents a performance evaluation; finally, Section 5 discusses related work and Section 6 concludes. 2. The Globule model Our system is made of servers that cooperate in order to replicate Web documents. This section describes our document and server models Document model In contrast to most Web servers, we do not consider a Web document and its replicas only as a collection of files. Instead, we take a more general approach and consider a document as a physically distributed object whose state is replicated across the Internet. All replication mechanisms are hidden from clients behind the object s interfaces. There is one standard interface containing methods such as get() and put() to allow for delivering and modifying a document s content. The design of Globule is inspired by that of Globe, a platform for large-scale distributed objects [10]. Its main novelty is the encapsulation of issues related to distribution and replication inside the objects. In other words, an object fully controls how, when, and where it distributes and replicates its content. We have shown in previous papers that significant performance improvements can be obtained over traditional replicated Web servers by associating each document with the replication strategy that suits it best [5, 7]. Such perdocument replication policies are made possible by the encapsulation of replication issues inside each document. The selection of the best replication policy for each document is realized internally by way of trace-based simulations. Replicas transmit logs of the requests they received to their master site. At startup or when a significant access pattern modification is detected, the master re-evaluates its choice of replication strategy. To do so, it extracts the most recent trace records and simulates the behavior of a number of replication policies with this trace. Each simulation outputs performance metrics such as client retrieval time, network traffic and consistency. The best policy is chosen from these performance figures using a cost function. More details about these adaptive replicated documents can be found in [7] Cooperative servers One important issue for replicating Web documents is to gain access to computing resources in several locations worldwide (CPU, disk space, memory, bandwidth, etc.). On the other hand, adding extra resources locally is cheap and easy. Therefore, the idea is to trade cheap local resources for valuable remote ones. Server administrators negotiate for resource peering. The result of such a negotiation is for a secondary server to agree to allocate a given amount of its local resources to host replicas from a primary server. The primary server keeps control of the resources it has acquired: it controls which clients are redirected to which secondary server, which documents are replicated there and which replication policies are being used. Of course, servers may play both primary and secondary server roles at the same time: a server may host replicas from another server, and replicate its own content to a third one. We use these terms only to distinguish roles within a given cooperation session. Each document is made of one primary replica located at its primary server, and a number of secondary replicas located at various secondary servers. In our model, all document updates are performed at the primary replica, and then propagated to the secondaries according to the document s own replication policy Security issues The design of Globule raises two different security issues: security of a secondary server against malicious replicas, and security of a primary server against malicious secondary servers. For a secondary server, hosting a replica from a remote site is acceptable only if the replica cannot interfere with its other operations. In particular, foreign code should be isolated so that in cannot compromise the security of the server. In the case of Globule, no code is ever shipped between servers. All replicated objects belong to the same class, namely the static document class. Every Globule server contains an implementation of this class, so it is not necessary to ship it. This is also true for policy objects: every Globule server contains an implementation of every policy that is likely to be used, so it is enough for a primary
3 Registration events Document s slave replica HttpRequest events HTTP requests Front end HttpRequest events Event Manager HttpRequest events Document s master replica Registration events FileUpdate events Alarm events etc. HeartBeat monitor FileUpdate monitor Policy object Incoming events (HttpRequest, FileUpdate) Replication object Replica metadata Replica data External HTTP requests Figure 2. Document Replica Architecture Secondary1 Resource Pool Primary Resource Pool Monitor Resource Pool Figure 1. Server Architecture to associate each of its documents with the identifier of the replication policy that should be used. The other issue is that a primary server must make sure that secondary servers are trustworthy, so that, for example, they do not deliver modified documents to the clients. This is a very difficult problem, because the primary cannot control its secondaries behavior. We currently address this issue by requiring administrators to setup cooperations explicitly with partners whom they trust. We plan to relax this model by allowing servers to negotiate peering relationships autonomously, and by using a trust model to address the associated security risk. 3. System architecture We first describe the overall architecture of Globule, then we show how this architecture has been implemented as a module for the Apache Web server Internal architecture Server architecture Figure 1 shows the general architecture of a Globule server. The system is based on events: events can represent an incoming HTTP request or a variety of internal signals such as a registration to a component to monitor file updates or a time-triggered alarm. All document replicas and monitors can receive events. Monitors may receive registration events. For example, a document can register to the HeartBeat monitor in order to be sent alarms at a given periodicity. Document replicas can receive HttpRequest events which represent incoming HTTP requests, as well as alarms sent by monitors. Event receivers that belong together are grouped into logical entities called resource pools. Every Globule server contains two standard resource pools. The first one contains monitors such as the HeartBeat monitor to send periodic alarms and the FileUpdate monitor to send alarms when a specific file has been updated. The second resource pool contains all objects representing primary replicas of local documents. Globule servers can have an arbitrary number of additional resource pools, each one of them containing the set of secondary replicas belonging to a specific primary server. Figure 1 shows a server that contains a resource pool for secondary replicas from site Secondary1, in addition to the two standard resource pools. The Event Manager is in charge of delivering events to their destination. Each event is addressed to a specific element of the system using a two-level scheme. Addresses are made of the name of the resource pool of the destination, and a resource-pool-dependent name that identifies the final destination. A front-end component is in charge of transforming incoming HTTP requests into HttpRequest events. Other types of events can be sent by any system component, under the condition that it provides a destination address. Some destination addresses are fixed in the system, such as those of monitors. Document addresses are made of the name of their primary server and their URL. For example, the internal address for document bar will be the pair (pool=" " Document replication The internal architecture of a replica is depicted in Figure 2. The core of a replica is formed by the replication object. This object receives incoming HttpRequest events directed at the replica and ensures that the appropriate actions are taken to allow the replica data to be delivered in the response. Such actions can range from fetching a fresh copy of its document at the primary server, compare the local copy to the primary, registering to or sending invalida-
4 tions, or even simply doing nothing. The nature of tasks that must be performed before a delivery takes place is dictated by the replication policy object. There are different kinds of policy objects, each representing a specific policy. All policy objects implement a standard interface used by the (generic) replication object. Basically, this interface allows the replication object to ask the policy object for instructions each time a request is received. Depending on the policy it represents, the policy object then returns a list of actions that must be performed before the delivery can take place. The actions themselves are performed by the replication object. Globule contains a number of simple policies, such as TTL. This policy allows a secondary replica to deliver a copy to a requester without any consistency check during a fixed amount of time since a fresh copy had been fetched. If the period has expired, a consistency check is required. To implement this policy, the policy object of each secondary replica must simply maintain the date of the last retrieval for that document. When requested, it checks whether the delay has expired. It subsequently allows the immediate delivery of the local copy, or otherwise first requests the primary to check for consistency. The policy object located at the primary replica is even simpler: it allows all transfers without any former action. A more complex policy is Invalidation. In this scheme, the primary replica sends a message to all registered secondary replicas when the document is updated so that they drop their outdated copy. To do so, each primary replica must maintain a list of its secondaries. This list is constructed dynamically: when the primary s policy object receives a request from a not-yet-registered secondary, it extracts the secondary s callback address from the request and adds it to its local list. A similar mechanism is used when a secondary replica is destroyed: a special request is sent to the primary replica to remove the deleted replica from the invalidation list. In addition to maintaining a list of secondaries, the primary replica must also subscribe to events from the FileUpdate monitor. Whenever the file containing the document is updated, the monitor sends a FileUpdate event to the master replica. Upon reception of this event, the replication object requests its policy object for the list of registered secondary replicas to send them an invalidation. Keep in mind that this is only one possible implementation, and that a server may contain several invalidation policies which have different approaches to issues such as unreachable replicas and invalidation propagation among a large number of replicas. Policy objects can use arbitrary internal state to perform their task. Examples include no state at all, the date of the last consistency check, and a list of secondary replica addresses. Each replica is given a meta-data file to store this internal state. This is necessary when unloading replica objects from memory. Objects are then given a chance to save their state, which is used again when the object is brought back into memory. Such state changes take place when the server needs to reclaim memory, or when the server is being stopped. State changes are discussed in more detail in the next section Resource management Maintaining replicated documents requires more resources than hosting regular non-replicated documents. To achieve replication, we use replication and policy objects to handle requests to replicas; replicas are given a meta-data file to store their internal state; and finally, secondary servers must store copies of document replicas. These elements obviously require storage in memory or on disk, which brings us to resource management. Two issues have to be addressed. First, servers must be able to control the amount of disk storage that they use for secondary replicas. We expect that the negotiation involved between the administrators of a primary and a secondary server will decide on a disk quota that can be used to host replicas. However, in many cases this quota will be significantly smaller than the total set of documents that may be replicated there. To address this problem, each resource pool at a secondary server keeps track of the storage space its replicas are using, and maintains its usage below the quota. This is done using standard replacement policies: when the disk usage is rising above quota, then the least recently used replicas are simply deleted. 1 The second issue is that of memory management. Both primary and secondary servers must maintain replication and policy objects in memory across multiple requests to avoid the overhead of (un)loading them at each request. It is therefore necessary to control the number of loaded objects so that they do no not exceed the memory capacity of the server. The management of loaded objects is again realized on a per-resource pool basis: each resource pool is given a maximum number of loadable documents. A replacement mechanism similar to that of disk resources is in charge of unloading replicas from memory when necessary. Note that these objects can save their state to disk before being unloaded. This allows to unload objects representing primary as well as secondary replicas. To keep track of resources used by replicas, resource pools provide replicas with an allocation layer that intercepts all resource-consuming or -releasing requests. In par- 1 Deleting arbitrary secondary replicas is acceptable because they can always be re-created from the primary server. On the other hand, one should not delete primary replicas, since they hold irreplaceable data. We do not consider this as a problem, since the storage space for holding primary replicas would be used on this server anyway, even if it was not replicated. This is the reason why primary resource pools are given no disk quota.
5 ticular, this layer allows replicas to request the creation of new files, as well as read, write, and release them. By doing so, resource pools can keep track of their current disk usage while giving all freedom to replicas to behave according to various policies. In addition, resource pools keep track of resource ownership. This allows them to prevent replicas to access each other s resources, and to release all resources associated to replicas that are being destroyed Globule as an Apache module The Apache module model Apache is an HTTP server structured as a set of modules. The original distribution contains a number of modules, but third-party modules can be provided as well. This enables one to easily add new features [3]. The treatment for each request is decomposed into several steps, such as access checking, actually sending a response back to the client, and logging the request. Modules can register handler functions to participate in one or more of these steps. When a request is received, the server runs the registered handlers for each step. Modules can then accept or refuse to process the operation; the server tries all the handlers registered for each step until one accepts to process it. The architecture of Apache provides us all the tools necessary to implement a replication module: Globule can in particular intercept requests before being served by the standard document delivery modules to let replication objects check for consistency. Likewise, servers can communicate with each other by HTTP to transfer document copies or invalidations MPMs and memory management One problem that we ran into when implementing Globule concerns the interaction between Apache s multiprocessing and Globule s memory management. Apache implements several strategies to treat concurrent requests, called Multi-Processing Modules (MPMs). Depending on the operating system, the server is automatically compiled with the MPM that works best for it. Examples of MPMs are a module that treats each request in a separate process, one that implements multiple threads inside a single process, and a hybrid MPM that maintains multiple processes, each of which contains multiple threads. The choice of one MPM over another should be transparent to all other modules. However, Globule is a special case in that respect because it needs to maintain objects in memory across multiple requests. These objects must be accessible to any request-serving thread or process, even if they were created by another thread or process. The problem is that by default memory is shared among threads of the same process, but not among different processes. The solution is obviously to make use of shared memory. However, objects that need to be shared are of various sizes, and they can be created or destroyed at any time. This makes it difficult to use simple structures like an object table allocated in shared memory. This lead us to implement our own shared memory management scheme. At startup, Globule creates a chunk of shared memory where shared objects will be stored. It is associated with its own memory allocator similar to that of the standard C libraries, except that it (de)allocates pieces of the shared memory chunk Client redirection Like all CDNs, Globule must direct client requests to the replica that can best serve them. It does so by means of DNS redirection : before sending an HTTP request, the client needs to resolve the DNS name of the service. The DNS request eventually reaches the authoritative server for that zone, which is configured to identify the location of the client and return the IP address of the replica closest to it [4]. Globule incorporates such a custom DNS server as part of its implementation. Although Apache has originally been designed to handle only the HTTP protocol, its versions starting from 2.0 allow one to write modules that implement other protocols. It is therefore possible, with minimum changes to Apache, to integrate a DNS server inside Apache. More details on this aspect can be found in [9]. 4. Performance evaluation We evaluate the overhead introduced by Globule in addition to regular document delivery by Apache. The experimental setup consists of two dual-processor 1GHz Pentium III machines connected by 100Mbit/s Ethernet. The first machine is used to run Apache either with or without the Globule module. The second machine is used to send requests to the server. We measure Globule s overhead independently from the particular effects of specific replication policies by manually selecting a policy that always allows the delivery of the local replica without any prior action. We then compare the resulting performance with the same requests sent to an Apache server running without Globule. More complex replication policies may of course introduce additional costs, but these costs are already taken into account as one aspect of the cost/benefit analysis that selects replication policies [7]. We compare the relative performances of Globule and Apache from two points of view: request latency mea-
6 Request latency (micro seconds) Globule with object creation Globule without object creation Apache e+06 Document size (bytes) Figure 3. Request latencies sures reflect the performance as seen by the clients, whereas server throughput reflects the performance as seen by the servers. We stress that we are interested in measuring the overhead introduced by Globule, and as such concentrate on micro benchmarks for our Apache implementation. The overhead introduced by Globule as a whole is relevant only when considering the effects of individual replication policies, which has been discussed at length in [5]. For this reason, we did not perform wide-area experiments at this point Request latency We measure the latency of requests as seen by the client, i.e., the duration between when a TCP connection is initiated by the client and when the response has been fully received. To reproduce a realistic request access pattern, our client is configured to reproduce requests taken from the log file of our department s Web server. Document sizes are reproduced as well. To load the server to its maximum capacity, we replay the trace file as fast as possible. Figure 3 shows the request latencies observed by the client when document size varies. To make the graph readable, we did not plot the performance of each request that was performed, but instead, showed only the median latency values for requests to documents of each size. Note that because we are making use of a high-speed network, we are measuring pure-server performance. This setup can be considered as a worst-case scenario compared to a realistic wide-area setting. We split Globule latency measures into two curves. The first one concerns the first request that Globule receives for a document, and the second one concerns all subsequent requests. When receiving the first request for a given document, Globule must create a replication object, a policy object and a meta-data file to hold the replica s internal state. As can be seen in the graph, these operations have a fixed cost of about 400 µs. Note that this cost is incurred only once for each document. When no object creation is required, the latency of requests to Globule replicas is approximately 200 µs higher than that of the same requests to unmodified Apache. This fixed cost is mostly due to Globule finding the replication object that must be forwarded each HttpRequest event. The additional cost for delivering Globule replicas is primarily visible only for small documents. When the document size is greater than a few kilobytes this cost becomes negligible compared to the time needed to actually transfer the document over the network. We expect this latency difference between Globule and Apache to become even lower when requests are sent over a wide-area network Server throughput Figure 4 shows the throughput that a Globule server can achieve compared to that of Apache. Throughputs are measured using ab, the standard Apache benchmarking tool. We use this tool to send requests repeatedly to documents of different sizes. We measure the throughput of Apache and Globule when being requested with different levels of concurrency. All graphs show a decrease of throughput when document size increases: obviously, it takes more time for a server to deliver a large document than a small one. When there is no concurrency between requests, Globule s throughput is very close to that of Apache: for document sizes below 8kB, Globule s throughput is about 5% lower than that of Apache. Throughputs for larger documents show the same effect as in the request latency measurements: the difference becomes negligible because network transfer costs take over any additional cost caused by Globule. When we increase the concurrency level to higher values, we see that the difference in throughputs becomes larger. The difference is about 5% for a concurrency level of 1, 9% for a concurrency level of 2 and 17% for a concurrency level of 5. We explain this degradation by the way Globule allocates shared memory. To prevent race conditions, such allocations are serialized by a global mutex. Apache implements such cross-process and cross-thread mutexes by using one interprocess lock, plus one intraprocess mutex per process. (Un)locking the global mutex before (after) each shared memory allocation requires to (un)lock n + 1 mutexes. We can therefore expect that shared memory allocations become slower as the server load increases due to request concurrency.
7 Server throughput (req/sec) Server throughput (req/sec) Server throughput (req/sec) Apache Globule e Document size (bytes) (a) Concurrency level=1 Apache Globule e+06 Document size (bytes) (b) Concurrency level=2 Apache Globule e+06 Document size (bytes) (c) Concurrency level=5 Figure 4. Server Throughput Like in the latency analysis, these throughput measures reflect a worst-case scenario because they were performed over a high-speed network. In a wide-area environment, the cost for acquiring locks will become negligible compared to the total request latency, and the throughput of Globule will likely be much closer to that of unmodified Apache. 5. Related work Many systems have been developed to cache or replicate Web documents. The first server-controlled systems have been push-caches, where the server was responsible of pushing cached copies close to the users [2]. More recently, content distribution networks (CDNs) have been developed along the same idea [1, 8]. These systems rely on a large set of servers deployed around the world. Consistency is realized by incorporating a hash value of a document s content inside its URL. When a replicated document is modified, its URL is modified as well. This scheme necessitates to change hyperlink references to modified documents as well. In order to deliver only up-to-date documents to users, this scheme cannot use the same mechanism to replicate HTML documents; only embedded objects such as images and videos are replicated. Globule presents three major differences with CDNs. First, since its consistency management is independent from the document naming scheme, it can replicate all types of objects. Second, contrary to CDNs that use the same consistency policy for all documents, Globule selects consistency policies on a per-document basis so that each document uses the policy that suits it best. Finally, the system does not require one single organization to deploy a large number of machines across the Internet: Globule users are encouraged to trade resources with each other, therefore incrementally building a worldwide network of servers at low cost. 6. Conclusion We have presented the design and implementation of Globule, a user-centered content delivery network. Globule integrates all necessary services into a single tool: document replication, selection of the most appropriate replication strategies on a per-document basis, consistency management, and automatic client redirection. Globule is implemented as a module for the Apache server. This will enable administrators of non-replicated Apache servers to switch to replicated documents by simply compiling an extra module in their server and editing a configuration file. Globule will soon be made available from under an open-source license.
8 References [1] J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl. Globally distributed content delivery. IEEE Internet Computing, 6(5):50 58, September-October [2] J. Gwertzman and M. Seltzer. The case for geographical push-caching. In Proc. 5th Workshop on Hot Topics in Operating Systems (HotOS), Orcas Island, WA, May IEEE. [3] B. Laurie and P. Laurie. Apache: The Definitive Guide. O Reilly & Associates, Sebastopol, CA., 2nd edition, [4] P. R. McManus. A passive system for server selection within mirrored resource environments using AS path length heuristics. Technical report, AppliedThory Communications, Inc., June mcmanus/proximate.pdf. [5] G. Pierre, I. Kuz, M. van Steen, and A. S. Tanenbaum. Differentiated strategies for replicating Web documents. Computer Communications, 24(2): , Jan [6] G. Pierre and M. van Steen. Globule: a platform for selfreplicating Web documents. In Proceedings of the 6th International Conference on Protocols for Multimedia Systems, LNCS 2213, pages 1 11, Oct [7] G. Pierre, M. van Steen, and A. S. Tanenbaum. Dynamically selecting optimal distribution strategies for Web documents. IEEE Transactions on Computers, 51(6): , June [8] M. Rabinovich and A. Aggarwal. RaDaR: A scalable architecture for a global Web hosting service. In Proceedings of the 8th International World-Wide Web Conference, May [9] M. Szymaniak. A DNS-based client redirector for the Apache HTTP server. Master s thesis, Vrije Universiteit, Amsterdam, The Netherlands, July globule.org/. [10] M. van Steen, P. Homburg, and A. S. Tanenbaum. Globe: A wide-area distributed system. IEEE Concurrency, 7(1):70 78, January-March 1999.
Globule: a Platform for Self-Replicating Web Documents
Globule: a Platform for Self-Replicating Web Documents Guillaume Pierre Maarten van Steen Vrije Universiteit, Amsterdam Internal report IR-483 January 2001 Abstract Replicating Web documents at a worldwide
A Case for Dynamic Selection of Replication and Caching Strategies
A Case for Dynamic Selection of Replication and Caching Strategies Swaminathan Sivasubramanian Guillaume Pierre Maarten van Steen Dept. of Mathematics and Computer Science Vrije Universiteit, Amsterdam,
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
By the Citrix Publications Department. Citrix Systems, Inc.
Licensing: Planning Your Deployment By the Citrix Publications Department Citrix Systems, Inc. Notice The information in this publication is subject to change without notice. THIS PUBLICATION IS PROVIDED
Web Server Software Architectures
Web Server Software Architectures Author: Daniel A. Menascé Presenter: Noshaba Bakht Web Site performance and scalability 1.workload characteristics. 2.security mechanisms. 3. Web cluster architectures.
Content Delivery Networks (CDN) Dr. Yingwu Zhu
Content Delivery Networks (CDN) Dr. Yingwu Zhu Web Cache Architecure Local ISP cache cdn Reverse Reverse Proxy Reverse Proxy Reverse Proxy Proxy L4 Switch Content Content Content Server Content Server
Real-Time Analysis of CDN in an Academic Institute: A Simulation Study
Journal of Algorithms & Computational Technology Vol. 6 No. 3 483 Real-Time Analysis of CDN in an Academic Institute: A Simulation Study N. Ramachandran * and P. Sivaprakasam + *Indian Institute of Management
Adding Advanced Caching and Replication Techniques to the Apache Web Server
Adding Advanced Caching and Replication Techniques to the Apache Web Server Joachim Marder, Steffen Rothkugel, Peter Sturm University of Trier D-54286 Trier, Germany Email: [email protected], [email protected],
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB
Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB Executive Summary Oracle Berkeley DB is used in a wide variety of carrier-grade mobile infrastructure systems. Berkeley DB provides
Internet Content Distribution
Internet Content Distribution Chapter 4: Content Distribution Networks (TUD Student Use Only) Chapter Outline Basics of content distribution networks (CDN) Why CDN? How do they work? Client redirection
EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications
ECE6102 Dependable Distribute Systems, Fall2010 EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications Deepal Jayasinghe, Hyojun Kim, Mohammad M. Hossain, Ali Payani
ZooKeeper. Table of contents
by Table of contents 1 ZooKeeper: A Distributed Coordination Service for Distributed Applications... 2 1.1 Design Goals...2 1.2 Data model and the hierarchical namespace...3 1.3 Nodes and ephemeral nodes...
Improving the Database Logging Performance of the Snort Network Intrusion Detection Sensor
-0- Improving the Database Logging Performance of the Snort Network Intrusion Detection Sensor Lambert Schaelicke, Matthew R. Geiger, Curt J. Freeland Department of Computer Science and Engineering University
2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment
R&D supporting future cloud computing infrastructure technologies Research and Development on Autonomic Operation Control Infrastructure Technologies in the Cloud Computing Environment DEMPO Hiroshi, KAMI
Testing & Assuring Mobile End User Experience Before Production. Neotys
Testing & Assuring Mobile End User Experience Before Production Neotys Agenda Introduction The challenges Best practices NeoLoad mobile capabilities Mobile devices are used more and more At Home In 2014,
MAGENTO HOSTING Progressive Server Performance Improvements
MAGENTO HOSTING Progressive Server Performance Improvements Simple Helix, LLC 4092 Memorial Parkway Ste 202 Huntsville, AL 35802 [email protected] 1.866.963.0424 www.simplehelix.com 2 Table of Contents
Large-Scale Web Applications
Large-Scale Web Applications Mendel Rosenblum Web Application Architecture Web Browser Web Server / Application server Storage System HTTP Internet CS142 Lecture Notes - Intro LAN 2 Large-Scale: Scale-Out
SIDN Server Measurements
SIDN Server Measurements Yuri Schaeffer 1, NLnet Labs NLnet Labs document 2010-003 July 19, 2010 1 Introduction For future capacity planning SIDN would like to have an insight on the required resources
Dynamic Thread Pool based Service Tracking Manager
Dynamic Thread Pool based Service Tracking Manager D.V.Lavanya, V.K.Govindan Department of Computer Science & Engineering National Institute of Technology Calicut Calicut, India e-mail: [email protected],
The Effectiveness of Request Redirection on CDN Robustness
The Effectiveness of Request Redirection on CDN Robustness Limin Wang, Vivek Pai and Larry Peterson Presented by: Eric Leshay Ian McBride Kai Rasmussen 1 Outline! Introduction! Redirection Strategies!
SiteCelerate white paper
SiteCelerate white paper Arahe Solutions SITECELERATE OVERVIEW As enterprises increases their investment in Web applications, Portal and websites and as usage of these applications increase, performance
1. Comments on reviews a. Need to avoid just summarizing web page asks you for:
1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of
www.basho.com Technical Overview Simple, Scalable, Object Storage Software
www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...
CHAPTER 4 PERFORMANCE ANALYSIS OF CDN IN ACADEMICS
CHAPTER 4 PERFORMANCE ANALYSIS OF CDN IN ACADEMICS The web content providers sharing the content over the Internet during the past did not bother about the users, especially in terms of response time,
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
How To Understand The Power Of A Content Delivery Network (Cdn)
Overview 5-44 5-44 Computer Networking 5-64 Lecture 8: Delivering Content Content Delivery Networks Peter Steenkiste Fall 04 www.cs.cmu.edu/~prs/5-44-f4 Web Consistent hashing Peer-to-peer CDN Motivation
Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers
BASEL UNIVERSITY COMPUTER SCIENCE DEPARTMENT Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers Distributed Information Systems (CS341/HS2010) Report based on D.Kassman, T.Kraska,
Transport Layer Protocols
Transport Layer Protocols Version. Transport layer performs two main tasks for the application layer by using the network layer. It provides end to end communication between two applications, and implements
VMWARE WHITE PAPER 1
1 VMWARE WHITE PAPER Introduction This paper outlines the considerations that affect network throughput. The paper examines the applications deployed on top of a virtual infrastructure and discusses the
Globule: A Collaborative Content Delivery Network
Globule: A Collaborative Content Delivery Network Guillaume Pierre Maarten van Steen Abstract We present Globule, a collaborative content delivery network developed by our research group. Globule is composed
Influence of Load Balancing on Quality of Real Time Data Transmission*
SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol. 6, No. 3, December 2009, 515-524 UDK: 004.738.2 Influence of Load Balancing on Quality of Real Time Data Transmission* Nataša Maksić 1,a, Petar Knežević 2,
Job Reference Guide. SLAMD Distributed Load Generation Engine. Version 1.8.2
Job Reference Guide SLAMD Distributed Load Generation Engine Version 1.8.2 June 2004 Contents 1. Introduction...3 2. The Utility Jobs...4 3. The LDAP Search Jobs...11 4. The LDAP Authentication Jobs...22
Meeting the Five Key Needs of Next-Generation Cloud Computing Networks with 10 GbE
White Paper Meeting the Five Key Needs of Next-Generation Cloud Computing Networks Cloud computing promises to bring scalable processing capacity to a wide range of applications in a cost-effective manner.
PARALLELS CLOUD STORAGE
PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...
Using Fuzzy Logic Control to Provide Intelligent Traffic Management Service for High-Speed Networks ABSTRACT:
Using Fuzzy Logic Control to Provide Intelligent Traffic Management Service for High-Speed Networks ABSTRACT: In view of the fast-growing Internet traffic, this paper propose a distributed traffic management
NetIQ Access Manager 4.1
White Paper NetIQ Access Manager 4.1 Performance and Sizing Guidelines Performance, Reliability, and Scalability Testing Revisions This table outlines all the changes that have been made to this document
TPC-W * : Benchmarking An Ecommerce Solution By Wayne D. Smith, Intel Corporation Revision 1.2
TPC-W * : Benchmarking An Ecommerce Solution By Wayne D. Smith, Intel Corporation Revision 1.2 1 INTRODUCTION How does one determine server performance and price/performance for an Internet commerce, Ecommerce,
Managing Users and Identity Stores
CHAPTER 8 Overview ACS manages your network devices and other ACS clients by using the ACS network resource repositories and identity stores. When a host connects to the network through ACS requesting
Content Delivery Networks
Content Delivery Networks Terena 2000 ftp://ftpeng.cisco.com/sgai/t2000cdn.pdf Silvano Gai Cisco Systems, USA Politecnico di Torino, IT [email protected] Terena 2000 1 Agenda What are Content Delivery Networks?
Diagram 1: Islands of storage across a digital broadcast workflow
XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,
Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand
Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula, K. Savitha, H. W. Jin D. K. Panda Network Based
- An Essential Building Block for Stable and Reliable Compute Clusters
Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative
Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at
Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How
DoS: Attack and Defense
DoS: Attack and Defense Vincent Tai Sayantan Sengupta COEN 233 Term Project Prof. M. Wang 1 Table of Contents 1. Introduction 4 1.1. Objective 1.2. Problem 1.3. Relation to the class 1.4. Other approaches
A Tool for Evaluation and Optimization of Web Application Performance
A Tool for Evaluation and Optimization of Web Application Performance Tomáš Černý 1 [email protected] Michael J. Donahoo 2 [email protected] Abstract: One of the main goals of web application
Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures
Chapter 18: Database System Architectures Centralized Systems! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types! Run on a single computer system and do
Understanding Slow Start
Chapter 1 Load Balancing 57 Understanding Slow Start When you configure a NetScaler to use a metric-based LB method such as Least Connections, Least Response Time, Least Bandwidth, Least Packets, or Custom
Eloquence Training What s new in Eloquence B.08.00
Eloquence Training What s new in Eloquence B.08.00 2010 Marxmeier Software AG Rev:100727 Overview Released December 2008 Supported until November 2013 Supports 32-bit and 64-bit platforms HP-UX Itanium
D. SamKnows Methodology 20 Each deployed Whitebox performs the following tests: Primary measure(s)
v. Test Node Selection Having a geographically diverse set of test nodes would be of little use if the Whiteboxes running the test did not have a suitable mechanism to determine which node was the best
Figure 1. The cloud scales: Amazon EC2 growth [2].
- Chung-Cheng Li and Kuochen Wang Department of Computer Science National Chiao Tung University Hsinchu, Taiwan 300 [email protected], [email protected] Abstract One of the most important issues
Load Balancing in Distributed Web Server Systems With Partial Document Replication
Load Balancing in Distributed Web Server Systems With Partial Document Replication Ling Zhuo, Cho-Li Wang and Francis C. M. Lau Department of Computer Science and Information Systems The University of
PERFORMANCE OF MOBILE AD HOC NETWORKING ROUTING PROTOCOLS IN REALISTIC SCENARIOS
PERFORMANCE OF MOBILE AD HOC NETWORKING ROUTING PROTOCOLS IN REALISTIC SCENARIOS Julian Hsu, Sameer Bhatia, Mineo Takai, Rajive Bagrodia, Scalable Network Technologies, Inc., Culver City, CA, and Michael
MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?
MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM? Ashutosh Shinde Performance Architect [email protected] Validating if the workload generated by the load generating tools is applied
Oracle Applications Release 10.7 NCA Network Performance for the Enterprise. An Oracle White Paper January 1998
Oracle Applications Release 10.7 NCA Network Performance for the Enterprise An Oracle White Paper January 1998 INTRODUCTION Oracle has quickly integrated web technologies into business applications, becoming
CS268 Exam Solutions. 1) End-to-End (20 pts)
CS268 Exam Solutions General comments: ) If you would like a re-grade, submit in email a complete explanation of why your solution should be re-graded. Quote parts of your solution if necessary. In person
Muse Server Sizing. 18 June 2012. Document Version 0.0.1.9 Muse 2.7.0.0
Muse Server Sizing 18 June 2012 Document Version 0.0.1.9 Muse 2.7.0.0 Notice No part of this publication may be reproduced stored in a retrieval system, or transmitted, in any form or by any means, without
QuickStart Guide vcenter Server Heartbeat 5.5 Update 2
vcenter Server Heartbeat 5.5 Update 2 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent
Improved metrics collection and correlation for the CERN cloud storage test framework
Improved metrics collection and correlation for the CERN cloud storage test framework September 2013 Author: Carolina Lindqvist Supervisors: Maitane Zotes Seppo Heikkila CERN openlab Summer Student Report
SAN Conceptual and Design Basics
TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer
Load Balancing BEA WebLogic Servers with F5 Networks BIG-IP v9
Load Balancing BEA WebLogic Servers with F5 Networks BIG-IP v9 Introducing BIG-IP load balancing for BEA WebLogic Server Configuring the BIG-IP for load balancing WebLogic Servers Introducing BIG-IP load
A Packet Forwarding Method for the ISCSI Virtualization Switch
Fourth International Workshop on Storage Network Architecture and Parallel I/Os A Packet Forwarding Method for the ISCSI Virtualization Switch Yi-Cheng Chung a, Stanley Lee b Network & Communications Technology,
Cisco Integrated Services Routers Performance Overview
Integrated Services Routers Performance Overview What You Will Learn The Integrated Services Routers Generation 2 (ISR G2) provide a robust platform for delivering WAN services, unified communications,
Network performance in virtual infrastructures
Network performance in virtual infrastructures A closer look at Amazon EC2 Alexandru-Dorin GIURGIU University of Amsterdam System and Network Engineering Master 03 February 2010 Coordinators: Paola Grosso
Overlay Networks. Slides adopted from Prof. Böszörményi, Distributed Systems, Summer 2004.
Overlay Networks An overlay is a logical network on top of the physical network Routing Overlays The simplest kind of overlay Virtual Private Networks (VPN), supported by the routers If no router support
How To Test Your Web Site On Wapt On A Pc Or Mac Or Mac (Or Mac) On A Mac Or Ipad Or Ipa (Or Ipa) On Pc Or Ipam (Or Pc Or Pc) On An Ip
Load testing with WAPT: Quick Start Guide This document describes step by step how to create a simple typical test for a web application, execute it and interpret the results. A brief insight is provided
From Internet Data Centers to Data Centers in the Cloud
From Internet Data Centers to Data Centers in the Cloud This case study is a short extract from a keynote address given to the Doctoral Symposium at Middleware 2009 by Lucy Cherkasova of HP Research Labs
Agility Database Scalability Testing
Agility Database Scalability Testing V1.6 November 11, 2012 Prepared by on behalf of Table of Contents 1 Introduction... 4 1.1 Brief... 4 2 Scope... 5 3 Test Approach... 6 4 Test environment setup... 7
Distributed File Systems
Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)
Scaling out a SharePoint Farm and Configuring Network Load Balancing on the Web Servers. Steve Smith Combined Knowledge MVP SharePoint Server
Scaling out a SharePoint Farm and Configuring Network Load Balancing on the Web Servers Steve Smith Combined Knowledge MVP SharePoint Server Scaling out a SharePoint Farm and Configuring Network Load Balancing
POWER ALL GLOBAL FILE SYSTEM (PGFS)
POWER ALL GLOBAL FILE SYSTEM (PGFS) Defining next generation of global storage grid Power All Networks Ltd. Technical Whitepaper April 2008, version 1.01 Table of Content 1. Introduction.. 3 2. Paradigm
Tableau Server 7.0 scalability
Tableau Server 7.0 scalability February 2012 p2 Executive summary In January 2012, we performed scalability tests on Tableau Server to help our customers plan for large deployments. We tested three different
A block based storage model for remote online backups in a trust no one environment
A block based storage model for remote online backups in a trust no one environment http://www.duplicati.com/ Kenneth Skovhede (author, [email protected]) René Stach (editor, [email protected]) Abstract
Network Simulation Traffic, Paths and Impairment
Network Simulation Traffic, Paths and Impairment Summary Network simulation software and hardware appliances can emulate networks and network hardware. Wide Area Network (WAN) emulation, by simulating
Stateful Inspection Technology
Stateful Inspection Technology Security Requirements TECH NOTE In order to provide robust security, a firewall must track and control the flow of communication passing through it. To reach control decisions
Wikimedia architecture. Mark Bergsma <[email protected]> Wikimedia Foundation Inc.
Mark Bergsma Wikimedia Foundation Inc. Overview Intro Global architecture Content Delivery Network (CDN) Application servers Persistent storage Focus on architecture, not so much on
Windows Server 2008 R2 Hyper-V Live Migration
Windows Server 2008 R2 Hyper-V Live Migration White Paper Published: August 09 This is a preliminary document and may be changed substantially prior to final commercial release of the software described
OpenFlow Based Load Balancing
OpenFlow Based Load Balancing Hardeep Uppal and Dane Brandon University of Washington CSE561: Networking Project Report Abstract: In today s high-traffic internet, it is often desirable to have multiple
Load Balancing using Pramati Web Load Balancer
Load Balancing using Pramati Web Load Balancer Satyajit Chetri, Product Engineering Pramati Web Load Balancer is a software based web traffic management interceptor. Pramati Web Load Balancer offers much
Chapter 2 TOPOLOGY SELECTION. SYS-ED/ Computer Education Techniques, Inc.
Chapter 2 TOPOLOGY SELECTION SYS-ED/ Computer Education Techniques, Inc. Objectives You will learn: Topology selection criteria. Perform a comparison of topology selection criteria. WebSphere component
Load balancing using Remote Method Invocation (JAVA RMI)
Load balancing using Remote Method Invocation (JAVA RMI) Ms. N. D. Rahatgaonkar 1, Prof. Mr. P. A. Tijare 2 1 Department of Computer Science & Engg and Information Technology Sipna s College of Engg &
Copyright www.agileload.com 1
Copyright www.agileload.com 1 INTRODUCTION Performance testing is a complex activity where dozens of factors contribute to its success and effective usage of all those factors is necessary to get the accurate
Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! [email protected]
Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! [email protected] 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind
Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida
Amazon Web Services Primer William Strickland COP 6938 Fall 2012 University of Central Florida AWS Overview Amazon Web Services (AWS) is a collection of varying remote computing provided by Amazon.com.
Performance Optimization Guide
Performance Optimization Guide Publication Date: July 06, 2016 Copyright Metalogix International GmbH, 2001-2016. All Rights Reserved. This software is protected by copyright law and international treaties.
Guide to TCP/IP, Third Edition. Chapter 3: Data Link and Network Layer TCP/IP Protocols
Guide to TCP/IP, Third Edition Chapter 3: Data Link and Network Layer TCP/IP Protocols Objectives Understand the role that data link protocols, such as SLIP and PPP, play for TCP/IP Distinguish among various
Indirection. science can be solved by adding another level of indirection" -- Butler Lampson. "Every problem in computer
Indirection Indirection: rather than reference an entity directly, reference it ( indirectly ) via another entity, which in turn can or will access the original entity A x B "Every problem in computer
D1.2 Network Load Balancing
D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June [email protected],[email protected],
Performance And Scalability In Oracle9i And SQL Server 2000
Performance And Scalability In Oracle9i And SQL Server 2000 Presented By : Phathisile Sibanda Supervisor : John Ebden 1 Presentation Overview Project Objectives Motivation -Why performance & Scalability
Performance Tuning Guide for ECM 2.0
Performance Tuning Guide for ECM 2.0 Rev: 20 December 2012 Sitecore ECM 2.0 Performance Tuning Guide for ECM 2.0 A developer's guide to optimizing the performance of Sitecore ECM The information contained
Structure and Performance of Open Access Networks Case Lappeenranta Model
Structure and Performance of Open Access Networks Case Lappeenranta Model M.Juutilainen, T.Lapinlampi, J.Ikonen and J.Porras Paper Title Laboratory of Communications Engineering, Lappeenranta University
Web Application Hosting Cloud Architecture
Web Application Hosting Cloud Architecture Executive Overview This paper describes vendor neutral best practices for hosting web applications using cloud computing. The architectural elements described
SharePoint Server 2010 Capacity Management: Software Boundaries and Limits
SharePoint Server 2010 Capacity Management: Software Boundaries and s This document is provided as-is. Information and views expressed in this document, including URL and other Internet Web site references,
P2P VoIP for Today s Premium Voice Service 1
1 P2P VoIP for Today s Premium Voice Service 1 Ayaskant Rath, Stevan Leiden, Yong Liu, Shivendra S. Panwar, Keith W. Ross [email protected], {YongLiu, Panwar, Ross}@poly.edu, [email protected]
