Performance Analysis of a Query by Example Image Search Method in Peer to Peer Overlays

Size: px
Start display at page:

Download "Performance Analysis of a Query by Example Image Search Method in Peer to Peer Overlays"


1 AGH University of Science and Technology Faculty of Electrical Engineering, Automatics, Computer Science and Electronics Ph.D. Thesis Michał Grega Performance Analysis of a Query by Example Image Search Method in Peer to Peer Overlays Supervisor: Prof. dr hab. inż. Zdzisław Papir

2 AGH University of Science and Technology Faculty of Electrical Engineering, Automatics, Computer Science and Electronics Department of Telecommunications Al. Mickiewicza 30, Kraków, Poland tel fax Copyright c Michał Grega, 2011 All rights reserved Printed in Poland

3 Acknowledgements It is a pleasure to thank the many people who made this thesis possible. First of all I would like to show my gratitude to my supervisor, Prof. Zdzisław Papir. I am grateful for his guidance, his valuable remarks and continuous support. Without the kind and warm atmosphere of knowledge exchange and support he created in our research team it would be much harder, if not impossible for me to finish this work. I am grateful to Dr. -Ing. Nicolas Liebau and his team from the Technische Universität Darmstadt, who advised and supported me during and after the CONTENT project. I would also like to wholeheartedly thank my colleagues, Mikołaj Leszczuk and Lucjan Janowski, who devoted their knowledge and time to discuss my work. Special credit goes also to my friends from our room 309, Kasia Kosek-Szott, Szymon Szott and Piotrek Romaniak. We went together through the PhD course supporting and helping each other not only in our professional career but also on the private side of our lives. My warm thoughts go to my parents and their enormous effort, which brought me to this point. Last, but not least, I would like to thank my wife, Kate, for her patience and loving words of support, which allowed me to advance in times of doubt. Thank you!


5 Abstract Nowadays the traditional division between content producers and consumers is becoming blurry. A new type of user emerges, the prosumer, who is at the same time the producer and the consumer of multimedia content. This is the direct result of easy access to content creation tools, such as digital cameras and camcoders, the popularity of content manipulation tools, such as open source image editing tools and open access to multimedia hosting services, such as Google Picasa, YouTube, Flickr and many others. This mass production and availability of content creates a new challenge. It is not enough to create content and make it available in the Internet. It is necessery to allow the content to be easily searched for and accessed by other users. We are used to searching for items in the network with sets of keywords. Users, however, have no incentive to tag each of their movies and photographs with keywords. This is where advanced multimedia search methods, such as Query by Example can be successfully utilized. Another emerging problem is the economical effort required to set up a new multimedia service. The volume of the multimedia data, requirements on storage space, computation power and bandwidth allow only the largest market vendors to easily introduce new multimedia services. An answer to this problem is the departure from the traditional client server architecture to a Peer-to-Peer network, in which network upkeep costs are shared among the network users. This dissertation presents the work in the cross-domain area of Peer-to-Peer networking and advanced multimedia search methods. The author identifies and solves the problems encountered during the application of the Query by Example search technique in both structured and unstructured Peer-to-Peer overlays. The author proves that implementation of Query by Example service in Peerto-Peer overlays is possible while maintaining the quality offered by centralized solution and the benefits of Peer-to-Peer overlays at the same time.


7 Contents Acknowledgements Abstract iii v 1 Introduction Motivation and Goal of the Research The Concept of the P2P QbE System Major Research Problems Research Approach Thesis Cooperation and Publications Structure of the Dissertation State of the Art P2P Architectures Client-Server Architecture Centralised P2P Unstructured P2P Hybrid P2P Structured P2P Summary of P2P Architectures The Metadata Management The Metadata Systems Dublin Core The Metadata Systems MPEG Search Methods and Architectures of Metadata Classification Based on Input Method Classification of Search Methods Based on Complexity Advanced Search Methods and Users Summary on Classification of Search Methods The Architectures of Metadata Search Benchmarking

8 viii CONTENTS The Benchmarked Parameters Search Accuracy Benchmarking Simulation Tools Related Work Conclusions Problem Approach Image Database Measurement of QbE Accuracy in Local Database Measurement Methodology QbE Methods Description of the Experiment Experiment Execution and Results Experiment Conclusions Application of QbE in an Unstructured P2P Overlay Implementation of the QbE Service File Distribution and Popularity Simulation Setup Analysys of the Results Conclusions Application of QbE in a Structured P2P Overlay Routing in CAN Overlay Implementation of the QbE Service in the CAN Overlay Simulation Setup Results Conclusions Comparision of QbE in Structured and Unstructured P2P Overlays Research Conclusions Summary 63

9 List of Abbreviations ANSI CAN DCEMS DCMI DHT DS ITU MP MPEG NISO OCR P2P PSTN QbE QoE SATIN TTL VoIP XM American National Standards Institute Content Addressable Network Dublin Core Metadata Element Set Dublin Core Metadata Initiative Distributed Hash Table Description Schemes International Telecommunications Union Mega Pixels Moving Picture Experts Group National Information Standards Organisation Optical Character Recognition Peer-to-Peer Public Switched Telephone Networks Query by Example Quality of Experience European Doctoral School on Advanced Topics In Networking Time To Live Voice over IP MPEG-7 Experimentation Model


11 Chapter 1 Introduction This introductory Chapter presents the goal of the research, the thesis of the dissertation and the methodological approach used to solve the problem stated in the thesis. A discussion of the relevant author s publications and a presentation of the overall dissertation structure is provided as well. 1.1 Motivation and Goal of the Research The goal of the presented research is to analyse the possibility of developing a content-based Query by Example (QbE) search system for a Peer-to-Peer (P2P) overlay. System performance will be assessed in terms of search accuracy and badwitdth utilization. A Peer-to-Peer system is a self organising system consisting of end-systems (called peers ) which form an overlay network over the existing network architecture. The structure of the P2P network can significantly differ from the structure of the underlying physical network (Figure 1.1). Peers offer and consume services and resources. They are also characterised by a significant amount of autonomy. Services are exchanged between any participating peers. Such networks are gaining more and more popularity and attention both from users and researchers. On one hand this growing interest can be explained by numerous P2P based applications, ranging from simple file sharing to more sophisticated services such as Voice over IP (VoIP) and online gaming. On the other hand, P2P networking is a challenging topic for researchers because of its distributed architecture, networked cooperation of the peers and lack of central authority (in most of P2P network architectures). This research has been focused on one of the most popular applications of P2P which is file sharing. File sharing in P2P networks became significantly popular since A study performed in 2004 by the CacheLogic company gave a

12 4 Introduction Figure 1.1: The concept of the overlay P2P network over an existing physical network conclusion that Traffic analysis conducted as a part of an European Tier 1 Service Provider field trail has shown that P2P traffic volumes are at least double that of http during the peak evening periods and as much as tenfold at other times [59]. Figure 1.2 shows that in 2006 P2P services in Europe were responsible for 70% of the overall global traffic. A more recent study [61] performed by the Canadian company Sandvine shows that P2P traffic is responsible for 35% of the downstream broadband traffic and over 75% of the upstream traffic in North America. A study performed by Cisco company in 2010 [2] shows, that the share of the network traffic caused by P2P applications is being surpassed by traffic caused by video streaming (YouTube like services), but still is and will be the second largest source of traffic in the networks. While the numbers vary between years and studies due to different test regions and methodologies the bottom line remains the same P2P is responsible for most of the current global network traffic. Rapid global growth in multimedia data has created new, previously unknown business opportunities. Data search itself has grown into a business sector which has allowed companies such as Google and Yahoo to become brands recognised worldwide. Multimedia is now rapidly moving into every-day life. Users are not only able to access media via radio, television and the Internet but now also create their own media content. Digital cameras, handheld camcoders, media enabled mobile

13 1.1 Motivation and Goal of the Research 5 Figure 1.2: The percentage of Internet traffic generated by the P2P applications [59] phones and the wide spread availability of content creation and editing software which were previously available only to professionals, all contribute to this data volume growth. The traditional division of users has to be extended by a new category the prosumers who are, at the at the same time the producers and consumers of multimedia. Data is also being shared within user communities via Web 2.0 services such as YouTube, Google Video and Flickr and also used in media industries. Usergenerated content is of growing importance to news agencies, as it is often the fastest way to come up with breaking news. Decentralised architectures are becoming more significant in sharing user created content. This growth of the amount of the multimedia data causes a new problem of effective search to emerge. Usually, published media are rarely accompanied by a comprehensive textual annotation. Most users have no incentive for adding even a few simple few-keyword metadata for their multimedia object. The usual keyword based search becomes ineffective and not possible in many unannotated repositories. This is the place where Query by Example (QbE) search may become useful as it requires no textual description of the multimedia object to work.

14 6 Introduction Another research area of growing importance is the Quality of Experience (QoE) which is a measure of a subjective quality experienced by the user. In the early days of the Internet the satisfaction of users was of negligible importance, although the complex problem of delivering high-quality service to the end-user was well known from the experience with Public Switched Telephone Networks (PSTN). Nowadays, when the user can choose between different service providers the service with the highest QoE level is the most successful one. In the presented research the proposed QbE solution will be assessed by QoE measurements. This research has been devoted to one type of multimedia content which are still images. Nevertheless, the research approach and the conclusions drawn from the presented results form a firm foundation for research on advanced search methods for other types of multimedia content such as audio, video and 3D objects. 1.2 The Concept of the P2P QbE System The basic concept of the P2P QBE system is to support content-based QbE retrieval of images stored in a distributed manner in a P2P file sharing network. A use case of such a system is presented in Figure 1.3. The mechanics of the generic QbE system are as follows: the user provides an example of a searched item to the system in order to retrieve similar objects from the repository. The system calculates, on the client side, the features which describe the example. The features extracted from the example are called low-level metadata. The concept and definition of metadata is provided in Chapter 2.2. Low-level metadata is a numeric vector which represents signal properties of the image. It can be, for example, a dominant colour of the image or an edge histogram. These low-level features can be easily sent over the network, as usually their size in bytes is much smaller than the size of the original media. The features of different media objects can be numerically compared in order to calculate the similarity between the media items. Figure 1.4 depicts the example of a QbE query. In this case a dominant colour descriptor, from the MPEG-7 descriptor set (Chapter 2.2.2) was used to retrieve similar images from a local database. As can be observed, all returned images have a similar dominant colour tone as the provided example, but the content is different. In the distributed scenario (as depicted in Figure 1.3) the low-level metadata is routed trough the P2P overlay to queried nodes. This is the significant difference between the solution proposed in the dissertation and searching within a local database of images. The queried side is a set of the P2P nodes selected by the routing algorithm of the P2P overlay. The

15 1.3 Major Research Problems 7 The user provides an example Calculation of the MPEG-7 descriptor values of images possesed Propagate the features of the example in the overlay Calculation of MPEG-7 descriptor values of the example Queried Node Search for similar images in possesion Calculate the distance of the example from the images in possession P2P Overlay Querying Node Propagate the distances and thumbnails to the querying node Gather the distances/ thumbnails and create a ranking of the results Based on the ranking and thumbnails the user chooses a file to download Figure 1.3: The use case of the QbE search system queried side stores the pre-calculated low-level metadata of the possessed media. Upon receiving the low-level metadata of the example the queried side calculates the distances of the metadata of the example from the stored metadata (under a predefined metric). The queried side returns the list of distances and thumbnails for the most similar objects to the querying side. 1.3 Major Research Problems The presented research integrates two recognised techniques the QbE method of search and the P2P content delivery method. This is the primary source of new research challenges. The QbE search process, when implemented in a P2P overlay, will require more bandwidth than the traditional keyword query. The extracted features of media items are stored and transmitted in the form of a vector of numbers. While the

16 8 Introduction Figure 1.4: An example of the QbE search principle volume of this vector is small while compared to the size of the media file it is still significantly larger than a few-byte keyword. This may cause the network to become congested with queries, especially in case of unstructured overlays. Another source of network congestion are the thumbnails of media files. In case of images the thumbnails are the images in a very low resolution. They allow the user to decide whether the resulting image fulfills their search criteria. It is critical to choose the proper overlay architecture for the search service. Different architectures have different properties and not each architecture may be able to provide the QbE service. The main difference between the QbE retrieval system for a local media database and the QbE search system in a P2P network is that in case of the local database search all media in the database are available during the search process. In case of the P2P search the available set of media is only a subset of all media existing in the network. Another effect is caused by the dynamic nature of the network the participants join and leave the P2P network during its operation. This property of a P2P network is called churn [67]. As churn adds another level of complexity to the presented research problems and it is not the most significant factor influencing the performance of the search service it was decided to analyse the performance of the QbE application in P2P overlay in the absence of churn. Content-based search is resource-consuming as it requires the analysis of the image itself. The extraction of the descriptor values can be done only once for each

17 1.3 Major Research Problems 9 image and stored for later use, but this can be a problem for large databases. The comparison of the descriptors requires large computing power as the descriptors are typically one dimensional vectors of as much as 200 values. A single comparison can be done almost instantly, but the problem is not scalable in case of large local repositories. Here, the great advantage of the distributed computing power of the P2P overlays can be utilised. A single node is unlikely to be in possession of more than several thousand images. Search in such a small repository can be performed very quickly and all the nodes can do it in parallel. Concluding, the following theoretical and practical problems arise and will be solved in order to introduce the QbE search technique into the P2P architecture. 1. It should be determined which metadata frameworks and QbE techniques are available and which are suitable for use in P2P overlays. 2. It should be recognised which P2P overlays are the best candidates for the introduction of such a service. 3. It should be decided how to compare the performance of the QbE search method in both centralised and P2P environments. Also, a sufficient number of practical experiments has to be conducted in order to analyse the performance of the QbE search in centralised and P2P environments. A comparison of the performance in different environments has to be conducted. The Search time also has to be taken into account. It is very difficult to assess the search time in the simulation environment. Even if such an assessment is made it will lack in precision. This is due to the nature of the simulation environment which simplifies the protocol stack and the behaviour of peers. The protocol stack implemented within the peers in the simulator is simplified when compared to a real implementation in order to allow for resource efficient simulations. The peers are implemented in a uniform way in the simulator, which means that each peer has similar resources in its disposal. This is very different from the real networking environment in which peers differ substantially between each other in terms of available resources and the way these resources are utilised. Also the network links, unlike in the simulator, differ between each other in terms of available bandwidth and latency. In the authors belief the only proper method of accurate search time assessment in this case is implementation and measurement in a real network environment. Such a task is out of the scope of the presented research and is taken into account as one of the possible future research directions.

18 10 Introduction 1.4 Research Approach The proposed methodological approach is depicted in Figure 1.5. In order to reach the research goal of designing the QbE search system for the P2P overlay seven research tasks need to be accomplished. 2. Development of measurement methodology for QbE performance 1. State of the Art analysis - P2P overlays - QbE techniques 3. QbE performance analysis in local database. Selection of QbE method for further research 4. Application of QbE in structured P2P overlay 5. Application of QbE in unstructured P2P 6. Performance analysis 7. Comparison and conclusions Figure 1.5: The research approach 1. The first research task is a detailed analysis of the existing state of the art of both P2P network architectures and QbE search. In the topic of the P2P networks the existing architectures (called overlays) will be analysed in order to specify the requirements for the QbE search system. The analysis of multimedia search methods will allow to get acquainted with the recent achievements in this research area. Special attention will be put on the analysis of the methods of Query by Example of multimedia in local databases. It is foreseen to adapt the methods used in such databases for the need of the designed system. In order to create a system based on extraction of metadata it is required to analyse the standards of metadata extraction, comparison and storage. Also special attention will be put on the analysis of existing, similar systems. The state of the art in both P2P systems and QbE systems are presented in Chapters 2.1, 2.2 and The second research task is to develop a method of measurement of the performance of a distributed QbE search service. Both objective and subjective (QoE) metrics will be taken into account. The main objective measures of search performance are the accuracy and search time. These aspects have been chosen during the preliminary research as the most relevant. The existing methods of analysis of the quality of search are dedicated to the multimedia databases. According to the current state of the art there is no popular,

19 1.4 Research Approach 11 versatile and widely used measurement environment for search services in the P2P networks. The methods of the subjective quality assessment for video and audio multimedia services are well-defined in the International Telecommunications Union (ITU) recommendations. For measurement of the QoE of other multimedia services, including search, there are no standards nor recommendations. A set of proposed measurement methods is described in Chapter There are many techniques used for image QbE search. Some of these techniques are defined within the MPEG-7 standard [37] and some, such as the PictureFinder software, are proprietary solutions developed by companies and research institutions. Those methods yield different results and offer different quality of search results. For the purpose of the research of the QbE technique in the P2P environment it is required to identify the method which offers the highest QoE to the service users and is applicable in scope of the presented research. To identify such a method a set of QbE methods will be implemented in a local database of images. The performance of these methods will be assessed and the best QbE method will be selected for implementation in the P2P overlay. The measurements of performance in the centralised database will also serve as a reference for comparison of search performance between centralised and distributed QbE systems. This problem is addressed in Chapter It is planned to implement the QbE method selected in Task 3 in two architectures of P2P overlays structured and unstructured overlays. This will allow for comparison of the performance of the service in these two, significantly different, P2P architectures. The unstructured overlays are of simple construction. No complex query routing algorithms are implemented and the search process is often based on flooding the network with search queries. Such networks are characterised by low search overhead at the cost of the lower performance. The experiment and its results are covered in Chapter Structured overlays are characterised by a more complex construction and implemented routing routines. Thanks to that higher performance is delivered. Most of the structured P2P overlays are based on the Distributed Hash Tables (DHT). An example of a structured P2P overlay is CAN (Content Addressable Network). For the research purpose the implementation of the QbE service will be done in the simulation environment. Simulation of the P2P overlays is the only feasible way to research these network architectures. As scalability is of utmost importance it is impossible to create a P2P overlay in a laboratory conditions. Even such structures as PlanetLab offers at most

20 12 Introduction approximately thousand of nodes, whereas real P2P overlays are composed often of millions of nodes. The simulation environment will be chosen after the analysis of the existing P2P simulation tools. The experiment and results are described in Chapter During the sixth stage of the research the performance of the QbE service implementations in the structured P2P overlay and unstructured P2P overlay will be analysed. It is foreseen to analyse the performance of the QbE service in terms of accuracy when compared to the centralized solution. Bandwidth consumption will also be covered. 7. The last stage of the research is to compare the performance of the QbE search service in the local image database (Task 3) and P2P overlays (Tasks 4 and 5). The comparison of the performance of a local and distributed QbE services will allow to prove the thesis of the dissertation. 1.5 Thesis The thesis of the dissertation is as follows: It is possible to use a Query by Example mechanism for a search of images based on their low-level description in the unstructured and structured P2P overlays at accuracy comparable to the centralized solutions. 1.6 Cooperation and Publications The research presented in this dissertation was partially funded by the CON- TENT (Content Networks and Services for Home Users) Network of Excellence (No ). Fruitful cooperation within the CONTENT NoE has resulted in the authors participation in the European Doctoral School on Advanced Topics In Networking (SATIN). The results of the research where presented and discussed during the SATIN meetings both with fellow PhD students as well as with senior researchers from European universities. The PhD thesis and research was supported by Nicolas Liebau, PhD Eng. from the Technical University Darmstadt, Germany. Furthermore, the PhD research and preparation of the thesis was funded by a national PhD grant (no. N N ). The subjective tests of the MPEG-7 descriptors were performed for the need of the econtentplus project GAMA.

21 1.6 Cooperation and Publications 13 During the research of the topics described in this dissertation the author has prepared several publications, which are listed and commented below, in their chronological order: 1. Implementation and Application of MPEG-7 Descriptors in Peer-to-Peer Networks for Search Quality Improvement - Introduction to Research [17] was presented during the CONTENT PhD workshop in Madrid, Spain. The paper presents the general idea of the system as well as the basic methodological approach to the problem. 2. Content-based Search for Peer-to-Peer Overlays [22] was presented during the PhD workshop of the MEDHOCNET conference in Corfu, Greece. It presented a detailed description of the methodology as well as a detailed approach to the problem of benchmarking and measurement of a distributed search service. 3. Quality of experience evaluation for multimedia services [21] was presented at a plenary session of the KKRRiT 2008 conference in Wrocław, Poland. The paper described the differences between objective and subjective evaluation methods of multimedia services. 4. Benchmarking of Media Search based on Peeer-to-Peer Overlay Networks [25] was presented during the INFOSCALE 2008 conference workshop in Vico Equense, Italy. The publication presents the outcomes of the P2P benchmarking activity which was carried out in the CONTENT NoE. 5. Advanced Multimedia Search in P2P Overlays [18] was presented during the Students Workshop during the IEEE INFOCOM 2009 conference in Rio de Janeiro, Brazil. The paper presents the basic concept of the P2P QbE system along with a brief summary of the results of the user test, which was performed in order to select the optimal QbE method for further research. 6. Ground-Truth-Less Comparision of Selected Content-Based Image Retrieval Measures [20] was presented during the UCMEDIA 2009 conference in Venice, Italy. The paper presents the comparison of different QbE methods assessed in a series of subjective experiments. The results obtained in these experiments are presented in Chapter Wyszukiwanie przez podanie przykładu w sieci nakładkowej protokołu Gnutella [19] was presented during the KKRRiT 2011 conference in Poznań, Poland. The paper was awarded the second best young author paper award. The paper presents the results of implementation of the QbE service in the Gnutella P2P overlay. The results are presented in Chapter 3.3.

22 14 Introduction 1.7 Structure of the Dissertation Chapter 2 presents the theoretical background of the dissertation. In this Chapter the existing P2P overlays are described. An overview of existing metadata management standards is provided. Afterwards existing P2P benchmarking methods are briefly described. The Chapter is concluded with an overview of P2P simulation tools and a description of similar research projects. Chapter 3 reports the results of the practical experiments. First, the performance of different QbE methods is assessed in subjective experiments. Afterwards the implementation and simulation results of the QbE service in an unstructured and a structured P2P are provided. A comparison of the two implementations is described. Chapter 4 wraps up the results and most significant conclusions. Suggestions for further research directions are provided as well.

23 Chapter 2 State of the Art The investigated search service is created by applying content based media retrieval techniques to the P2P file sharing overlay. Content based media retrieval is a popular research topic. Such services are typically created for centralized media databases. Much effort is also undertaken by the researchers in the area of the P2P overlays. The combination of both techniques will, on one hand, open new possibilities for content providers and consumers but, on the other hand, is an up-to-date research challenge. This Chapter describes basic concepts and the current state of the art in both research areas the P2P overlays and the content-based image retrieval. In-depth understanding of these research areas is required in order to identify new research challenges which result from combination of underlying technologies. The description of the architectures of the P2P networks is provided in Section 2.1. The existing systems of metadata storage are discussed in Section 2.2. In Section 2.3 the existing content-based retrieval systems are presented. Section 2.4 describes the benchmarking system used for the measurements of the system. It is followed up by Section 2.5 which presents the existing software for simulation of P2P overlays. Section 2.6 is devoted to the similar research and related work in the area. 2.1 P2P Architectures The problem how to store the data and how to find data exists since the first databases were created. This Chapter describes the problems and solutions associated with data storage and retrieval. The division of P2P systems presented in this Chapter is based on [65]. For the sake of simplicity the Chapter will focus on file sharing service, but all the assumptions and techniques described

24 16 State of the Art can by applied to any kind of services such as instant messaging, VoIP or social networking Client-Server Architecture The first approach to the mentioned problem is a client-server architecture. A file sharing service following the client-server paradigm consists of two kinds of nodes. A powerful server, which job is to store the files, keep an up-to-date record of them, reply to the queries and finally serve the clients with the file download functionality. The clients in this design are computers of very low resources available, when compared to the server. The sole role of the clients is to be an interface between the server and the user. The client-server architecture has numerous drawbacks. It is expensive to setup as it requires a powerful computer. It is also expensive to support as it requires a lot of bandwidth to deal with both queries and file downloads. The server creates a single point of failure for the file sharing service. The system does not scale without substantial hardware and bandwidth expenses. On the other hand, there are advantages of such set-up. The service, being operated from a single server, is easy to control. Due to full control over the service it is also easy to create an economic model for it Centralised P2P The costs and the scalability problems of the client-server architecture caused the first generation of P2P architecture to emerge. Historically the first solution are centralised P2P networks such as Napster [55]. The main goal of the creators of the centralised P2P were to remove the load caused by the file downloads from the central server. In the centralised P2P architecture service consists of a central server and nodes. The role of the central server is to maintain the index of the files available throughout the network. The files are kept in the nodes and the nodes are responsible for supporting the resources required for the download. If a node wants to share a file in the network it has to advertise the file and its metadata to the central server. If a node wishes to find and download a file, it sends a query to the central server. The server responds with the address (or addresses) of the node (nodes) which have the requested asset. The querying node downloads the file from the node that has it bypassing the central server. This architecture has benefits over the client-server architecture. It allows to distribute the most resource consuming part of the service, being the file download. The concept of the service is relatively simple as no sophisticated routing mechanisms are required. This results in a simple implementation. Thanks to the

25 2.1 P2P Architectures 17 central indexing server the network can be controlled and an economical model for the service can be created. On the other hand, the indexing server in the centralised P2P networks still is the single point of failure. Also the load balancing, while much better than in the client-server solution, still is far from being perfect. Nodes holding files being popular among users have to sustain larger load than nodes that with unpopular content Unstructured P2P The next step in the evolution of P2P overlays is the unstructured P2P architecture. In this solution there is no need for any central authority. Both the storage and search are distributed in the network. The network consists only of nodes, which are sometimes referred as to servents [10] (a word created from server and client ), as in one of the implementations of the unstructured P2P architecture - the Gnutella protocol in version 0.4. The node does not perform file upload in any form. The sole responsibility of the node in the unstructured P2P architecture is to maintain connection with other nodes and to respond the queries for searched files. Search process is based on flooding the network with the query and hoping that the query propagated within the network will eventually reach the nodes, which possess the searched file. This architecture was the first one to remove the single point of failure from the design. If a node disconnects from the network only its content becomes inaccessible and the network as a whole does not cease to function. The drawbacks of the unstructured P2P systems is a huge signalling overhead which is required to maintain the connectivity of the nodes. Also the search process is very ineffective as the query routing is based in the flooding principle. Due to the totally distributed nature of the unstructured P2P system the administrator of the network looses the control over it, which apart from creating legal problems with digital rights management (referred also as piracy ), renders creation of economic model for the service very challenging Hybrid P2P Hybrid P2P networks emerged as a natural combination of the design paradigms of the centralised P2P systems and the unstructured P2P architectures. In the hybrid P2P systems the nodes are formed into a two tier topology. The nodes that have an abundance of resources (computing power and bandwidth) are elevated to a status of super peers (which can be also understood as a kind of local servers).

26 18 State of the Art The super peers are responsible for maintaining the indexes of small chunks of the network (typically up to 100 nodes, as Gnutella version 0.6 [47]). When a node connects to a network it has to establish a connection with a super peer responsible for the part of the network the node is in. After the connection has been established the node advertises the metadata of the files it wants to share to the super peer. In this way the super peer holds a complete index of files which are stored in its part of the network. If a node wants to find a file it sends a query to its super peer. The super peer checks whether the file is available within the part of the network it has an index for. If so, it returns the address of the node which has the searched file in possession. If no result can be found in the super peer s index the query is propagated to other super peers by flooding just as in unstructured P2P systems. The benefit of the hybrid P2P architecture over a unstructured P2P is a reduced amount of signalling required to maintain the network. The benefit of removal of a single point of failure is partly covered. A disconnection of a peer does not influence the network, although the disconnection of a super peer requires either selection of a new super peer or the other super peers to take over the functions of the disconnected one. The main drawback of the hybrid P2P architecture is an uneven load offered to the nodes forming the network. The super peers need to withstand a much more traffic and have to commit more computing power than the regular nodes without any substantial benefit Structured P2P The structured P2P architecture is the most current approach to the design of the P2P system. The novelty of the concept is the introduction of a logical link between the data (a file) and the address in the addressing space in the P2P overlay. A single peer in the structured P2P overlay is responsible for a chunk of the addressing space. If a new file is to be placed in the network it has to be mapped to the addressing space of the network. The mapping is done with a hashing algorithm such as for example SHA1 [11]. After the file was mapped to the address, the node responsible for the part of the addressing space the file address belongs to, has to be informed. There are two general strategies of storage [65]. One is the direct storage. In this method the file is uploaded to the node, which is responsible for the address linked to the file. Second is the indirect storage. In this method the responsible node is informed only of the file metadata and location. Search in the structured P2P overlay is easy. If a node knows the hash of the file searched it automatically knows the part of the addressing space it needs to query.

27 2.1 P2P Architectures 19 The benefit of such architecture is a low search overhead. Also the problem of load balancing is solved by this design. If a given content is popular among the users, it is enough to increase the number of nodes responsible for that part of the addressing space to split the load between them. The main drawback of the structured P2P overlay is a complex construction which makes the design and deployment of hybrid P2P networks a challenging task. Nevertheless several structured P2P overlay designs have been proposed such as Chord [66] or CAN [60] Summary of P2P Architectures The P2P systems have been actively developed since at least 1999 as an answer to the challenge of load distribution from the server to the clients. Different architectural solutions offer a different level of tradeoff between the communication overhead and the storage cost per node (Figure 2.1) with client-server and unstructured P2P solutions on the opposite poles. Communication overhead per node Pure P2P Hybrid P2P DHT Centralised P2P Client-Server Storage cost per node Figure 2.1: Storage cost versus communication overhead (based on [65]) In order to analyse the behaviour and performance of the QbE service in P2P overlays two different P2P architectures were chosen for implementation. These are Gnutella v. 0.4, which represents the pure P2P paradigm and Content Addressable Network (CAN) which represents the DHT based P2P architecture.

28 20 State of the Art 2.2 The Metadata Management As the term of metadata was introduced without any formal definitions there are several ways to define, what metadata really is. The most common and informal understanding of metadata is data about data [68]. This definition provides the best description of the nature of the metadata. To give a better understanding of the concept of the metadata, two additional, complimentary definitions can be introduced. A summary report presented by the Committee on Cataloging: Description & Access defines metadata as structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities [68]. Additionally, Bulterman gives a following definition: [Metadata is a set of] optional structured descriptions that are publicly available to explicitly assist in locating objects [8]. Several metadata systems were introduced as standards. Two most popular will be presented here in more details the Dublin Core and the MPEG-7 standard The Metadata Systems Dublin Core The Dublin Core 1 metadata standard is a simple (referred as the metadata pidgin for digital tourists [30]) and straightforward standard for information resource description. The preliminary works on the proposal of the standard started in 1995 and are continued by the Dublin Core Metadata Initiative (DCMI). The standard proposal was accepted by the National Information Standards Organisation (NISO), which is an association accredited by the American National Standards Institute (ANSI) [56]. The Dublin Core has been also proposed as a Request for Comments (RFC) [70]. The goals of the Dublin Core Metadata Standard are as follows [30]: 1. The set of metadata for a media file is supposed to be kept as small and simple as possible. 2. The semantic used by the standard is supposed to be well understandable and defined. This makes the metadata human-readable. 3. The standard is supposed to be international, having numerous language versions. 4. The standard is supposed to be easily extensible. The standardising organisation is working on creating interfaces which will allow to incorporate other metadata sets to the standard. 1 The name Dublin Core refers to Dublin, Ohio, USA where the first initiative of a metadata standard emerged, not to Dublin, Ireland, which is a common misunderstanding.

29 2.2 The Metadata Management 21 There are two levels of the Dublin Core Metadata Element Set (DCEMS): the Simple DCEMS and the Qualified DCEMS. The Simple DCMES contains of a set of high-level 15 elements, such as Title, Subject or Creator. The Qualified DCEMS is an extension of the Simple DCEMS and introduces three new elements Audience, Provenance and RightsHolder. Additionally the Qualified DCEMS introduces a set of refinements or qualifiers which narrow the semantics of an element. There are four rules which define the relationship between metadata and its data which are applied in the Dublin Core standard [30]: 1. The standard defines a set of elements and element refinements, along with a formal registry. These should be used by the content managers as a best practise 2. The one-to-one principle. A Dublin Core compliant metadata describes a single instance of a media, and not the whole class of media. In other words each media manifestation has its own Dublin Core metadata. 3. Dumb-down principle. A media manager (either a piece of software or a human) should be able to ignore a qualifier and use its value only. It may lead to loss of accuracy, but such a case should be handled. 4. The metadata should be constructed in such a way that it can be parsed and used both by software and human content managers. The Dublin Core standard reveals many advantages such as overall user-friendliness, which allows the access and understanding of the metadata by the human content managers. It has, however, a disadvantage, which eliminates it as a candidate for the described search system. It does not define the methods of extraction and comparison of the lowlevel metadata. The standard focuses only on the high-level metadata. It is worth mentioning, that some of the existing P2P networks allow narrowing of the simple text-based query by input of a metadata. Such, existing implementations (e.g. the KAD implementation in the emule client) allow, for example, to define the author or the title of the searched media. Moreover it is also possible to narrow down the search by inputting some content based metadata to the search query such as bitrate. These content-based metadata fields are, unfortunately far too primitive to allow significant improvement in the search quality The Metadata Systems MPEG-7 The MPEG family of standards consists of five well established groups of standards [53]. MPEG-1 [34] and MPEG-4 [36] are standards for multimedia compression,

30 22 State of the Art storage, production and distribution. MPEG-2 [35] is a standard for audio and video transport for a broadcast-quality TV. MPEG-21 [38] is defined as an open framework for the multimedia content and is still under development. The MPEG-7 standard was proposed by the Moving Pictures Experts Group (MPEG) as an ISO/IEC standard in 2001 [37]. The formal name of the standard is Multimedia Content Description Interface. The main goal of the MPEG-7 standard can be described as to standardise a core set of quantitative measures of audio-visual features called Descriptors (D), and structures of descriptors and their relationships, called Description Schemes (DS) in MPEG-7 parlance [53]. The most important requirements, set for the MPEG-7, are as follows [50]: Applications the standard can be applied in many environments and to numerous tasks ranging from education and tourist information to biomedical applications and storage. One of the most important areas of application of the MPEG-7 standard are the search and retrieval services [53]. Media types the MPEG-7 standard addresses many media types, including images, video, audio and three dimensional (3D) items. This diversity makes the standard all-purpose and allow development of complex system, which integrate many types of media content. Media independence the MPEG-7 metadata can be separated from the media and stored in a different place, also in multiple copies. This feature of the standard makes it useful e.g. in the P2P environments. Object-based approach the MPEG-7 standard structure is object-oriented, which is a desired feature in case of development of advanced metadata systems. In such system object-based approach allows development of a hierarchy of descriptions in which one descriptions inherit parts of the structure from another. Abstraction levels the MPEG-7 standard covers all levels of abstraction of the description of the media. The lowest, machine levels of the description of the media include such signal-processing features of the media such as spectral features in case of audio media or color structure descriptors in case of images. These are utilised for QbE search. On the other hand, MPEG-7 supports the high, semantic level of description of the media. In the scope of this research the low level descriptions of the media will be utilised. Extensibility the structure of the MPEG-7 is open for extension of the basic set of descriptors. This feature of the standard is very valuable, as new types of media are emerging and can not be covered at the moment of the standardisation.

31 2.2 The Metadata Management 23 Several terms, introduced by the MPEG-7 standard that will be used in the scope of this dissertation are presented below and in Figure 2.2 [54]: Data all kind of the multimedia content which can be described with the use of the standardised metadata. Feature a significant extractable characteristic of the data. Descriptor defines the syntax and semantics of the representation of a feature. Descriptor value an instance of a descriptor for a particular feature and a particular piece of data. Description scheme provides information on structure and relation between its parts. The parts of a description scheme can be both descriptors and other description schemes. Description Definition Language, DDL is an interface which allows to extend the existing descriptors and create new ones. Defined in the standard Defined outside standard Description Definition Language Description Scheme Description Scheme Descriptor Description Scheme Descriptor Descriptor Descriptor Figure 2.2: The relations between the parts of the MPEG-7 standard [54] To allow better understanding of the above definitions, a following example can be given (based on [54]): The description scheme Shot Technical Aspects consists of two descriptors, Lens which gives the information on the focal length of the lens used and Sensor size, which gives information on the size of the camera sensor in MP (Mega Pixels). A descriptor value for the Lens descriptor can be, for example, 300 mm and the descriptor value for the Sensor size descriptor can be e.g. 8 MP. In case of the low-level descriptors the standard defines, in most cases, the way of the extraction of the descriptor value. The extracted descriptor has a form of

32 24 State of the Art a vector. The standard does not, except for few cases, define a metric in which the descriptor values can be compared. However, such metrics are known and methods for calculation of the distance between two low-level descriptor values are described in the literature [50]. The MPEG group, apart from providing the standard itself, provides a reference software, called the experimentation Model (XM). This software tool allows extraction and comparison of descriptor values. It can serve as a reference for testing of other MPEG-7 based tools. An interesting inconsistency in the standard can be observed. The MPEG-7 standard does not provide a method of the descriptor value comparison, whereas the XM, being a part of the standard, does so. As the MPEG-7 standard is one of the most advanced achievements towards the metadata management it is used in the presented research. 2.3 Search Methods and Architectures of Metadata This Chapter presents a classification of the search methods used for retrieval of the media. The overview of the classification is depicted in Figure Classification Based on Input Method Most of the existing search system are based on the text search. The user inputs a text string which is then searched in the repository. The search can be exact (where only identical hits are returned) or for similar hits (for example single words from the query). For some application it is useful to utilise a fuzzy textual search method. In this kind of search a distance is calculated from the query to the textual strings in the repository. This distance is calculated in a dedicated metric, e.g. Levensthein Edit Distance [44]. This approach allows for successful search in case when the user misspells the word. It can also be helpful if the words in the database are spelt incorrectly. The textual content in the repository may have different origins. The repository can simply consist of textual documents as in case of Internet search. In case of media repositories the textual search can be performed with use of the file names (most common case in the P2P networks) or the manual annotations assigned by the repository owners. Using this search method we can search for movies with a defined actor by typing the name and hoping, that the movies in the repository have been tagged with the actor s name. Therefore, the effectiveness of this way of media search depends on the accuracy of the textual description of the media

33 2.3 Search Methods and Architectures of Metadata 25 Search methods classification User input method Matching method Source data Source data origin Complexity Textual Query by Example Exact matching Fuzzy matching Low-level features High-level features Semantic Automatic annotation Manual annotation Single-stage Multi-stage Figure 2.3: The classification of search methods and is independent from the media itself. To summarise, the textual search for media is the easiest but also the most primitive way of search. The content-based search allows to search for media basing on their contents. A textual description can be automatically extracted from the media. This can be done, for example, by means of the Optical Character Recognition (OCR), face recognition or speech recognition. Such, content based, automatically generated metadata can be accessed via previously described textual search. Now, a textual search by providing a name of an actor can yield better results, if the repository has been processed by a face recognition algorithm. A more advanced content-based search techniques utilise the low-level features of the video, such as dominant colour, shape histogram or similar. This gives us an opportunity to perform search by providing an example. If the repository provides tools for a low-level content based search and with a Query by Example (QbE) search method it is possible to search for videos or scenes which depict an actor by providing a visual example of this actor. The most advanced search tools utilise the semantic information extracted from the media. If a repository is providing tools for extraction and search within the semantic data derived from the content it would be possible to perform a textual query in form of a sentence: I am searching for love scenes with an actor talking to a woman. From the above example it is easy to notice that the most versatile and useful search system should incorporate all of the above mentioned search methods. If such system would exist, it would be possible to perform a query I am looking

34 26 State of the Art for scenes from movies with the actor from the provided example, who is wearing a blue suit and talking to a woman Classification of Search Methods Based on Complexity The search methods can be classified as single-stage and multi stage methods. In single stage search methods the user provides a query and the searching process is concluded, when the user is provided with the list of relevant results. This method is simple and does not require much input from the user himself. In some systems the user, after receiving the results has a possibility to apply filters or perform further search within the results. This filtering and/or search is performed locally. The other category of search mechanisms are multi stage search systems. In such systems the first part of the querying process is identical with the singlestage search. However, upon receiving the results the user may specify his query by utilising the received results. Such specified query is executed again and is expected to yield better results than the initial query. Multistage search systems are supposed to allow for more effective search. They, however require more effort and expertise from the user Advanced Search Methods and Users The natural question which arises here is for the usefulness of search tools. We got used to the textual search both in case of documents and media and may just not feel the need for more advanced tools. Such tools however, may be useful both for professionals and prosumers. Two usage scenarios are provided below. These scenarios have been developed by the author with cooperation with content providers and specialists [24]. Alice, a 28 year old history teacher is preparing an ancient history lesson for her class. She wishes to show pictures from Pompeii, showing the Vesuvius volcano in the background. She has a problem, however the only photos she finds in the Internet are thousands of family photos shot at Vesuvius. She knows that there are many good pictures in the network, but she can t find any appropriate straight away, but after browsing hundreds of pictures Juan is a 46 years old violin player and composer. Today, on his way home in the train he heard a fascinating Celtic tune from someones mobile ring tone. He remembered the tune perfectly thanks to his musical talent. He would really like to know more of this tune he hums all the time...

35 2.3 Search Methods and Architectures of Metadata 27 Alice runs a Peer-to-Peer client on her notebook and uploads her photo as a query example. In the returned results she quickly finds an appropriate photo for her lesson. Juan runs the same client on his computer and hums the tune to the microphone to use it as a query over the content aware worldwide spread distributed multimedia P2P store. In the results he finds the full song and concert description, the music file together with DRM information to retrieve it. But he also gets a peer cast online radio station that is just playing the whole song or did it within the last week. He realises that the peer cast station plays the kind of Celtic music he likes all the time and finds out that it is run by a Celtic band that wishes to live spread its music to a wide audience over the network. After a few days of listening, he joins the Celtic band community that is running the peer-cast station and shares his own works. Expert publications stress that users are becoming media producers and consumers at the same time ( prosumers ). The society is moving from a common, unified mainstream market to individual and fragmented tastes [23]. This trend can be depicted as an effect of a long tail in Figure 2.4. On the left side a vertical bar represents the standard model of multimedia production, where few broadcast stations produce content for masses of consumers. On the right side of the image the horizontal bar depicts the huge society of bloggers, photo-bloggers, videobloggers and podcasters who offer their multimedia productions to a narrow group of focused consumers. These producers are at the same time consumers of multimedia productions provided to them. This is a Web 2.0 community, which can be referred to as prosumers Summary on Classification of Search Methods The search and retrieval mechanisms can be classified in several ways (Figure 2.3). But the most important conclusion is that the diversity of search mechanisms is similar to the diversity of content types. And, like the content, different retrieval methods are targeted at different end users. Moreover thanks to easy access to the Internet and growing popularity of media the tastes and interests of consumers are becoming more diverse. The search and retrieval technology should keep up with the trend The Architectures of Metadata The traditional approach for storing metadata is to keep it in a single repository along with the data itself. This approach guarantees quick access both to metadata

36 28 State of the Art Number of content consumers Professional photographs, world-wide magazines, photo exhibitions Regional, local, community newspapers photographs, information portals Bloging, photo sharing One-to-many Number of content creators Everybody-to-a-few Figure 2.4: Popularity of content versus the number of the content consumers [23] and to the content as both are kept in a single repository. This feature makes the centralised approach useful for a local use. But when moving into the area of networking this approach becomes ineffective. User, searching for a piece of content has to his disposal only one, local copy of metadata accompanying the content itself. It is depicted in Figure 2.5. {A, B, C...} represent pieces of content and {M A, M B, M C...} the corresponding metadata. The approach adopted in the presented research is to separate the metadata from the content while preserving the link between both of them. Moreover - metadata can be copied and distributed - which is not always possible with the content due to its high volume or the copyright restraints. This leads to easier search and retrieval of metadata and its linked content. The user has to his disposal a distributed and fuzzy repository of metadata stored in many P2P overlay nodes. The concept is depicted in Figure Search Benchmarking According to the English dictionary [69], a benchmark is: a standardized problem or test that serves as a basis for evaluation or comparison (as of computer system performance)

37 2.4 Search Benchmarking 29 Figure 2.5: The traditional approach to content and metadata storing P2P Overlay Node P2P Overlay Node A M A M A D B M B M B M D P2P Overlay Node M B C M C P2P client User Figure 2.6: The distributed storage of media and metadata The search tools, as any other systems, need to be measured in order to assess their widely understood quality. Apart from assessing quality, the results of the benchmarking process allow to compare the proposed or developed system against similar ones. The requirement for such comparison is that all the compared sys-

P2P-VoD on Internet: Fault Tolerance and Control Architecture

P2P-VoD on Internet: Fault Tolerance and Control Architecture Escola Tècnica Superior d Enginyeria Departamento d Arquitectura de Computadors i Sistemas Operatius P2P-VoD on Internet: Fault Tolerance and Control Architecture Thesis submitted by Rodrigo Godoi under

More information


PROJECT FINAL REPORT PROJECT FINAL REPORT Grant Agreement number: 212117 Project acronym: FUTUREFARM Project title: FUTUREFARM-Integration of Farm Management Information Systems to support real-time management decisions and

More information

Integrating Conventional ERP System with Cloud Services

Integrating Conventional ERP System with Cloud Services 1 Integrating Conventional ERP System with Cloud Services From the Perspective of Cloud Service Type Shi Jia Department of Computer and Systems Sciences Degree subject (EMIS) Degree project at the master

More information

Introduction to Recommender Systems Handbook

Introduction to Recommender Systems Handbook Chapter 1 Introduction to Recommender Systems Handbook Francesco Ricci, Lior Rokach and Bracha Shapira Abstract Recommender Systems (RSs) are software tools and techniques providing suggestions for items

More information

The HAS Architecture: A Highly Available and Scalable Cluster Architecture for Web Servers

The HAS Architecture: A Highly Available and Scalable Cluster Architecture for Web Servers The HAS Architecture: A Highly Available and Scalable Cluster Architecture for Web Servers Ibrahim Haddad A Thesis in the Department of Computer Science and Software Engineering Presented in Partial Fulfillment

More information

Resource Management for Scientific Application in Hybrid Cloud Computing Environments. Simon Ostermann

Resource Management for Scientific Application in Hybrid Cloud Computing Environments. Simon Ostermann Resource Management for Scientific Application in Hybrid Cloud Computing Environments Dissertation by Simon Ostermann submitted to the Faculty of Mathematics, Computer Science and Physics of the University

More information

University: Andrés Terrasa Company: Salvador Ferris

University: Andrés Terrasa Company: Salvador Ferris POLYTECHNIC UNIVERSITY OF VALENCIA Faculty of Computer Science University: Andrés Terrasa Company: Salvador Ferris Move away from frequented tracks and walk through footpaths Pythagoras To my parents

More information

Risk assessment-based decision support for the migration of applications to the Cloud

Risk assessment-based decision support for the migration of applications to the Cloud Institute of Architecture of Application Systems University of Stuttgart Universittsstrae 38 D 70569 Stuttgart Diplomarbeit Nr. 3538 Risk assessment-based decision support for the migration of applications

More information

Best Practices in Scalable Web Development

Best Practices in Scalable Web Development MASARYK UNIVERSITY FACULTY OF INFORMATICS Best Practices in Scalable Web Development MASTER THESIS Martin Novák May, 2014 Brno, Czech Republic Declaration Hereby I declare that this paper is my original

More information

How to Decide to Use the Internet to Deliver Government Programs and Services

How to Decide to Use the Internet to Deliver Government Programs and Services How to Decide to Use the Internet to Deliver Government Programs and Services 1 Internet Delivery Decisions A Government Program Manager s Guide How to Decide to Use the Internet to Deliver Government

More information

Cloud-Based Software Engineering

Cloud-Based Software Engineering Cloud-Based Software Engineering PROCEEDINGS OF THE SEMINAR NO. 58312107 DR. JÜRGEN MÜNCH 5.8.2013 Professor Faculty of Science Department of Computer Science EDITORS Prof. Dr. Jürgen Münch Simo Mäkinen,

More information

Stakeholder Relationship Management for Software Projects

Stakeholder Relationship Management for Software Projects Stakeholder Relationship Management for Software Projects BY FRANCESCO MARCONI B.S., Politecnico di Milano, Milan, Italy, 2010 M.S., Politecnico di Milano, Milan, Italy, 2013 THESIS Submitted as partial

More information

Analysing the Characteristics of VoIP Traffic

Analysing the Characteristics of VoIP Traffic Analysing the Characteristics of VoIP Traffic A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Master of Science in the Department

More information

Report on Requirements of Users in Schools, the Healthcare Sector and the Arts, Humanities and Social Sciences

Report on Requirements of Users in Schools, the Healthcare Sector and the Arts, Humanities and Social Sciences Report on Requirements of Users in Schools, the Healthcare Sector and the Arts, Humanities and Social Sciences April 2008 Sabine Jaume-Rajaonia and Karel Vietsch (editors), Andrew Perry, Catalin Meiroşu,

More information

Analysis, Design and Implementation of a Helpdesk Management System

Analysis, Design and Implementation of a Helpdesk Management System Analysis, Design and Implementation of a Helpdesk Management System Mark Knight Information Systems (Industry) Session 2004/2005 The candidate confirms that the work submitted is their own and the appropriate

More information

Cloud Service Analysis Choosing between an on-premise resource and a cloud computing service

Cloud Service Analysis Choosing between an on-premise resource and a cloud computing service R din rou Cloud Service Analysis Choosing between an on-premise resource and a cloud computing service Master of Science Thesis in the Master Degree Programme, Software Engineering and Technology JONAS

More information

Study on Electronic Health Record and its Implementation

Study on Electronic Health Record and its Implementation Master Thesis Spring 2012 School of Health and Society Department Design and Computer Science Embedded Systems Study on Electronic Health Record and its Implementation Writer Qian Huang Qin Yin Instructor

More information

LISP-TREE: A DNS Hierarchy to Support the LISP Mapping System

LISP-TREE: A DNS Hierarchy to Support the LISP Mapping System LISP-TREE: A DNS Hierarchy to Support the LISP Mapping System Loránd Jakab, Albert Cabellos-Aparicio, Florin Coras, Damien Saucez and Olivier Bonaventure 1 Abstract During the last years several operators

More information

Cataloging and Metadata Education: A Proposal for Preparing Cataloging Professionals of the 21 st Century

Cataloging and Metadata Education: A Proposal for Preparing Cataloging Professionals of the 21 st Century Cataloging and Metadata Education: A Proposal for Preparing Cataloging Professionals of the 21 st Century A response to Action Item 5.1 of the Bibliographic Control of Web Resources: A Library of Congress

More information

MSc Thesis. Thesis Title: Designing and optimization of VOIP PBX infrastructure. Naveed Younas Rana. Student ID: 1133670. Supervisor: Dr.

MSc Thesis. Thesis Title: Designing and optimization of VOIP PBX infrastructure. Naveed Younas Rana. Student ID: 1133670. Supervisor: Dr. MSc Thesis Thesis Title: Designing and optimization of VOIP PBX infrastructure By Naveed Younas Rana Student ID: 1133670 Department of computer science and technology University of Bedfordshire Supervisor:

More information

Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures

Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures Rodrigo Tavares Fernandes Instituto Superior Técnico Avenida Rovisco

More information

Emergence and Taxonomy of Big Data as a Service

Emergence and Taxonomy of Big Data as a Service Emergence and Taxonomy of Big Data as a Service Benoy Bhagattjee Working Paper CISL# 2014-06 May 2014 Composite Information Systems Laboratory (CISL) Sloan School of Management, Room E62-422 Massachusetts

More information Coordination Action on Digital Library Interoperability, Best Practices and Modelling Foundations Coordination Action on Digital Library Interoperability, Best Practices and Modelling Foundations Coordination Action on Digital Library Interoperability, Best Practices and Modelling Foundations Funded under the Seventh Framework Programme, ICT Programme Cultural Heritage and Technology Enhanced

More information

Scalability and Performance Management of Internet Applications in the Cloud

Scalability and Performance Management of Internet Applications in the Cloud Hasso-Plattner-Institut University of Potsdam Internet Technology and Systems Group Scalability and Performance Management of Internet Applications in the Cloud A thesis submitted for the degree of "Doktors

More information

All Your Contacts Are Belong to Us: Automated Identity Theft Attacks on Social Networks

All Your Contacts Are Belong to Us: Automated Identity Theft Attacks on Social Networks All Your Contacts Are Belong to Us: Automated Identity Theft Attacks on Social Networks Leyla Bilge, Thorsten Strufe, Davide Balzarotti, Engin Kirda EURECOM Sophia Antipolis, France,,

More information

Brazilian Institute for Web Science Research

Brazilian Institute for Web Science Research ISSN 0103-9741 Monografias em Ciência da Computação n 46/08 Brazilian Institute for Web Science Research Nelson Maculan Carlos José Pereira de Lucena (editors) Departamento de Informática PONTIFÍCIA UNIVERSIDADE

More information

Fast Forward» How the speed of the internet will develop between now and 2020. Commissioned by: NLkabel & Cable Europe. Project: 2013.

Fast Forward» How the speed of the internet will develop between now and 2020. Commissioned by: NLkabel & Cable Europe. Project: 2013. Fast Forward» How the speed of the internet will develop between now and 2020 Commissioned by: NLkabel & Cable Europe Project: 2013.048 Publication number: 2013.048-1262 Published: Utrecht, June 2014 Authors:

More information

Designing workflow systems

Designing workflow systems TECHNISCHE UNIVERSITEIT EINDHOVEN Department of Mathematics and Computing Science MASTER S THESIS An algorithmic approach to process design and a human oriented approach to process automation by Irene

More information


ICT SYSTEMS MARKETING PLAN ICT SYSTEMS MARKETING PLAN by Ladan Mehrabi Graduate Diploma in Business Administration, Simon Fraser University, 2006 Bachelors Degree in Electronics Engineering, Tehran Azad University, 2001 PROJECT

More information

JCR or RDBMS why, when, how?

JCR or RDBMS why, when, how? JCR or RDBMS why, when, how? Bertil Chapuis 12/31/2008 Creative Commons Attribution 2.5 Switzerland License This paper compares java content repositories (JCR) and relational database management systems

More information