A survey of web archive search architectures
|
|
- Jonas Greer
- 2 years ago
- Views:
Transcription
1 A survey of web archive search architectures Miguel Costa, Daniel Gomes (Portuguese Web Francisco Couto, Mário J. Silva (University of Lisbon)
2 The Internet Archive was founded in 1996 Web-archived page of the first Web Archive
3 Web archiving has been growing 77 web archiving initiatives 282 billion web-archived files
4 Web archives must be searchable Users demand Google-like search Searchable means at least full-text search Unsearchable=Useless
5 How to enable web archive search? Our pursued answer since 2001
6 Research on web archiving (2001) A web archive of online publications Project between the University of Lisbon and the National Library of Portugal
7 Research on search engines (2001) Portuguese-web search engine
8 Portuguese Web Archive = Web search + Web archiving
9 Survey about web archiving initiatives (2011) URL search: 89% Meta-data search: 79% Full-text search: 67% (28 initiatives) The knowledge is out there
10 For this survey of web archive search architectures Identified prevalent search architectures Compared main features based on: Available publications (still few) Our experience
11 Portuguese Web Archive Based on NutchWAX Archive-access tools are widely used to support search Full-text search over 1.2B docs at archive.pt
12 Search workflow Machine 1 Query Server Index partition 1 Query Query Broker Machine 2 Query Server Machine N Query Server Index partition 2 Index partition N For large collections, indexes must be partitioned across several machines
13 Time, document partitioning (PWA) document identifiers Advantages Selects time partitions according to query timespan Progressive degradation All computers have all terms One partition fails, remaining respond to query term No index rebuilding required to add new collections lexicon after boat car... you zebra <3,1> Disadvantages <2,1> <5,2> <1,3> <4,2> <2,1> <4,1> xbox <5,1> document-based partition (1-2) document-based partition (3-4) High workload: all document partitions within timespan must be scanned for each query Centralized data center approach
14 Everlast P2P architecture Different types of nodes Crawlers Version directories Indexes Low cost nodes Full-text search Unlimited scalability Tested in laboratory
15 Term, time partitioning (Everlast) document identifiers lexicon after boat car dad <3,1> <2,1> <5,2> <1,3> <4,2> <2,1> <4,1>... <5,1> term-based partition (a-b) term-based partition (c-d) Advantages Robustness of decentralized architecture Lower workload: only one term partition is contacted for each query Disadvantages Index updates to add new collections Term partition unreachable may prevent response to query term Redundancy required Latency due to network
16 Wayback Machine (URL search) Doesn t use inverted indexes Flat sorted files of URLs URL partitioning Advantages High throughput with millions of queries daily Easy to manage: no phd required Disadvantages High communication workload because all queries are broadcasted to all index partitions Limited search features
17 Overall comparison Search requirement Storage and workload scalability Wayback Machine Portuguese Web Archive Everlast High High Very High Service reliability High High Medium Time-aware indexing Performance of response times and throughput No Yes Yes High Very High Medium None is the best, just different. Our objective was to improve documentation about web archive search
18 Food-for-thought
19 Problem for existing web archives Users don t know where to search for past web content Page unavailable means lost forever Dissemination of web archive services is expensive
20 It would be nice to have a single portal for cross-web archive search but Web-archived data is spread Search architectures are different Search technologies are different Interoperability is required
21 How to design a cross-web archive search architecture?
22 Our proposal: Web archive metasearch based on OpenSearch MyWebArchive search OpenSearch Web archive Metasearch Portal OpenSearch client YourWebArchive search OpenSearch Live-web search OpenSearch OpenSearch is a widely supported and simple technology Most web archives use NutchWAX and it supports OpenSearch Portal would be simple and cheap to implement Extremely useful to web users Increase visibility of web archiving initiatives Easily combines live-web with past-web search results
23 Successfully tested by Computer Science students Web applications that gather information about politicians from several sources: Wikipedia, Youtube, Twitter, Portuguese Web Archive
24 Web archive search easily integrated on web browsers
25 Required research to cross-web archive search Cross-web archive ranking algorithms How to rank search results? User interface design How to adequately present results from different sources?
26 Conclusions Web archives must support full-text search Web archive search architectures are different but search interoperability should be a requirement OpenSearch has potential to quickly enable cross-web archive search What do you think?
27 Contact us whenever you like Thanks.
Creating a billion-scale searchable web archive. Daniel Gomes, Miguel Costa, David Cruz, João Miranda and Simão Fontes
Creating a billion-scale searchable web archive Daniel Gomes, Miguel Costa, David Cruz, João Miranda and Simão Fontes Web archiving initiatives are spreading around the world At least 6.6 PB were archived
A Survey of Web Archive Search Architectures
A Survey of Web Archive Search Architectures Miguel Costa 1,2 miguel.costa@fccn.pt Daniel Gomes 1 daniel.gomes@fccn.pt Francisco M Couto 2 fcouto@di.fc.ul.pt Mário J. Silva 3 mjs@inesc-id.pt 1 Foundation
Design and selection criteria for a national web archive
Design and selection criteria for a national web archive Daniel Gomes Sérgio Freitas Mário J. Silva University of Lisbon Daniel Gomes http://xldb.fc.ul.pt/daniel/ 1 The digital era has begun The web is
PASIG San Diego March 13, 2015
Archive-It Archiving & Preserving Web Content PASIG San Diego March 13, 2015 We are a Digital Library Founded in 1996 by Brewster Kahle In 2007 of@icially designated a library by the state of California
Information Search in Web Archives
Information Search in Web Archives Miguel Costa Portuguese Foundation for National Scientific Computing LaSIGE @ Faculty of Sciences, University of Lisbon 1 st International Workshop on Archiving Community
Full Text Search of Web Archive Collections
Full Text Search of Web Archive Collections IWAW2005 Michael Stack Internet Archive stack@ archive.org Internet Archive IA is a 501(c)(3)non-profit Mission is to build a public Internet digital library.
LARGE SCALE INTERNET SERVICES
1 LARGE SCALE INTERNET SERVICES 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2 Overview Background Knowledge Architectural Case Studies Real-World Case Study 3 Overview Overview
Automatic Remediation for SOA Web Services. ORION SOAP Proxy
Automatic Remediation for SOA Web Services ORION SOAP Proxy based on ORION SOA 7.0 Abstract Web services are becoming an integral part of the business infrastructure, and the availability and proper functioning
Distributed computing: index building and use
Distributed computing: index building and use Distributed computing Goals Distributing computation across several machines to Do one computation faster - latency Do more computations in given time - throughput
Tools for Web Archiving: The Java/Open Source Tools to Crawl, Access & Search the Web. NLA Gordon Mohr March 28, 2012
Tools for Web Archiving: The Java/Open Source Tools to Crawl, Access & Search the Web NLA Gordon Mohr March 28, 2012 Overview The tools: Heritrix crawler Wayback browse access Lucene/Hadoop utilities:
System Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks
System Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks OnurSoft Onur Tolga Şehitoğlu November 10, 2012 v1.0 Contents 1 Introduction 3 1.1 Purpose..............................
Web Archiving and Scholarly Use of Web Archives
Web Archiving and Scholarly Use of Web Archives Helen Hockx-Yu Head of Web Archiving British Library 15 April 2013 Overview 1. Introduction 2. Access and usage: UK Web Archive 3. Scholarly feedback on
Practical Options for Archiving Social Media
Practical Options for Archiving Social Media Content Summary for ALGIM Web-Symposium Presentation 03/05/11 Euan Cochrane, Senior Advisor, Digital Continuity, Archives New Zealand, The Department of Internal
The Case for a Portuguese Web Search Engine
The Case for a Portuguese Web Search Engine Mário J. Gaspar da Silva FCUL/DI and LASIGE/XLDB mjs@di.fc.ul.pt Community Web Community webs have distint social patterns (Gibson, 1998) Community webs can
TABLE OF CONTENTS THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY FOR SHAREPOINT DATA. Introduction. Examining Third-Party Replication Models
1 THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY TABLE OF CONTENTS 3 Introduction 14 Examining Third-Party Replication Models 4 Understanding Sharepoint High Availability Challenges With Sharepoint
Cultural Heritage Institutions, Metadata Aggregators and The Cloud Aleksandra Nowak, Marcin Werla Poznań Supercomputing and Networking Center
Cultural Heritage Institutions, Metadata Aggregators and The Cloud Aleksandra Nowak, Marcin Werla Poznań Supercomputing and Networking Center ECloud and LoCloud are funded by the European Commission's
Web Search by the people, for the people Michael Christen, mc@yacy.net, http://yacy.net
Web by the people, for the people, mc@yacy.net, RMLL 2011 Rencontres Mondiales du Logiciel Libre http://2011.rmll.info Topics What is a decentralized search engine? and why would you use that Architecture
Archive-IT Services Andrea Mills Booksgroup Collections Specialist
Getting Started with Archive-IT Services Andrea Mills Booksgroup Collections Specialist Internet Archive Micro History Text Archive Update Archive-IT Services 1996 The Internet Archive is created, with
Scholarly Use of Web Archives
Scholarly Use of Web Archives Helen Hockx-Yu Head of Web Archiving British Library 15 February 2013 Web Archiving initiatives worldwide http://en.wikipedia.org/wiki/file:map_of_web_archiving_initiatives_worldwide.png
Creating a Billion-Scale Searchable Web Archive
Creating a Billion-Scale Searchable Web Archive Daniel Gomes, Miguel Costa, David Cruz, João Miranda, Simão Fontes Foundation for National Scientific Computing Av. Brasil, 101 1700-066 Lisboa, Portugal
Xiaoling Zhen. Professor: Raimo Kantola Instructor: Jose M. Costa
Xiaoling Zhen Helsinki University of Technology Department of Electrical and Communication Engineering Networking Laboratory Professor: Raimo Kantola Instructor: Jose M. Costa 1 Contents Background Purpose
Hypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
Digital Libraries Division Data Loss Escalation Procedures
Digital Libraries Division Data Loss Escalation Procedures Date: October 2015 Version: 1.0 Contributors: Mark Phillips Assistant Dean for Digital Libraries Ana Krahmer Supervisor, Digital Newspaper Unit
Peer-to-Peer Networks. Chapter 6: P2P Content Distribution
Peer-to-Peer Networks Chapter 6: P2P Content Distribution Chapter Outline Content distribution overview Why P2P content distribution? Network coding Peer-to-peer multicast Kangasharju: Peer-to-Peer Networks
Technical White Paper: Clustering QlikView Servers
Technical White Paper: Clustering QlikView Servers Technical White Paper: Clustering QlikView Servers Clustering QlikView Servers for Resilience and Horizontal Scalability v1.3 QlikView 9 SR3 or above,
Diagram 1: Islands of storage across a digital broadcast workflow
XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,
elearning Content Management Middleware
elearning Content Management Middleware Chen Zhao Helsinki 18.2.2004 University of Helsinki Department of Computer Science Authors Chen Zhao Title elearning Content Management Middleware Date 18.2.2004
www.inovoo.com EMC APPLICATIONXTENDER 8.0 Real-Time Document Management
www.inovoo.com EMC APPLICATIONXTENDER 8.0 Real-Time Document Management 02 EMC APPLICATIONXTENDER 8.0 EMC ApplicationXtender (AX) is a web-based real-time document management system which stores, manages
The XLDB Group at CLEF 2004
The XLDB Group at CLEF 2004 Nuno Cardoso, Mário J. Silva, and Miguel Costa Grupo XLDB - Departamento de Informática, Faculdade de Ciências da Universidade de Lisboa {ncardoso, mjs, mcosta} at xldb.di.fc.ul.pt
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Business Software Defined DMS DOCUMENT MANAGEMENT SYSTEM ZETA SOFTWARE
ZETA Business Software Defined DMS DOCUMENT MANAGEMENT SYSTEM Document Management Software Across the globe, billions of paper documents are produced by individuals, small offices and large organizations
If you are not the person responsible for filling it out, you can skip this part and go to the end of the survey.
1. Introduction WHAT THIS SURVEY IS ABOUT: This survey will cover several areas, with the aim of gathering information about the user requirements for the ECLAP portal, both for end users and content partners.
Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw.
Archiving Systems Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie uwe.borghoff@unibw.de Decision Process Reference Models Technologies Use Cases
Distributed Architecture of Oracle Database In-memory
Distributed Architecture of Oracle Database In-memory Niloy Mukherjee, Shasank Chavan, Maria Colgan, Dinesh Das, Mike Gleeson, Sanket Hase, Allison Holloway, Hui Jin, Jesse Kamp, Kartik Kulkarni, Tirthankar
Next Generation. School Management System. EduSwift. School Management System. www.eduswift.com
Next Generation School Management System EduSwift School Management System For more information contact: sales@eduswift.com www.eduswift.com ABOUT EduSwift Education system outlines the backbone of every
Plagiarism. Dr. M.G. Sreekumar UNESCO Coordinator, Greenstone Support for South Asia Head, LRC & CDDL, IIM Kozhikode
Digital Rights Management & Plagiarism Dr. M.G. Sreekumar UNESCO Coordinator, Greenstone Support for South Asia Head, LRC & CDDL, IIM Kozhikode Intranet / Internet K-Assets/Objects, Practices, CoP, Collaborative
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
SharePoint Server 2010 Capacity Management: Software Boundaries and Limits
SharePoint Server 2010 Capacity Management: Software Boundaries and s This document is provided as-is. Information and views expressed in this document, including URL and other Internet Web site references,
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
SCALABLE DATA SERVICES
1 SCALABLE DATA SERVICES 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2 Overview MySQL Database Clustering GlusterFS Memcached 3 Overview Problems of Data Services 4 Data retrieval
A brief introduction on SharePoint
A brief introduction on SharePoint Raizel Consulting 11/09/2007 SharePoint is an enterprise information portal, from Microsoft, that can be configured to run Intranet, Extranet and Internet sites. SharePoint
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process
Search and Real-Time Analytics on Big Data
Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its
A Peer-to-Peer Approach to Content Dissemination and Search in Collaborative Networks
A Peer-to-Peer Approach to Content Dissemination and Search in Collaborative Networks Ismail Bhana and David Johnson Advanced Computing and Emerging Technologies Centre, School of Systems Engineering,
Deciding When to Deploy Microsoft Windows SharePoint Services and Microsoft Office SharePoint Portal Server 2003. White Paper
Deciding When to Deploy Microsoft Windows SharePoint Services and Microsoft Office SharePoint Portal Server 2003 White Paper Published: October, 2003 Table of Contents Introduction 4 Relationship between
Invenio: A Modern Digital Library for Grey Literature
Invenio: A Modern Digital Library for Grey Literature Jérôme Caffaro, CERN Samuele Kaplun, CERN November 25, 2010 Abstract Grey literature has historically played a key role for researchers in the field
Web Archiving Tools: An Overview
Web Archiving Tools: An Overview JISC, the DPC and the UK Web Archiving Consortium Workshop Missing links: the enduring web Helen Hockx-Yu Web Archiving Programme Manager July 2009 Shape of the Web: HTML
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
Bigtable is a proven design Underpins 100+ Google services:
Mastering Massive Data Volumes with Hypertable Doug Judd Talk Outline Overview Architecture Performance Evaluation Case Studies Hypertable Overview Massively Scalable Database Modeled after Google s Bigtable
Intelligent Dashboards made Simple! Using Excel Services
Intelligent Dashboards made Simple! Using Excel Services Presented by: Asif Rehmani, SharePoint Server MVP Trainer/Solution Architect asif@sharepointsolutions.com Who am I? One of the founders of Chicago
GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING
MEDIA MONITORING AND ANALYSIS GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING Searchers Reporting Delivery (Player Selection) DATA PROCESSING AND CONTENT REPOSITORY ADMINISTRATION AND MANAGEMENT
Audit compliance and long-term archiving for SharePoint
Connect to SharePoint Product Info Audit compliance and long-term archiving for SharePoint Connect to SharePoint integrates Microsoft Office SharePoint with the DocuWare integrated document management
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
Index Data's MasterKey Connect Product Description
Index Data's MasterKey Connect Product Description MasterKey Connect is an innovative technology that makes it easy to automate access to services on the web. It allows nonprogrammers to create 'connectors'
Analysis of Web Archives. Vinay Goel Senior Data Engineer
Analysis of Web Archives Vinay Goel Senior Data Engineer Internet Archive Established in 1996 501(c)(3) non profit organization 20+ PB (compressed) of publicly accessible archival material Technology partner
CLOUD COMPUTING. When it's smarter to rent than to buy.. Presented by Anand Tirumani
CLOUD COMPUTING When it's smarter to rent than to buy.. Presented by Anand Tirumani Agenda Cloud Computing: Concepts and Terminologies What is Cloud Computing? Essential Characteristics Service Models
The Laserfiche Rio Advantage. Automate, Optimize and Transform Business Processes. Unlimited document repositories and servers
Automate, Optimize and Transform Business Processes When organizations rely on paper, there often isn t time to re-evaluate how work gets done. Business leaders are forced to spend time on nonessential
Data Discovery on the Information Highway
Data Discovery on the Information Highway Susan Gauch Introduction Information overload on the Web Many possible search engines Need intelligent help to select best information sources customize results
Search Engine Technology and Digital Libraries: Moving from Theory t...
1 von 5 09.09.2010 15:09 Search Engine Technology and Digital Libraries Moving from Theory to Practice Friedrich Summann Bielefeld University Library, Germany Norbert Lossau
High Throughput Computing on P2P Networks. Carlos Pérez Miguel carlos.perezm@ehu.es
High Throughput Computing on P2P Networks Carlos Pérez Miguel carlos.perezm@ehu.es Overview High Throughput Computing Motivation All things distributed: Peer-to-peer Non structured overlays Structured
Digital Libraries and Content Management
Digital Libraries and Content Management Database Research Group, University of Rostock 4th European IBM Content Manager and Media Workshop, September 2002, Essen 0. Overview 1. Content Management Systems
Webcasting vs. Web Conferencing. Webcasting vs. Web Conferencing
Webcasting vs. Web Conferencing 0 Introduction Webcasting vs. Web Conferencing Aside from simple conference calling, most companies face a choice between Web conferencing and webcasting. These two technologies
Taming a Wild Workflow and Domesticating our Digital Data
Taming a Wild Workflow and Domesticating our Digital Data Nancy Allmang, nancy.allmang@nist.gov Paula Deutsch, paula.deutsch@nist.gov Information Services Division Technology Services Project Overview
High Availability and Clustering
High Availability and Clustering AdvOSS-HA is a software application that enables High Availability and Clustering; a critical requirement for any carrier grade solution. It implements multiple redundancy
AlienVault Unified Security Management (USM) 4.x-5.x. Deployment Planning Guide
AlienVault Unified Security Management (USM) 4.x-5.x Deployment Planning Guide USM 4.x-5.x Deployment Planning Guide, rev. 1 Copyright AlienVault, Inc. All rights reserved. The AlienVault Logo, AlienVault,
1st Training Session Berlin, May 15th, 2014
experimental Infrastructures for the Future Internet 1st Training Session Berlin, May 15th, 2014 www.fi-xifi.eu A very brief survey of how to use XIFI and FI-OPS XIFI FOR DEVELOPERS Agenda Introduction
Introduction to Hadoop
Introduction to Hadoop 1 What is Hadoop? the big data revolution extracting value from data cloud computing 2 Understanding MapReduce the word count problem more examples MCS 572 Lecture 24 Introduction
Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003
Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building
Realizing a Vision Interesting Student Projects
Realizing a Vision Interesting Student Projects Do you want to be part of a revolution? We are looking for exceptional students who can help us realize a big vision: a global, distributed storage system
DMS Document Management System
ZETA Solutions for Excellence DMS Document Management System Zeta Doc Store Across the globe, billions of paper documents are produced by individuals, small offices and large organizations every day. Many
EVERYTHING YOU NEED FOR BRANDING ON MULTIPLE CHANNELS
EVERYTHING YOU NEED FOR BRANDING ON MULTIPLE CHANNELS Manage all you rich media content and customer experience simultaneously with DAM for Sitecore EXECUTE YOUR MULTICHANNEL STRATEGY Apps Adobe Adaptive
Document Management System (DMS) Release 4.5 User Guide
Document Management System (DMS) Release 4.5 User Guide Prepared by: Wendy Murray Issue Date: 20 November 2003 Sapienza Consulting Ltd The Acorn Suite, Guardian House Borough Road, Godalming Surrey GU7
Availability Digest. www.availabilitydigest.com. @availabilitydig. HPE Helion Private Cloud and Cloud Broker Services February 2016
the Availability Digest @availabilitydig HPE Helion Private Cloud and Cloud Broker Services February 2016 HPE Helion is a complete portfolio of cloud products and services that offers enterprise security,
Hadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
Cassandra A Decentralized, Structured Storage System
Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922
Real Time Performance Dashboard for SOA Web Services ORION SOA
Real Time Performance Dashboard for SOA Web Services ORION SOA Abstract The adoption of service-oriented architectures (SOA) has become increasingly prevalent in enterprise IT environments. This web services
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 minor@sdsc.edu San Diego Supercomputer Center
Monitoring Elastic Cloud Services
Monitoring Elastic Cloud Services trihinas@cs.ucy.ac.cy Advanced School on Service Oriented Computing (SummerSoc 2014) 30 June 5 July, Hersonissos, Crete, Greece Presentation Outline Elasticity in Cloud
Everything You Need To Know About Cloud Computing
Everything You Need To Know About Cloud Computing What Every Business Owner Should Consider When Choosing Cloud Hosted Versus Internally Hosted Software 1 INTRODUCTION Cloud computing is the current information
International Journal of Engineering Research & Management Technology
International Journal of Engineering Research & Management Technology March- 2015 Volume 2, Issue-2 Survey paper on cloud computing with load balancing policy Anant Gaur, Kush Garg Department of CSE SRM
Firebird meets NoSQL (Apache HBase) Case Study
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
Architectural Components of Digital Library: A Practical Example Using DSpace
Architectural Components of Digital Library: A Practical Example Using DSpace Sandip Das MS-LIS Student, DRTC ISI Bangalore, sandip@drtc.isibang.ac.in M. Krishnamurty DRTC, ISI Bangalore-560059 mkrishna_murthy@hotmail.com
Easier - Faster - Better
Highest reliability, availability and serviceability ClusterStor gets you productive fast with robust professional service offerings available as part of solution delivery, including quality controlled
The Ultimate in Scale-Out Storage for HPC and Big Data
Node Inventory Health and Active Filesystem Throughput Monitoring Asset Utilization and Capacity Statistics Manager brings to life powerful, intuitive, context-aware real-time monitoring and proactive
Service Description Cloud Storage Openstack Swift
Service Description Cloud Storage Openstack Swift Table of Contents Overview iomart Cloud Storage... 3 iomart Cloud Storage Features... 3 Technical Features... 3 Proxy... 3 Storage Servers... 4 Consistency
PI Cloud Connect Overview
PI Cloud Connect Overview Version 1.0.8 Content Product Overview... 3 Sharing data with other corporations... 3 Sharing data within your company... 4 Architecture Overview... 5 PI Cloud Connect and PI
Implement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
FEDERATED DATA SYSTEMS WITH EIQ SUPERADAPTERS VS. CONVENTIONAL ADAPTERS WHITE PAPER REVISION 2.7
FEDERATED DATA SYSTEMS WITH EIQ SUPERADAPTERS VS. CONVENTIONAL ADAPTERS WHITE PAPER REVISION 2.7 INTRODUCTION WhamTech offers unconventional data access, analytics, integration, sharing and interoperability
Tools and Services for the Long Term Preservation and Access of Digital Archives
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer Studies Department of Electrical and Computer
josh Archive! The Organization Intelligence Software document archiving substitutive Conservation
josh Archive! The Organization Intelligence Software document archiving substitutive Conservation josh is a software by it Consult and is a registered trademark of it Consult S.r.l. - 2001-2014 all rights
Capitalizing on Emerging Technology: Enhancing the Health Artifact and Image Management Solution
Mr. Alvaro E. Rodriguez, PM Care & Benefits Integrated Systems Solution Delivery Division 2015 Defense Health Information Technology Symposium Capitalizing on Emerging Technology: Enhancing the Health
SofaWare Management Architecture Basics
SofaWare Management Architecture Basics The SofaWare management architecture is made up of several software components. These components are similar to components in FW-1/NG. Some aspects of the SofaWare
Integrating Document Management & GIS
Integrating Document Management & GIS Alex Bain, VP R&D, R7 Solutions August 7, 2003 1 Contents Overview of Document Management Systems Why Document Management? Features Landscape Industry examples of
W H I T E P A P E R E X E C U T I V E S U M M AR Y S I T U AT I O N O V E R V I E W. Sponsored by: EMC Corporation. Laura DuBois May 2010
W H I T E P A P E R E n a b l i n g S h a r e P o i n t O p e r a t i o n a l E f f i c i e n c y a n d I n f o r m a t i o n G o v e r n a n c e w i t h E M C S o u r c e O n e Sponsored by: EMC Corporation
Introduction to Hadoop
1 What is Hadoop? Introduction to Hadoop We are living in an era where large volumes of data are available and the problem is to extract meaning from the data avalanche. The goal of the software tools
Information access through information technology
Information access through information technology 1 Created to support an invited lecture at the International Conference MDGICT 2009 in Tamil Nadu, India, December 2009 by Paul.Nieuwenhuysen@vub.ac.be
Taking full advantage of the medium does also mean that publications can be updated and the changes being visible to all online readers immediately.
Making a Home for a Family of Online Journals The Living Reviews Publishing Platform Robert Forkel Heinz Nixdorf Center for Information Management in the Max Planck Society Overview The Family The Concept
Unit 2 Distributed Systems R.Yamini Dept. Of CA, SRM University Kattankulathur
Unit 2 Distributed Systems R.Yamini Dept. Of CA, SRM University Kattankulathur 1 Introduction to Distributed Systems Why do we develop distributed systems? availability of powerful yet cheap microprocessors
Proceedings of the Federated Conference on Computer Science and Information Systems pp. 737 741
Proceedings of the Federated Conference on Computer Science and Information Systems pp. 737 741 ISBN 978-83-60810-22-4 DCFMS: A Chunk-Based Distributed File System for Supporting Multimedia Communication
Chapter-1 : Introduction 1 CHAPTER - 1. Introduction
Chapter-1 : Introduction 1 CHAPTER - 1 Introduction This thesis presents design of a new Model of the Meta-Search Engine for getting optimized search results. The focus is on new dimension of internet