Interacting with local and remote data repositories using the stashr package
|
|
|
- Sharyl Dawson
- 9 years ago
- Views:
Transcription
1 Computational Statistics DOI /s x ORIGINAL PAPER Interacting with local and remote data repositories using the stashr package Sandrah P. Eckel Roger D. Peng Received: 14 March 2007 / Accepted: 8 May 2008 Springer-Verlag 2008 Abstract The stashr package (a Set of Tools for Administering Shared Repositories) for R implements a basic versioned key-value style database where character string keys are associated with data values. Using the S4 classes localdb and remotedb, and associated methods, versioned key-value databases can be either created locally on the user s computer or accessed remotely via the Internet. The stashr package can enhance reproducible research by providing a localdb database format for the caching of computations which can subsequently be stored on the Internet. To reproduce a particular computation, a reader can access the remotedb database and obtain the associated R objects. Keywords Reproducible research Database Data distribution Version control 1 Overview Scientific research is conducted by collecting data, analyzing and summarizing the evidence in the data, and publishing substantive results in a paper. To verify scientific results we must replicate or, at a minimum, reproduce the findings of a previous study. Replication is the act of collecting an independent dataset in a similar manner to the original study and then using the data to address a similar scientific question of interest. This work was supported in part by the Johns Hopkins Training Program in the Epidemiology and Biostatistics of Aging (NIA T32 AG00247) and the Faculty Innovation Fund, Johns Hopkins Bloomberg School of Public Health. S. P. Eckel (B) R. D. Peng Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe Street E3527, Baltimore, MD 21205, USA [email protected]
2 S. P. Eckel, R. D. Peng Reproduction of scientific research is the act of using the same data as the original study and performing additional statistical analyses (Gentleman and Temple Lang 2004). Replication is generally considered the highest standard of verification because researchers can address the uncertainty inherent in collecting a data sample from a larger population and improve upon any shortcomings of previous data collection designs. Reproducibility allows researchers to evaluate the sensitivity of results from the initial statistical analysis of a study. Although replication is ideal, in cases such as large epidemiologic studies, reproduction is often the only way to verify the scientific findings of a study (Peng et al. 2006). For fully reproducible research we need to facilitate prospective analysts access to the data used for a statistical analysis. Hence we need to be able to distribute potentially large datasets and computations. To disseminate accurate data, we need a system for physical data distribution that manages data updates by automatically synchronizing each prospective analyst s local copy of the data to the remote, master copy of the data and that keeps track of version changes to the data. It follows that we need software to help manage both the distribution and synchronization/caching of this data. 2 The stashr package The stashr package is an extension to local and remote databases of the filehash package (Peng 2006), which uses a key-value database to allow for interactive work with datasets too large to be loaded into R as a single object. A key-value database is a collection of data files, each indexed by a character string key. One example of a keyvalue database is a multi-center study consisting of data from 4 cities (New York, Los Angeles, Chicago and Seattle) where data for each city is stored in a file named ny, la, chicago, and seattle respectively. In this case, the key-value database allows a researcher to download data from a particular city of interest instead of downloading the entire four city dataset at once. The stashr package adds important functionalities to data handling and distribution in R. These contributions include: a set of tools for creating a versioned local database for subsequent exporting to a remotely accessible webserver the ability to access versioned remote databases efficiently a tool for synchronizing local copies of a database to the remote version an abstract interface for interacting with local and remote databases. The local and remote features of stashr address the need to manage data distribution and synchronize cached data copies for reproducible research. 3 Design rationale 3.1 Repositories The stashr package is designed for interacting with both local and remote data repositories. A repository is a directory of files containing data and metadata. Each user
3 The stashr package interface function in stashr is a generic function with specific methods defined for repository objects of the S4 classes localdb and remotedb. A localdb repository is created and stored on a user s local disk. A remotedb repository is a localdb repository that has been stored on a remote webserver. A localdb repository consists of a root directory that contains a data directory and a text file version. The version file lists information on the current version of the repository and all previous repository versions. The data directory contains compressed data files labeled according to their corresponding character key and the version number of the data indexed by this key (the key version). Each data file has a corresponding.sig text file that lists the 32-byte MD5 checksum from running md5sum() on the data file (see the R package tools for more details), the data file s character key and key version number. The local cache associated with a remotedb repository consists of a root directory containing a data directory and a text file url that lists the remotedb repository s URL on a webserver. The data directory contains the cached data and.sig files corresponding to user specified keys. We will not discuss the.sig files further since they will be employed in the future to safeguard against data corruption, but they are not currently being used. 3.2 Versioning The version file in a localdb repository lists information on each version of the repository. The repository version is set to 1 when the first object is inserted. Upon each successive alteration of the data repository (either insertion or deletion of data files associated with character keys), the repository version is incremented by 1. In addition to the repository version, character keys are associated with key version numbers. The first time a value associated with a character key is inserted into the database, this key is associated with the key version 1. The second time a value is associated with this same key and inserted in the database, the key version is 2. On each successive insertion of a value associated with a given key, the key version increases by 1. If we delete the data associated with a given key, the next insertion of a value associated with this key will increase the key version by 1 from the key version last associated with the key before deletion. We retain copies of the data corresponding to each data insertion, so a user can access any specified version of a repository, regardless of whether the data has been altered or deleted in subsequent versions. Each line of the version file summarizes a different repository version. A line starts with a repository version number which is colon-separated from a space-separated listing of the keys and key versions indexing data in this repository version. For example, a line of the version file might read: 4:seattle.1 la.2 ny.1 At version 4, the repository contains data files corresponding to the keys: seattle, la and ny. Data corresponding to the keys seattle and ny has been inserted only once in the database since each of these keys is followed by a.1, indicating key version 1. From the.2 following the key la we know data corresponding to the key la has been inserted twice and repository version 4 contains key version 2 of the la data.
4 S. P. Eckel, R. D. Peng 3.3 Caching and synchronization A key feature of the stashr package is the ability for a user to download and store specified data from a remotedb repository in a local cache and, at a later date, to synchronize the local cache to the current version of the remote repository. The synchronization feature allows a user to efficiently maintain an up-to-date local cache of remote repositories by downloading updated versions of cached files only when necessary. When creating a local cache of a remotedb repository the user can specify that he or she would always like to access the most current version of the remote repository. Subsequent data synchronization will look for the most recent repository version. The user can also specify access to a specific remote repository version, in which case synchronization is not necessary. Synchronization of data in a local copy of a remote database is achieved by comparing the key versions of keys associated with the locally cached data to the key versions of the keys associated with data in the most recent version of the remote repository. If the key versions do not match, the new data is downloaded from the remote repository and the key versions are updated. If a key in the local cache no longer corresponds to data in the current version of the remote repository (i.e., the data associated with this key has been deleted in the most current remote repository), then the locally cached copy of this data is deleted. 4 Interface 4.1 Creating a localdb repository or a cache of a remotedb repository To create a new instance of the S4 class localdb or remotedb you call new ("localdb", dir, name, reposversion) or new("remotedb", dir, url, name, reposversion) where dir specifies the local directory in which to create the localdb repository or the local cache for the remotedb repository. For remotedb, url gives the URL of the remote repository s root directory and the default reposversion is 1, the most current repository version. For remotedb, specifying another positive integer for reposversion fixes the repository version to that number. For a new localdb object, reposversion should be left at the default 1. The name argument is a label associated with the new localdb or remotedb object. 4.2 Accessing a localdb or remotedb database The user-end interfaces to localdb or remotedb databases include dbfetch, dbinsert, dblist, dbexists, dbdelete and dbsync. Each interface is a generic function defined in the filehash package and has a specific method for objects of the remotedb and localdb classes (with the exception of dbsync, which is only available for the remotedb class). The first argument of all the functions is an object of class remotedb or localdb.
5 The stashr package The function dbfetch takes a second argument key, a character string key, and returns the associated R object. For objects of the class remotedb, if the key s data file does not exist in the local cache, then dbfetch downloads the file from the remote repository to the local cache and reads the data file. For remotedb, if the data file exists in the local cache, then dbfetch compares the key s version number from the local cache to the key s version number in the most current version of the remote repository. If the key s version numbers are the same, dbfetch reads the file from the local cache. Otherwise, dbfetch downloads and reads the updated version of the data file from the remote repository. For objects of the class localdb, dbfetch returns the R object associated with the character string key if the associated data files exist in the local cache, and if not, dbfetch returns an error. The second argument of dbinsert is key, a character string key indexing the R object specified by the third argument, value. For a localdb object, dbinsert increments both the repository version and the key version by 1, and then writes the value to a compressed data file indexed by the specified key and the new key version. Lastly, dbinsert appends to the version file a line summarizing the new version of the repository. A user can overwrite a data value by calling dbinsert with a previously specified character key, but with different data values. However, since at each insertion the key is indexed by a different key version, the overwritten data can be recovered by accessing the repository version associated with the older key version. Calling dbinsert on a remotedb object returns an error message. The function dblist takes a localdb or remotedb object as its only argument and returns the vector of keys associated with the repository version. The function dbexists requires as a second argument, key, a vector of character string keys and returns a logical vector of the same length that specifies whether each element of key is associated with data in the current repository version. For localdb objects, dbdelete takes a character string key specified in key, increments the repository version by 1, removes the specified key from the list of keys associated with the current repository version, and appends a line to the version file with the new repository version information. The data file associated with this key is not deleted from the local data directory so it can be accessed later. For remotedb objects, dbdelete returns an error. The function dbsync takes a second argument key, a (possibly null) character vector of keys, and synchronizes the locally cached data associated with these keys to the data in the current version of the remote repository. If key contains a character string key associated with a data file that has not yet been locally cached, dbsync returns an error message. If key is null, then dbsync synchronizes the data for all of the keys associated with the locally cached data files. For keys with out-of-date key versions, dbsync downloads the associated updated files and deletes the associated out-of-date files. For keys that have been deleted from the current remote repository version, dbsync deletes the associated files from the local cache. 5 Examples We create a new localdb repository locdb in the local directory dir.
6 S. P. Eckel, R. D. Peng > library(stashr) > dir <- file.path(getwd(), "localdbexample") > locdb <- new("localdb", dir = dir, name = "localdb Example") We insert various R objects into locdb and call dblist to see a vector of keys corresponding to the inserted data files. > v <- 1:10 > dbinsert(locdb, key = "vector", value = v) > m <- matrix(1:20, 5, 4) > dbinsert(locdb, key = "matrix", value = m) > d <- data.frame(cbind(id = 1:5, age = c(12, 11, + 15, 11, 14), sex = c(1, 1, 0, 1, 0))) > dbinsert(locdb, key = "dataframe", value = d) > l <- list(v = v, m = m, df = d) > dbinsert(locdb, key = "list", value = l) > dblist(locdb) [1] "vector" "matrix" "dataframe" "list" We can fetch any of the R objects saved in locdb. > dbfetch(locdb, "vector") [1] If we delete a data file from locdb, dblist or dbexists can be used to confirm the deletion. > dbdelete(locdb, "matrix") > dblist(locdb) [1] "vector" "dataframe" "list" The locdb repository is stored as a remote repository on the Internet at: > myurl <- " rpeng/ + remotedbexampleversioning" To access the data in the remote repository, we first create a remotedb object called remdb. We will store the local cache of the remote repository in a local directory called remotedbexampleversioning. > dir <- file.path(getwd(), "remotedbexampleversioning") > remdb <- new("remotedb", url = myurl, dir = dir, + name = "remotedb Example") We use dblist to see the keys associated with data in the repository. > dblist(remdb) [1] "list" "dataframe" "vector" "matrix"
7 The stashr package Using dbfetch, we cache and return the data associated with vector. trying URL ' rpeng/ remotedbexampleversioning/data/ 6ba8844da718b4a65f60dbfd0d92d6ef.3' Content type 'text/plain; charset=utf-8' length 60 bytes opened URL ================================================== downloaded 60 bytes trying URL ' rpeng/ remotedbexampleversioning/data/ 6ba8844da718b4a65f60dbfd0d92d6ef.3.SIG' Content type 'text/plain; charset=utf-8' length 44 bytes opened URL ================================================== downloaded 44 bytes [1] > dbfetch(remdb, "vector") [1] Discussion The filehash package allows for interactive access to a (potentially quite large) dataset through a key-value database located on the user s disk. Since stashr is an extension of filehash, stashr is related to many of the large data handling packages related to the filehash package (Peng 2006). One such package is g.data by David Brahm (2002). A main idea of filehash is that by using keys to index the data, we can efficiently and interactively download only the necessary data objects. The g.data package provides interfaces to delayed-data packages (DDPS). DDPS efficiently load data using the lazy evaluation feature of R. To interface with one of the common relational databases (Oracle, MySQL or SQLite) in R we can use the S4 methods and generics in the DBI package (R Special Interest Group on Databases 2006), along with ROracle, RMySQL or RSQLite. The deprecated repostools package (Gentry and Gentleman 2004) on Bioconductor, which creates and interfaces with repositories of R packages, has some features similar to those in stashr. Both repostools and stashr help an individual create a remote repository in order to distribute the contents to other R users, who are provided with specific methods to interface with the remote repository. Both packages allow a user to download files from the remote repository to a local repository with the possibility of future synchronization. Unlike repostools, thestashr package contains the tools to construct local repositories but does not automate the posting of the repository to a webserver. The key difference between the two packages is that repostools is
8 S. P. Eckel, R. D. Peng intended for repositories of R packages whereas stashr is designed for repositories of key-value databases containing arbitrary R objects. The stashr package is designed to be part of a larger set of tools for both authors and readers of statistical documents in the context of streamlining and enhancing reproducible documents created, for example, by using literate programming tools such as Sweave (Leisch 2002). More generally, writers of statistical analyses could store the results of computations in a localdb database along with the code used to generate those results. This database could later be automatically posted as a remotedb repository on the Internet. Ideally, an author could choose to store remote repositories in a central database used by all statistical researchers or on his or her own webserver. Future tools for readers would facilitate access to R objects (stored in a remotedb repository) created in reproducible statistical analyses as well as to associated the R code. The R objects used in the computation of each figure or numerical value in a published paper, for example, could be indexed in a stashr database and subsequently reproduced. References Brahm DE (2002) Delayed data packages. R News 2(3): Gentleman R, Temple Lang D (2004) Statistical analyses and reproducible research. Technical Report 2, Bioconductor Project Working Papers Gentry J, Gentleman R (2004) repostools: repository tools for R, R package Version Leisch F (2002) Sweave: Dynamic generation of statistical reports using literate data analysis. In Härdle W, Rönz B (eds) Compstat Proceedings in computational statistics, pp Physika Verlag, Heidelberg, ISBN Peng RD (2006) Interacting with data using the filehash package. R News 6(4): Peng RD, Dominici F, Zeger SL (2006) Reproducible epidemiologic research. Am J Epidemiol 163(9): , doi: /aje/kwj093 R Special Interest Group on Databases (R-SIG-DB) (2006) DBI: R Database Interface, R package Version
EMC E20-120. EMC Content Management Foundation Exam(CMF) http://www.examskey.com/e20-120.html
EMC E20-120 EMC Content Management Foundation Exam(CMF) TYPE: DEMO http://www.examskey.com/e20-120.html Examskey EMC E20-120 exam demo product is here for you to test the quality of the product. This EMC
Forensic Analysis of Internet Explorer Activity Files
Forensic Analysis of Internet Explorer Activity Files by Keith J. Jones [email protected] 3/19/03 Table of Contents 1. Introduction 4 2. The Index.dat File Header 6 3. The HASH Table 10 4. The
Directory Backup and Restore
Directory Backup and Restore Overview Active Directory is backed up as part of system state, a collection of system components that depend on each other. You must backup and restore system state components
Package hive. January 10, 2011
Package hive January 10, 2011 Version 0.1-9 Date 2011-01-09 Title Hadoop InteractiVE Description Hadoop InteractiVE, is an R extension facilitating distributed computing via the MapReduce paradigm. It
Lab 5 Managing Access to Shared Folders
Islamic University of Gaza Computer Network Lab Faculty of engineering ECOM 4121 Computer Department. Prepared by : Eng. Eman R. Al-Kurdi Managing Access to Shared Folders Objective: Manage access to shared
How To Backup A Database In Navision
Making Database Backups in Microsoft Business Solutions Navision MAKING DATABASE BACKUPS IN MICROSOFT BUSINESS SOLUTIONS NAVISION DISCLAIMER This material is for informational purposes only. Microsoft
Back Up and Restore the Project Center and Info Exchange Servers. Newforma Project Center Server
Back Up and Restore the Project Center and Info Exchange Servers This document outlines backup and restore procedures for Newforma Project Center Server and Newforma Info Exchange Server. This document
SQL Injection January 23, 2013
Web-based Attack: SQL Injection SQL Injection January 23, 2013 Authored By: Stephanie Reetz, SOC Analyst Contents Introduction Introduction...1 Web applications are everywhere on the Internet. Almost Overview...2
Restore and Recovery Tasks. Copyright 2009, Oracle. All rights reserved.
Restore and Recovery Tasks Objectives After completing this lesson, you should be able to: Describe the causes of file loss and determine the appropriate action Describe major recovery operations Back
Symantec NetBackup for Lotus Notes Administrator's Guide
Symantec NetBackup for Lotus Notes Administrator's Guide for UNIX, Windows, and Linux Release 7.5 Symantec NetBackup for Lotus Notes Administrator's Guide The software described in this book is furnished
Version: 1.5 2014 Page 1 of 5
Version: 1.5 2014 Page 1 of 5 1.0 Overview A backup policy is similar to an insurance policy it provides the last line of defense against data loss and is sometimes the only way to recover from a hardware
MS Access: Advanced Tables and Queries. Lesson Notes Author: Pamela Schmidt
Lesson Notes Author: Pamela Schmidt Tables Text Fields (Default) Text or combinations of text and numbers, as well as numbers that don't require calculations, such as phone numbers. or the length set by
DO NOT ASSUME THAT THE BACKUP IS CORRECT. MAKE SURE IT IS.
BACKING UP DATABASES & ASSOCIATED FILES It is not the responsibility of Incisive Software Limited, or any agent appointed by us, to configure or ensure the reliability and validity of the backing up of
The Hadoop Distributed File System
The Hadoop Distributed File System The Hadoop Distributed File System, Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, Yahoo, 2010 Agenda Topic 1: Introduction Topic 2: Architecture
Intro to Databases. ACM Webmonkeys 2011
Intro to Databases ACM Webmonkeys 2011 Motivation Computer programs that deal with the real world often need to store a large amount of data. E.g.: Weather in US cities by month for the past 10 years List
Backup and Recovery. What Backup, Recovery, and Disaster Recovery Mean to Your SQL Anywhere Databases
Backup and Recovery What Backup, Recovery, and Disaster Recovery Mean to Your SQL Anywhere Databases CONTENTS Introduction 3 Terminology and concepts 3 Database files that make up a database 3 Client-side
Sophos SafeGuard Native Device Encryption for Mac Administrator help. Product version: 7
Sophos SafeGuard Native Device Encryption for Mac Administrator help Product version: 7 Document date: December 2014 Contents 1 About SafeGuard Native Device Encryption for Mac...3 1.1 About this document...3
3. PGCluster. There are two formal PGCluster Web sites. http://pgfoundry.org/projects/pgcluster/ http://pgcluster.projects.postgresql.
3. PGCluster PGCluster is a multi-master replication system designed for PostgreSQL open source database. PostgreSQL has no standard or default replication system. There are various third-party software
Microsoft Dynamics TM NAV 5.00. Making Database Backups in Microsoft Dynamics NAV
Microsoft Dynamics TM NAV 5.00 Making Database Backups in Microsoft Dynamics NAV MAKING DATABASE BACKUPS IN MICROSOFT DYNAMICS NAV Information in this document, including URL and other Internet Web site
Automated file management with IBM Active Cloud Engine
Automated file management with IBM Active Cloud Engine Redefining what it means to deliver the right data to the right place at the right time Highlights Enable ubiquitous access to files from across the
Field Properties Quick Reference
Field Properties Quick Reference Data types The following table provides a list of the available data types in Microsoft Office Access 2007, along with usage guidelines and storage capacities for each
How to Set Up a Shared SQL Express Database with ManagePro 7 Standard version
How to Set Up a Shared SQL Express Database with ManagePro 7 Standard version This instruction set is provided AS IS without warranty, express or implied, including but not limited to the implied warranties
Updates Click to check for a newer version of the CD Press next and confirm the disc burner selection before pressing finish.
Backup. If your computer refuses to boot or load Windows or if you are trying to restore an image to a partition the Reflect cannot lock (See here), and then you will have to start your PC using a rescue
Oracle Endeca Server. Cluster Guide. Version 7.5.1.1 May 2013
Oracle Endeca Server Cluster Guide Version 7.5.1.1 May 2013 Copyright and disclaimer Copyright 2003, 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of
Log Correlation Engine Backup Strategy
Log Correlation Engine Backup Strategy August 10, 2012 (Revision 1) Copyright 2002-2012 Tenable Network Security, Inc. Tenable Network Security, Nessus and ProfessionalFeed are registered trademarks of
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1 Mark Rittman, Director, Rittman Mead Consulting for Collaborate 09, Florida, USA,
New Mexico State University. AiM 8.X Basic AiM
New Mexico State University AiM 8.X Basic AiM January 22, 2015 Confidential Business Information This documentation is proprietary information of New Mexico State University (NMSU) and is not to be copied,
A Brief Introduction to MySQL
A Brief Introduction to MySQL by Derek Schuurman Introduction to Databases A database is a structured collection of logically related data. One common type of database is the relational database, a term
Backup and Restore User manual For version 5.0.0.8
Backup and Restore User manual For version 5.0.0.8 All rights reserved 1989 2008 Microware Software - Rev. 1.0.5 October 2, 2008 Page 1 Table of Contents Backup and Restore 1.0 Overview... 3 1.1 Main features...4
Backing up and restoring HP Systems Insight Manager 6.0 or greater data files in a Windows environment
Technical white paper Backing up and restoring HP Systems Insight Manager 6.0 or greater data files in a Windows environment Table of contents Abstract 2 Introduction 2 Saving and restoring data files
RSA SecurID Software Token 1.0 for Android Administrator s Guide
RSA SecurID Software Token 1.0 for Android Administrator s Guide Contact Information See the RSA corporate web site for regional Customer Support telephone and fax numbers: www.rsa.com Trademarks RSA,
Simple Storage Service (S3)
Simple Storage Service (S3) Amazon S3 is storage for the Internet. It is designed to make web-scale computing easier for developers. Amazon S3 provides a simple web services interface that can be used
ACS Backup and Restore
Table of Contents Implementing a Backup Plan 3 What Should I Back Up? 4 Storing Data Backups 5 Backup Media 5 Off-Site Storage 5 Strategies for Successful Backups 7 Daily Backup Set A and Daily Backup
Package retrosheet. April 13, 2015
Type Package Package retrosheet April 13, 2015 Title Import Professional Baseball Data from 'Retrosheet' Version 1.0.2 Date 2015-03-17 Maintainer Richard Scriven A collection of tools
Setting Up a CLucene and PostgreSQL Federation
Federated Desktop and File Server Search with libferris Ben Martin Abstract How to federate CLucene personal document indexes with PostgreSQL/TSearch2. The libferris project has two major goals: mounting
COS 318: Operating Systems
COS 318: Operating Systems File Performance and Reliability Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Topics File buffer cache
Coveo Platform 7.0. Oracle Knowledge Connector Guide
Coveo Platform 7.0 Oracle Knowledge Connector Guide Notice The content in this document represents the current view of Coveo as of the date of publication. Because Coveo continually responds to changing
Novell Identity Manager
AUTHORIZED DOCUMENTATION Manual Task Service Driver Implementation Guide Novell Identity Manager 4.0.1 April 15, 2011 www.novell.com Legal Notices Novell, Inc. makes no representations or warranties with
Setting Up Resources in VMware Identity Manager
Setting Up Resources in VMware Identity Manager VMware Identity Manager 2.4 This document supports the version of each product listed and supports all subsequent versions until the document is replaced
Integration Client Guide
Integration Client Guide 2015 Bomgar Corporation. All rights reserved worldwide. BOMGAR and the BOMGAR logo are trademarks of Bomgar Corporation; other trademarks shown are the property of their respective
SQL Injection Attack Lab Using Collabtive
Laboratory for Computer Security Education 1 SQL Injection Attack Lab Using Collabtive (Web Application: Collabtive) Copyright c 2006-2011 Wenliang Du, Syracuse University. The development of this document
7-1. This chapter explains how to set and use Event Log. 7.1. Overview... 7-2 7.2. Event Log Management... 7-2 7.3. Creating a New Event Log...
7-1 7. Event Log This chapter explains how to set and use Event Log. 7.1. Overview... 7-2 7.2. Event Log Management... 7-2 7.3. Creating a New Event Log... 7-6 7-2 7.1. Overview The following are the basic
Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide
Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide for Windows Release 7.5 Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide The software described in this
SnapLogic Salesforce Snap Reference
SnapLogic Salesforce Snap Reference Document Release: October 2012 SnapLogic, Inc. 71 East Third Avenue San Mateo, California 94401 U.S.A. www.snaplogic.com Copyright Information 2012 SnapLogic, Inc. All
Exchange Mailbox Protection Whitepaper
Exchange Mailbox Protection Contents 1. Introduction... 2 Documentation... 2 Licensing... 2 Exchange add-on comparison... 2 Advantages and disadvantages of the different PST formats... 3 2. How Exchange
PORTAL ADMINISTRATION
1 Portal Administration User s Guide PORTAL ADMINISTRATION GUIDE Page 1 2 Portal Administration User s Guide Table of Contents Introduction...5 Core Portal Framework Concepts...5 Key Items...5 Layouts...5
Technical Data Sheet: imc SEARCH 3.1. Topology
: imc SEARCH 3.1 Database application for structured storage and administration of measurement data: Measurement data (measurement values, measurement series, combined data from multiple measurement channels)
Basic File Recording
PART NUMBER: PUBLICATION DATE: 23. October 2015 Copyright 2014 2015 SightLine Applications, Inc. Hood River, OR All Rights Reserved Summary In addition to performing digital video stabilization, object
PCVITA Express Migrator for SharePoint (File System) 2011. Table of Contents
Table of Contents Chapter-1 ---------------------------------------------------------------------------- Page No (2) What is PCVITA Express Migrator for SharePoint (File System)? Migration Supported The
Citrix Systems, Inc.
Citrix Password Manager Quick Deployment Guide Install and Use Password Manager on Presentation Server in Under Two Hours Citrix Systems, Inc. Notice The information in this publication is subject to change
INFORMATION BROCHURE Certificate Course in Web Design Using PHP/MySQL
INFORMATION BROCHURE OF Certificate Course in Web Design Using PHP/MySQL National Institute of Electronics & Information Technology (An Autonomous Scientific Society of Department of Information Technology,
Site Maintenance Using Dreamweaver
Site Maintenance Using Dreamweaver As you know, it is possible to transfer the files that make up your web site from your local computer to the remote server using FTP (file transfer protocol) or some
CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION
1 E N D U R A D A T A EDpCloud: A File Synchronization, Data Replication and Wide Area Data Distribution Solution CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION 2 Resilient
SafeGuard Enterprise Tools guide
SafeGuard Enterprise Tools guide Product version: 5.60 Document date: April 2011 Contents 1 About this guide...3 2 Displaying the system status with SGNState...3 3 Reverting an unsuccessful installation
IBRIX Fusion 3.1 Release Notes
Release Date April 2009 Version IBRIX Fusion Version 3.1 Release 46 Compatibility New Features Version 3.1 CLI Changes RHEL 5 Update 3 is supported for Segment Servers and IBRIX Clients RHEL 5 Update 2
Backup Tab. User Guide
Backup Tab User Guide Contents 1. Introduction... 2 Documentation... 2 Licensing... 2 Overview... 2 2. Create a New Backup... 3 3. Manage backup jobs... 4 Using the Edit menu... 5 Overview... 5 Destination...
Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS Digital Preservation Committee November 2006
Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS Digital Preservation Committee November 2006 Introduction: Fixity, in preservation terms, means that the digital object
Burst Technology bt-loganalyzer SE
Burst Technology bt-loganalyzer SE Burst Technology Inc. 9240 Bonita Beach Rd, Bonita Springs, FL 34135 CONTENTS WELCOME... 3 1 SOFTWARE AND HARDWARE REQUIREMENTS... 3 2 SQL DESIGN... 3 3 INSTALLING BT-LOGANALYZER...
Backup and Restore with 3 rd Party Applications
Backup and Restore with 3 rd Party Applications Contents Introduction...1 Backup Software Capabilities...1 Backing up a Single Autodesk Vault Site...1 Backup Process...1 Restore Process...1 Backing up
ADSMConnect Agent for Oracle Backup on Sun Solaris Installation and User's Guide
ADSTAR Distributed Storage Manager ADSMConnect Agent for Oracle Backup on Sun Solaris Installation and User's Guide IBM Version 2 SH26-4063-00 IBM ADSTAR Distributed Storage Manager ADSMConnect Agent
Resco CRM Server Guide. How to integrate Resco CRM with other back-end systems using web services
Resco CRM Server Guide How to integrate Resco CRM with other back-end systems using web services Integrating Resco CRM with other back-end systems using web services (Data, Metadata) This document consists
Documentum Content Distribution Services TM Administration Guide
Documentum Content Distribution Services TM Administration Guide Version 5.3 SP5 August 2007 Copyright 1994-2007 EMC Corporation. All rights reserved. Table of Contents Preface... 7 Chapter 1 Introducing
Chapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
Chapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
Merkle Hash Trees for Distributed Audit Logs
Merkle Hash Trees for Distributed Audit Logs Subject proposed by Karthikeyan Bhargavan [email protected] April 7, 2015 Modern distributed systems spread their databases across a large number
CA Workload Automation Agent for Databases
CA Workload Automation Agent for Databases Implementation Guide r11.3.4 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the
Raima Database Manager Version 14.0 In-memory Database Engine
+ Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized
SAS 9.4 Intelligence Platform: Migration Guide, Second Edition
SAS 9.4 Intelligence Platform: Migration Guide, Second Edition SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS 9.4 Intelligence Platform:
EndNote Beyond the Basics
IOE Library Guide EndNote Beyond the Basics These notes assume that you know EndNote basics and are using it regularly. Additional tips and instruction is contained within the guides and FAQs available
Microsoft Exchange 2003 Disaster Recovery Operations Guide
Microsoft Exchange 2003 Disaster Recovery Operations Guide Microsoft Corporation Published: December 12, 2006 Author: Exchange Server Documentation Team Abstract This guide provides installation and deployment
Best Practices for Deploying and Managing Linux with Red Hat Network
Best Practices for Deploying and Managing Linux with Red Hat Network Abstract This technical whitepaper provides a best practices overview for companies deploying and managing their open source environment
Field Name Data Type Description Field Size Format
Data Dictionary: tblvendor (Sample) Field Name Data Type Field Size Format Default Value Input Mask VID Text Unique Vendor Identifier 10 >L????? VName Text Vendor Name 50 VAddress Text Vendor Address 50
Sisense. Product Highlights. www.sisense.com
Sisense Product Highlights Introduction Sisense is a business intelligence solution that simplifies analytics for complex data by offering an end-to-end platform that lets users easily prepare and analyze
Zen Internet. Online Data Backup. Zen Vault Professional Plug-ins. Issue: 2.0.08
Zen Internet Online Data Backup Zen Vault Professional Plug-ins Issue: 2.0.08 Contents 1 Plug-in Installer... 3 1.1 Installation and Configuration... 3 2 Plug-ins... 5 2.1 Email Notification... 5 2.1.1
GraySort and MinuteSort at Yahoo on Hadoop 0.23
GraySort and at Yahoo on Hadoop.23 Thomas Graves Yahoo! May, 213 The Apache Hadoop[1] software library is an open source framework that allows for the distributed processing of large data sets across clusters
Backing Up TestTrack Native Project Databases
Backing Up TestTrack Native Project Databases TestTrack projects should be backed up regularly. You can use the TestTrack Native Database Backup Command Line Utility to back up TestTrack 2012 and later
EMC Documentum Webtop
EMC Documentum Webtop Version 6.5 User Guide P/N 300 007 239 A01 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748 9103 1 508 435 1000 www.emc.com Copyright 1994 2008 EMC Corporation. All rights
Package uptimerobot. October 22, 2015
Type Package Version 1.0.0 Title Access the UptimeRobot Ping API Package uptimerobot October 22, 2015 Provide a set of wrappers to call all the endpoints of UptimeRobot API which includes various kind
There are numerous ways to access monitors:
Remote Monitors REMOTE MONITORS... 1 Overview... 1 Accessing Monitors... 1 Creating Monitors... 2 Monitor Wizard Options... 11 Editing the Monitor Configuration... 14 Status... 15 Location... 17 Alerting...
Integrity Checking and Monitoring of Files on the CASTOR Disk Servers
Integrity Checking and Monitoring of Files on the CASTOR Disk Servers Author: Hallgeir Lien CERN openlab 17/8/2011 Contents CONTENTS 1 Introduction 4 1.1 Background...........................................
You can specify IPv4 and IPv6 addresses while performing various tasks in this feature. The resource
The feature enables the configuration of a Virtual Private Network (VPN) routing and forwarding instance (VRF) table so that the domain name system (DNS) can forward queries to name servers using the VRF
Database Administration with MySQL
Database Administration with MySQL Suitable For: Database administrators and system administrators who need to manage MySQL based services. Prerequisites: Practical knowledge of SQL Some knowledge of relational
Using Object Database db4o as Storage Provider in Voldemort
Using Object Database db4o as Storage Provider in Voldemort by German Viscuso db4objects (a division of Versant Corporation) September 2010 Abstract: In this article I will show you how
Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee [email protected] [email protected]
Hadoop Distributed File System Dhruba Borthakur Apache Hadoop Project Management Committee [email protected] [email protected] Hadoop, Why? Need to process huge datasets on large clusters of computers
Monitoring Siebel Enterprise
Monitoring Siebel Enterprise eg Enterprise v6 Restricted Rights Legend The information contained in this document is confidential and subject to change without notice. No part of this document may be reproduced
Secure Installation and Operation of Your Xerox Multi-Function Device. Version 1.0 August 6, 2012
Secure Installation and Operation of Your Xerox Multi-Function Device Version 1.0 August 6, 2012 Secure Installation and Operation of Your Xerox Multi-Function Device Purpose and Audience This document
Package RCassandra. R topics documented: February 19, 2015. Version 0.1-3 Title R/Cassandra interface
Version 0.1-3 Title R/Cassandra interface Package RCassandra February 19, 2015 Author Simon Urbanek Maintainer Simon Urbanek This packages provides
SafeGuard Enterprise Web Helpdesk. Product version: 6 Document date: February 2012
SafeGuard Enterprise Web Helpdesk Product version: 6 Document date: February 2012 Contents 1 SafeGuard web-based Challenge/Response...3 2 Installation...5 3 Authentication...8 4 Select the Web Helpdesk
Wave Analytics Data Integration
Wave Analytics Data Integration Salesforce, Spring 16 @salesforcedocs Last updated: April 28, 2016 Copyright 2000 2016 salesforce.com, inc. All rights reserved. Salesforce is a registered trademark of
Connectivity. Alliance Access 7.0. Database Recovery. Information Paper
Connectivity Alliance Access 7.0 Database Recovery Information Paper Table of Contents Preface... 3 1 Overview... 4 2 Resiliency Concepts... 6 2.1 Database Loss Business Impact... 6 2.2 Database Recovery
Fairsail. Implementer. Fairsail to Active Directory Synchronization. Version 1.0 FS-PS-FSAD-IG-201310--R001.00
Fairsail Implementer Fairsail to Active Directory Synchronization Version 1.0 FS-PS-FSAD-IG-201310--R001.00 Fairsail 2013. All rights reserved. This document contains information proprietary to Fairsail
- Suresh Khanal. http://mcqsets.com. http://www.psexam.com Microsoft Excel Short Questions and Answers 1
- Suresh Khanal http://mcqsets.com http://www.psexam.com Microsoft Excel Short Questions and Answers 1 Microsoft Access Short Questions and Answers with Illustrations Part I Suresh Khanal Kalanki, Kathmandu
UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure
UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure Authors: A O Jaunsen, G S Dahiya, H A Eide, E Midttun Date: Dec 15, 2015 Summary Uninett Sigma2 provides High
StoreGrid Backup Server With MySQL As Backend Database:
StoreGrid Backup Server With MySQL As Backend Database: Installing and Configuring MySQL on Windows Overview StoreGrid now supports MySQL as a backend database to store all the clients' backup metadata
GDC Data Transfer Tool User s Guide. NCI Genomic Data Commons (GDC)
GDC Data Transfer Tool User s Guide NCI Genomic Data Commons (GDC) Contents 1 Getting Started 3 Getting Started.......................................................... 3 The GDC Data Transfer Tool: An
create-virtual-server creates the named virtual server
Name Synopsis Description Options create-virtual-server creates the named virtual server create-virtual-server [--help] --hosts hosts [--httplisteners http-listeners] [--networklisteners network-listeners]
Gladinet Cloud Backup V3.0 User Guide
Gladinet Cloud Backup V3.0 User Guide Foreword The Gladinet User Guide gives step-by-step instructions for end users. Revision History Gladinet User Guide Date Description Version 8/20/2010 Draft Gladinet
Acronis Backup Advanced Version 11.5 Update 6
Acronis Backup Advanced Version 11.5 Update 6 APPLIES TO THE FOLLOWING PRODUCTS Advanced for Exchange BACKING UP MICROSOFT EXCHANGE SERVER DATA Copyright Statement Copyright Acronis International GmbH,
