CDL s OAI Harvest Architecture Development
|
|
- Sophie Edwards
- 8 years ago
- Views:
Transcription
1 California Digital Library CDL s OAI Harvest Architecture Development Presentation Wednesday, Bill Landis, Metadata Coordinator, CDL
2 Overview OAI harvesting infrastructure planning and development thus far (Bill) Developing CDL capacity to harvest, transform/normalize, and ingest metadata-only digital objects into a CF repository Overview of results of Metadata clustering experiments and possible next steps (Dave) Possibilities for using clustering and classification as a means of enriching CDL digital objects (American West? Others?) with consistent metadata to enhance topical browsing access for end users Questions/discussion
3 CDL Common Framework Context Target for long-term management of American West metadata objects: Preservation Repository at CDL Repository based on OAIS Reference Model Records harvested to file system Feeder process produces METS object/sip Ingest process for Repository creates AIP, or an enriched/transformed AmWest metadata package Access services driven by indexes created from Repository METS objects
4 OAI Harvest Architecture OAI Provider OAI Provider U.Mich. OAI Harvester OAI Provider Harvest Tracking DB OAI Sets On File Server
5 Prototype Harvest Tracking DB UMich OAI harvester Stores Goals Especially like its ability to log harvest details (time, progress, completion, # of records harvested) -- automatically fed into Harvest Tracking DB Metadata about data providers/ institutions Information about who set up harvest for which project (important in CDL environment) Initial harvest dates and re-harvest frequency Feed information about an object s harvest into METS record creation process Feed information to CDL s automated scheduling software for future re-harvests Currently OAI specific/we ultimately envision this being folded into CDL Common Framework Tracking DB (along with tracking information for Preservation Repository ingest, Web crawling, etc.)
6 Prototype Harvest Tracking DB MySQL database Most information auto-populated during harvesting process Tables for Institution - Data provider Harvest Set - User phpmyadmin user interface Supports human-driven (as opposed to automated) interactions with data providers Supports human input of certain field values Public access site URL at Set level if this exists Set-specific or Repository-specific Rights URL when this is not supplied in OAI records or in Identify or ListSets responses -- in these cases CDL will add this human-input Rights URL from the Tracking DB during the course of METS-record creation
7 Input OAI data provider URL Select OAI verb to send
8 Choose sets for which to request records Edit information about set in tracking DB
9 In Edit mode, add contextual and other information for later use in normalizing/enriching record. * In this case LC provides the set s public URL in the set description, but not in each record, so we record it here to be added to our transformed metadata record (AIP). * We also add the generic rights URL for the set or institution, depending on which is available, for use in case individual records do not contain a dc.rights field. * Other fields contain data auto-populated by the Tracking DB)
10 Switching to the ListRecords OAI verb request invokes a harvest of selected, previously unharvested sets.
11 DB provides browsing access to: OAI verb responses stored in Tracking DB Previously harvested sets Newly harvested sets
12 Access to individual records through the Tracking DB is both to the harvested XML file and to the <identifier> URL s destination
13 Harvested XML record
14 Identifier URL s destination
15 Metadata Transformation (Feeder/Ingest) SIP SIP SIP Subset filter Original Record (SIP) Enhanced Record (AIP) OAI Feeder Process (creates METS objects) Context DateTx Thmb Tx Context Information Date/Era Normalization Identifier validator/thumbnail generator GeoTx Geographic place names GenreTx Type/Genre OAI Sets Harvest Tracking DB TitleTx Title generator Future transforms? CDL Common Framework (CF) Repository Object
16 Repository SIP SIP SIP Feeder Process DateTx GeoTx TypeTx OAI Sets Harvest Tracking DB TitleTx Repo Clustering Index User Interface Access Services
17 Feeder Process for OAI Records METS record creation Successfully wrapped and ingested approximately 100,000 metadata-only OAI objects during development/testing in March New CDL Repository service: Catalog added to previously existing Preserve Untouched copy of as-harvested metadata becomes SIP Information from Harvest Tracking DB parked in METS record ( Context info on previous slide) for downstream use Default rights URL for set or repository as backup
18 Repository Ingest Process Generic ingest profiles Hope: generic profile will be able to take care of 90+% of records harvested Reality: we ll see Profiles essentially an ordered list of calls to CDL CF utilities for things like Normalization of specific fields Enrichment activities on specific fields Subsetting and removing out-of-scope content for a given collection Special profiles for specific sets We hope this is the rare exception, not the rule Ex: Only content date in set records are in <identifier> string!
19 Ingest Utilities Subsetting and disposing of out-of-scope content Improving best practices in OAI data provider community will hopefully make this easier as time goes by, both at set and record levels Software-assisted human analysis of OAI sets/records seems the best way to develop subsetting algorithms, probably tied to specific OAI service provider projects We may need to do this in a human-mediated step post Harvest, but pre Ingest
20 Ingest Utilities Date Normalization Coding completed, QA plan in development Stats from last test run: 359,830 records processed, found and normalized at least one date in 279,337 records (76.4%) using <date>, <title>, & <description> fields Process Analysis of 360,000 records Roy s web-based analysis tools Spotfire DecisionSite ( - very large datasets problematic perl hacking courtesy of Mike Information derived from clustering experiments (probably more useful for geographic and genre information than for dates)
21 Ingest Utilities Date Normalization process (cont.) Developing algorithm Steps for normalizing date patterns extracted from <date> fields Steps for extracting date patterns from other fields if no <date> field or if <date> field contains some version of undated/unknown/n.d./s.d., etc. Currently limiting this to <title> and <description> fields Steps for writing out normalized dates in CDL qualified Dublin Core fields in transformed AIP descriptive metadata Documentation, QA testing
22 Ingest Utilities Other transformations needed to support envisioned interface we ll be testing on target American West user populations beginning in 2006 ( Evaluation year for AmWest): Work on these ongoing during remainder of 2005, into 2006 Date normalization utility work is prototype for process for other utilities Prototype vs. production approaches for CDL CF development?
23 Ingest Utilities Higher priority to support evaluation activities: Identifier validator/thumbnail generator All of our user research indicates that end users want thumbnails in results lists most OAI records don t provide such links We anticipate that this utility will test for validity of links and destination filetypes, grab image files, make low-resolution thumbnail for object (algorithm to be determined), store pointer in METS record Geographical location terms Analysis of 360K records underway Currently planning to transform to TGN hierarchical terms (CDL licensing of TGN from Getty is in process) Will need to prototype some kind of basic, back-end Thesaurus Manager utility with which normalization utility can interact Derived from content of <coverage> and other candidate DC fields
24 Ingest Utilities Lower priority (can start evaluation activities and work these in during 2006) Genre terms Anticipating 2 hierarchical levels First subset of DCMI Type Vocabulary Second subset of AAT derived from analysis of current pool of harvested records Will also interact with a back-end Thesaurus Manager Derived from <type> and other candidate DC fields Title extraction utility When a record has no <title> field, we ll have algorithm for extracting approx. first 10 words of <description> field as a supplied title Approx. 800 of If no <description field> we ll insert an explicit <title>[no Title]</title> field for use in results lists and citations
25 Ingest Utilities Topic terms High priority to support American West evaluation activities beginning in November 2005 Several approaches to consider; more in Dave s talk Important for American West evaluation activities Feedback from usability testing of hierarchical, faceted browsing approach in American West interface will help CDL assess need for further development of CF support for clustering and classification activities Testing of usefulness of 23 High-level Topics Testing 23 topics locally yielded a 15-topic list Felicia/Jane designed online survey that was up for 2 weeks Bill solicited survey takers from K-12 teacher and historian listservs 194 responses Stay tuned for analysis
Ex Libris Rosetta: A Digital Preservation System Product Description
Ex Libris Rosetta: A Digital Preservation System Product Description CONFIDENTIAL INFORMATION The information herein is the property of Ex Libris Ltd. or its affiliates and any misuse or abuse will result
More informationHow collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives
How collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives Anna Perricci Columbia University Libraries Best Practices Exchange November 14, 2013 Overview Web
More informationEncoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web
Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web Corey A Harper University of Oregon Libraries Tel: +1 541 346 1854 Fax:+1 541 346 3485 charper@uoregon.edu
More informationFunctional Requirements for Digital Asset Management Project version 3.0 11/30/2006
/30/2006 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 26 27 28 29 30 3 32 33 34 35 36 37 38 39 = required; 2 = optional; 3 = not required functional requirements Discovery tools available to end-users:
More informationA guide to the lifeblood of DAM:
A guide to the lifeblood of DAM: Key concepts and best practices for using metadata in digital asset management systems. By John Horodyski. Sponsored by Widen Enterprises and DigitalAssetManagement.com.
More informationHow collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives
How collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives Anna Perricci Columbia University Libraries METRO Conference 2014 January 15, 2014 Overview Web
More informationNext-Generation Technical Services (NGTS) Digital Asset Management System (DAMS) Requirements
Next-Generation Technical Services (NGTS) Digital Asset Management System (DAMS) Requirements Final Report July 20, 2012 Power of Three (POT) #1, Lightning Team #1A Kathleen Cameron, UC San Francisco (Co-chair)
More informationDEVELOPING A VISUAL DIGITAL IMAGE COLLECTION. Calgary Collections 2005: The Changing Collections Environment CLA Pre-Conference June 15, 2005
DEVELOPING A VISUAL DIGITAL IMAGE COLLECTION Heather D Amour Collections Librarian University of Calgary Marilyn Nasserden Fine Arts Librarian University of Calgary Calgary Collections 2005: The Changing
More informationGUIDELINES FOR THE CREATION OF DIGITAL COLLECTIONS
GUIDELINES FOR THE CREATION OF DIGITAL COLLECTIONS Best Practices for Descriptive Metadata This document sets forth guidelines for creating descriptive metadata for items in CARLI Digital Collections (CDC)
More informationDigital Asset Management. Content Control for Valuable Media Assets
Digital Asset Management Content Control for Valuable Media Assets Overview Digital asset management is a core infrastructure requirement for media organizations and marketing departments that need to
More informationERA Challenges. Draft Discussion Document for ACERA: 10/7/30
ERA Challenges Draft Discussion Document for ACERA: 10/7/30 ACERA asked for information about how NARA defines ERA completion. We have a list of functions that we would like for ERA to perform that we
More informationCore Competencies for Visual Resources Management
Core Competencies for Visual Resources Management Introduction An IMLS Funded Research Project at the University at Albany, SUNY Hemalata Iyer, Associate Professor, University at Albany Visual resources
More informationAcronym: Data without Boundaries. Deliverable D12.1 (Database supporting the full metadata model)
Project N : 262608 Acronym: Data without Boundaries Deliverable D12.1 (Database supporting the full metadata model) Work Package 12 (Implementing Improved Resource Discovery for OS Data) Reporting Period:
More informationInteragency Science Working Group. National Archives and Records Administration
Interagency Science Working Group 1 National Archives and Records Administration Establishing Trustworthy Digital Repositories: A Discussion Guide Based on the ISO Open Archival Information System (OAIS)
More informationLibrary metadata, whether in the form of MARC 21
Metadata to Support Next-Generation Library Resource Discovery: Lessons from the extensible Catalog, Phase 1 Jennifer Bowen The extensible Catalog (XC) Project at the University of Rochester will design
More informationK@ A collaborative platform for knowledge management
White Paper K@ A collaborative platform for knowledge management Quinary SpA www.quinary.com via Pietrasanta 14 20141 Milano Italia t +39 02 3090 1500 f +39 02 3090 1501 Copyright 2004 Quinary SpA Index
More informationNotes about possible technical criteria for evaluating institutional repository (IR) software
Notes about possible technical criteria for evaluating institutional repository (IR) software Introduction Andy Powell UKOLN, University of Bath December 2005 This document attempts to identify some of
More informationFlattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
More informationLong Term Knowledge Retention and Preservation
Long Term Knowledge Retention and Preservation Aziz Bouras University of Lyon, DISP Laboratory France abdelaziz.bouras@univ-lyon2.fr Recent years: How should digital 3D data and multimedia information
More informationImplementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager paul.bevan@llgc.org.uk Structure! Background and overview! OAIS Model! Why
More informationDigital Curation at the National Space Science Data Center
Digital Curation at the National Space Science Data Center DigCCurr2007: Digital Curation In Practice 20 April 2007 Donald Sawyer/NASA/GSFC Ed Grayzeck/NASA/GSFC Overview NSSDC Requirements & Digital Curation
More informationMetadata driven framework for the Canada Research Data Centre Network
Metadata driven framework for the Canada Research Data Centre Network IASSIST 2010 Session A4: DDI3 Tools Pascal Heus, Metadata Technology North America pascal.heus@metadatatechnology.com http://www.metadatatechnology.com
More informationDatabase preservation toolkit:
Nov. 12-14, 2014, Lisbon, Portugal Database preservation toolkit: a flexible tool to normalize and give access to databases DLM Forum: Making the Information Governance Landscape in Europe José Carlos
More informationImplementing SharePoint 2010 as a Compliant Information Management Platform
Implementing SharePoint 2010 as a Compliant Information Management Platform Changing the Paradigm with a Business Oriented Approach to Records Management Introduction This document sets out the results
More informationCOURSE SYLLABUS COURSE TITLE:
1 COURSE SYLLABUS COURSE TITLE: FORMAT: CERTIFICATION EXAMS: 55043AC Microsoft End to End Business Intelligence Boot Camp Instructor-led None This course syllabus should be used to determine whether the
More informationCatDV Pro Workgroup Serve r
Architectural Overview CatDV Pro Workgroup Server Square Box Systems Ltd May 2003 The CatDV Pro client application is a standalone desktop application, providing video logging and media cataloging capability
More informationAdding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM)
Adding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM) Oracle's Sun Storage Archive Manager (SAM) self-protecting file system software reduces operating costs by providing data
More informationAssignment 1 Briefing Paper on the Pratt Archives Digitization Projects
Twila Rios Digital Preservation Spring 2012 Assignment 1 Briefing Paper on the Pratt Archives Digitization Projects The Pratt library digitization efforts actually encompass more than one project, including
More informationVerify and Control Headings in Bibliographic Records
OCLC Connexion Browser Guides Verify and Control Headings in Bibliographic Records Last updated: April 2013 6565 Kilgour Place, Dublin, OH 43017-3395. www.oclc.org Revision History Date Section title Description
More informationMBooks: Google Books Online at the University of Michigan Library
MBooks: Google Books Online at the University of Michigan Library Phil Farber, Chris Powell, Cory Snavely University of Michigan Library Information Technology Architecture overview Four basic pieces:
More informationNavigating to Success: Finding Your Way Through the Challenges of Map Digitization
Presentations (Libraries) Library Faculty/Staff Scholarship & Research 10-15-2011 Navigating to Success: Finding Your Way Through the Challenges of Map Digitization Cory K. Lampert University of Nevada,
More informationOCLC Content Management Services John A. Hearty, OCLC Dublin, OH
Proceedings of the 9th Annual Federal Depository Library Conference October 22-25, 2000 OCLC Content Management Services John A. Hearty, OCLC Dublin, OH Agenda Dimensions of Digital Archiving Obstacles
More informationSharePoint 2010 Interview Questions-Architect
Basic Intro SharePoint Architecture Questions 1) What are Web Applications in SharePoint? An IIS Web site created and used by SharePoint 2010. Saying an IIS virtual server is also an acceptable answer.
More informationCONCEPTCLASSIFIER FOR SHAREPOINT
CONCEPTCLASSIFIER FOR SHAREPOINT PRODUCT OVERVIEW The only SharePoint 2007 and 2010 solution that delivers automatic conceptual metadata generation, auto-classification and powerful taxonomy tools running
More informationMetadata Quality Control for Content Migration: The Metadata Migration Project at the University of Houston Libraries
Metadata Quality Control for Content Migration: The Metadata Migration Project at the University of Houston Libraries Andrew Weidner University of Houston, USA ajweidner@uh.edu Annie Wu University of Houston,
More informationDSpace: An Institutional Repository from the MIT Libraries and Hewlett Packard Laboratories
DSpace: An Institutional Repository from the MIT Libraries and Hewlett Packard Laboratories MacKenzie Smith, Associate Director for Technology Massachusetts Institute of Technology Libraries, Cambridge,
More informationEnabling the Big Data Commons through indexing of data and their interactions
biomedical and healthcare Data Discovery Index Ecosystem Enabling the Big Data Commons through indexing of and their interactions 2 nd BD2K all-hands meeting Bethesda 11/12/15 Aims 1. Help users find accessible
More informationArchiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw.
Archiving Systems Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie uwe.borghoff@unibw.de Decision Process Reference Models Technologies Use Cases
More information2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com)
CSP CHRONOS Compliance statement for ISO 14721:2003 (Open Archival Information System Reference Model) 2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com) The international
More informationLegal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION
Brian Lao - bjlao Karthik Jagadeesh - kjag Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND There is a large need for improved access to legal help. For example,
More informationSelecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo
Selecting a Taxonomy Management Tool Wendi Pohs InfoClear Consulting #SLATaxo InfoClear Consulting What do we do? Content Analytics Strategy and Implementation, including: Taxonomy/Ontology development
More informationSo today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
More informationCiteSeer x in the Cloud
Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar
More informationYale University. Report to the Digital Library Federation October 2004 Volume 5, Number 1. Fall 2004. http://www.diglib.org/pubs/news05_01/
Yale University Report to the Digital Library Federation October 2004 Volume 5, Number 1. Fall 2004 http://www.diglib.org/pubs/news05_01/ TABLE OF CONTENTS i. Collections, services, and systems ii. Projects
More informationVilas Wuwongse, Thiti Vacharasintopchai, Neelawat Intaraksa Asian Institute of Technology www.ait.asia
A Common Infrastructure for Digital Contents Vilas Wuwongse, Thiti Vacharasintopchai, Neelawat Intaraksa Asian Institute of Technology www.ait.asia Outline Introduction Issues Proposed Approach A Common
More informationApplying the OAIS standard to CCLRC s British Atmospheric Data Centre and the Atlas Petabyte Storage Service
Applying the OAIS standard to CCLRC s British Atmospheric Centre and the Atlas Petabyte Storage Service Corney, D.R., De Vere, M., Folkes, T., Giaretta, D., Kleese van Dam, K., Lawrence, B. N., Pepler,
More informationNuxeo, an open source platform for content-centric business applications. Stéfane Fermigier, Nuxeo Laurent Doguin, Nuxeo
Nuxeo, an open source platform for content-centric business applications Stéfane Fermigier, Nuxeo Laurent Doguin, Nuxeo Nuxeo, the Company Providing an Open Source Content Management Platform for Business
More informationInformation Access Platforms: The Evolution of Search Technologies
Information Access Platforms: The Evolution of Search Technologies Managing Information in the Public Sphere: Shaping the New Information Space April 26, 2010 Purpose To provide an overview of current
More informationWHY DIGITAL ASSET MANAGEMENT? WHY ISLANDORA?
WHY DIGITAL ASSET MANAGEMENT? WHY ISLANDORA? Digital asset management gives you full access to and control of to the true value hidden within your data: Stories. Digital asset management allows you to
More informationCombining Technologies to Create New Solutions
Combining Technologies to Create New Solutions Vishesh Chachra VP of Emerging Businesses About VTLS Inc. VTLS has been in business for over 25 years VTLS does business in 40 countries. VTLS has offices/wholly
More informationCGHub Web-based Metadata GUI Statement of Work
CGHub Web-based Metadata GUI Statement of Work Mark Diekhans Version 1 April 23, 2012 1 Goals CGHub stores metadata and data associated from NCI cancer projects. The goal of this project
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationOPENGREY: HOW IT WORKS AND HOW IT IS USED
OPENGREY: HOW IT WORKS AND HOW IT IS USED CHRISTIANE STOCK christiane.stock@inist.fr INIST-CNRS, France Abstract OpenGrey is a unique repository providing open access to European grey literature references,
More informationInformation and documentation The Dublin Core metadata element set
ISO TC 46/SC 4 N515 Date: 2003-02-26 ISO 15836:2003(E) ISO TC 46/SC 4 Secretariat: ANSI Information and documentation The Dublin Core metadata element set Information et documentation Éléments fondamentaux
More informationCommunity Edition. Master Data Management 3.X. Administrator Guide
Community Edition Talend Master Data Management 3.X Administrator Guide Version 3.2_a Adapted for Talend MDM Studio v3.2. Administrator Guide release. Copyright This documentation is provided under the
More informationIntroduction. Architecture Re-engineering. Systems Consolidation. Data Acquisition. Data Integration. Database Technology
Introduction Data migration is necessary when an organization decides to use a new computing system or database management system that is incompatible with the current system. Architecture Re-engineering
More informationTaking Control of Library Metadata and Websites using the extensible Catalog
Taking Control of Library Metadata and Websites using the extensible Catalog Jennifer Bowen University of Rochester/eXtensible Catalog Organization Code4lib 2010, Asheville, North Carolina Feb.23, 2010
More informationIslandora: An Open Source Institutional Repository Solution. Consortium of MnPALS Libraries Annual Meeting April 2014
Islandora: An Open Source Institutional Repository Solution Consortium of MnPALS Libraries Annual Meeting April 2014 Outline Introduction to Islandora (Linda) Islandora functionality and demo (Alex) SMSU
More information#MMTM15 #INFOARCHIVE #EMCWORLD 1
#MMTM15 #INFOARCHIVE #EMCWORLD 1 1 INFOARCHIVE A TECHNICAL OVERVIEW DAVID HUMBY SOFTWARE ARCHITECT #MMTM15 2 TWEET LIVE DURING THE SESSION! Connect with us: Sign up for a Hands On Lab 6 th May, 1.30 PM,
More informationCISCO ACE XML GATEWAY TO FORUM SENTRY MIGRATION GUIDE
CISCO ACE XML GATEWAY TO FORUM SENTRY MIGRATION GUIDE Legal Marks No portion of this document may be reproduced or copied in any form, or by any means graphic, electronic, or mechanical, including photocopying,
More informationChapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
More informationBest Practices for Structural Metadata Version 1 Yale University Library June 1, 2008
Best Practices for Structural Metadata Version 1 Yale University Library June 1, 2008 Background The Digital Production and Integration Program (DPIP) is sponsoring the development of documentation outlining
More informationWhy archiving erecords influences the creation of erecords. Martin Stürzlinger scopepartner Vienna, Austria
Why archiving erecords influences the creation of erecords Martin Stürzlinger scopepartner Vienna, Austria Electronic Records In a Productive System Created Used Changed Deleted In an Archival System No
More informationSANS Dshield Webhoneypot Project. OWASP November 13th, 2009. The OWASP Foundation http://www.owasp.org. Jason Lam
SANS Dshield Webhoneypot Project Jason Lam November 13th, 2009 SANS Internet Storm Center jason@networksec.org The Foundation http://www.owasp.org Introduction Who is Jason Lam Agenda Intro to honeypot
More informationSharePoint Term Store & Taxonomy Design Harold Brenneman Lighthouse Microsoft Technology Group
SharePoint Term Store & Taxonomy Design Harold Brenneman Lighthouse Microsoft Technology Group Lighthouse Computer Services, All rights reserved Harold Brenneman Consulting Manager MBA, focusing on the
More informationA Java Tool for Creating ISO/FGDC Geographic Metadata
F.J. Zarazaga-Soria, J. Lacasta, J. Nogueras-Iso, M. Pilar Torres, P.R. Muro-Medrano17 A Java Tool for Creating ISO/FGDC Geographic Metadata F. Javier Zarazaga-Soria, Javier Lacasta, Javier Nogueras-Iso,
More informationPublishing Linked Data Requires More than Just Using a Tool
Publishing Linked Data Requires More than Just Using a Tool G. Atemezing 1, F. Gandon 2, G. Kepeklian 3, F. Scharffe 4, R. Troncy 1, B. Vatant 5, S. Villata 2 1 EURECOM, 2 Inria, 3 Atos Origin, 4 LIRMM,
More informationBeyond The Web Drupal Meets The Desktop (And Mobile) Justin Miller Code Sorcery Workshop, LLC http://codesorcery.net/dcdc
Beyond The Web Drupal Meets The Desktop (And Mobile) Justin Miller Code Sorcery Workshop, LLC http://codesorcery.net/dcdc Introduction Personal introduction Format & conventions for this talk Assume familiarity
More informationMicrosoft Project Server 2010 Administrator's Guide
Microsoft Project Server 2010 Administrator's Guide 1 Copyright This document is provided as-is. Information and views expressed in this document, including URL and other Internet Web site references,
More informationInstitutional Repositories: Staff and Skills Set
SHERPA Document Institutional Repositories: Staff and Skills Set University of Nottingham 25 th August 2009 Circulation PUBLIC Mary Robinson University of Nottingham Introduction This document began in
More informationMeeting Increasing Demands for Metadata
LANDMARK TECHNICAL PAPER 1 LANDMARK TECHNICAL PAPER Meeting Increasing Demands for Metadata Presenter: Janet Hicks, Senior Manager, Strategy and Business Management, Information Management Presented at
More informationPDS4 and Build 5a Update. Dan Crichton, Emily Law November 2014
PDS4 and Build 5a Update Dan Crichton, Emily Law November 2014 1 PDS4 and Related MC Topics PDS4 Report and Build 5a Dan Crichton and Emily Law IM/DDWG Steve Hughes Software Sean Hardman Tool Planning
More informationSoftware Development in the Digital Library Program. Digital Library Brown Bag Tamara Cameron David Jiao Oct. 22, 2004
Software Development in the Digital Library Program Digital Library Brown Bag Tamara Cameron David Jiao Oct. 22, 2004 Outline Custom Development in the DLP Overview of Digital Library Program Software
More informationAssessment of RLG Trusted Digital Repository Requirements
Assessment of RLG Trusted Digital Repository Requirements Reagan W. Moore San Diego Supercomputer Center 9500 Gilman Drive La Jolla, CA 92093-0505 01 858 534 5073 moore@sdsc.edu ABSTRACT The RLG/NARA trusted
More informationNS DISCOVER 4.0 ADMINISTRATOR S GUIDE. July, 2015. Version 4.0
NS DISCOVER 4.0 ADMINISTRATOR S GUIDE July, 2015 Version 4.0 TABLE OF CONTENTS 1 General Information... 4 1.1 Objective... 4 1.2 New 4.0 Features Improvements... 4 1.3 Migrating from 3.x to 4.x... 5 2
More informationPortal Version 1 - User Manual
Portal Version 1 - User Manual V1.0 March 2016 Portal Version 1 User Manual V1.0 07. March 2016 Table of Contents 1 Introduction... 4 1.1 Purpose of the Document... 4 1.2 Reference Documents... 4 1.3 Terminology...
More informationThe Analysis of Online Communities using Interactive Content-based Social Networks
The Analysis of Online Communities using Interactive Content-based Social Networks Anatoliy Gruzd Graduate School of Library and Information Science, University of Illinois at Urbana- Champaign, agruzd2@uiuc.edu
More informationUsing LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.
White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,
More informationBig Data? Definition # 1: Big Data Definition Forrester Research
Big Data Big Data? Definition # 1: Big Data Definition Forrester Research Big Data? Definition # 2: Quote of Tim O Reilly brings it all home: Companies that have massive amounts of data without massive
More informationData Management in an International Data Grid Project. Timur Chabuk 04/09/2007
Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the
More informationENTERPRISE DOCUMENTS & RECORD MANAGEMENT
ENTERPRISE DOCUMENTS & RECORD MANAGEMENT DOCWAY PLATFORM ENTERPRISE DOCUMENTS & RECORD MANAGEMENT 1 DAL SITO WEB OLD XML DOCWAY DETAIL DOCWAY Platform, based on ExtraWay Technology Native XML Database,
More informationISLANDORA STAFF USER GUIDE. Version 1.3
ISLANDORA STAFF USER GUIDE Version 1.3 July 2014 1 P age Table of Contents Islandora Staff User Guide Chapter 1: Introduction to Islandora and the Islandora Community Page 2 Chapter 2: Introduction to
More informationFROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS
FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS V. CHRISTOPHIDES Department of Computer Science & Engineering University of California, San Diego ICS - FORTH, Heraklion, Crete 1 I) INTRODUCTION 2
More informationSecond EUDAT Conference, October 2013 Workshop: Digital Preservation of Cultural Data Scalability in preservation of cultural heritage data
Second EUDAT Conference, October 2013 Workshop: Digital Preservation of Cultural Data Scalability in preservation of cultural heritage data Simon Lambert Scientific Computing Department STFC UK Types of
More informationThis thesaurus is a set of terms for use by any college or university archives in the United States for describing its holdings.
I. Scope This thesaurus is a set of terms for use by any college or university archives in the United States for describing its holdings. The topical facets are: academic affairs administration classes
More informationFree web-based solution to manage photographs that could be used to manage collection items online if there is a photo of every item.
Review of affordable Collections Database options Our wish list and needs for the Anna Maria Island Historical Society: - Free, or inexpensive - Web-based, cloud storage solution, no server exists at the
More informationCDL Database Administration Framework v. 1.1 August 2009
CDL Database Administration Framework v. 1.1 August 2009 Contents Purpose of the framework 1 Terminology 1 Database administration conventions 1 Summary Table 1 Database design, location, and naming 2
More informationPreservation Action: What, how and when? Hilde van Wijngaarden Head, Digital Preservation Department National Library of the Netherlands
: What, how and when? Hilde van Wijngaarden Head, Digital Preservation Department National Library of the Netherlands What is preservation action? Execution of a strategy to regain or improve access to
More informationSAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24. Data Federation Administration Tool Guide
SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24 Data Federation Administration Tool Guide Content 1 What's new in the.... 5 2 Introduction to administration
More informationImplementing Open Source Systems for Digital Asset Management and Preservation
Implementing Open Source Systems for Digital Asset Management and Preservation Andy Weidner, Drew Krewer, Bethany Scott, Sean Watkins Texas Conference on Digital Libraries Austin, TX May 26, 2016 Overview
More informationDigital Preservation. OAIS Reference Model
Digital Preservation OAIS Reference Model Stephan Strodl, Andreas Rauber Institut für Softwaretechnik und Interaktive Systeme TU Wien http://www.ifs.tuwien.ac.at/dp Aim OAIS model Understanding the functionality
More informationMission-Critical Database with Real-Time Search for Big Data
Mission-Critical Database with Real-Time Search for Big Data February 17, 2012 Slide 1 Overview About MarkLogic Why MarkLogic Case Studies Technology and Features Slide 2 About MarkLogic 10 years in business
More informationSummary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011)
Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011) Key Dates Release Date: June 6, 2013 Response Date: June 25, 2013 Purpose This Request
More informationThe NERC DataGrid (NDG)
The NERC DataGrid (NDG) Roy Lowry on behalf of the NDG, BADC and BODC. Ray Cramer, Marta Gutierrez, Kerstin Kleese Van Dam, Venkatasiva Kondapalli, Susan Latham, Bryan Lawrence, Kevin O Neill, Ag Stephens,
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationWeb Site Collection Plan. for. Michigan State University Archives & Historical Collections. May 21, 2015
Web Site Collection Plan for Michigan State University Archives & Historical Collections May 21, 2015 Prepared by: Ed Busch Michigan State University buschedw@msu.edu Contents Section 1. Overview, Mission,
More informationAHDS Digital Preservation Glossary
AHDS Digital Preservation Glossary Final version prepared by Raivo Ruusalepp Estonian Business Archives, Ltd. January 2003 Table of Contents 1. INTRODUCTION...1 2. PROVENANCE AND FORMAT...1 3. SCOPE AND
More informationData Driven Success. Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA)
Data Driven Success Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA) In business, data is everything. Regardless of the products or services you sell or the systems you support,
More informationIntegrating VoltDB with Hadoop
The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.
More information