CDL s OAI Harvest Architecture Development

Size: px
Start display at page:

Download "CDL s OAI Harvest Architecture Development"

Transcription

1 California Digital Library CDL s OAI Harvest Architecture Development Presentation Wednesday, Bill Landis, Metadata Coordinator, CDL

2 Overview OAI harvesting infrastructure planning and development thus far (Bill) Developing CDL capacity to harvest, transform/normalize, and ingest metadata-only digital objects into a CF repository Overview of results of Metadata clustering experiments and possible next steps (Dave) Possibilities for using clustering and classification as a means of enriching CDL digital objects (American West? Others?) with consistent metadata to enhance topical browsing access for end users Questions/discussion

3 CDL Common Framework Context Target for long-term management of American West metadata objects: Preservation Repository at CDL Repository based on OAIS Reference Model Records harvested to file system Feeder process produces METS object/sip Ingest process for Repository creates AIP, or an enriched/transformed AmWest metadata package Access services driven by indexes created from Repository METS objects

4 OAI Harvest Architecture OAI Provider OAI Provider U.Mich. OAI Harvester OAI Provider Harvest Tracking DB OAI Sets On File Server

5 Prototype Harvest Tracking DB UMich OAI harvester Stores Goals Especially like its ability to log harvest details (time, progress, completion, # of records harvested) -- automatically fed into Harvest Tracking DB Metadata about data providers/ institutions Information about who set up harvest for which project (important in CDL environment) Initial harvest dates and re-harvest frequency Feed information about an object s harvest into METS record creation process Feed information to CDL s automated scheduling software for future re-harvests Currently OAI specific/we ultimately envision this being folded into CDL Common Framework Tracking DB (along with tracking information for Preservation Repository ingest, Web crawling, etc.)

6 Prototype Harvest Tracking DB MySQL database Most information auto-populated during harvesting process Tables for Institution - Data provider Harvest Set - User phpmyadmin user interface Supports human-driven (as opposed to automated) interactions with data providers Supports human input of certain field values Public access site URL at Set level if this exists Set-specific or Repository-specific Rights URL when this is not supplied in OAI records or in Identify or ListSets responses -- in these cases CDL will add this human-input Rights URL from the Tracking DB during the course of METS-record creation

7 Input OAI data provider URL Select OAI verb to send

8 Choose sets for which to request records Edit information about set in tracking DB

9 In Edit mode, add contextual and other information for later use in normalizing/enriching record. * In this case LC provides the set s public URL in the set description, but not in each record, so we record it here to be added to our transformed metadata record (AIP). * We also add the generic rights URL for the set or institution, depending on which is available, for use in case individual records do not contain a dc.rights field. * Other fields contain data auto-populated by the Tracking DB)

10 Switching to the ListRecords OAI verb request invokes a harvest of selected, previously unharvested sets.

11 DB provides browsing access to: OAI verb responses stored in Tracking DB Previously harvested sets Newly harvested sets

12 Access to individual records through the Tracking DB is both to the harvested XML file and to the <identifier> URL s destination

13 Harvested XML record

14 Identifier URL s destination

15 Metadata Transformation (Feeder/Ingest) SIP SIP SIP Subset filter Original Record (SIP) Enhanced Record (AIP) OAI Feeder Process (creates METS objects) Context DateTx Thmb Tx Context Information Date/Era Normalization Identifier validator/thumbnail generator GeoTx Geographic place names GenreTx Type/Genre OAI Sets Harvest Tracking DB TitleTx Title generator Future transforms? CDL Common Framework (CF) Repository Object

16 Repository SIP SIP SIP Feeder Process DateTx GeoTx TypeTx OAI Sets Harvest Tracking DB TitleTx Repo Clustering Index User Interface Access Services

17 Feeder Process for OAI Records METS record creation Successfully wrapped and ingested approximately 100,000 metadata-only OAI objects during development/testing in March New CDL Repository service: Catalog added to previously existing Preserve Untouched copy of as-harvested metadata becomes SIP Information from Harvest Tracking DB parked in METS record ( Context info on previous slide) for downstream use Default rights URL for set or repository as backup

18 Repository Ingest Process Generic ingest profiles Hope: generic profile will be able to take care of 90+% of records harvested Reality: we ll see Profiles essentially an ordered list of calls to CDL CF utilities for things like Normalization of specific fields Enrichment activities on specific fields Subsetting and removing out-of-scope content for a given collection Special profiles for specific sets We hope this is the rare exception, not the rule Ex: Only content date in set records are in <identifier> string!

19 Ingest Utilities Subsetting and disposing of out-of-scope content Improving best practices in OAI data provider community will hopefully make this easier as time goes by, both at set and record levels Software-assisted human analysis of OAI sets/records seems the best way to develop subsetting algorithms, probably tied to specific OAI service provider projects We may need to do this in a human-mediated step post Harvest, but pre Ingest

20 Ingest Utilities Date Normalization Coding completed, QA plan in development Stats from last test run: 359,830 records processed, found and normalized at least one date in 279,337 records (76.4%) using <date>, <title>, & <description> fields Process Analysis of 360,000 records Roy s web-based analysis tools Spotfire DecisionSite ( - very large datasets problematic perl hacking courtesy of Mike Information derived from clustering experiments (probably more useful for geographic and genre information than for dates)

21 Ingest Utilities Date Normalization process (cont.) Developing algorithm Steps for normalizing date patterns extracted from <date> fields Steps for extracting date patterns from other fields if no <date> field or if <date> field contains some version of undated/unknown/n.d./s.d., etc. Currently limiting this to <title> and <description> fields Steps for writing out normalized dates in CDL qualified Dublin Core fields in transformed AIP descriptive metadata Documentation, QA testing

22 Ingest Utilities Other transformations needed to support envisioned interface we ll be testing on target American West user populations beginning in 2006 ( Evaluation year for AmWest): Work on these ongoing during remainder of 2005, into 2006 Date normalization utility work is prototype for process for other utilities Prototype vs. production approaches for CDL CF development?

23 Ingest Utilities Higher priority to support evaluation activities: Identifier validator/thumbnail generator All of our user research indicates that end users want thumbnails in results lists most OAI records don t provide such links We anticipate that this utility will test for validity of links and destination filetypes, grab image files, make low-resolution thumbnail for object (algorithm to be determined), store pointer in METS record Geographical location terms Analysis of 360K records underway Currently planning to transform to TGN hierarchical terms (CDL licensing of TGN from Getty is in process) Will need to prototype some kind of basic, back-end Thesaurus Manager utility with which normalization utility can interact Derived from content of <coverage> and other candidate DC fields

24 Ingest Utilities Lower priority (can start evaluation activities and work these in during 2006) Genre terms Anticipating 2 hierarchical levels First subset of DCMI Type Vocabulary Second subset of AAT derived from analysis of current pool of harvested records Will also interact with a back-end Thesaurus Manager Derived from <type> and other candidate DC fields Title extraction utility When a record has no <title> field, we ll have algorithm for extracting approx. first 10 words of <description> field as a supplied title Approx. 800 of If no <description field> we ll insert an explicit <title>[no Title]</title> field for use in results lists and citations

25 Ingest Utilities Topic terms High priority to support American West evaluation activities beginning in November 2005 Several approaches to consider; more in Dave s talk Important for American West evaluation activities Feedback from usability testing of hierarchical, faceted browsing approach in American West interface will help CDL assess need for further development of CF support for clustering and classification activities Testing of usefulness of 23 High-level Topics Testing 23 topics locally yielded a 15-topic list Felicia/Jane designed online survey that was up for 2 weeks Bill solicited survey takers from K-12 teacher and historian listservs 194 responses Stay tuned for analysis

Ex Libris Rosetta: A Digital Preservation System Product Description

Ex Libris Rosetta: A Digital Preservation System Product Description Ex Libris Rosetta: A Digital Preservation System Product Description CONFIDENTIAL INFORMATION The information herein is the property of Ex Libris Ltd. or its affiliates and any misuse or abuse will result

More information

How collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives

How collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives How collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives Anna Perricci Columbia University Libraries Best Practices Exchange November 14, 2013 Overview Web

More information

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web Corey A Harper University of Oregon Libraries Tel: +1 541 346 1854 Fax:+1 541 346 3485 charper@uoregon.edu

More information

Functional Requirements for Digital Asset Management Project version 3.0 11/30/2006

Functional Requirements for Digital Asset Management Project version 3.0 11/30/2006 /30/2006 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 26 27 28 29 30 3 32 33 34 35 36 37 38 39 = required; 2 = optional; 3 = not required functional requirements Discovery tools available to end-users:

More information

A guide to the lifeblood of DAM:

A guide to the lifeblood of DAM: A guide to the lifeblood of DAM: Key concepts and best practices for using metadata in digital asset management systems. By John Horodyski. Sponsored by Widen Enterprises and DigitalAssetManagement.com.

More information

How collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives

How collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives How collaboration can save [more of] the web: recent progress in collaborative web archiving initiatives Anna Perricci Columbia University Libraries METRO Conference 2014 January 15, 2014 Overview Web

More information

Next-Generation Technical Services (NGTS) Digital Asset Management System (DAMS) Requirements

Next-Generation Technical Services (NGTS) Digital Asset Management System (DAMS) Requirements Next-Generation Technical Services (NGTS) Digital Asset Management System (DAMS) Requirements Final Report July 20, 2012 Power of Three (POT) #1, Lightning Team #1A Kathleen Cameron, UC San Francisco (Co-chair)

More information

DEVELOPING A VISUAL DIGITAL IMAGE COLLECTION. Calgary Collections 2005: The Changing Collections Environment CLA Pre-Conference June 15, 2005

DEVELOPING A VISUAL DIGITAL IMAGE COLLECTION. Calgary Collections 2005: The Changing Collections Environment CLA Pre-Conference June 15, 2005 DEVELOPING A VISUAL DIGITAL IMAGE COLLECTION Heather D Amour Collections Librarian University of Calgary Marilyn Nasserden Fine Arts Librarian University of Calgary Calgary Collections 2005: The Changing

More information

GUIDELINES FOR THE CREATION OF DIGITAL COLLECTIONS

GUIDELINES FOR THE CREATION OF DIGITAL COLLECTIONS GUIDELINES FOR THE CREATION OF DIGITAL COLLECTIONS Best Practices for Descriptive Metadata This document sets forth guidelines for creating descriptive metadata for items in CARLI Digital Collections (CDC)

More information

Digital Asset Management. Content Control for Valuable Media Assets

Digital Asset Management. Content Control for Valuable Media Assets Digital Asset Management Content Control for Valuable Media Assets Overview Digital asset management is a core infrastructure requirement for media organizations and marketing departments that need to

More information

ERA Challenges. Draft Discussion Document for ACERA: 10/7/30

ERA Challenges. Draft Discussion Document for ACERA: 10/7/30 ERA Challenges Draft Discussion Document for ACERA: 10/7/30 ACERA asked for information about how NARA defines ERA completion. We have a list of functions that we would like for ERA to perform that we

More information

Core Competencies for Visual Resources Management

Core Competencies for Visual Resources Management Core Competencies for Visual Resources Management Introduction An IMLS Funded Research Project at the University at Albany, SUNY Hemalata Iyer, Associate Professor, University at Albany Visual resources

More information

Acronym: Data without Boundaries. Deliverable D12.1 (Database supporting the full metadata model)

Acronym: Data without Boundaries. Deliverable D12.1 (Database supporting the full metadata model) Project N : 262608 Acronym: Data without Boundaries Deliverable D12.1 (Database supporting the full metadata model) Work Package 12 (Implementing Improved Resource Discovery for OS Data) Reporting Period:

More information

Interagency Science Working Group. National Archives and Records Administration

Interagency Science Working Group. National Archives and Records Administration Interagency Science Working Group 1 National Archives and Records Administration Establishing Trustworthy Digital Repositories: A Discussion Guide Based on the ISO Open Archival Information System (OAIS)

More information

Library metadata, whether in the form of MARC 21

Library metadata, whether in the form of MARC 21 Metadata to Support Next-Generation Library Resource Discovery: Lessons from the extensible Catalog, Phase 1 Jennifer Bowen The extensible Catalog (XC) Project at the University of Rochester will design

More information

K@ A collaborative platform for knowledge management

K@ A collaborative platform for knowledge management White Paper K@ A collaborative platform for knowledge management Quinary SpA www.quinary.com via Pietrasanta 14 20141 Milano Italia t +39 02 3090 1500 f +39 02 3090 1501 Copyright 2004 Quinary SpA Index

More information

Notes about possible technical criteria for evaluating institutional repository (IR) software

Notes about possible technical criteria for evaluating institutional repository (IR) software Notes about possible technical criteria for evaluating institutional repository (IR) software Introduction Andy Powell UKOLN, University of Bath December 2005 This document attempts to identify some of

More information

Flattening Enterprise Knowledge

Flattening Enterprise Knowledge Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it

More information

Long Term Knowledge Retention and Preservation

Long Term Knowledge Retention and Preservation Long Term Knowledge Retention and Preservation Aziz Bouras University of Lyon, DISP Laboratory France abdelaziz.bouras@univ-lyon2.fr Recent years: How should digital 3D data and multimedia information

More information

Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context

Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager paul.bevan@llgc.org.uk Structure! Background and overview! OAIS Model! Why

More information

Digital Curation at the National Space Science Data Center

Digital Curation at the National Space Science Data Center Digital Curation at the National Space Science Data Center DigCCurr2007: Digital Curation In Practice 20 April 2007 Donald Sawyer/NASA/GSFC Ed Grayzeck/NASA/GSFC Overview NSSDC Requirements & Digital Curation

More information

Metadata driven framework for the Canada Research Data Centre Network

Metadata driven framework for the Canada Research Data Centre Network Metadata driven framework for the Canada Research Data Centre Network IASSIST 2010 Session A4: DDI3 Tools Pascal Heus, Metadata Technology North America pascal.heus@metadatatechnology.com http://www.metadatatechnology.com

More information

Database preservation toolkit:

Database preservation toolkit: Nov. 12-14, 2014, Lisbon, Portugal Database preservation toolkit: a flexible tool to normalize and give access to databases DLM Forum: Making the Information Governance Landscape in Europe José Carlos

More information

Implementing SharePoint 2010 as a Compliant Information Management Platform

Implementing SharePoint 2010 as a Compliant Information Management Platform Implementing SharePoint 2010 as a Compliant Information Management Platform Changing the Paradigm with a Business Oriented Approach to Records Management Introduction This document sets out the results

More information

COURSE SYLLABUS COURSE TITLE:

COURSE SYLLABUS COURSE TITLE: 1 COURSE SYLLABUS COURSE TITLE: FORMAT: CERTIFICATION EXAMS: 55043AC Microsoft End to End Business Intelligence Boot Camp Instructor-led None This course syllabus should be used to determine whether the

More information

CatDV Pro Workgroup Serve r

CatDV Pro Workgroup Serve r Architectural Overview CatDV Pro Workgroup Server Square Box Systems Ltd May 2003 The CatDV Pro client application is a standalone desktop application, providing video logging and media cataloging capability

More information

Adding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM)

Adding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM) Adding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM) Oracle's Sun Storage Archive Manager (SAM) self-protecting file system software reduces operating costs by providing data

More information

Assignment 1 Briefing Paper on the Pratt Archives Digitization Projects

Assignment 1 Briefing Paper on the Pratt Archives Digitization Projects Twila Rios Digital Preservation Spring 2012 Assignment 1 Briefing Paper on the Pratt Archives Digitization Projects The Pratt library digitization efforts actually encompass more than one project, including

More information

Verify and Control Headings in Bibliographic Records

Verify and Control Headings in Bibliographic Records OCLC Connexion Browser Guides Verify and Control Headings in Bibliographic Records Last updated: April 2013 6565 Kilgour Place, Dublin, OH 43017-3395. www.oclc.org Revision History Date Section title Description

More information

MBooks: Google Books Online at the University of Michigan Library

MBooks: Google Books Online at the University of Michigan Library MBooks: Google Books Online at the University of Michigan Library Phil Farber, Chris Powell, Cory Snavely University of Michigan Library Information Technology Architecture overview Four basic pieces:

More information

Navigating to Success: Finding Your Way Through the Challenges of Map Digitization

Navigating to Success: Finding Your Way Through the Challenges of Map Digitization Presentations (Libraries) Library Faculty/Staff Scholarship & Research 10-15-2011 Navigating to Success: Finding Your Way Through the Challenges of Map Digitization Cory K. Lampert University of Nevada,

More information

OCLC Content Management Services John A. Hearty, OCLC Dublin, OH

OCLC Content Management Services John A. Hearty, OCLC Dublin, OH Proceedings of the 9th Annual Federal Depository Library Conference October 22-25, 2000 OCLC Content Management Services John A. Hearty, OCLC Dublin, OH Agenda Dimensions of Digital Archiving Obstacles

More information

SharePoint 2010 Interview Questions-Architect

SharePoint 2010 Interview Questions-Architect Basic Intro SharePoint Architecture Questions 1) What are Web Applications in SharePoint? An IIS Web site created and used by SharePoint 2010. Saying an IIS virtual server is also an acceptable answer.

More information

CONCEPTCLASSIFIER FOR SHAREPOINT

CONCEPTCLASSIFIER FOR SHAREPOINT CONCEPTCLASSIFIER FOR SHAREPOINT PRODUCT OVERVIEW The only SharePoint 2007 and 2010 solution that delivers automatic conceptual metadata generation, auto-classification and powerful taxonomy tools running

More information

Metadata Quality Control for Content Migration: The Metadata Migration Project at the University of Houston Libraries

Metadata Quality Control for Content Migration: The Metadata Migration Project at the University of Houston Libraries Metadata Quality Control for Content Migration: The Metadata Migration Project at the University of Houston Libraries Andrew Weidner University of Houston, USA ajweidner@uh.edu Annie Wu University of Houston,

More information

DSpace: An Institutional Repository from the MIT Libraries and Hewlett Packard Laboratories

DSpace: An Institutional Repository from the MIT Libraries and Hewlett Packard Laboratories DSpace: An Institutional Repository from the MIT Libraries and Hewlett Packard Laboratories MacKenzie Smith, Associate Director for Technology Massachusetts Institute of Technology Libraries, Cambridge,

More information

Enabling the Big Data Commons through indexing of data and their interactions

Enabling the Big Data Commons through indexing of data and their interactions biomedical and healthcare Data Discovery Index Ecosystem Enabling the Big Data Commons through indexing of and their interactions 2 nd BD2K all-hands meeting Bethesda 11/12/15 Aims 1. Help users find accessible

More information

Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw.

Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw. Archiving Systems Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie uwe.borghoff@unibw.de Decision Process Reference Models Technologies Use Cases

More information

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com)

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com) CSP CHRONOS Compliance statement for ISO 14721:2003 (Open Archival Information System Reference Model) 2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com) The international

More information

Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION

Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION Brian Lao - bjlao Karthik Jagadeesh - kjag Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND There is a large need for improved access to legal help. For example,

More information

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo Selecting a Taxonomy Management Tool Wendi Pohs InfoClear Consulting #SLATaxo InfoClear Consulting What do we do? Content Analytics Strategy and Implementation, including: Taxonomy/Ontology development

More information

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we

More information

CiteSeer x in the Cloud

CiteSeer x in the Cloud Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar

More information

Yale University. Report to the Digital Library Federation October 2004 Volume 5, Number 1. Fall 2004. http://www.diglib.org/pubs/news05_01/

Yale University. Report to the Digital Library Federation October 2004 Volume 5, Number 1. Fall 2004. http://www.diglib.org/pubs/news05_01/ Yale University Report to the Digital Library Federation October 2004 Volume 5, Number 1. Fall 2004 http://www.diglib.org/pubs/news05_01/ TABLE OF CONTENTS i. Collections, services, and systems ii. Projects

More information

Vilas Wuwongse, Thiti Vacharasintopchai, Neelawat Intaraksa Asian Institute of Technology www.ait.asia

Vilas Wuwongse, Thiti Vacharasintopchai, Neelawat Intaraksa Asian Institute of Technology www.ait.asia A Common Infrastructure for Digital Contents Vilas Wuwongse, Thiti Vacharasintopchai, Neelawat Intaraksa Asian Institute of Technology www.ait.asia Outline Introduction Issues Proposed Approach A Common

More information

Applying the OAIS standard to CCLRC s British Atmospheric Data Centre and the Atlas Petabyte Storage Service

Applying the OAIS standard to CCLRC s British Atmospheric Data Centre and the Atlas Petabyte Storage Service Applying the OAIS standard to CCLRC s British Atmospheric Centre and the Atlas Petabyte Storage Service Corney, D.R., De Vere, M., Folkes, T., Giaretta, D., Kleese van Dam, K., Lawrence, B. N., Pepler,

More information

Nuxeo, an open source platform for content-centric business applications. Stéfane Fermigier, Nuxeo Laurent Doguin, Nuxeo

Nuxeo, an open source platform for content-centric business applications. Stéfane Fermigier, Nuxeo Laurent Doguin, Nuxeo Nuxeo, an open source platform for content-centric business applications Stéfane Fermigier, Nuxeo Laurent Doguin, Nuxeo Nuxeo, the Company Providing an Open Source Content Management Platform for Business

More information

Information Access Platforms: The Evolution of Search Technologies

Information Access Platforms: The Evolution of Search Technologies Information Access Platforms: The Evolution of Search Technologies Managing Information in the Public Sphere: Shaping the New Information Space April 26, 2010 Purpose To provide an overview of current

More information

WHY DIGITAL ASSET MANAGEMENT? WHY ISLANDORA?

WHY DIGITAL ASSET MANAGEMENT? WHY ISLANDORA? WHY DIGITAL ASSET MANAGEMENT? WHY ISLANDORA? Digital asset management gives you full access to and control of to the true value hidden within your data: Stories. Digital asset management allows you to

More information

Combining Technologies to Create New Solutions

Combining Technologies to Create New Solutions Combining Technologies to Create New Solutions Vishesh Chachra VP of Emerging Businesses About VTLS Inc. VTLS has been in business for over 25 years VTLS does business in 40 countries. VTLS has offices/wholly

More information

CGHub Web-based Metadata GUI Statement of Work

CGHub Web-based Metadata GUI Statement of Work CGHub Web-based Metadata GUI Statement of Work Mark Diekhans Version 1 April 23, 2012 1 Goals CGHub stores metadata and data associated from NCI cancer projects. The goal of this project

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

OPENGREY: HOW IT WORKS AND HOW IT IS USED

OPENGREY: HOW IT WORKS AND HOW IT IS USED OPENGREY: HOW IT WORKS AND HOW IT IS USED CHRISTIANE STOCK christiane.stock@inist.fr INIST-CNRS, France Abstract OpenGrey is a unique repository providing open access to European grey literature references,

More information

Information and documentation The Dublin Core metadata element set

Information and documentation The Dublin Core metadata element set ISO TC 46/SC 4 N515 Date: 2003-02-26 ISO 15836:2003(E) ISO TC 46/SC 4 Secretariat: ANSI Information and documentation The Dublin Core metadata element set Information et documentation Éléments fondamentaux

More information

Community Edition. Master Data Management 3.X. Administrator Guide

Community Edition. Master Data Management 3.X. Administrator Guide Community Edition Talend Master Data Management 3.X Administrator Guide Version 3.2_a Adapted for Talend MDM Studio v3.2. Administrator Guide release. Copyright This documentation is provided under the

More information

Introduction. Architecture Re-engineering. Systems Consolidation. Data Acquisition. Data Integration. Database Technology

Introduction. Architecture Re-engineering. Systems Consolidation. Data Acquisition. Data Integration. Database Technology Introduction Data migration is necessary when an organization decides to use a new computing system or database management system that is incompatible with the current system. Architecture Re-engineering

More information

Taking Control of Library Metadata and Websites using the extensible Catalog

Taking Control of Library Metadata and Websites using the extensible Catalog Taking Control of Library Metadata and Websites using the extensible Catalog Jennifer Bowen University of Rochester/eXtensible Catalog Organization Code4lib 2010, Asheville, North Carolina Feb.23, 2010

More information

Islandora: An Open Source Institutional Repository Solution. Consortium of MnPALS Libraries Annual Meeting April 2014

Islandora: An Open Source Institutional Repository Solution. Consortium of MnPALS Libraries Annual Meeting April 2014 Islandora: An Open Source Institutional Repository Solution Consortium of MnPALS Libraries Annual Meeting April 2014 Outline Introduction to Islandora (Linda) Islandora functionality and demo (Alex) SMSU

More information

#MMTM15 #INFOARCHIVE #EMCWORLD 1

#MMTM15 #INFOARCHIVE #EMCWORLD 1 #MMTM15 #INFOARCHIVE #EMCWORLD 1 1 INFOARCHIVE A TECHNICAL OVERVIEW DAVID HUMBY SOFTWARE ARCHITECT #MMTM15 2 TWEET LIVE DURING THE SESSION! Connect with us: Sign up for a Hands On Lab 6 th May, 1.30 PM,

More information

CISCO ACE XML GATEWAY TO FORUM SENTRY MIGRATION GUIDE

CISCO ACE XML GATEWAY TO FORUM SENTRY MIGRATION GUIDE CISCO ACE XML GATEWAY TO FORUM SENTRY MIGRATION GUIDE Legal Marks No portion of this document may be reproduced or copied in any form, or by any means graphic, electronic, or mechanical, including photocopying,

More information

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data

More information

Best Practices for Structural Metadata Version 1 Yale University Library June 1, 2008

Best Practices for Structural Metadata Version 1 Yale University Library June 1, 2008 Best Practices for Structural Metadata Version 1 Yale University Library June 1, 2008 Background The Digital Production and Integration Program (DPIP) is sponsoring the development of documentation outlining

More information

Why archiving erecords influences the creation of erecords. Martin Stürzlinger scopepartner Vienna, Austria

Why archiving erecords influences the creation of erecords. Martin Stürzlinger scopepartner Vienna, Austria Why archiving erecords influences the creation of erecords Martin Stürzlinger scopepartner Vienna, Austria Electronic Records In a Productive System Created Used Changed Deleted In an Archival System No

More information

SANS Dshield Webhoneypot Project. OWASP November 13th, 2009. The OWASP Foundation http://www.owasp.org. Jason Lam

SANS Dshield Webhoneypot Project. OWASP November 13th, 2009. The OWASP Foundation http://www.owasp.org. Jason Lam SANS Dshield Webhoneypot Project Jason Lam November 13th, 2009 SANS Internet Storm Center jason@networksec.org The Foundation http://www.owasp.org Introduction Who is Jason Lam Agenda Intro to honeypot

More information

SharePoint Term Store & Taxonomy Design Harold Brenneman Lighthouse Microsoft Technology Group

SharePoint Term Store & Taxonomy Design Harold Brenneman Lighthouse Microsoft Technology Group SharePoint Term Store & Taxonomy Design Harold Brenneman Lighthouse Microsoft Technology Group Lighthouse Computer Services, All rights reserved Harold Brenneman Consulting Manager MBA, focusing on the

More information

A Java Tool for Creating ISO/FGDC Geographic Metadata

A Java Tool for Creating ISO/FGDC Geographic Metadata F.J. Zarazaga-Soria, J. Lacasta, J. Nogueras-Iso, M. Pilar Torres, P.R. Muro-Medrano17 A Java Tool for Creating ISO/FGDC Geographic Metadata F. Javier Zarazaga-Soria, Javier Lacasta, Javier Nogueras-Iso,

More information

Publishing Linked Data Requires More than Just Using a Tool

Publishing Linked Data Requires More than Just Using a Tool Publishing Linked Data Requires More than Just Using a Tool G. Atemezing 1, F. Gandon 2, G. Kepeklian 3, F. Scharffe 4, R. Troncy 1, B. Vatant 5, S. Villata 2 1 EURECOM, 2 Inria, 3 Atos Origin, 4 LIRMM,

More information

Beyond The Web Drupal Meets The Desktop (And Mobile) Justin Miller Code Sorcery Workshop, LLC http://codesorcery.net/dcdc

Beyond The Web Drupal Meets The Desktop (And Mobile) Justin Miller Code Sorcery Workshop, LLC http://codesorcery.net/dcdc Beyond The Web Drupal Meets The Desktop (And Mobile) Justin Miller Code Sorcery Workshop, LLC http://codesorcery.net/dcdc Introduction Personal introduction Format & conventions for this talk Assume familiarity

More information

Microsoft Project Server 2010 Administrator's Guide

Microsoft Project Server 2010 Administrator's Guide Microsoft Project Server 2010 Administrator's Guide 1 Copyright This document is provided as-is. Information and views expressed in this document, including URL and other Internet Web site references,

More information

Institutional Repositories: Staff and Skills Set

Institutional Repositories: Staff and Skills Set SHERPA Document Institutional Repositories: Staff and Skills Set University of Nottingham 25 th August 2009 Circulation PUBLIC Mary Robinson University of Nottingham Introduction This document began in

More information

Meeting Increasing Demands for Metadata

Meeting Increasing Demands for Metadata LANDMARK TECHNICAL PAPER 1 LANDMARK TECHNICAL PAPER Meeting Increasing Demands for Metadata Presenter: Janet Hicks, Senior Manager, Strategy and Business Management, Information Management Presented at

More information

PDS4 and Build 5a Update. Dan Crichton, Emily Law November 2014

PDS4 and Build 5a Update. Dan Crichton, Emily Law November 2014 PDS4 and Build 5a Update Dan Crichton, Emily Law November 2014 1 PDS4 and Related MC Topics PDS4 Report and Build 5a Dan Crichton and Emily Law IM/DDWG Steve Hughes Software Sean Hardman Tool Planning

More information

Software Development in the Digital Library Program. Digital Library Brown Bag Tamara Cameron David Jiao Oct. 22, 2004

Software Development in the Digital Library Program. Digital Library Brown Bag Tamara Cameron David Jiao Oct. 22, 2004 Software Development in the Digital Library Program Digital Library Brown Bag Tamara Cameron David Jiao Oct. 22, 2004 Outline Custom Development in the DLP Overview of Digital Library Program Software

More information

Assessment of RLG Trusted Digital Repository Requirements

Assessment of RLG Trusted Digital Repository Requirements Assessment of RLG Trusted Digital Repository Requirements Reagan W. Moore San Diego Supercomputer Center 9500 Gilman Drive La Jolla, CA 92093-0505 01 858 534 5073 moore@sdsc.edu ABSTRACT The RLG/NARA trusted

More information

NS DISCOVER 4.0 ADMINISTRATOR S GUIDE. July, 2015. Version 4.0

NS DISCOVER 4.0 ADMINISTRATOR S GUIDE. July, 2015. Version 4.0 NS DISCOVER 4.0 ADMINISTRATOR S GUIDE July, 2015 Version 4.0 TABLE OF CONTENTS 1 General Information... 4 1.1 Objective... 4 1.2 New 4.0 Features Improvements... 4 1.3 Migrating from 3.x to 4.x... 5 2

More information

Portal Version 1 - User Manual

Portal Version 1 - User Manual Portal Version 1 - User Manual V1.0 March 2016 Portal Version 1 User Manual V1.0 07. March 2016 Table of Contents 1 Introduction... 4 1.1 Purpose of the Document... 4 1.2 Reference Documents... 4 1.3 Terminology...

More information

The Analysis of Online Communities using Interactive Content-based Social Networks

The Analysis of Online Communities using Interactive Content-based Social Networks The Analysis of Online Communities using Interactive Content-based Social Networks Anatoliy Gruzd Graduate School of Library and Information Science, University of Illinois at Urbana- Champaign, agruzd2@uiuc.edu

More information

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,

More information

Big Data? Definition # 1: Big Data Definition Forrester Research

Big Data? Definition # 1: Big Data Definition Forrester Research Big Data Big Data? Definition # 1: Big Data Definition Forrester Research Big Data? Definition # 2: Quote of Tim O Reilly brings it all home: Companies that have massive amounts of data without massive

More information

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007 Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the

More information

ENTERPRISE DOCUMENTS & RECORD MANAGEMENT

ENTERPRISE DOCUMENTS & RECORD MANAGEMENT ENTERPRISE DOCUMENTS & RECORD MANAGEMENT DOCWAY PLATFORM ENTERPRISE DOCUMENTS & RECORD MANAGEMENT 1 DAL SITO WEB OLD XML DOCWAY DETAIL DOCWAY Platform, based on ExtraWay Technology Native XML Database,

More information

ISLANDORA STAFF USER GUIDE. Version 1.3

ISLANDORA STAFF USER GUIDE. Version 1.3 ISLANDORA STAFF USER GUIDE Version 1.3 July 2014 1 P age Table of Contents Islandora Staff User Guide Chapter 1: Introduction to Islandora and the Islandora Community Page 2 Chapter 2: Introduction to

More information

FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS

FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS V. CHRISTOPHIDES Department of Computer Science & Engineering University of California, San Diego ICS - FORTH, Heraklion, Crete 1 I) INTRODUCTION 2

More information

Second EUDAT Conference, October 2013 Workshop: Digital Preservation of Cultural Data Scalability in preservation of cultural heritage data

Second EUDAT Conference, October 2013 Workshop: Digital Preservation of Cultural Data Scalability in preservation of cultural heritage data Second EUDAT Conference, October 2013 Workshop: Digital Preservation of Cultural Data Scalability in preservation of cultural heritage data Simon Lambert Scientific Computing Department STFC UK Types of

More information

This thesaurus is a set of terms for use by any college or university archives in the United States for describing its holdings.

This thesaurus is a set of terms for use by any college or university archives in the United States for describing its holdings. I. Scope This thesaurus is a set of terms for use by any college or university archives in the United States for describing its holdings. The topical facets are: academic affairs administration classes

More information

Free web-based solution to manage photographs that could be used to manage collection items online if there is a photo of every item.

Free web-based solution to manage photographs that could be used to manage collection items online if there is a photo of every item. Review of affordable Collections Database options Our wish list and needs for the Anna Maria Island Historical Society: - Free, or inexpensive - Web-based, cloud storage solution, no server exists at the

More information

CDL Database Administration Framework v. 1.1 August 2009

CDL Database Administration Framework v. 1.1 August 2009 CDL Database Administration Framework v. 1.1 August 2009 Contents Purpose of the framework 1 Terminology 1 Database administration conventions 1 Summary Table 1 Database design, location, and naming 2

More information

Preservation Action: What, how and when? Hilde van Wijngaarden Head, Digital Preservation Department National Library of the Netherlands

Preservation Action: What, how and when? Hilde van Wijngaarden Head, Digital Preservation Department National Library of the Netherlands : What, how and when? Hilde van Wijngaarden Head, Digital Preservation Department National Library of the Netherlands What is preservation action? Execution of a strategy to regain or improve access to

More information

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24. Data Federation Administration Tool Guide

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24. Data Federation Administration Tool Guide SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24 Data Federation Administration Tool Guide Content 1 What's new in the.... 5 2 Introduction to administration

More information

Implementing Open Source Systems for Digital Asset Management and Preservation

Implementing Open Source Systems for Digital Asset Management and Preservation Implementing Open Source Systems for Digital Asset Management and Preservation Andy Weidner, Drew Krewer, Bethany Scott, Sean Watkins Texas Conference on Digital Libraries Austin, TX May 26, 2016 Overview

More information

Digital Preservation. OAIS Reference Model

Digital Preservation. OAIS Reference Model Digital Preservation OAIS Reference Model Stephan Strodl, Andreas Rauber Institut für Softwaretechnik und Interaktive Systeme TU Wien http://www.ifs.tuwien.ac.at/dp Aim OAIS model Understanding the functionality

More information

Mission-Critical Database with Real-Time Search for Big Data

Mission-Critical Database with Real-Time Search for Big Data Mission-Critical Database with Real-Time Search for Big Data February 17, 2012 Slide 1 Overview About MarkLogic Why MarkLogic Case Studies Technology and Features Slide 2 About MarkLogic 10 years in business

More information

Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011)

Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011) Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011) Key Dates Release Date: June 6, 2013 Response Date: June 25, 2013 Purpose This Request

More information

The NERC DataGrid (NDG)

The NERC DataGrid (NDG) The NERC DataGrid (NDG) Roy Lowry on behalf of the NDG, BADC and BODC. Ray Cramer, Marta Gutierrez, Kerstin Kleese Van Dam, Venkatasiva Kondapalli, Susan Latham, Bryan Lawrence, Kevin O Neill, Ag Stephens,

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

Web Site Collection Plan. for. Michigan State University Archives & Historical Collections. May 21, 2015

Web Site Collection Plan. for. Michigan State University Archives & Historical Collections. May 21, 2015 Web Site Collection Plan for Michigan State University Archives & Historical Collections May 21, 2015 Prepared by: Ed Busch Michigan State University buschedw@msu.edu Contents Section 1. Overview, Mission,

More information

AHDS Digital Preservation Glossary

AHDS Digital Preservation Glossary AHDS Digital Preservation Glossary Final version prepared by Raivo Ruusalepp Estonian Business Archives, Ltd. January 2003 Table of Contents 1. INTRODUCTION...1 2. PROVENANCE AND FORMAT...1 3. SCOPE AND

More information

Data Driven Success. Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA)

Data Driven Success. Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA) Data Driven Success Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA) In business, data is everything. Regardless of the products or services you sell or the systems you support,

More information

Integrating VoltDB with Hadoop

Integrating VoltDB with Hadoop The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.

More information