Archive-IT Services Andrea Mills Booksgroup Collections Specialist
|
|
|
- May Lester
- 10 years ago
- Views:
Transcription
1 Getting Started with Archive-IT Services Andrea Mills Booksgroup Collections Specialist
2 Internet Archive Micro History Text Archive Update Archive-IT Services
3 1996 The Internet Archive is created, with the goal to archive and preserve the World Wide Web
4 Book digitization begins at University of Toronto Libraries Archive-IT begins targeted web archiving services
5 OpenLibrary, TVNews, Audio and Video, Computer Games and Software
6 Updates 10 Years of Digitization
7 A Decade of Collecting 2.3 million ebooks 1250 Contributing Institutions 400 Sponsors 2450 unique texts collections More than 150 digitization projects currently underway
8 Canadian Libraries
9 Government Publications
10 Social Instagram Flickr
11 Getting Started with Archive-IT Services
12 Archive-IT.org
13 Web Archiving The process of collecting portions of web content, preserving the collections, and then providing access to the archives - for use and re-use.
14 Archive-IT vs. WaybackMachine
15 Archive-IT Services Web based application and fully hosted solution; includes access and storage (2 copies) Tools for selection, scoping and metadata creation Scope-IT Capture content using 10 different frequencies
16 Types of Content HTML, text, video, audio, social media, PDF, images, passwordprotected content, static databases, newspapers Social Media: Flickr, Twitter, Instagram, Vimeo and Facebook only with Archive-IT
17 Features Different levels of access for users Browse collections by both URL, Full text search (basic and advanced) and metadata search 9 post crawl reports for Analysis Online Help Section, Partner Specialists and Tech Support
18 How does it Work? Heritrix: Web Crawler Umbra: Assists/provides flexibility for the crawler to access sites as a browser does Wayback Machine: Access tool for rendering and the viewing pages - the web as it was. NutchWAX: Search engine Full-text search SOLR: Metadata search
19 Starting to Collect
20 Big Questions Do you have a Mission/Mandate to Collect? What are the Goals and Objectives for the Collection? Vision for the Collection?
21 Mandate to Collect... What now? Institutional Collection Web Content
22 Goals and Objectives Why is this web archive important? Short-term Vision (3 yrs.) Long Term Vision (10 yrs.)
23 Vision for Collection What will it look like? How will it be used? How will it be managed and maintained?
24 Broad to Specific As of today, Archive-It has collected 8,961,536,030 URLs for 2,643 public collections!
25 Broad Collections Canadian Government Information collected by University of Toronto has 605 seeds
26 Broad Collections Prairie Provinces Politics Prairie Provinces Politics & Economics collected by University of Alberta has 393 seeds
27 Specific Collections University of Southern California collecting 1 seed
28 Site Closures Aboriginal Canada Portal Closed February 12, 2013
29 10 Years on Mars: Collected by University of Michigan Capture public perception of the Mars Rovers on their 10th anniversary, and to preserve and provide access to that information for the future. 1. Official government documents 2. Popular news and Science media 3. Fringe (conspiracy theorizing, alien spotting...)
30 Current Events Ebola Virus Disease Collected by University of Manitoba has 13 seeds
31 Test Account and Practise
32 Test Account Create a collection, capture content and view the results Start with Five (5) URLs 1 crawl Archive up to 250,000 webpages
33 Is your seed already in the WaybackMachine? Search both keywords and URLs
34 Is the Site Archived Elsewhere? Ask your Colleagues LISTSERVs Registry options?
35 Valuable Experience Attempt to capture all or part of your proposed collection in your test crawl This will help determine Scope, Frequency, QA needs and Subscription level
36 Start Collecting Refer back to Mission, Goals and Vision for collection Repeat
37 Learn More more Download our white paper on the web archiving life cycle Check out our blog:
Practical Options for Archiving Social Media
Practical Options for Archiving Social Media Content Summary for ALGIM Web-Symposium Presentation 03/05/11 Euan Cochrane, Senior Advisor, Digital Continuity, Archives New Zealand, The Department of Internal
WEB ARCHIVING AT SCALE
WEB ARCHIVING AT SCALE (updated 12/19/14) by Rosalie Lack, Stephen Abrams, Trisha Cruse THE VALUE OF WEB CONTENT Web content holds a critical place in modern library collections. Web archiving is an essential
Tools for Web Archiving: The Java/Open Source Tools to Crawl, Access & Search the Web. NLA Gordon Mohr March 28, 2012
Tools for Web Archiving: The Java/Open Source Tools to Crawl, Access & Search the Web NLA Gordon Mohr March 28, 2012 Overview The tools: Heritrix crawler Wayback browse access Lucene/Hadoop utilities:
How To Understand Web Archiving Metadata
Web Archiving Metadata Prepared for RLG Working Group The following document attempts to clarify what metadata is involved in / required for web archiving. It examines: The source of metadata The object
Indexing big data with Tika, Solr, and map-reduce
Indexing big data with Tika, Solr, and map-reduce Scott Fisher, Erik Hetzner California Digital Library 8 February 2012 Scott Fisher, Erik Hetzner (CDL) Indexing big data 8 February 2012 1 / 19 Outline
Web Archiving Tools: An Overview
Web Archiving Tools: An Overview JISC, the DPC and the UK Web Archiving Consortium Workshop Missing links: the enduring web Helen Hockx-Yu Web Archiving Programme Manager July 2009 Shape of the Web: HTML
THE WEB ARCHIVING LIFE CYCLE MODEL
THE WEB ARCHIVING LIFE CYCLE MODEL The Archive-It Team Internet Archive March 2013 Principle authors: Molly Bragg Kristine Hanna Contributors: Lori Donovan Graham Hukill Anna Peterson Introduction 1 Introduction
Everything you ever wanted to know about. Physiotherapy. C a n a d a ONLINE
Everything you ever wanted to know about Physiotherapy C a n a d a ONLINE Everything you ever wanted to know about Physiotherapy Canada Online... Have questions about how to access Physiotherapy Canada
Kris Carpenter Negulescu, Director The Internet Archive, Web Group
Opportunities for Global Cooperation & Collaboration in Web Archiving Kris Carpenter Negulescu, Director The Internet Archive, Web Group Agenda Welcome & Overview The Internet Archive (IA) The International
Mark E. Pruzansky MD. Local SEO Action Plan for. About your Local SEO Action Plan. Technical SEO. 301 Redirects. XML Sitemap. Robots.
Local SEO Action Plan for Mark E. Pruzansky MD Action Plan generated on 5 May 2013 About your Local SEO Action Plan This report contains a number of recommendations for correcting the issues and taking
Building a master s degree on digital archiving and web archiving. Sara Aubry (IT department, BnF) Clément Oury (Legal Deposit department, BnF)
Building a master s degree on digital archiving and web archiving Sara Aubry (IT department, BnF) Clément Oury (Legal Deposit department, BnF) Objectives of the presentation > Present and discuss an experiment
DFID Research Open and Enhanced Access Policy: Implementation guide
DFID Research Open and Enhanced Access Policy: Implementation guide V1.1: January 2013 This guide provides information to help researchers and project managers fulfil the requirements and meet the objectives
The British Academy of Management. Website and Social Media Policy
The British Academy of Management s Website and Social Media Policy The creation of management knowledge through research and its dissemination through teaching and application The British Academy of Management
Plagiarism. Dr. M.G. Sreekumar UNESCO Coordinator, Greenstone Support for South Asia Head, LRC & CDDL, IIM Kozhikode
Digital Rights Management & Plagiarism Dr. M.G. Sreekumar UNESCO Coordinator, Greenstone Support for South Asia Head, LRC & CDDL, IIM Kozhikode Intranet / Internet K-Assets/Objects, Practices, CoP, Collaborative
Local Loading. The OCUL, Scholars Portal, and Publisher Relationship
Local Loading Scholars)Portal)has)successfully)maintained)relationships)with)publishers)for)over)a)decade)and)continues) to)attract)new)publishers)that)recognize)both)the)competitive)advantage)of)perpetual)access)through)
Web Archiving and Scholarly Use of Web Archives
Web Archiving and Scholarly Use of Web Archives Helen Hockx-Yu Head of Web Archiving British Library 15 April 2013 Overview 1. Introduction 2. Access and usage: UK Web Archive 3. Scholarly feedback on
INDUSTRY GUIDE TO FINANCIAL PLANNING WEEK
INDUSTRY GUIDE TO FINANCIAL PLANNING WEEK Tips, Tools & Opportunities to Help Your Organization Make the Most of Financial Planning Week FINANCIAL PLANNING WEEK provides a significant opportunity for your
Collecting and Providing Access to Large Scale Archived Web Data. Helen Hockx-Yu Head of Web Archiving, British Library
Collecting and Providing Access to Large Scale Archived Web Data Helen Hockx-Yu Head of Web Archiving, British Library Web Archives key characteristics Snapshots of web resources, taken at given point
A survey of web archive search architectures
A survey of web archive search architectures Miguel Costa, Daniel Gomes (Portuguese Web Archive@FCCN) Francisco Couto, Mário J. Silva (University of Lisbon) The Internet Archive was founded in 1996 Web-archived
USM Web Content Management System
University of Southern Maine USM Web Content Management System USM Web CMS Technical Development Group 4/26/2011 Content o Introduction o Login o User Roles o Group Member o Group Publisher o Group Admin
Best Practices. for libraries to maximize digital circulation. Your checklist to a successful digital collection. Staff. Collection Development
Best Practices for libraries to maximize digital circulation Your checklist to a successful digital collection. Collection Development Staff 1 Training Marketing 1 Best Practices With your OverDrive service,
WEB DEVELOPMENT & SEO
WEB DEVELOPMENT & SEO ATP Consulting is a small Team of Professionals providing clients with WEB DESIGN and DEVELOPMENT. WEBSITE We will provide you with an easy-to-update and feature-rich website based
Google Product. Google Module 1
Google Product Overview Google Module 1 Google product overview The Google range of products offer a series of useful digital marketing tools for any business. The clear goal for all businesses when considering
State Records Guideline No 18. Managing Social Media Records
State Records Guideline No 18 Managing Social Media Records Table of Contents 1 Introduction... 4 1.1 Purpose... 4 1.2 Authority... 5 2 Social Media records are State records... 5 3 Identifying Risks...
Archiving the Web and Beyond: A Look at Twi8er and Facebook (and some other things too)
Archiving the Web and Beyond: A Look at Twi8er and Facebook (and some other things too) July 23, 2014 Benn Joseph Manuscript Librarian Northwestern University Library Outline of discussion Digital preservanon
Start the tour. www.universitypressscholarship.com. Oxford University Press 2013. All rights reserved.
Table of contents. Tutorial home page. What is University Press Scholarship Online?. Navigating from the Home Page 4. Browsing by subject 5. Working with Subject Specializations in the Quick search 6.
Scholarly Use of Web Archives
Scholarly Use of Web Archives Helen Hockx-Yu Head of Web Archiving British Library 15 February 2013 Web Archiving initiatives worldwide http://en.wikipedia.org/wiki/file:map_of_web_archiving_initiatives_worldwide.png
Commerce 4KH3: Management Issues in Electronic Business
Commerce 4KH3: Management Issues in Electronic Business Ines Perkovic Business Librarian Innis Library, KTH-108 February 2016 McMaster University Libraries library.mcmaster.ca 905.525.9140 x22081 [email protected]
Digital Heritage Preservation - Economic Realities and Options
Digital Heritage Preservation - Economic Realities and Options Ronald Walker Executive Director, Canadiana.org Abstract The demand for digital heritage preservation is increasing, particularly in response
FINDING THESES AND DISSERTATIONS
FINDING THESES AND DISSERTATIONS Dissertations or theses are documents submitted by candidates for master s or doctoral degrees, and present the authors research hypotheses and findings. They are important
Website, Blogs, Social Sites : Create web presence in the world of Internet [email protected], June 21, 2015.
Website, Blogs, Social Sites : Create web presence in the world of Internet [email protected], June 21, 2015. www.myreaders.info Return to Website Create Presence on Internet and World Wide Web. This article
STEPPING UP TO THE ELECTRONIC ARCHIVING CHALLENGE: OCLC S ROLE. Andrea Keyhani Director, Licensing & Publisher Relations
STEPPING UP TO THE ELECTRONIC ARCHIVING CHALLENGE: OCLC S ROLE Andrea Keyhani Director, Licensing & Publisher Relations OCLC.org Nonprofit, membership organization 36,000 libraries in 76 countries Mission:
Oracle Social Relationship Management (SRM): Professional Services for Branded Solution Delivery
Oracle Social Relationship Management (SRM): Professional Services for Branded Solution Delivery The Social Delivery team provides professional services to brand, extend, and customize your social media
How To Harvest For The Agnic Portal
This is a preprint of an article whose final and definitive form has been published in Library Hi Tech News [30(4):1-5, 2013]; Works produced by employees of the US Government as part of their official
B2B Software Content Marketing: 2013 Benchmarks, Budgets, and Trends North America
B2B Software Content Marketing: 2013 Benchmarks, Budgets, and Trends North America FOREWORD Hello Software Marketers! Welcome to B2B Software Content Marketing: 2013 Benchmarks, Budgets, and Trends North
Housing Works. Content Management System Overview. Presented to: 12.16.13
[! Content Management System Overview ]! Presented to: Housing Works 12.16.13 SAMPLES OF OUR DESIGN AND DEVELOPMENT WORK Juvenile Diabetes Research Foundation of Canada www.jdrf.ca Kroger Bringing Hope
Note: Survey responses are based upon the number of individuals that responded to the specific question. Response Responses Responses
State: IN Results based on 133 survey(s). Note: Survey responses are based upon the number of individuals that responded to the specific question. What is your current job responsibility? (select one)
Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION
Brian Lao - bjlao Karthik Jagadeesh - kjag Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND There is a large need for improved access to legal help. For example,
ACORD Website Design
Request for Proposals: Terms of Reference (TOR) for re- designing ACORD Website www.acordinternational.org Table of Contents I. BACKGROUND... 1 II. END- RESULT TO BE DELIVERED BY THE SERVICE PROVIDER...
THE ICDD & SOCIAL MEDIA. By Betsy Potter, Director of Operations
THE ICDD & SOCIAL MEDIA By Betsy Potter, Director of Operations BENEFITS n Relationships n Branding n Learning HOW SHOULD SOCIAL MEDIA BE USED n Integrate n Amplify n Repurpose n Build community n Learn
How To Manage Pandora
PANDORA - past, present, and future National web archiving in Australia Dr Paul Koerbin Manager Web Archiving National Library of Australia National Conference on eresources in Malaysia Penang, Malaysia,
SEO: What is it and Why is it Important?
SEO: What is it and Why is it Important? SearchEngineOptimization What is it and Why is it Important? The term SEO is being mentioned a lot lately, but not everyone is familiar with what SEO actually is.
Residential Technology Assessment by Educational Attainment. Do Not Copy Without Written Permission 85 www.connectedtennessee.org
Residential Technology Assessment by Educational Attainment Do Not Copy Without Written Permission 85 www.connectedtennessee.org Tennessee Residents with a Computer at Home Percent of Tennessee residents
Introducing our new Editor: Email Creator
Introducing our new Editor: Email Creator To view a section click on any header below: Creating a Newsletter... 3 Create From Templates... 4 Use Current Templates... 6 Import from File... 7 Import via
Best Practices. for Library Partners to maximize digital circulation. Your checklist to a successful digital collection.
Best Practices for Library Partners to maximize digital circulation Your checklist to a successful digital collection. 1 OverDrive s Best Practices With your OverDrive service, we help you maximize the
HTML5 for ETDs. Virginia Polytechnic Institute and State University CS 4624. May 8 th, 2010. Sung Hee Park. Dr. Edward Fox.
HTML5 for ETDs Virginia Polytechnic Institute and State University CS 4624 May 8 th, 2010 Sung Hee Park Dr. Edward Fox Nicholas Lynberg Philip McElmurray Jesse Racer Page 1 Table of Contents Executive
OVERVIEW OF NTU LIBRARIES 南 洋 理 工 大 学 图 书 馆 简 介
OVERVIEW OF NTU LIBRARIES 南 洋 理 工 大 学 图 书 馆 简 介 4 AREAS OF FOCUS 4 大 工 作 重 点 Supporting scholarly communication & research 为 教 学 科 研 提 供 文 献 信 息 保 障 Preparing students for the knowledge economy 协 助 学 生
A new home page design is being finalized, which will add a new link to material in other languages through the top navigation of the homepage.
Website Translation and Accessibility STAFF REPORT INFORMATION ONLY 16. Date: December 17, 2012 To: From: Toronto Public Library Board City Librarian SUMMARY At its meeting on May 28, 2012 the Toronto
The Australian War Memorial s Digital Asset Management System
The Australian War Memorial s Digital Asset Management System Abstract The Memorial is currently developing an Enterprise Content Management System (ECM) of which a Digital Asset Management System (DAMS)
ISLE Open Educational Resources What IOER Offers Now
ISLE Open Educational Resources What IOER Offers Now September 4, 2015 ilsharedlearning.org IOER Overview What OER Offers Now 2 1 IOER - Open Educational Resources and Object Repository IOER provides open
BeeSocial. Create A Buzz About Your Business. Social Media Marketing. Bee Social Marketing is part of Genacom, Inc. www.genacom.
BeeSocial M A R K E T I N G Create A Buzz About Your Business Social Media Marketing Bee Social Marketing is part of Genacom, Inc. www.genacom.com What is Social Media Marketing? Social Media Marketing
San Francisco 04.2010
Peter Brantley EU Presidency Open Book Alliance Madrid Spain San Francisco 04.2010 [A] book is a machine to think with. - I. A. Richards, - Principles of Literary Criticism, 1924 2009 Total books $23.8B
User s Guide: Archiving Work from an LMS PROJECT SHARE
User s Guide: Archiving Work from an LMS PROJECT SHARE Table of Contents Courses... 2 Groups... 8 eportfolio... 10 File Manager... 14 Institution Administrators... 15 Page 1 The Epsilen learning management
WEB ARCHIVING IN THE UNITED STATES: A 2013 SURVEY AN NDSA REPORT
WEB ARCHIVING IN THE UNITED STATES: A 2013 SURVEY AN NDSA REPORT September 2014 Results of a Survey of Organizations Preserving Web Content Authors Jefferson Bailey, Internet Archive Abigail Grotke, Library
BUSINESS PLAN 2013-2016. Library and Archives Canada
BUSINESS PLAN 2013-2016 Library and Archives Canada Catalogue No.: SB1-6/2013E-PDF ISSN: 2292-0021 Business plan (Library and Archives Canada) Aussi offert en français sous le titre : Plan d affaires 2013-2016
Salesforce CRM Content Implementation Guide
Salesforce CRM Content Implementation Guide Salesforce, Winter 16 @salesforcedocs Last updated: December 8, 2015 Copyright 2000 2015 salesforce.com, inc. All rights reserved. Salesforce is a registered
ANNUAL SURVEY ON INFOCOMM USAGE IN HOUSEHOLDS AND BY INDIVIDUALS FOR 2012
ANNUAL SURVEY ON INFOCOMM USAGE IN HOUSEHOLDS AND BY INDIVIDUALS FOR 2012 Infocomm Development Authority of Singapore 10 Pasir Panjang Road #10-01 Mapletree Business City Singapore 117438 Republic of Singapore
BIG DATA. John A. Eisenhauer Chair, Data Governance Society Rick Young - Managing Director 3Sage Consulting
BIG DATA John A. Eisenhauer Chair, Data Governance Society Rick Young - Managing Director 3Sage Consulting WHAT IS BIG DATA? Volume Amount Velocity Frequency of change Variety Complexity Value WHERE DOES
How to Use Social Media to Enhance Your Web Presence USING SOCIAL MEDIA FOR BUSINESS. www.climbthesearch.com
USING SOCIAL MEDIA How to Use Social Media to Enhance Your Web Presence FOR BUSINESS www.climbthesearch.com s Share WRITTEN Khoi Le Marketing Director [email protected] 2 INTRODUCTION If you or the
How to Drive More Traffic to Your Event Website
Event Director s Best Practices Webinar Series Presents How to Drive More Traffic to Your Event Website Matt Clymer Online Marketing Specialist December 16 th 2010 Today s Speakers Moderator Guest Speaker
Digital Collecting Strategy
Digital Collecting Strategy 2014-15 and 2015-2016 Version: 1.0 Approved by: Executive Owner/sponsor: Director, Library Services Contact Officer: Manager, Collection Strategy and Development Date approved:
COMPLIANCE MATRIX of GIGW
COMPLIANCE MATRIX of GIGW Sl. No. Guide Lines 1.Government of India Identifiers Compliance 1 Association to Government is demonstrated by the use of Emblem/Logo, prominently displayed on the homepage of
EMAIL NEWSLETTERS FOR LEAD NURTURING LEADFORMIX BEST PRACTICES
EMAIL NEWSLETTERS FOR LEAD NURTURING LEADFORMIX BEST PRACTICES An Effective Touch Point for Long-Term Marketing Relationships Introduction Email newsletters are no longer optional when it comes to B2B
