Formats for Exchanging Archival Data An Introduction to EAD, EAC-CPF, and Archival Metadata Standards 7 th International Seminar of Archives from Iberian Tradition, 1 July 2011 7 th International Seminar of Archives from Iberian Tradition www.an.gov.br/tradicaoiberica Rio de Janeiro Brazil June 27th to July 1st, 2011
Michael Rush Accessioning Archivist / EAD Coordinator, Beinecke Rare Book and Manuscript Library, Yale University 2 Co-Chair, Technical Subcommittee for Encoded Archival Description, Society of American Archivists
3 Assumptions and Definitions
Assumptions Familiarity with some ICA standards - ISAD (G), ISAAR (CPF), ISDF, or ISDIAH Awareness of EAD, but little of no experience with it Little or no experience with XML 4
Definitions EAD: Encoded Archival Description EAC-CPF: Encoded Archival Context Corporate bodies, Persons, and Families XML: Extensible Markup Language 5
6 Development History
EAD Development History Berkeley Finding Aid Project (1993-1995) EAD Alpha (1996) EAD Beta (1996) EAD 1.0 (1998) EAD 2002 (2002) EAD 2002 Schema (2007) EAD 2013? 7
EAC-CPF Development History Meeting at Yale University (1998) Meeting at University of Toronto (2001) EAC Beta (2004) EAC-CPF (2010) 8
9 Governance and Maintenance
Governance and Maintenance EAD EAD Working Group (1995-2010) Technical Subcommittee for EAD (2010- ) EAC-CPF Ad hoc working group (2001-2004) EAC Working Group (2007-2011) Technical Subcommittee for EAC-CPF (2011- ) Schema Development Team (2010- ) 10
11 Design Goals and Applications
EAD Design Goals Represent hierarchical structure of finding aids SGML, then XML Flexibility, to encourage adoption. Compatibility with ISAD (G) 12
Delivery Standardization Sharing EAD Applications Transmission/Communication Repurposing 13
Example EAD Implementations Yale Finding Aid Database Online Archive of California (OAC) Northwest Digital Archive (NWDA) Archives Portal Europe (APEnet) 14
EAC-CPF Design Goals Close compatibility with ISAAR (CPF) A change from EAC Beta to current schema XML Philosophical neutrality Relatively simple and straightforward Extensible design Adaptability to relational database structures 15
EAC-CPF Applications Identity/Authority Description Relationships Aggregation Transmission/Communication 16
Example EAC-CPF Implementation The Social Networks and Archival Context Project (SNAC) 18
19 Challenges
Challenges Data migration and/or creation Establishing encoding best practices Delivery Indexing and search Display Data maintenance 20 Sharing
Data migration/creation 21 Methods Hand encoding Templates Scripting Outsourcing Export from databases (Archivists Toolkit, Archon, ICA AtoM) Costs Staff time Staff training Consultant or outsourcing fees Software
Local Encoding Best Practices Yale EAD Encoding Best Practice Guidelines [EAD] Consortial Northwest Digital Archives Best Practice Guidelines [EAD] RLG Best Practice Guidelines for Encoded Archival Description [EAD] 22
Delivery Indexing and search No single solution Popular tools include : XTF (extensible Text Framework) Fedora Commons Repository Software Display Transformation via XSLT (Exstensible Stylesheet Language Transformations) XML --> HTML XML --> PDF 23
24 CSS
Data Maintenance File management Version control Link maintenance 25
Consortia Bulk Aggregators ArchiveGrid Topical Aggregators Sharing U.S. National Library of Medicine, History of Medicine Finding Aids Consortium 26
27 Related Standards
Related Description Standards 28 ICA standards: ISAD(G): General International Standard Archival Description - Second edition ISAAR(CPF): International Standard Archival Authority Record for Corporate Bodies, Persons and Families, 2nd Edition ISDF: International Standard for Describing Functions ISDIAH: International Standard for Describing Institutions with Archival Holdings National Description Standards DACS: Describing Archives: A Content S tandard (USA)
29 EAD Data Structure
EAD: Basic Structure <ead>* <eadheader>* <archdesc>* <dsc> 30
EAD Header <eadheader>* <eadid>* <filedesc>* <profiledesc> <revisiondesc> 31
File Description <filedesc>* <titlestmt>* <titleproper>* <author> <publicationstmt> <publisher> 32
Profile Description <profiledesc> <creation> - Creation <langusage> - Language Usage <descrules> - Descriptive Rules 33
Revision Description <revisiondesc> <change> - Change <date> - Date <item> - Item 34
EAD: Basic Structure <ead>* <eadheader>* <archdesc>* <dsc> 35
Hierarchical Encoding <archdesc> Top level of description. <dsc> Optional child of <archdesc> Consists of nested components 36
Components <c> - Component (Unnumbered) Or <c01> - Component (First Level) <c02> - Component (Second Level) Through <c12> - Component (Twelfth Level) 37
38 Levels of Description
Descriptive Elements Valid as at all levels of description <did> is required at each level of description. 39
<did>* Descriptive Identification Always the first child of <archdesc> and the component elements. Wrapper element containing elements with basic identifying information. Must have one child element. 40
<did> Children <unitid> - Unit Identification [ISAD(G) 3.1.1] <unittitle> - Unit Title [ISAD(G) 3.1.2] 41
<did> Children (continued) <unitdate> - Unit Date [ISAD(G) 3.1.3] <physdesc> - Physical Description [ISAD(G) 3.1.5] 42
<did> Children (continued) <origination> - Origination [ISAD(G) 3.2.1] <langmaterial> - Language of the Material [ISAD(G) 3.4.3] 43
<did> Children (continued) <note> - Note [ISAD(G) 3.6.1] <abstract> - Abstract <physloc> - Physical Location <materialspec> - Material Specific Details <repository> - Repository 44
<did> Children (continued) <did> <container> - Container <dao> - Digital Archival Object <daogroup> - Digital Archival Object Group 45
<did> Siblings <bioghist> - Biography or History [ISAD(G) 3.2.2] <custodhist> - Custodial History [ISAD(G) 3.2.3] <acqinfo> - Acquisition Information [ISAD(G) 3.2.3] 46
<did> Siblings 47 <scopecontent> - Scope and Content [ISAD(G) 3.3.1] <accruals> - Accruals [ISAD(G) 3.3.2] <appraisal> - Appraisal [ISAD(G) 3.3.3] <arrangement> - Arrangement [ISAD(G) 3.3.4]
<did> Siblings (continued) 48 <accessrestrict> - Conditions Governing Access [ISAD(G) 3.4.1] <userestrict> - Conditions Governing Use [ISAD(G) 3.4.2] <phystech> - Physical Characteristics and Technical Requirements [ISAD(G) 3.4.4] <otherfindaid> - Other Finding Aid [ISAD(G) 3.4.5]
<did> Siblings (continued) 49 <originalsloc> - Location of Originals [ISAD(G) 3.5.1] <altformavail> - Alternative Form Available [ISAD(G) 3.5.2] <relatedmaterial> - Related Material [ISAD(G) 3.5.3] <separatedmaterial> - Separated Material [ISAD(G) 3.5.3]
<did> Siblings (continued) 50 <bibliography> - Bibliography [ISAD(G) 3.5.4] <note> - Note [ISAD(G) 3.6.1] <odd> - Other Descriptive Data [ISAD(G) 3.6.1] <processinfo> - Processing Information [ISAD(G) 3.7.1]
<did> Siblings (continued) <prefercite> - Preferred Citation <controlaccess> - Control Access <fileplan> - File Plan <index> - Index 51
52 EAC-CPF Data Structure
EAC-CPF Concepts SINGLE IDENTITY: one person (or corporate body or family) with a single identity represented in one EAC-CPF instance. (Most common.) MULTIPLE IDENTITY-MANY IN ONE: two or more identities (including official identities) with each represented by distinct descriptions within one EAC-CPF instance. Can be programmatically converted into Multiple Identity-One in Many. (Less common though not rare.) MULTIPLE IDENTITY-ONE IN MANY: two or more identities (including official identities) each represented in two or more interrelated EAC-CPF instances. Can be programmatically converted into Multiple Identity-Many in One. (Less common though not rare.) ALTERNATIVE SET: derived EAC-CPF instance that is based on and incorporates two or more alternative EAC-CPF instances for the same entity. To be used by a consortia or a utility providing union access to authority records maintained in two or more systems by two or more agencies. Alternative EAC-CPF instances may be in different languages or in the same language. 53 COLLABORATIVE IDENTITY: a single identity shared by two or more persons (e.g. a shared pseudonym used in creation of a collaborative work). Use Multiple Identity-One in Many. (Rare.)
Basic structure <eac-cpf>* <control>* <cpfdescription> <identity> <description> <relations> 54
Basic Structure <control>: identity, creation, maintenance, status, rules and authorities, and sources used to generate the EAC-CPF instance. 55 <cpfdescription>: description of the EAC-CPF entity <identity>: names <description>: formal and informal descriptive elements <relations>: relationships to other entities, resources and function descriptions
Philosophical neutrality (1) <eac-cpf> <control></control> <cpfdescription> <identity></identity> <description></description> <relations> <cpfrelation></cpfrelation> <cpfrelation></cpfrelation> </relations> </cpfdescription> </eac-cpf> 56
Philosophical neutrality (2) <eac-cpf> <control></control> <multipleidentities> <cpfdescription></cpfdescription> <cpfdescription></cpfdescription> 57 <cpfdescription></cpfdescription> </multipleidentities> </eac-cpf>
<recordid>* <control> <maintenanceagency>* <maintenancestatus>* <maintenancehistory>* <publicationstatus> <languagedeclaration>* <sources>* <conventiondeclaration> <otherrecordid> <localcontrol> <localtypedeclaration> 58
<cpfdescription>/<identity> <entitytype>* <nameentry>** <nameentryparallel>** <entityid> <descriptivenote> 59
Basic Name Models <nameentry> <part></part> <usedates></usedates> </nameentry> <nameentryparallel> <nameentry></nameentry> <nameentry></nameentry> </nameentryparallel> 60
<cpfdescription>/<description> <existdates> <function> <generalcontext> <legalstatus> <languageused> <mandate> <occupation> <place> <bioghist> <structureorgenealogy> <localdescription> 61
<cpfdescription>/<relations> <cpfrelation> <functionrelation> <resourcerelation> @*RelationType <relationentry> <objectbinwrap> <objectxmlwrap> <date>, <daterange>, <dateset> <place> <descriptivenote> 62
<cpfrelation>: relation types @cpfrelationtype identity hierarchical hierarchical-parent hierarchical-child temporal temporal-earlier temporal-later family associative 63
<resourcerelation>: relation types @resourcerelationtype creatorof subjectof other 64
<functionrelation>: relation types @functionrelationtype controls owns performs 65
66 Future Development
EAD Revision Timeline Comment period complete (October 2010 February 2011) EAD Revision Forum (SAA Annual Meeting, August 2011) TS-EAD Working Meeting (March 2012) Release draft schema (Fall 2012) Second comment period (Winter 2013) Finalize schema and documentation (Spring 2013) Release revised schema (August 2013) 67
EAD Revision Goals 68 Clarify relationship with EAC-CPF Improve interoperability with databases Reconsider finding aids as documents or data Simplification To eliminate unnecessary complexity To make implementation easier Improve usability Enable profiles (schema subsets) Data-friendly Implementation-friendly (may or may not be the same as data-friendly)
Future EAC Development EAC-CPF Implementation Review by 2016 Companion EAC standards? EAC-Functions (EAC-F)? EAC-Institutions with Archival Holdings (EAC-IAH)? 69
70
71
72
Questions? michael.rush@yale.edu http://twitter.com/mike_rush 73