Encoding Records in XML: RDF, METS

Similar documents
DTD Tutorial. About the tutorial. Tutorial

04 XML Schemas. Software Technology 2. MSc in Communication Sciences Program in Technologies for Human Communication Davide Eynard

Last Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling

Creating and Managing Controlled Vocabularies for Use in Metadata

+ <xs:element name="productsubtype" type="xs:string" minoccurs="0"/>

DRAFT. Standard Definition. Extensible Event Stream. Christian W. Günther Fluxicon Process Laboratories

Integration and interoperability of data sources: forward into the new century

Modernize your NonStop COBOL Applications with XML Thunder September 29, 2009 Mike Bonham, TIC Software John Russell, Canam Software

<xs:restriction base="xs:string">

XML: extensible Markup Language. Anabel Fraga

<!--=========================================--> <!--=========================================-->

User manual for e-line DNB: the XML import file. User manual for e-line DNB: the XML import file

Security for industrial automation and control systems: Patch compatibility information

Service Description: NIH GovTrip - NBS Web Service

Design and Implementation of a Feedback Systems Web Laboratory Prototype

[MS-DVRD]: Device Registration Discovery Protocol. Intellectual Property Rights Notice for Open Specifications Documentation

keyon Luna SA Monitor Service Administration Guide 1 P a g e Version Autor Date Comment

XML. Document Type Definitions XML Schema

Gplus Adapter 8.0. for Siebel CRM. Developer s Guide

How To Use Xml In A Web Browser (For A Web User)

XIII. Service Oriented Computing. Laurea Triennale in Informatica Corso di Ingegneria del Software I A.A. 2006/2007 Andrea Polini

The Direct Project. Implementation Guide for Direct Project Trust Bundle Distribution. Version March 2013

Advanced PDF workflows with ColdFusion

Standard Recommended Practice extensible Markup Language (XML) for the Interchange of Document Images and Related Metadata

No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation.

Session Initiation Protocol (SIP) Registration Extensions

Chapter 4. Sharing Data through Web Services

<xs:complextype name="trescdokumentu_typ">

Using Dublin Core for DISCOVER: a New Zealand visual art and music resource for schools

XML Schema Definition Language (XSDL)

Filen ex_e.xml. Her kommer koderne Det der står skrevet med fed er ændret af grp <?xml version="1.0"?>

Tecnologie per XML. Sara Comai Politecnico di Milano. Tecnologie legate a XML

ASPIRE Programmable Language and Engine

Data Integration Hub for a Hybrid Paper Search

Introduction to XML. Data Integration. Structure in Data Representation. Yanlei Diao UMass Amherst Nov 15, 2007

Structured vs. unstructured data. Semistructured data, XML, DTDs. Motivation for self-describing data

[MS-FSDAP]: Forms Services Design and Activation Web Service Protocol

CAS Protocol 3.0 specification

CA ERwin Data Modeler

Oracle Java CAPS Message Library for EDIFACT User's Guide

George McGeachie Metadata Matters Limited. ER SIG June 9th,

Structured vs. unstructured data. Motivation for self describing data. Enter semistructured data. Databases are highly structured

Schema XSD opisująca typy dokumentów obsługiwane w Systemie invooclip

Chapter 3: XML Namespaces

How To Write A Type Definition In Xhtml 1.2.2

COM_2006_023_02.xsd <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs=" elementformdefault="qualified">

Compliance Modeling. Formal Descriptors and Tools. , Falko Kötter 2. Report 2014/02 March 28, 2014

MASTER DATA INTEGRATION

<Namespaces> Core XML Technologies. Why Namespaces? Namespaces - based on unique prefixes. Namespaces. </Person>

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?

Understanding Metadata

Role Based Access Model in XML based Documents

Web Content Management System based on XML Native Database

[MS-QoE]: Quality of Experience Monitoring Server Protocol Specification

BACHELOR S THESIS. Roman Betík XML Data Visualization

Achille Felicetti" VAST-LAB, PIN S.c.R.L., Università degli Studi di Firenze!

Presentation / Interface 1.3

Ask DCMI and AskDCMI in question & Answer Format

Introduction. Web Data Management and Distribution. Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart

Exchanger XML Editor - Canonicalization and XML Digital Signatures

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

XML WEB TECHNOLOGIES

Semistructured data and XML. Institutt for Informatikk INF Ahmet Soylu

A Semantic web approach for e-learning platforms

Appendix 1 Technical Requirements

One of the main reasons for the Web s success

An Empirical Study on XML Schema Idiosyncrasies in Big Data Processing

DocuSign Connect Guide

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

XMP Specification. ADOBE SYSTEMS INCORPORATED Corporate Headquarters 345 Park Avenue San Jose, CA (408)

HTML Web Page That Shows Its Own Source Code

XML and Tools. Muhammad Khalid Sohail Khan Mat #: University of Duisburg Essen Germany

Archivio Sp. z o.o. Schema XSD opisująca typy dokumentów obsługiwane w Systemie Invo24

Fast track to HTML & CSS 101 (Web Design)

Semantic Web Languages: RDF vs. SOAP Serialisation

XML An Introduction. Eric Scharff. Center for LifeLong Learning and Design (L3D)

[MS-QoE]: Quality of Experience Monitoring Server Protocol. Intellectual Property Rights Notice for Open Specifications Documentation

IHE Radiology Technical Framework Supplement. Trial Implementation

ATWD XML Web Service Handbook

Comparison of Fully Software and Hardware Accelerated XML Processing

EFSOC Framework Overview and Infrastructure Services

Multimedia Applications. Mono-media Document Example: Hypertext. Multimedia Documents

CHAPTER 1 INTRODUCTION

Library and Archives Data Structures

EUR-Lex 2012 Data Extraction using Web Services

Extending the Linked Data API with RDFa

Effective Management and Exploration of Scientific Data on the Web. Lena Strömbäck Linköping University

Introduction to XML Applications

Chapter 2: Designing XML DTDs

A LOTOS NT Library for Modelisation, Analysis, and Validation of Distributed Systems

CHAPTER 9: DATAPORT AND XMLPORT CHANGES

METADATA STANDARDS AND GUIDELINES RELEVANT TO DIGITAL AUDIO

Et tu, XML? Philip Wadler, Avaya Labs

YAZ proxy User s Guide and Reference. YAZ proxy User s Guide and Reference

Enterprise Content Management (ECM) Strategy

Integration of an XML electronic dictionary with linguistic tools for natural language processing

Bridging the Browser and the Server

General Information. Standards MX. Standards

Liberty ID-WSF Authentication, Single Sign-On, and Identity Mapping Services Specification

D4.1.2 Cloud-based Data Storage (Prototype II)

Transcription:

714: Metadata Encoding Records in XML: RDF, METS Margaret E.I. Kipp - kipp@uwm.edu https://pantherfile.uwm.edu/kipp/public/courses/714

Encoding Metadata in XML metadata encoding covers the process of encoding the metadata records in an encoding scheme such as XML metadata may be stored in two principle forms: internal storage or external storage internal storage: metadata embedded in the object itself (e.g. metadata in an HTML header) external storage: metadata is stored separately (a surrogate record, many digital libraries)

Internal vs External Storage Advantages to Internal Storage metadata is always with the item so there is no need to add the extra surrogate record and a link between them Advantages to External Storage no need to modify item to accept metadata record allows for situation where item is not available electronically

Expressing Metadata in HTML/XML HTML embedded in HTML document uses <meta name="name" content="content"> and <link rel="property" href="uri"> tags may also use lang attribute XML uses XML schema to define namespace i.e. element names (HTML already defined) XML files are ideal for storage in databases DBs also designed with E-R models

Expressing Metadata in XML/RDF RDF (Resource Description Framework) is a standard designed for encoding web metadata for the semantic web RDF uses the URI (often a URL) as the mechanism for identifying objects objects may be URIs or constant values (i.e. the date, time or language) RDF also provides information about and relationships between web resources and real world concepts such as people, places, concepts, etc.

RDF Model every statement can be structured as a triplet consisting of a subject, predicate and object information on the web has no obvious structure to a computer, but structured statements can be encoded in RDF making them machine readable e.g. could indicate the subject, object and verb in a sentence In "Alice lives in Florida" a computer would not know that Alice is a person who lives in Florida (a US State)

Structured Statements Alice (person) [subject] -- lives in [verb] -- Florida (US State) [object] In this simple encoding format: round brackets () indicate information about a term, clarifications square brackets indicate the part of speech in a grammatical sense.

RDF Example Wisconsin (subject)--has the postal abbreviation (predicate)--wi (object) <?xml version="1.0"?> <rdf:rdf xmlns:rdf="http://www.w3.org/1999/02/22-rdfsyntax-ns#" xmlns:terms="http://purl.org/dc/terms/"> <rdf:description rdf:about="urn:xstates:wisconsin"> <terms:alternative>wi</terms:alternative> </rdf:description> </rdf:rdf>

RDA/DC in RDF Example <?xml version="1.0"?> <rdf:rdf xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:dcq="http://purl.org/dc/qualifiers/1.0/"> <rdf:description about="155192398x (pb)"> <dc:title>harry Potter and the philosopher's stone /</dc:title> <dc:creator>rowling, J. K.</dc:creator> <dc:format>223 p. : 20 cm.</dc:format> <dc:publisher>raincoast</dc:publisher> <dc:publisher>vancouver, BC</dc:publisher> <dc:date>2000</dc:date> <dc:identifier>155192398x (pb)</dc:identifier> <dc:identifier>9781551923987 (pb)</dc:identifier> <dc:language>eng</dc:language> <dc:subject><rdf:description><dcq:subjectqualifier>namepersonal</dcq:subjectqualifier><rdf:value >Potter, Harry (Fictitious character)--juvenile fiction.</rdf:value></rdf:description></dc:subject> <dc:subject> <rdf:description> <dcq:subjectqualifier>topical</dcq:subjectqualifier> <rdf:value>hogwarts School of Witchcraft and wizardry (Fictitious place)--juvenile fiction.</rdf:value> </rdf:description> </dc:subject> <dc:subject> <rdf:description> <dcq:subjectqualifier>topical</dcq:subjectqualifier> <rdf:value>wizards--juvenile fiction.</rdf:value> </rdf:description> </dc:subject> <dc:relation>harry Potter ; bk. 2</dc:relation> <dc:type>text</dc:type> </rdf:description> </rdf:rdf>

Combining Metadata Descriptions modules/chunks of metadata records can be combined into a single structure for transmission to other systems METS - Metadata Encoding and Transmission Standard provides a framework for incorporating components from various metadata schemes within one structure METS can package descriptive, administrative and structural metadata into one XML document for exchange with other repositories

METS METS is an XML Schema which expresses: 1) the hierarchical structure of digital library objects 2) the names and locations of the files that constitute those objects 3) all associated metadata (Zheng and Qin 2008, p. 200) each part of the metadata record may be another record, to which METS record points http://www.loc.gov/standards/mets/

METS header (req) descriptive, administrative metadata file section structural map (required) structural links METS Records behaviour http://www.dlib.org/dlib/june06/zeng/06zeng.html

METS Examples <mets:mets><mets:dmdsec ID="MODS1"> <mets:mdwrap MDTYPE="MODS"> <mets:xmldata> <mods:mods version="3.3"> <mods:titleinfo> <mods:title>great conversations: the pianists</mods:title> </mods:titleinfo>... </mets:xmldata></mets:mdwrap> </mets:dmdsec></mets:mets> http://www.loc.gov/standards/mets/mets-examples.html

Multiple Schemas in a Namespace Common practice to add elements to DC via another schema. e.g.: <record xmlns="http://example.org/learningapp/" xmlns:xsi="http://www.w3.org/2001/xmlschemainstance" xsi:schemalocation="http://example.org/learningapp/ http://example.org/learningapp/schema.xsd" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ims="http://www.imsglobal.org/xsd/imsmd_v1p2" > http://dublincore.org/documents/dc-xml-guidelines/

Multiple Schemas using RDF <rdf:rdf xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:js="http://js.org/meta/"> <rdf:description about="http://js.org/doc/1"> <dc:title>metadata sharing and XML</dc:title> <dc:creator>john Smith</dc:creator> <js:rating>3</js:rating> </rdf:description> </rdf:rdf> http://www.ukoln.ac.uk/interop-focus/gpg/metadata/

Multiple Schemas using METS METS can be used to store multiple metadata schemas for the same object e.g. LC METS record contains MODS and MARCXML http://lcweb2.loc.gov/diglib/ihas/loc.natlib.gottlieb.00011/full.html http://lcweb2.loc.gov/diglib/ihas/loc.natlib.gottlieb.00011/mets.xml

Parallel Metadata used for handling multilingual materials in a digital collection two approaches: inline parallel metadata metadata record includes multilingual terms external parallel metadata multiple metadata records for each language e.g. <dc:subject scheme="lcsh" xml:lang="en">united States History</dc:subject> <dc:subject scheme="rvm" xml:lang="fr">histoire des Etats-Unis</dc:subject>

In-Class Exercise: Record Creation Create an RDF record for the course textbook or create a METS record containing two different record formats (you can use records you have previously created).

XML Schema Definitions

DTD (Document Type Definition) original style for defining the elements in an XML schema defines: elements if repeatable if required not written in XML DTD Tutorial: http://www.w3schools.com/dtd/default.asp

Example External DTD <!ELEMENT books (book+)> <!ELEMENT book (authors,title)> <!ELEMENT authors (author+)> # + means >1 <!ELEMENT author (#PCDATA)> <!ELEMENT title (#PCDATA)> specifies a books object which can contain multiple book objects each book has at least one author and a title

Example Internal DTD <?xml version="1.0"?> <!DOCTYPE books [ <!ELEMENT books (book+)> <!ELEMENT book (title, authors)> <!ELEMENT authors (author+)> <!ELEMENT author (#PCDATA)> <!ELEMENT title (#PCDATA)> ]> <books> <book><title>metadata</title> <authors><author>zeng</author><author>qin</author></authors></book> </books>

Components of a DTD All XML documents (including HTML and XHTML) are made up of the following elements: Elements - the elements named in your schema Attributes - these refine the elements (e.g. the href attribute in the <a> or anchor tag for URLs Entities - special characters e.g. PCDATA - character data which will be parsed for special characters or markup CDATA - character data

Declaring Elements in a DTD Declaring an element: <!ELEMENT element-name (#PCDATA)> e.g. <!ELEMENT title (#PCDATA)> Elements which contain other elements: <!ELEMENT element-name category> or <!ELEMENT element-name (element-content)> e.g. <!ELEMENT book (title, author)> # elements must be declared in the order listed here, this element has two subelements

Declaring Elements in a DTD 2 Number of occurrences of element: Once: <!ELEMENT book (author)> One or more: <!ELEMENT book (author+)> Zero or more: <!ELEMENT book (author*)> Zero or one: <!ELEMENT book (author?)> Choice between two elements: <!ELEMENT book (title,author,publisher,(isbn url))> can contain either isbn or url Tutorial: http://www.w3schools.com/dtd/dtd_elements.asp

Declaring Attributes in a DTD <!ATTLIST element-name attribute-name attribute-type default-value> DTD example: <!ATTLIST unit type CDATA "metric"> XML example: <unit type="metric" /> You can also specify a list of values as the attribute-type DTD example: <!ATTLIST unit type (metric, imperial) "metric"> Instead of specifying a default-value you can also specify #REQUIRED, #IMPLIED (optional) or #FIXED (value is fixed) http://www.w3schools.com/dtd/dtd_attributes.asp

Choosing Elements or Attributes XML does not enforce the choice between elements and attributes normally, attributes should be used for data which is specific to a single metadata element (for example, information about the language of the summary) data which refers to the entire object being described would go in an element

Example Elements or Attributes Using attributes: <book author="marcia Zeng" title="metadata" /> Using subelements: <book> <author>marcia Zeng</author> <title>metadata</title> </book> Using subsubelements: <book> <author> <firstname>marcia</fir stname> <lastname>zeng</last name> </author> </book>

Defining Entities Entities are special characters or special information, in programming languages these are called constants (e.g. ) e.g. you could define the base URL for your site and insert it using an entity, then if this changes you only need to update one thing Declarations: Internal declaration: <!ENTITY entity-name "entity-value"> External declaration: <!ENTITY entity-name SYSTEM "URI/URL"> Example declaration: <!ENTITY baseurl " http://www.example.com/"> XML Example: <url>&baseurl;</url>

Verifying XML and DTDs The following URL contains a set of validators for determining if your XML is correct It will also allow you to validate a DTD http://www.w3schools.com/xml/xml_validator.asp More DTD example: http://www.w3schools.com/dtd/dtd_examples.asp

XML Schemas (XSD) a language for defining XML elements and structures "the semantic and structural definition of metadata elements and the relationships between the elements" [Zeng and Qin, 131] the structure and elements in an XML document can be defined by a DTD (Document Type Definition) or an XSD (XML Schema) XSD is itself encoded in XML

Simple DTD: Book <!ELEMENT books (book+)> <!ELEMENT book (author,title)> <!ELEMENT author (#PCDATA)> <!ELEMENT title (#PCDATA)> specifies a books object which can contain multiple book objects (+) each book has an author and title

Simple XSD: Book <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/xmlschema"> <xs:element name="books"> <xs:complextype><xs:sequence> <xs:element ref="book" maxoccurs="unbounded"/> </xs:sequence></xs:complextype></xs:element> <xs:element name="book"> <xs:complextype><xs:sequence> <xs:element name="author" type="xs:string"/> <xs:element name="title" type="xs:string"/> </xs:sequence></xs:complextype> </xs:element></xs:schema>

XML Schema Elements tag: <xs:element> defines an XML element <xs:element name="[element name]" type="[type]"/> element name: the name of your element type: from a list, use xs:string unless you must have a specific data type (e.g. number, datestamp, boolean (true/false)) e.g. <xs:element name="author" type="xs:string"/>

Elements: Mandatory, Optional, Repeatable attributes maxoccurs and minoccurs can be used to specify how often an element occurs, they can take values of 0 (optional), unbounded (repeatable) or a number specifying exact number of repeats e.g.: repeatable: <xs:element name="[element]" type="[type]" maxoccurs="unbounded"/> optional: <xs:element name="[element]" type="[type]" minoccurs="0"/> only once: <xs:element name="[element]" type="[type]"/>

Attributes elements can have attributes <xs:attribute name="[attribute]" type="[type]"/> optional by default, can also take a use attribute to specify that it is required or optional <xs:attribute name="[attribute]" type="[type]" use="required"/>

An Element with Attributes <xs:complextype name="date"> <xs:simplecontent> <xs:extension base="xs:string"> <xs:attribute name="format" type="xs:string"/> <xs:attribute name="land" type="xs:string"/> </xs:extension> </xs:simplecontent> </xs:complextype> we are defining a new type here, an element with attributes, based on xs:string e.g. <date format="yyyy-mm-dd" lang="en">2009-09- 30</date>

Sequences of Elements <xs:sequence> <xs:choice minoccurs="0" maxoccurs="unbounded" > <xs:element ref="title"/> <xs:element ref="creator"/> </xs:choice> </xs:sequence> from DC xs:sequence specifies a list of elements to include xs:choice specifies that you can choose which to use

XML Schemas Simple DC XML Schema http://dublincore.org/schemas/xmls/simpledc20021212.xsd no required elements and no required schemes http://www.ukoln.ac.uk/metadata/dcmi/dc-xml-guidelines/ http://dublincore.org/schemas/xmls/ EAD XML Schema http://www.loc.gov/ead/ead.xsd Markup Languages: http://en.wikipedia.org/wiki/list_of_xml_markup_languages

XML Schema Examples and Tutorials http://www.codalogic.com/lmx/xsd-overview.html a short introduction to XML Schema with examples http://www.w3schools.com/schema/default.asp tutorials for XML and XML Schema

Toybrary: Final Element List Form a small group. Select one of the toys to test catalogue. Based on the test cataloguing, we will discuss the following: elements to keep required/optional/repeatable etc. elements descriptions of elements suggested value standards/controlled vocabularies for elements

In Class Exercise: Designing an XML Schema Create a simple DTD or XML Schema to describe a schema for toys using our current application profile/data dictionary.