... an ambitious research and development initiative... a world-leading text markup technology, anticipating a standards-driven future for the



Similar documents
Overview Document Framework Version 1.0 December 12, 2005

Preservation Handbook

Lecture Overview. Web 2.0, Tagging, Multimedia, Folksonomies, Lecture, Important, Must Attend, Web 2.0 Definition. Web 2.

Glossary of terms used in the survey

Cross Platform Publisher (XPP)

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy Page 1 of 8

Keep it Simple... 7 Transformation-based Development (2013 and Beyond)...7 Less Customization and More Innovation...8 Time to Market...

Future Library Systems : Beyond the Electronic Card Catalogue

System Requirements for Archiving Electronic Records PROS 99/007 Specification 1. Public Record Office Victoria

Digital Library workshop for Parliamentary Libraries

and ensure validation; documents are saved in standard METS format.

UPFRONT XHTML: For Workflow, Not Just the Web

Utilise one central repository for a diverse range of content EQUELLA.

Content Management. component of BoxesOSv3.0

Office SharePoint Server 2007

Adobe Acrobat 9 Pro Accessibility Guide: PDF Accessibility Overview

XML and the College Website A Practical Look at the Use of XML and XSL

A Digital Library Feasibility Study

Functional Requirements for Digital Asset Management Project version /30/2006

Digital Rights Management - The Difference Between DPM and CM

Taking Your Content Mobile. 5 Keys to a Successful Mobile Content Strategy

Creating metadata that work for digital libraries and Google

Making Content Easy to Find. DC2010 Pittsburgh, PA Betsy Fanning AIIM

CHAPTER 1 INTRODUCTION

THE POTENTIAL OF WIKIS FOR PRODUCING DTBOOK CONTENTS

How To Manage Your Digital Assets On A Computer Or Tablet Device

Studio. Rapid Single-Source Content Development. Author XYLEME STUDIO DATA SHEET

Mindshare Studios Introductory Guide to Content Management Systems

11 ways to migrate Lotus Notes applications to SharePoint and Office 365

EBooks: Expanding the School Library

XML Workflow for Digital Content. David Wilcockson, Librios

XML- New meta language in e-business

Creating an EAD Finding Aid. Nicole Wilkins. SJSU School of Library and Information Science. Libr 281. Professor Mary Bolin.

Progress Report Template -

Rotorcraft Health Management System (RHMS)

Queensland recordkeeping metadata standard and guideline

Hosted SharePoint 2010 Key features

Best Practices for Structural Metadata Version 1 Yale University Library June 1, 2008

James Hardiman Library. Digital Scholarship Enablement Strategy

Totara LMS. Key benefits. Key Features

STRATEGY FOR GENERATING ON LINE CURRICULUM CONTENT FOR AUSTRALIAN SCHOOLS

Document Management Glossary

WhitePaper. Getting More from Your Content with Single Source Publishing:

Network Working Group

TERMS OF REFERENCE. Revamping of GSS Website. GSS Information Technology Directorate Application and Database Section

Sharepoint vs. inforouter

How and Why Are Companies Using XML?

ONTOLOGY-BASED MULTIMEDIA AUTHORING AND INTERFACING TOOLS 3 rd Hellenic Conference on Artificial Intelligence, Samos, Greece, 5-8 May 2004

Working With Templates in Web Publisher. Contributed by Paul O Mahony Developer Program

An Introduction to Managing Research Data

- a Humanities Asset Management System. Georg Vogeler & Martina Semlak

Using Dublin Core for DISCOVER: a New Zealand visual art and music resource for schools

CERN Document Server

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

The Rutgers Workflow Management System. Workflow Management System Defined. The New Jersey Digital Highway

What is intelligent content?

EQUELLA. One Central Repository for a Diverse Range of Content.

Taking full advantage of the medium does also mean that publications can be updated and the changes being visible to all online readers immediately.

METADATA STANDARDS AND GUIDELINES RELEVANT TO DIGITAL AUDIO

Workflow Analysis E-THESES BEST PRACTICE SUMMARIES. Josh Brown & Kathy Sadler

Digital Workflows. Madans/Bercovici NYU CSPS Digital Publishing June 21,

06 XML-based Technologies

ebooks: Exporting EPUB files from Adobe InDesign

Oct 15, Internet : the vast collection of interconnected networks that all use the TCP/IP protocols

Implementing Topic Maps 4 Crucial Steps to Successful Enterprise Knowledge Management. Executive Summary

Extracting and Preparing Metadata to Make Video Files Searchable

IBM's practice for facilitating interoperability of Operating Systems

The FAO Open Archive: Enhancing Access to FAO Publications Using International Standards and Exchange Protocols

Report of the Ad Hoc Committee for Development of a Standardized Tool for Encoding Archival Finding Aids

OCLC CONTENTdm. Geri Ingram Community Manager. Overview. Spring 2015 CONTENTdm User Conference Goucher College Baltimore MD May 27, 2015

TaskCentre v4.5 Run Crystal Report Tool White Paper

Invenio: A Modern Digital Library for Grey Literature

Lightweight Data Integration using the WebComposition Data Grid Service

Digital Collecting Strategy

EPiServer and XForms - The Next Generation of Web Forms

Figure 2: DAMA Publications

Standards, Tools and Web 2.0

EFFECTIVE STORAGE OF XBRL DOCUMENTS

SYSPRO Reporting Services

An Approach to the Preservation of Digital Records

technische universiteit eindhoven WIS & Engineering Geert-Jan Houben

ILIAS-Conference, Nuremberg, Offline or Online Content Creation? Fred Neumann

PDF Primer PDF. White Paper

Xyleme LCMS // SINGLE SOURCE // AUTHORING

Content Integration Information for VA LMS Phase II

Sydney Ph: Melbourne Ph: Adelaide Ph: LISAsoft. Bespoke Development

MultiMimsy database extractions and OAI repositories at the Museum of London

OpenText Content Hub for Publishers

Clinical Knowledge Manager. Product Description 2012 MAKING HEALTH COMPUTE

How To Create A Charter Corpus On The Web (For Historians)

Ensure your digital records have a future. Digital preservation

Communiqué 4. Standardized Global Content Management. Designed for World s Leading Enterprises. Industry Leading Products & Platform

Developing open & distributed tools for Fablab project documentation

COURSE ONLINE READINGS SERVICE

EUR-Lex 2012 Data Extraction using Web Services

XML-BASED INTEGRATION: A CASE STUDY

Chapter 1 Introduction

Migrating Lotus Notes Applications to Google Apps

Pastiche. Bring training content to your learners tablets PASTICHE DATA SHEET. Product Highlights

Filing Information Rich Digital Asset Management Coca-Cola s Archive Research Assistant: Using DAM for Competitive Advantage IDC Opinion

Transcription:

... an ambitious research and development initiative... a world-leading text markup technology, anticipating a standards-driven future for the Internet... answers the call to create a Semantic Web... creates a wide range of metadata and rendering alternatives... ensures standards-compliant data is generated by content management software... comes with archive-quality text creation templates... and web services to perform format conversion

It has been widely predicted that the future of electronic content development and distribution is the Semantic Web, founded on ontologies or schemas which define particular areas of knowledge and human activity. Common Ground Markup Language (CGML) is an ontology of creativity and publishing, a comprehensive description of the content development and distribution process. CGML is an XML schema which aims to turn into practical reality the call to create the Semantic Web. It works through a series of inter-related software tools which use as their foundation Common Ground s patent-pending CGML interlanguage mechanism to create interoperability across an unprecedented range of data and metadata standards. This suite of related technologies includes: - Common Ground Markup Language the core XML schema. - CGCreator a Microsoft Word template which generates an underlying XML file which can simultaneously be rendered to different formats (print typesetting, web, electronic reading devices and audio) and supplied with different metadata wrappers (e-commerce records, learning objects, library cataloguing records). - CGExchange web services for format conversion and metadata exchange. - CGDictionary 1200 terms, 65,000 words of definitions, and growing: a comprehensive lexicon of writing and publishing for the digital era. - CGPublisher a workflow tool which collects standards-compliant metadata (see www. CGPublisher.com). These are some of the things CGML is designed to achieve: - Metadata interoperability for digital and physical works create a library cataloguing record (e.g. the Library of Congress s MARC system), an e-commerce record (e.g. the bookselling and publishing industry s ONIX standard), a learning object (e.g. the IMS/ SCORM standards for Learning Management systems), a web syndication feed (e.g. the RSS web standard), an electronic resource discovery record (e.g. the Dublin Core metadata framework), or a job ticket for a print production (e.g. the JDF digital print standard). - Data interoperability for digitised works create a work which will render as typeset text (e.g. the DocBook and TEI standards), or to the web (e.g. the W3C s XHTML standard), or to an electronic reading device (e.g. the Open ebook standard), or to audio (e.g. the Digital Talking book standard, for disability access etc.). - Markup-enhanced searchability, simultaneously across files and metadata moving beyond traditional search algorithms and building on centuries of library (e.g. MARC), bookstore (e.g. ONIX) and text markup practices (e.g. DocBook and TEI). - Archive quality text word processing, desktop publishing and web ways of authoring text may come and go, but the foundational concepts of CGML s commonsense publishing interlanguage including author, title, work, paragraph, image will not. 2

A SELECTION OF PUBLISHING SCHEMAS AND STANDARDS AGLS Biblink DTB DocBook Dublin Core EAD EdNA EML ical IMS/SCORM Indecs JDF MARC ODRL ONIX OEB RSS TEI UKNC XHTML XrML - Australian Government Locator Service - http://www.naa.gov.au/recordkeeping/gov_online/agls/summary.html - European Commission National Libraries Electronic Publications Initiative - http://hosted.ukoln.ac.uk/biblink/ - Digital Talking Book - http://www.daisy.org/about_us/dtbooks.asp - Schema for Writing Structured Documents - http://www.oasis-open.org/docbook/ - The Dublin Core Metadata Initiative - http://dublincore.org/ - Encoded Archival Description Language - http://lcweb.loc.gov/ead/tglib/ - Education Network Australia Metadata Standard - http://www.edna.edu.au/metadata/ - Educational Modelling Language/IMS Learning Design Specifi cation - http://www.imsproject.org/learningdesign/index.cfm - Internet Calendaring and Scheduling Core Object Specifi cation - http://www.w3.org/2002/12/cal/rfc2445 - Instructional Management Systems/ Shareable Content Object Reference Model - http://www.imsproject.org - Interoperability of Data in E-Commerce Systems - http://www.indecs.org/ - Job Defi nition Format - http://www.cip4.org/ - Machine Readable Catalog - http://lcweb.loc.gov/marc - Open Digital Rights Language - http://odrl.net/ - Online Information Exchange - http://www.editeur.org/onix.html - Open e-book - http://www.openebook.org - Really Simple Syndication - http://blogs.law.harvard.edu/tech/rss - Text Encoding Initiative - http://www.tei-c.org/ - UK National Curriculum Metadata Standard - http://www.nc.uk.net/metadata/ - Extensible Hypertext Markup Language - http://www.w3.org/tr/xhtml1/ - Extensible Rights Markup Language - http://www.xrml.org 3

CGCreator is a Microsoft Word template in which: - a creator writes in Word; then - saves to CGML/XML; thus allowing - print, web, audio rendering alternatives; and - metadata alternatives for electronic libraries and repositories, learning management systems, electronic bookstores etc. THE POWER TO SHARE CGExchange allows producers and consumers of knowledge to share information in new and exciting ways. Knowledge-producing communities, such as academic, government and corporate institutions, have seized upon the web as the ideal forum for the exchange of knowledge. While informal dissemination of content is now easily achieved through website creation tools and blogging, users often find they are meeting many of the same obstacles that beset the traditional publishing industry. Indeed, the advent of new technology even accentuates the traditional challenges content creation, copyright protection, format conversion, resource discovery and metadata exchange. The digital revolution has heralded a plethora of relatively incompatible document formats and cataloguing standards Microsoft Word, Adobe Acrobat, HTML and DocBook, to name just a few. CGExchange is a web services platform that uses CGML technology to marry various standard and proprietary formats. Technically, CGML is an interlanguage a language used to describe and translate other languages. It is oriented towards publishing, and utilises the many XML-based languages that have emerged in publishing. These include both document and cataloguing formats, as well as a number of other languages describing copyright, printing and commercial transactions. Building a database to house the many terms and relationships of the publishing industry is difficult; building a definitive set of mappings between terms from different publishing standards is exceedingly complex. This is the foundational technology which makes CGML and CGExchange web services unique. CGExchange solves these problems by providing: - a web services to perform format conversion, resource discovery and metadata exchange. - CGCreator word processing templates to facilitate content creation and copyright protection. 4

There are many tools and technologies that perform one or two of these functions tools for converting Word to PDF, for enforcing copyright and so on. CGCreator and CGExchange harness the power of our CGML technology to do it all. CGExchange is not a workflow application it integrates tightly with CGPublisher, our publishing automation tool, as well as many other standards-based content management systems. It is a lightweight content repository, designed to search efficiently and more effectively than current search algorithms. USES OF CGEXCHANGE CGExchange can be used by both producers and consumers of published works. The following scenarios are just some of the many ways in which CGExchange can cut the cost and time of digital publishing. INDIVIDUAL AUTHOR/READER SCENARIO 1. An author develops a book, using our ready-made or customised CGCreator templates to build the structure of the book parts, chapters, sections, paragraphs and so on. As part of the development, the author enters in key metadata terms, such as the author name, book title and copyright information. 2. The author posts the book s contents to the CGExchange web service, which captures the data and the metadata alike. 3. The author can then choose to submit the book to various online bookstores (such as Amazon, via a publisher who has an Amazon account, or by establishing a publisher account with Amazon) or library catalogues (such as any Open Public Access Catalogue system). The correct format for the submission is automatically generated by CGExchange. 4. A potential reader of the book can search for the book using structured search facilities provided by CGExchange, or via any of the repositories the book has been submitted to (such as Amazon s internal search engine). 5. The reader can also specify the format of the book, such as PDF, HTML, Open e-book, MS Reader and so on. 6. The digital copyright, as specified by the author, is supplied along with the book. If the book s format permits, the book s copyright will be subsequently enforced by the reader s rendering device (allowing only a certain portion of the book to be read, for instance). ORGANISATION SCENARIO 1. A university department with a large repository of research papers residing in a content management system registers the documents with CGExchange. 2. The department decides to generate MARC records for each of the papers, to record in the university s library database. A set of MARC records is generated by CGExchange, based on the metadata resident in the content management system. The data is checked by the librarian for consistency, and then imported. 3. The department decides to make available the research papers on its intranet. As the papers are available in source format as DocBook, TEI, Microsoft Word and OpenOffice, CGExchange is able to generate outputs for a variety of reading devices, as PDF, HTML, MS Reader or Open E-book files. 4. Some of the papers are published as books by the university s press. The department decides to list the books with Amazon for sale. A set of ONIX records are generated by 5

CGExchange, which are then sent to online and physical bookstores and bookdata registries for inclusion in their databases. SCHOOL SCENARIO 1. A school wants to create a knowledge bank of lesson plans and learning resources that is accessible both to teachers across the school, learners and the school community. 2. Teacher-authors use CGCreator templates and save to CGExchange. 3. The school then decides to make this curriculum material available as learning objects in its Learning Management System (e.g. through IMS/SCORM standard), or its electronic library (e.g. through the MARC standard) or even to sell the curriculum resources to through conventional or online bookstores (e.g. the ONIX standard). FEATURES OF CGEXCHANGE - Imports and exports data in more than a dozen publishing standards. - Uses Microsoft Word templates and add-ins to simplify the creation structured content. - Provides for efficient structured retrieval of documents (e.g. by author, title, publisher, copyright date) whether stored in CGExchange or not. - Direct support for Microsoft Word 2003 XML format allows for conversion from Word to other common formats, such as HTML, PDF, DocBook and TEI. - Offers a detailed dictionary of terms and standards supported. - Allows for either lax or rigorous interpretation of standards (e.g. can support both HTML 4.0 and XHTML 1.1). - Non-intrusive can co-exist with content management systems or casual document repositories (file systems). - Cross-platform and standards-based uses XML 1.1, XSLT 2.0, XPath 2.0 and XQuery 1.0, as well as open publishing standards. - Highly scalable can process documents individually or in batches. - Fully customisable rendering formats can use style-sheets to specify formatting and layout requirements. BENEFITS OF CGEXCHANGE - Saves significant costs in either development time or tool procurement. - Saves costs, as charges apply on usage model at a sliding scale. - Ensures documents are open and accessible not locked into proprietary formats or repositories. - Reduces time in hand-coded document conversion. - Increased sales through simplified access to online bookstores, and through support for multiple rendering formats. 6

In the digital age, you need a more than an ordinary dictionary and such an extraordinary dictionary has been created to accompany CGML. Already running to some 1200 terms (each of the CGML XML tags) and 65,000 words of definitions, the Common Ground Dictionary of Creativity and Publishing (CGDictionary) is designed to cover all key concepts in both the traditional and digital worlds of publishing. Ordinary dictionaries capture the range of meanings that fit with a particular word, and all the ambiguities and multiple meanings that characterise everyday, natural language. CGDictionary: - Reduces ambiguity as much as possible, making clear definitional distinctions where natural language leaves room for uncertainty, such as between the various meanings of the word editor the different roles of an editor when they are somebody who puts together an anthology, when they are a commissioning editor who publishes works, or when they are a copy editor who corrects text. - Improves data entry, both at the metadata and file levels, by providing a clear guide as to what is meant by each field, and what will produce valid data across all the standards with which CGML interoperates. - Uses clear, commonsense language even though the equivalent tags in the various standards may be obscure, thus making the standards and file formats more accessible. - Is extensively hyperlinked, with tightly inter-related and cross-defined semantics so that every term is defined in relation to higher order, more abstract or more general concepts, and examples provided in the form of lower-order instances of that term at work. - Is not built on isolated words, but semantic units expressed as words or phrases which represent a particular meaning-making function. In this sense, the dictionary also serves as a functional grammar, and one that is equally applicable to the worlds of traditional textual and digital meaning. - Consists of concepts that are designed to stand the test of time. Not only are they derived from what are emerging as the dominant electronic standards. These standards, in turn, are built on centuries of text-working traditions. CGDictionary is developed and stored in the ontology-building tool, CGLexicographer. This tool is: - Dynamic: new schemas and standards can easily be absorbed, as and when they emerge. CGML is designed to survive technology and standards transitions. - Growing: new concepts or tags are progressively being added, particularly as CGML broadens its scope across a variety of electronic media. - Adaptable: highly flexible, with its application-specific paraphrase space, so that interfaces can speak in the languages of specialist domains, such as book publishing, learning environments, conference management systems, and any number of other communities whose interest is the development and distribution of content. With its Unicode foundation, it can easily be translated into languages other than English and scripts other than Roman. 7

CGDictionary can be embedded in any content management software as the basis for standardscompliant metadata and data formation. One such application is CGPublisher, Common Ground s own online publishing software (see www.cgpublisher.com). Within CGPublisher: - Meaning links display dictionary definition. - All person, publisher, work, workflow, rights and distribution metadata is progressively collected and collated in a form which is interoperable across the broad range of publishing standards and uses represented by CGML. 8

COMMON GROUND PUBLISHING PTY LTD www.commongroundgroup.com MELBOURNE PO Box 463, Altona 3018, Victoria, Australia Ph: +61 (0)3 9398 8000 Fax: +61 (0)3 9398 8088 Cnr Millers Rd & Esplanade Seaholme SYDNEY PO Box K481 Haymarket, Sydney, NSW 2000 Ph: +61 (0)2 9519 0303 Fax: +61 (0)2 9519 2203 Email: info@commongroundgroup.com