THE GENERAL CONFERENCE CONGRESS OF SOUTHEAST ASIAN LIBRARIANS (CONSAL) XVI BANGKOK THAILAND, 11-13 JUNE 2015 http://www.consalxvi.org ASEAN Digital Library A Regional Project on Digital Resource Sharing Lee Mei Chen Manager, National Library Office, National Library Board, 100 Victoria Street, #07-02 Singapore 188064, Lee_Mei_Chen@nlb.gov.sg Sim Ju Wei Librarian, Sim_Ju_Wei@nlb.gov.sg Fauziah Ibrahim Senior Librarian, Fauziah_Ibrahim@nlb.gov.sg Resource Discovery, National Library Board, Library Supply Centre 3 Changi South Street 2, #02-00, Xilin Districentre Building B Singapore 486548 ABSTRACT National Libraries from the various countries in ASEAN hold a great wealth of library resources that span the spectrum of formats books, papers and manuscripts, maps, photographs, paintings and drawings, audio and video recordings, ephemera, and newspapers. All these can be digitised for easy access and much of this digitisation work has already been done. The ASEAN Digital Library (ADL) is a regional effort to draw together these national databases so that the repositories can be seamlessly searched and accessed. ADL being a single search facility across the digitised content of the various ASEAN National Libraries will be a boon to users and researchers. ADL will also provide global exposure and visibility of content from ASEAN as currently there is a lack of such representation. ADL aims to foster greater awareness of the diverse cultures and heritage in this region. In addition, ADL can effectively engage citizens of ASEAN nations in sharing and learning about their cultures, heritage and shared history; thus nurturing an appreciation of the ASEAN Identity. This paper will share the current developments of the ADL project, highlighting the metadata records requirements, partners digital collections and technology used. Keywords: ASEAN National Libraries, metadata, harvesting, digital resource sharing. 1
1. Introduction National Libraries from across the Southeast Asian region hold a great wealth of library resources that span the spectrum of formats books, papers and manuscripts, maps, photographs, paintings and drawings, audio and video recordings, ephemera, and newspapers. All these can be digitised for easy access, and much of this digitisation work has already been done. What remains, however, is the effort to draw together these national digital resources so that the repositories can be seamlessly searched and accessed from a single point. As a start, the ASEAN Digital Library (ADL) initiative has begun pulling together and aggregating ASEAN digitised content on the prototype ADL platform, thus offering users a one search facility to gain access to the rich cultural resources of all ASEAN National Libraries. 2. Background The idea of an ASEAN Digital Library first came about when the National Library of New Zealand (NLNZ) approached the National Library Board (NLB) of Singapore with the vision of a digital resource sharing platform for the whole of Asia Pacific. NLB will act as the content aggregator for the ASEAN region and build the ASEAN Digital Library (ADL) which could contribute to the larger digital platform for Asia Pacific, to be led by NLNZ. 3. Objectives The main objectives of the discovery and access to information via ADL platform are as follows: a. To provide global exposure and visibility of content from the ASEAN region. Knowledge in the ASEAN region is diverse with respect to culture and language, and is not adequately exposed and represented for discovery on other global and regional platforms. ADL aims to address the lack of such representation and foster greater awareness of the diverse cultures and heritage in this region. b. To nurture appreciation of ASEAN Identity. ADL can effectively engage citizens of ASEAN nations and their dialogue partners in sharing and learning about their cultures, heritage and shared history. c. Professional expertise among member institutions in the development of digital content for discovery and use will be strengthened. Standards and best practices meeting the unique requirements of diverse content (e.g., ontologies specific to Asian knowledge concepts and linguistic translations) will also be widely shared. 2
4. Full ASEAN Participation At CONSAL held in May 2012, NLB proposed the launch of a regional initiative to connect together collections across the Asia Pacific region. Following the CONSAL 2012 meeting, Ms Ngian Lek Choh, the former Director of National Library of Singapore invited all National Libraries in the ASEAN region to join the pilot. Since then, all the ASEAN National Libraries have agreed to take part. With US$40,000 in funding from ASEAN Committee for Culture and Information (COCI) for Year One of the project, the first Project Meeting & Training for ASEAN National Library Country Representatives was held in Singapore in February 2014. 5. First Regional Training & Meeting for National Libraries in ASEAN ADL s first training & meeting was held from 10 14 February 2014, in Singapore. The first 2 days of the meeting saw the training of NLB s technical and metadata staff by NLNZ s DigitalNZ Team. From 12 14 February, the National Library of Singapore welcomed our ASEAN National Library colleagues, who took part in the training, discussions and workshops that were facilitated by the NLNZ and NLB teams. The ASEAN National Library representatives shared at the meeting their digital library collections and future development plans. A key outcome is the commitment from the ASEAN National Libraries to collaborate at the regional level by sharing and contributing the metadata records of their rich digital resources to the ADL platform. Presentation by Mr Wiratna Tritawirasta, Head of Automation Department, National Library of Indonesia (NLI) on NLI s online digitized collections. Presentation by Mr Edguardo Quiros, Chief Information Officer of the National Libraries of the Philippines, on the ideas developed during a break-out group discussion. 3
A group photo of the ASEAN NLs Representatives with the trainers from NLNZ (Mr Andy Neale and Dr Chris McDowell), Director of National Library (Gene Tan). The table below summarises what are some of the possible digitised collections from ASEAN National Libraries for ADL : ASEAN National Library Digitised Collections for ADL NL Cambodia Manuscripts Rare Material Photos Dewan Bahasa dan Pustaka Manuscripts Newspapers Books on Brunei Posters & maps NL Indonesia Indonesia Presidential Library Materials Temple literature website Indonesian Cinematheque Collection of drawings of Johannes Rach NL Laos Lao Manuscripts Texts & Images NL Malaysia NLM Internal Publications Malay Manuscripts Rare Books NL Philippines Otley Beyer Ethnographic Collection Official Gazette Historical Data Rizaliana Collection Picture Collections NL Singapore MusicSG PictureSG NORA BookSG 4
ASEAN National Library Digitised Collections for ADL NL Thailand Damrong Rajanubhab Collection Rare books Manuscripts Thai traditional books Palm leaves Sound and audiovisual recordings NL Vietnam Indochina Books Sino-Nom books English books about Vietnam Newspapers Magazines Journals Collections 6. Roles and responsibilities a. Content Partners Responsibilities Agreement to the terms of use for non-exclusive and royalty free license Ensuring that Content Partners have all the relevant rights or permissions to contribute the metadata Ensuring that metadata contributed do not contain or install viruses, malware etc., are not defamatory, do not violate the privacy or rights of 3rd parties or are unlawful in any way b. NLB Singapore Responsibilities Providing clear attribution of the items (displaying the source and the URL of each item) Promptly removing Content Partner s metadata from the ADL website upon request Giving 30 days notice of any changes to the Metadata Contribution Terms 7. ADL Workflow Upon agreeing to contribute their contents to ADL, Content Partners will be provided with a Step-by-Step guide to assist them in their metadata preparation. The Step-by- Step guide outlines the required and mandatory metadata fields for each submission. The following fields are what were minimally required for the participants when submitting their metadata for ADL: Title: A name given to the resource. For non-english metadata, it is recommended to indicate clearly the label Title in English. The value can be in the original script. Description: Descriptive information about the resource e.g. summary, abstract, etc. 5
Thumbnail URL: A URL resolving to a thumbnail of the content for display in search results list. Landing URL: An HTTP URL resolving to a landing page (webpage) for the content. The following additional fields are also desired if they can be added to the metadata for ADL: Creator: The name of the original creator / photographer etc. Date: The date of creation i.e. when was the photograph taken in whatever date format available Collection Name: Name of the collection(s) that the resource belongs to Subject: The subject of the resource, if available. Rights: Rights statement or any other information about rights held for the resource. Unique Identifier: A unique number or reference for each record that might come from your collection management system As ADL platform works as a gateway, the mandatory fields are necessary to link the search back to the source, held at the Content Partners portals. As much as we want to have a full metadata, NLB, however, understands the circumstances and limitations that are prevalent in the ASEAN region, hence, will still accept contents with minimal metadata as long as there is at least a title and its content URL. NLB works very closely with content partners: a) To gather and evaluate metadata for the identified collection contents. b) To process content partners metadata with NLB harvester and parser. c) To verify if harvested data are properly parsed and ingested into the database. d) To ensure content partners metadata are displayed accordingly in the ADL website. Figure. 1: Workflow for metadata submission & harvesting of the ADL project 6
Cambodia, Laos, Malaysia, Philippines, Thailand and Vietnam are the first five Content Partners that have identified their collections for contribution to the ADL. Content Partners are to submit an XML file containing all the records from their identified collections. Apart from XML file submission, NLB also explores other alternatives such as web harvesting option. The following are the alternatives adopted by NLB in its effort to harvest data: XML Sitemap: A file that lists URLs for a site along with metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site). NLB is able to find a sitemap from NL Laos Digital Manuscript Collection website. OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting is the most efficient harvest method because it can be scheduled to run regularly and automatically pick up only new or modified records. OAI-PMH is aimed at institutional repositories containing digital content. NL Vietnam provided NLB with its OAI-PMH feed address which makes it much easier for NLB to retrieve the metadata from their identified collection. Web Scrapping: Content Partners identified collections, and provided NLB with the URL links of the identified contents. NLB uses this information to extract required metadata information from the participants web pages. However, this option will only be adopted if Content Partners do not have any of the above options available to them. NLB trusts that the Content Partners will consistently update NLB when their site needs updating, if data is harvested through this method. This method is currently used to harvest data from NL Thailand. Prior to harvesting the data, NLB evaluates each collection s content data and checks: a) Whether the mandatory and optional fields are present. b) Whether other additional fields are present. c) The quality of the metadata for example: the use of LCSH vocabulary. While evaluating the metadata, NLB prepares a custom crawling setup, a parsing instruction and a metadata mapping table for each collection s content. This table will map the metadata from each of partner s collection to the NLB Application Profile element (based on Dublin Core). It will then be used during the harvesting and parsing of the metadata, before the data is finally ready for ingestion into the database. 7
Table 2: Sample of mapping table for a collection from a content partner 8. Challenges and Lessons Learned NLB is still in the process of gathering requirements for ADL interface features, while preparing to harvest the data from the content partners. Some of the challenges with the metadata received so far are the lack of metadata elements and the varying degree of encoding variances among Content Partners data. Despite the required 10 metadata fields for submission, many Content Partners provide only the basic metadata fields, without subject or description. While these are understandable and acceptable, it does give the bare and brief appearance when the metadata are displayed on the portal. Another challenge is the language variance across ASEAN. Some of the languages of the metadata are transcribed in local language as opposed to English, hence, greatly limiting the service that ADL can provide. Nevertheless, the language barrier aside, ADL provides a great opportunity to many users and researchers as the contents contributed by the Partners are of high research value to parties with interest in South East Asia. NLB is confident that Content Partners would, when the opportunity arises, provide enhancements to the metadata of these valuable resources and identify more of their valuable collection to be added to the ADL. This is especially the case for Content Partners that have interesting contents but without metadata, or partners that have their contents available only within their intranet platform. Together with the support of the ASEAN Content Partners and the initial collaboration with the NLNZ team, NLB is able to kick start this initiative. However, it remains a challenge to have the Content Partners commitment to allow their metadata to be shared on a regular and consistent basis. This is probably due to the different priorities and limitations some Content Partners may have in sustaining the submission of their metadata. NLB recognized the great collaborative effort from the ASEAN Content Partners and is looking forward to receiving more interesting contents and richer metadata from the 8
other ASEAN countries in the second phase. ADL indeed has a great potential to showcase and publicise the rich treasures in this region, and bring additional vibrancy to the digital library communities. 9. Conclusion In the coming months, NLB will continue working on developing the ADL website, and will speed up the gathering, uploading and testing of the metadata with our ASEAN National Library colleagues. The target is for the pilot ADL platform to be read by end 2015. The pilot launch will showcase the digitised collections from six ASEAN National Libraries; with the remaining four national libraries to come on board in phase two. NLB will also be organising the second Regional Meeting for ASEAN National Libraries in conjunction with the pilot website demonstration in 2015. NLB looks forward to the continuous support of ASEAN National Libraries in creating a unified search of ASEAN digital collections, which showcases ASEAN content and contributes towards greater knowledge exchange and cultural understanding amongst ASEAN countries and with users beyond the ASEAN region. REFERENCES 1. Khor Su Min, Lee Mei Chen and Andy Neale (July 2014). National Libraries Asia- Pacific [Newsletter article]. Retrieved from http://www.ndl.go.jp/en/cdnlao/newsletter/080/807.html 2. Gregory, Lisa and Williams, Stephanie (2014). Retrieved from http://www.dlib.org/dlib/july14/gregory/07gregory.html 3. Andy Neale (2013). Asia-Pacifica Service Concept. 9