1 Reinventing Archival Methods Presentation for Roundtable event in honour of Hans Hofman, National Archives of the Netherlands, The Hague, January Cassie Findlay This paper has been based on one of the same name prepared and delivered at the Australian Society of Archivists conference in 2013 with Kate Cumming, a fellow founder of the Recordkeeping Roundtable. In 1986 David Bearman first argued that the core methods of the archival profession appraisal, description, preservation and access were fundamentally unable to cope with the volumes of information that they were required to process. He called on the profession to completely reinvent its core methods. 1 While much has been done in the intervening 25 years, as a profession, our archival methods are still today ill-equipped to deal with the volume, fragility and complexity of contemporary archival records. Inspired by Bearman, in November 2012 the Sydney-based discussion group, the Recordkeeping Roundtable, hosted a workshop called Reinventing Archival Methods 2. At the workshop participants shared concerns that that archival professional methods are not coping with the scale and complexity of contemporary recordkeeping challenges and that they are failing at a time of critical risk. Participants explored how as a profession we can fundamentally reassess our methods and create a stable archival record of the 21st century. Many of the ideas discussed at the workshop have been distilled into two issues papers developed by the Recordkeeping Roundtable ( Appraisal, by Kate Cumming and Anne Picot, and Access, by Barbara Reed) that examine the archival methods of access and appraisal. 3 Following on from that work and discussions flowing from it, today I would like to talk about some of the things that I think we as a profession should stop doing, and also what I believe we should be doing more of, to explore some strategies for responding to the extensive challenges posed by contemporary digital information and for ensuring the creation of an robust and useful archival record. But first setting the scene. What is the contemporary business landscape and how is information being managed in it? How are records being made, kept and used, and are these methods compatible with the real world? A world characterised by: 1 David Bearman, Archival Methods, Archives and Museum Informatics Technical Report no. 9, Pittsburgh, Archives and Museum Informatics, 1989, accessible via 2 Recordkeeping Roundtable, Reinventing Archival Methods: Report and what s next, December 2012, accessible via 3 Recordkeeping Roundtable, Reinventing Archival Methods: Issues papers - Access and Appraisal, September 2013, accessible via
2 diverse business frameworks comprised of multiple transactional systems and distributed business processes, which are breaking down and fragmenting the formerly consolidated organisational perspectives on projects, programs, clients or transactions. commodification and commercialisation of information within third-party frameworks, and usually in the cloud a user base that expects instant access online. Who do not understand why more contemporary evidence is not available. And a growing sense that if it isn t online, it may as well not exist data growth that is expanding exponentially - and like any growth bubble (for example, property or technology) one that is unsustainable. In response to this trend, David Rosenthal of Stanford University has analysed the costs of storing all of today s data in the cloud. Based on industry figures he estimates that keeping 2011 s data would consume 14% of the gross world product. He extrapolates based on rates of current data growth to estimate that by 2020, the cost of maintaining all the data created in 2020 would be 100% of the gross world product. 4 We cannot keep all the information currently being created, yet there are few plans for its strategic management. And; information consumers who are not going to be satisfied with passive viewing but who wish to engage with and add their own meanings and stories to records, be able to access contemporary records and otherwise participate in the recordkeeping systems of organisations they interact with. And our methods are simply not up to the task. Let s consider some of the evidence of how well our methods are operating in this environment. Recent cross-jurisdictional recordkeeping surveys in the Australian States of Queensland and New South Wales public sectors and a series of workshops run by Archives New Zealand indicate that virtually no disposal of digital records is occurring in these sectors. 5 This means that records of continued business value are not proactively identified for ongoing management, time-expired data is not routinely destroyed and large data volumes are being carried forward which is an unsustainable business and financial practice. All digital data cannot be retained indefinitely through digital system transitions and so the risk of no systematic disposal is that swathes of digital data, irrespective of its business value, will be subject to wholesale destruction or purging at system migration or replacement. We have placed too much emphasis on EDRMS based recordkeeping. UK-based records consultant James Lappin who runs the Thinking Records blog, recently assessed a very large multi-national company with a very strong records management program. He found that the corporate EDRMS occupied only 4% of the organisation s total storage environment. 33% of corporate storage was taken up by business applications and a staggering 63% by corporate David Rosenthal, Let s Just Keep Everything Forever in the Cloud, May 2012, accessible via 5 See Queensland State Archives, 2013 Report on the Recordkeeping Survey of Queensland Public Authorities, accessible via eport2013.pdf and State Records NSW, Digital recordkeeping survey report, accessible via https:// and Archives New Zealand, Summary of Archives New Zealand s Digital Disposal Workshops, October 2013 (unpublished report). 6 James Lappin, Why NARA has no option but to preserve significant accounts, August 2013, accessible via
3 In their current application, EDRMS do not provide the corporate access and management frameworks we expect them to. Due to their low levels of use, and the generally administrative and documentary nature of their content, they also do not broadly mitigate information risk yet they are where the majority of us devote our attention. In many organisations, key business system migrations are frequently left to contract project teams on a deadline and strict budget, with no reference to carefully-developed disposal authorities. Records, including metadata, are left behind and lost, or large swathes are unnecessarily carried forward. Appraisal requirements will never be met in these fragile business environments. The real work is taking place using different tools. First and foremost, in core business systems. The custom built or off the shelf systems that register businesses, track environmental change indicators, manage people or large infrastructure projects. Document creation and sharing happens in wikis or whatever tool a work team decides best fits its needs. In his recent article, Developing a legal risk model for big volumes of unstructured data, Joe Montana, a US-based consultant, gives examples of the completely unregulated data growth that is occurring in corporations today. He says: In our work, we have seen a 2,000-employee company with 20,000 active SharePoint sites, and a 10,000-employee company with more than 50,000 sites. These sites are usually loosely regulated, if at all, and companies not only can t quantify the number, they have no insight into their contents. 7 And yet despite having the potential to be a continuously updated key intelligence source on business risk, appraisal is not used as a management or risk abatement strategy to ensure core information continues to be created and maintained in commercially-based and widely-distributed business frameworks, or to ensure that information that is no longer required can routinely be let go. Complex, organisation-wide systems are deployed without an awareness of ongoing information needs, requirements and risks. These critical risks which threaten information accessibility and integrity now and through time are not broadly understood within business environments and therefore little is done to mitigate them. We have also lost sight of the primacy of access to our mission. Access provision and the ability for the widest possible audience to use records is curtailed by the mess that is the intersection of a range of access laws which, in most jurisdictions, are not resulting in sensible proactive release of records that are not sensitive or personal in nature. Systems for keeping records are closed, not open in either interoperability terms nor in policy, to enable the participation of people and groups who are part of the business but perhaps not the official agency responsible. People and groups are going about the business of building recordkeeping systems for themselves because they are not getting the tools and advice that would be helpful to them from the recordkeeping community; from independent journalists creating a database of primary evidence to the building of recordkeeping systems for core government business to Jason Scott of Archive Team recovering at risk and disappearing web content like GeoCities. 7 Joe Montana, Developing a Legal Risk Model for Big Volumes of Unstructured Data, August 2013, accessible via
4 And still, as archivists and recordkeeping professionals, our approach to all these information risks is to continue to try to apply techniques of the small-data, centralised, controlled paper world to the digital universe. In his 1994 article, Electronic records, paper minds, Terry Cook argued that we must shift our emphasis: from the physical records to the conceptual management, from providing a warehouse service to integrating all the business processes of their sponsor with redesigned recordkeeping systems. And archivists must shift from looking after physical objects to focusing on the functional context in which records-creating activities take place. If we are able to do so, Cook continues: then once again, like Thoth, the Egyptian god of records and archives, we may sit beside the pharaohs rather than in dismal records offices or quiet archival stacks. 8 So; what should we stop doing? We should stop: Putting things in systems The notion that records people had to ensure that records were pulled from their business context and placed in a dedicated system where we could do stuff with them has done us a great disservice. As Andrew Waugh noted at the Reinventing Archival Methods workshop: We need to go where the work is, not where an official conscious record is managed. We should turn our gaze to the natural, unconscious recordkeeping that is going on all around us. We should be about seeing across the business and the systems in our view, understanding the requirements and risk associated with these and staking steps ensure that where the requirements of the business or its stakeholder warrant lots of additional context, auditability or monitoring, this occurs. Using services, APIs and apps. Not by yanking records into an artificially constructed world that strips them of their full potential for meaning, use and reuse. The language of we need to put that into the system also brings with it assumptions about place, custody and control. Surely it s time we got serious about post-custodialism. We ve been talking about it long enough! And not just in the digital business environment. Government archives are not going to be funded to build expensive purpose built facilities into the future. Yet too often our language and practices imply (if not actually require) that stuff comes to us or that we are given things for safe keeping. Too many of our approaches are descendants of paper practices, to do with keeping things and moving them around (often with long periods of time elapsing in between) that were necessitated mainly by space and labour costs. The notion of putting things into things also implies that there is an object to be put. This concept, again borne of paper thinking, simply does not translate when you are dealing with digital records. These are dynamic creatures; formed of chunks of data linked to other data that can be reconstituted, reused and represented. Some of the chunks making up a record may exist in 8 Terry Cook, Electronic Records, Paper Minds, first published in Archives and Manuscripts, November 1994, pp Republished in Archives & Social Studies: A Journal of Interdisciplinary Research, March 2007 and accessible online at
5 completely different system perhaps even one owned by a different organisation. However together, they form contextualised evidence of business. And we should stop relying on the monopoly on access that comes with having custody of unique materials as our ace card and bargaining chip. ABC Radio presenter of a weekly show called Future Tense, Anthony Funnell, illustrated this point when he spoke at our workshop. In the past, when producing radio programs, he said, they would request information from the ABC Archives as part of their research. Now they can find the same content in a fraction of the time online. You Tube and Flickr are the public archives in the contemporary world, libraries and archives are no longer the first stop. They were not concerned about any questions of authenticity, their business requirement was for fast access. Nor was Anthony overly concerned about the sustainability of these resources. The vulnerability of these materials is less real than you might think, he argued, given the copies and distribution: they will survive because enough people want it to survive, based on the scale of copying and use. So delivery of content from our keeping places on its own is not going to get us to the Pharoah s side. And we are never going to do it better than Google. Sure, we have a monopoly on cool old stuff that no-one has ever seen before but this is the stuff retrieved from shelves and discovered by an intrepid reader in our reading rooms. However once that has been digitised (maybe by the reader) and is available online, it s everyone s. Us delivering it is not important anymore. The mass digitisation of our collections by the likes of Ancestry has the same effect. It might seem a long way off but this pattern will continue until it is simply not necessary for people to ask us for stuff. It will all (if open access) be out there. We should stop: Banging on about disposal (and get back to appraisal) Which has, in many practitioners minds, come to mean destruction. The tables need to be turned - instead of making the conversation about we have to dispose of x, y and z after whatever period, we should be banging on about contextuality, access and risk management. Deliver the standards for evidence, the accessibility and the linkages that are needed for critical business information and let the rest take care of itself. It will anyway! Plus, disposal activity as practised in many jurisdictions tends to make us unpopular through long, laborious processes of disposal authority or schedule development, to end up with a set of requirements that no-one seems to be able to apply effectively anyway. Or if they attempt to, it is necessarily a reactive process of applying these often very subjective decisions to records on a case by case basis. Time-consuming in the paper world, impossible in the digital. Many archival authorities are expending huge amounts of time and resources on preparing disposal authorities that agencies are not implementing. The classes cannot easily be related to systems, and there are too many choices Years ago my colleague Tony Leviston suggested that at State Records we stop issuing disposal authorities that included retention requirements and leave destruction decisions to agencies who know their business and the changing risk profile for it. That we focus our attention on coming to grips with the records required for 4 th dimension purposes, their identification and management. We already see signs of this; the National Archives of Australia s rolled up disposal authority, AFDA Express, will surely continue to evolve into such an approach.
6 But I believe this trend is not radical enough. We are expert in registering and tracking agencies, business functions and activities but usually we know next to nothing about the recordkeeping systems that support these. An appraisal strategy that uses a risk based approach to capturing information about these and their recordkeeping requirements, at the macro level, could do much to enable the protection of records that we need to keep while routinely allowing for the letting go of those that we do not. So what should we be doing more of? Pushing back the boundaries of the archives Offering solutions, not problems Pushing back the boundaries of the archives This means pushing back against: jurisdictional, geographic and physical barriers disciplinary boundaries, and barriers of our own making In Australia we increasingly see public entities set up that sit above State boundaries to deliver better services when geographical lines just create inefficiency, or we see clusters of organisations in government, non-government and the private sector, all performing part of the same business. Many areas of Australian life present themselves as in need of an archival strategy that does not respect traditional recordkeeping boundaries; for example, climate change, water sharing, the mining boom, asylum seekers etc. However current appraisal and access frameworks are failing to connect with big cross-jurisdictional issues. In this environment, traditional considerations of access and appraisal concerns that are based on age or custodial thinking or cultural heritage or collection management are damaging and unsustainable. Today we need to identify the opportunities for connecting with information wherever it resides, not just when it is within a single recordkeeping system or the institutional walls of the archive. In terms of crossing disciplinary boundaries, our ongoing existence is under threat unless we can demonstrate value in business environments. Professionally, we need a clear understanding of what it is we can provide that is valuable. What is our value proposition? And how can we clearly and powerfully articulate it, and then follow through with the practical tools that will deliver on our promises? We cannot resolve current information challenges alone. We will need to collaborate with many other professions and groups to ensure that necessary information is identified, protected and managed. We therefore need to break down professional barriers and collaborate to ensure information is made and managed for as long as it is required. Recordkeeping people need to work with programmers and systems administrators. More of us should learn to code ourselves. With just a little bit of understanding of what is possible in building systems and playing with data, we can see the true range of possibilities that are open to us. We need to work more with FOI & privacy people. Many jurisdictions are signing up to the Open Government Partnership, are promoting open government reforms. Key to these are effective FOI regimes. If we do not work to ensure archival access systems are connected with the provision of
7 records via FOI, we will not be able to provide access to the level that society now expects, and we will see records unnecessarily closed for 30 years that should be freely available. We need to work with our users - especially digital humanists these are people who are hungry for the information in the records we manage and interested in creative new ways to search, present and reuse them. People like Tim Sherratt have long understood the value of partnering with this community and calls on recordkeepers to listen to what they need: make our catalogue data open, and make it useable. Share our contextual knowledge and frameworks, enable the joining up of collections for others to make new connections. Such steps are not only going to benefit people interested in history; in Australia a Royal Commission is currently examining child sexual abuse in institutions like churches and orphanages. Here is an opportunity for archival data to have a very immediate impact on the lives of the people seeking answers contained in our records. The rich archival metadata we capture is so powerful, we need to actively explore opportunities to share this data with business, with other archives, with users and communities. It can help to provide a better structure for 4 th dimension recordkeeping, access and use. Offer solutions not problems As a profession I think that we tend to put up blockages and barriers to new initiatives, systems and other types of change. Or we wait to act until the specifications for it are, in our eyes, perfect. While attention to quality is a good thing, this attitude can stymie innovation. Some of us are running out of chances to demonstrate to the resource allocators exactly why we are funded. We are at risk of making ourselves so unpopular we are de-legislated, de-funded and generally ignored. So how can we inject some positive and constructive contributions into the environments in which we work? We need to apply large-scale digital big data concepts to help to combat the massive information risks facing us today. We need to forget about hassling mid-grade officers to capture s, or spending months of time listing backlog transfers. We need to apply risk management to ensure proportionate use of our time and expertise. We need to focus on helping business to define topdown identification of the recordkeeping requirements that our business must heed, for the immediate and the long term. Interesting initiatives to watch in this area include risk based approach to recordkeeping, such as NARA s Capstone focusing on senior or mission critical personnel, and letting the rest take care of itself the National Archives of the UK s work in capturing the UK web domain, with the view that a large proportion of policy and strategy documents and evidence of public programmes will by default be saved (as an aside, it is interesting to follow the progress of the W3C Provenance metadata set 9 which signals a greater desire in the web community to see where data online has come from and the transformations it has undergone. Designed as a means of engendering trust and to operate in the semantic web environment, it is something to watch if we are to assist in achieving recordkeeping outcomes in online environments). By promoting the concept of information risk, by highlighting the problems we are seeing, by alerting organisations to the imminent and inherent risks their information is facing and by developing scaled and risk-appropriate approaches to their management, we can help organisations to resource and proactively manage their key business information, routinely destroy that which they no longer require and focus on sustaining for the long term information that business and the community requires. And by adopting a risk based approach to prioritising our own efforts we can have a much greater impact. 9 W3C What is Provenance?
8 I think that we should be more open to unorthodox means to achieving results that are measurable and demonstrable now. By way of an example, two years ago at State Records New South Wales we inherited a website that had gathered up a small number of annual reports. Probably like many other jurisdictions, government agencies annual reports in NSW are now required to be published electronically only, to save costs. In any case both we and the library had noticed that routine transfer of hard copy reports to us had dried up some time back. Yet it was still out policy to retain these as State archives. By rebuilding the site and promoting it as the way that agencies could offload current and legacy reports from their own sites while linking to us using a custom widget we provide, we solved the problem of cluttered government websites and a desire in government for their to be a central place to access these records, but we also took the step of using the agency data in our archival control system in each report s metadata profile, and advising agencies that by a quick and easy upload, they were no longer required to prepare the old kinds of transfers. This is a positive and helpful message that goes over well with managers, and we get the added bonus of being able to logically link physical archives, published materials and soon, digital archives. We also extended the scope of the site to documents released under FOI. This kind of project gets profile and is seen as an active contribution to government policy, while we are stealthily achieving our own ends behind the scenes. You can visit OpenGov here: In terms of appraisal, the functional assessments of business that we devise are so incredibly valuable. I think we need to be prepared however to throw some of our aspirations away, such as desires for 100% disposal authority coverage in business areas or jurisdictions, and replace them with appraisal strategies based on comprehensive coverage of core or high-risk business. Combined with a bird s eye view of critical recordkeeping systems about which we can provide assistance with regard to migration, deletion and metadata management, we would be far more likely to be actively ensuring that essential records are kept and available. We also need to ask ourselves hard and fundamental questions. For example, with appraisal strategies, can we continue to define these ourselves? Do we need to consider broader documentation strategies, or at least broader and more specific engagement with impacted people and communities, with business and system owners and draw these groups more directly into our appraisal and access frameworks? As recordkeepers, we have critical insights to share. In our access and appraisal frameworks we have always facilitated multi-party access to information, we have frameworks for managing distributed information, we have contextualised information to make it meaningful, we have identified and maintained core information for as long as it is required. In today s business environments, these skills are more valuable than ever and business is struggling to implement every one of them so there is such a need for our capacities. We therefore need to share what we do well but we do also need to acknowledge that we can t and don t have to do it all. We need to acknowledge that we don t have all the answers and we need to work with those who can contribute other parts of the framework. And others want us to help they are more than willing to form multi-disciplinary teams that can build new generation recordkeeping systems. Currently I am an associate for a new media / transparency initiative that will feature an artefacts cave ; source materials on which all published stories are based. The vision is for well documented future-proofed materials such as transcripts, audio and video recordings and more to be contextualised with reference to the journalist, the story and so on, but also to enable patrons of the site to start making their own connections, reuse materials and create new contexts that will grow over time. The lead on the project, a journalist
9 friend based in Ecuador, places the archive at the heart of the endeavour, and I m having a lot of fun talking about how it might work with the coders and journalists who make up the rest of the Board. So, really what it all boils down to is this: access and appraisal are core to what we offer but they require reinvention into contemporary business frameworks. If we work with the clever people, if we are not afraid to try things out, and if we recognise that there could well be a world without our profession if we do not offer value, then I believe our profession can move through this transition and most certainly get that seat with the Pharaoh.