Archiving digital documents and E-Mails in PDF/A



Similar documents
PDF/A the standard for long-term archiving

White Paper. 3-Heights Document Converter Basics and Applications

Processing PDF/A Documents

Tibiscus University, Timişoara

PDF Primer PDF. White Paper

PDF/VT The ISO Standard for the Printing of Variable and Transactional Documents

WHITE PAPER. 3-Heights Scan to PDF Server Basics and Applications

Server-Based PDF Creation: Basics

Frequently Asked Questions (FAQs) ISO :2005 PDF/A-1 Date: July 10, 2006

PDF/VT the ISO Standard for Variable Data Printing (VDP) Applications

PDF/A A standard for document archiving. Dipl. Inf. Reinhold Müller-Meernach. Dr. Uwe Wächter. SEAL Systems info@sealsystems.com

To be productive in today s graphic s industry, a designer, artist, or. photographer needs to have some basic knowledge of various file

PDF solution comparison

Lessons from document archiving PDF/A Dave McAllister, Director, Open Source and Standards Adobe Systems Incorporated. All Rights Reserved.

PDF solution comparison

Connections to External File Sources

White Paper. Digital signatures from the cloud Basics and Applications

PDF Accessibility Overview

Links. Blog. Great Images for Papers and Presentations 5/24/2011. Overview. Find help for entire process Quick link Theses and Dissertations

In addition, a decision should be made about the date range of the documents to be scanned. There are a number of options:

White Paper: Securely archiving s

Nuance Power PDF Advanced.

How To Use Pdf Files On A Pc Or Mac Or Mac With A Pdf File Manager On A Microsoft Powerbook Or Powerbook On A Pdf (Powerbook) On A Mac Or Powerintosh On A Powerbook With A Powerpoint 3D

PDF solution comparison.

VeraPDF: Building the definitive PDF/A validator The European Union s PREFORMA project

I want to run my business with state of the art technology.

I want to run my business with state of the art technology.

ABBYY FineReader 12 Corporate

Adobe Acrobat 9 Pro Accessibility Guide: PDF Accessibility Overview

GUIDANCE FOR INDUSTRY

ELO for SharePoint. More functionality for greater effectiveness. ELO ECM for Microsoft SharePoint 2013

Navigate your workflow

Designing forms for auto field detection in Adobe Acrobat

Best practices for producing high quality PDF files

Perfecting Advanced Rendering ADLIB PDF PRODUCT GUIDE

Veco User Guides. Document Management

How to Convert Outlook Folder Into a Single PDF Document

Frequently Asked Questions (FAQs) PDF/E-1 Date: March 2008

MFC Mikrokomerc OFFER

Electronic Records Management Guidelines - File Formats

Perfect PDF 8 Premium

Personalizing Your Signature Appearances

Océ PRISMA archive software. Archiving made easy. Powerful, high-volume. archiving software

Preservation Handbook

Bluebeam vs. Adobe. Bluebeam Vu vs. Adobe Reader X

Smithsonian Institution Archives Guidance Update SIA. ELECTRONIC RECORDS Recommendations for Preservation Formats. November 2004 SIA_EREC_04_03

INDIVIDUAL bizhub ENHANCEMENT

11 ways to migrate Lotus Notes applications to SharePoint and Office 365

What s New in Version Cue CS2

eform Suite for TeleForm Create and Process Intelligent eforms in PDF and HTML

Santiago Canyon College - Division of Continuing Education 8045 E Chapman Ave., Room U-84, Orange, CA (714)

Adobe Acrobat 9 Pro Accessibility Guide: Creating Accessible PDF from Microsoft Word

ABBYY PDF Transformer+ User s Guide

d3 Document Management Solution

ELO Product Comparison

White Paper Series. TransPromo Printing Made Easier With PDF/VT

PRESS RELEASE. AIIM, Philadelphia, May 15 th 2006 Embargo until, May 15 th 2006, at 5:40 p.m

Office of History. Using Code ZH Document Management System

A. Scan to PDF Instructions

Creating a High Resolution PDF File with Adobe Acrobat Software

MapInfo Professional Version Printing Guide

File Formats for Electronic Document Review Why PDF Trumps TIFF

3 C i t y C e n t e r D r i v e S u i t e S t. L o u i s, MO w w w. k n o w l e d g e l a k e. c o m P a g e 3

TEXT FILES. Format Description / Properties Usage and Archival Recommendations

Corporate Records Scanning Strategy

Dispatcher Phoenix is available in three distinct and customizable solutions to meet customer needs most effectively and efficiently:

Print File Formats: A Comparative Analysis of EMF, OpenXPS and PDF for Enterprise Printing

The Need for an Alternative PDF Solution Create, Collaborate, Edit, and Secure PDF Documents with Foxit PhantomPDF PhantomPDF Express...

Digital Asset Management

Konica Minolta Unity Document Suite. Powerful integrated document processing. Document capture & distribution Unity Document Suite

Printing Guide. MapInfo Pro Version Contents:

Top 10 PDF Tips. Top Ten Functions In ecopy PDF Pro Office

PRINT BETTER EASILY THE PERFECT SOFTWARE SOLUTION FOR YOUR INKJET PRINTER

Application Notes "EPCF 1%' 1SJOU &OHJOF "11&

The next level of enterprise digital asset management

So you say you want something printed...

Digital photo management using iview MediaPro 3 and Capture One PRO

Best Practices: PDF Export

TSG Leverages ImageCapture Suite SDK to Develop a Document Management Application for a Healthcare Client

e CABINET AND DOCULEX Document Capture and Electronic File Conversion

One Platform for all your Print, Scan and Device Management

Getting Started Guide. Chapter 10 Printing, Exporting, and ing

Going Paperless The Utah Experience. Mike Pecorelli Project Manager Utah DEQ

Document Scanning Essentials

Creating Accessible PDF Documents with Adobe Acrobat 7.0 A Guide for Publishing PDF Documents for Use by People with Disabilities

Adobe LiveCycle ES2 Output Solutions

The Challenge Handling a lot of paper documents

DATA MANAGEMENT FOR QUALITATIVE DATA USING NVIVO9

ONE PLATFORM FOR ALL YOUR PRINT, SCAN, AND DEVICE MANAGEMENT

Critical Communications Solution Suite

Transcription:

PDF/A Archiving digital documents and E-Mails in PDF/A *** Webinar Wednesday, May 27, 2009 *** PDF Tools AG 28.05.2009 Copyright 2008 PDF/A 1

Introductory remarks The presentation will last around 45 minutes Afterwards there will be additional 15 minutes to answer your questions; please use the chat/question function to ask. We are not native English speakers, thanks in advance for your understanding ;-) 28.05.2009 Copyright 2008 PDF/A 2

What is PDF/A? ISO 19005 is a standard of the International Organisation for Standardization (ISO) and has been published on October 1, 2005, as ISO 19005-1: Document Management - Electronic document file format for long term preservation - Part 1: Use of PDF 1.4 (PDF/A-1) The ISO norm defines the standard format PDF/A-1 for the long-term archiving of electronic documents. It is based on PDF version 1.4 of Adobe Systems. PDF/A (A stands for Archiving ) is a variant of PDF It only contains elements that are suitable for longterm archiving (no dynamic elements etc.) Elements that are necessary for a flawless reproduction are embedded into the document, such as fonts and color profiles 28.05.2009 Copyright 2008 PDF/A 3

PDF/A founded in 2006 The aim of the PDF/A is to promote the exchange of information and experience in the area of long-term archiving in accordance with ISO 19005: PDF/A. This is achieved through these activities: Promotion of the PDF/A Standard Classical and on-line marketing Education about PDF/A Conferences, Seminars, Presentations Actually: 3rd International PDF/A Conference, June 16/17-18, 2009 in Berlin () Work on the ISO Standard National representatives in the ISO committee of USA, Japan, Germany, Austria and Switzerland Technical Working Group Publications (TechNotes) Coordination of the requests to the ISO committee Testsuites for the certification of products 28.05.2009 Copyright 2008 PDF/A 4

PDF/A ca. 100 members Partner Members Full Members 28.05.2009 Copyright 2008 PDF/A 5

PDF Tools AG Founded as an independent spin-off company in 2002, in PDF market since 1993 Server-based developer tools for creating, processing, converting, rendering and enhancing PDF and PDF/A documents International: Customers in over 60 countries, branch in Canada Swiss delegate in the ISO Working Group 171 (PDF/A, PDF 1.7) with voting rights Largest range of PDF/A compliant products worldwide 28.05.2009 Copyright 2008 PDF/A 6

Your hosts Dr. Hans Bärfuss, Chief Executive Officer, PDF Tools AG - Works on PDF technology since 1993 - Active member of the ISO committee for PDF/A - Founder/vice president PDF/A Dr. Hans-Rudolf Aschmann, Chief Technology Officer of PDF Tools AG - Also works for more than 15 years in the PDF world - Specialist for PDF/A from digital sources - Software architect of the Document Converter Service Carlo Nessi, Head of Marketing of PDF Tools AG - IT marketing since 1989 (3M, Canon, Swisscom) 28.05.2009 Copyright 2008 PDF/A 7

Overview You will learn How digital documents develop as archive material Which properties analog and digital source have Why it is worthwhile to convert digital sources to PDF/A for archiving How digital sources are converted to PDF/A (processes, challenges, special sources, font handling, digital signatures etc.) 28.05.2009 Copyright 2008 PDF/A 8

PDF/A within the AIIM model for ECM Manage Capture STORE Deliver Preserve 28.05.2009 Copyright 2008 PDF/A 9

PDF/A within the AIIM model for ECM PDF/A PDF/A Processing Processing & Commenting Commenting Manage PDF/A PDF/A Creation, Creation, Conversion Conversion & Digital Digital Signing Signing Capture STORE Deliver PDF/A PDF/A Viewing Viewing & Printing Printing Preserve PDF/A PDF/A Validation Validation & Optimization Optimization 28.05.2009 Copyright 2008 PDF/A 10

Sources of digital documents Inbox Scans with or without OCR (optical character recognition) E-mails with or without attachments Office, graphics and construction MS Word, Excel, Powerpoint, Visio, etc. Illustrator, Indesign, Photoshop, etc. CAD: Autocad, 3D Studio Max, etc. Elektronic data interchange SWIFT, EDIFACT, etc. Outbox Print data streams: PostScript, PCL, AFP, etc. Archive migrations Masses of TIFF and other files, including source data (metadata, object relationships, etc.) 28.05.2009 Copyright 2008 PDF/A 11

Attributes of analog and digital sources Attribute Analog Digital Sources Scanner, raster images Standard and proprietary formats from applications and data streams, in file storage, mailboxes and attachments Quality of the source Good Large differences Complexity of the source Low Can be very high Product differentiation Compression rate, performance Quality Biggest challenge OCR recognition rate Loss of information during the conversion 28.05.2009 Copyright 2008 PDF/A 12

Testing of print pathes (1) The following samples are extracts from PDF/A compliant files The results show, that the conversion with low quality tools can be problematic 28.05.2009 Copyright 2008 PDF/A 13

Testing of print pathes (2) Original Incorrect Conversion 28.05.2009 Copyright 2008 PDF/A 14

Testing of print pathes (3): Fonts Original Incorrect Conversion 28.05.2009 Copyright 2008 PDF/A 15

Testing of print pathes (4) Original Incorrect Conversion 28.05.2009 Copyright 2008 PDF/A 16

Why convert to PDF/A? The user does not have to maintain the original native applications and the platforms on which the applications operate, to view the documents Users depend less on software manufacturers because all of the relevant information is saved in one ISO-standardized format and this format is manufacturer-independent (PDF/A) Simplified processing due to the fact that the archived data is standardized into one format. Option to perform a full-text search in all of the stored data. These advantages involve an economic benefit that must not be underestimated. Disadvantes: loss of interactivity or the built-in functionality of the native format. Solution: Archiving as PDF/A and in the native format. 28.05.2009 Copyright 2008 PDF/A 18

Conversion to PDF/A Proprietary formats PDF/A PDF/A Producer Producer (Printer (Printer ( Driver ( Driver Host Applications PDF/A PDF/A Export Export (Save (Save to to ( PDF ( PDF Standard formats Direct Direct conversion conversion to to PDF/A PDF/A (incl. (incl. ( OCR ( OCR 28.05.2009 Copyright 2008 PDF/A 19

Challenges of the conversion of digital documents to PDF/A Colors: If the colour profiles from the sources are missing, assumptions have to be made about the color space Fonts: If fonts (or glyphs) are missing, replacement fonts must be selected. To do this, the text must be a Unicode text Transparency: The flattening of transparency is complex and may lead to the loss of information (fonts, vectors, etc.) Levels, interactive and multimedia elements: Only the Print Preview is retained Actions Functionality (JavaScripts etc.) is lost Digitale Signatures Must be checked, documented and signed again 28.05.2009 Copyright 2008 PDF/A 20

Conversion of E-Mails to PDF/A E-Mails are digital-born documents The attachments of E-Mails can contain many different formats Standard formats Proprietary formats Containers, which can also be nested E-Mails can be stored in different places: Mailboxes of E-Mail servers File system E-Mails contain different types of information: Display as Text, HTML or RTF Also contain header information Conversion of E-Mails to PDF/A Body and attachments are converted separately Merge to one single document Handling of digital signatures 28.05.2009 Copyright 2008 PDF/A 21

Conversion of Websites to PDF/A Objective of the archiving of websites: To retain the contents of the (own) website in a way that is legally trustworthy, to be used as an evidence in legal procedures It is not useful to just print the website to PDF/A, as the layout is often changed in the printing function of a website; but it s important to keep the layout as it appears on the screen Solution: Decide on one browser and browser-version Define rules for archive-friendly webpage design Decide which representation should be used (screen view or print view) Capturing of the website as an image Storage of other information such as texts, images, fonts, background, colors, flash previews etc. Merge of the contents together with the link information to reproduce the website structure within the PDF document 28.05.2009 Copyright 2008 PDF/A 22

Conversion software: on client or server? Attribute Client Server Scaling workstations Small amount Large amount Distribution Complex Simple Robustness for the users Depends on the creator-applications Independent Performance for the users Restricted by the client Scalable Supported source formats Restricted by the installation Scalable Application support Local Central 28.05.2009 Copyright 2008 PDF/A 23

Font handling in mass archiving To Archive From Archive Split Split resources resources Merge Merge resources resources PDF/A Archive 28.05.2009 Copyright 2008 PDF/A 24

Legal security with digital signatures A PDF/A compliant digital signature can be added to a PDF/A file Objective is the best possible legal security What can a digital signature really provide: When (time) the digital signature has been applied If the document has been manipulated since and if yes, what has been changed Who/which process within a company has made the conversion A signature alone cannot guarantee: Correctness of the content (analog to the source) Proof of 100% visual similarity with the original Possible solution: Certification of the processes 28.05.2009 Copyright 2008 PDF/A 25

PDF/A PDF/A products of PDF Tools AG 28.05.2009 Copyright 2008 PDF/A 26

3-Heights Document Converter Service Converts images, Office documents, E-Mails incl. attachments, websites and existing PDF documents automatically to PDF/A Extensible service, for example for additional conversion functionalities (with plugins) Output formats are TIFF, PDF and PDF/A, incl. application of a digital signature Optional OCR Add-On Decentral use via many different interfaces: Windows Service with watched folders, Command Line, API, Explorer Plugin or direct in the mailbox (IMAP) This product is suitable for any volume and company size thanks to its scalability 28.05.2009 Copyright 2008 PDF/A 27

Thanks for attending this webinar! Questions?... can now be asked using the chat/question function... or send us an e-mail to: pdfsales@pdf-tools.com... or call us on: Tel. +41 43 411 44 50 PDF Tools AG www.pdf-tools.com 28.05.2009 Copyright 2008 PDF/A 28

Backup slides PDF/A - Features PDF/A - Advantages 28.05.2009 Copyright 2008 PDF/A 29

PDF/A - Features PDF/A: An ISO Standard ISO 19005 is an ISO (International Standards Organisation) Standard that was published on October 1, 2005: ISO 19005-1: Document Management - Electronic document file format for long term preservation - Part 1: Use of PDF 1.4 (PDF/A-1) Defines a format (PDF/A) for the long term archiving of electronic documents and is based on the PDF Reference Version 1.4 from Adobe Systems Inc. (implemented in Adobe Acrobat 5) Two Levels of Compliance There are two levels of compliance for PDF/A: PDF/A-1a: Level A compliance in Part 1 PDF/A-1b: Level B compliance in Part 1 PDF/A-1a represents full compliance with all requirements of the ISO standard and guarantees both accessibility (e.g. full text search and support for devices for the disabled) and reproducibility PDF/A-1b is a slightly reduced set of requirements and the guarantee is limited to reproducibility 28.05.2009 Copyright 2008 PDF/A 30

PDF/A - Advantages Advantages Improved accessibility alone may substantiate the implementation of an electronic archive. Some advantages of a PDF/A archive over a TIFF or paper archive are: Full-Text Search PDF/A stores text as objects, allowing for an efficient full-text search in an entire archive. TIFF must first be scanned. File Size PDF/A files require only a fraction of the memory space of original or TIFF files, without loss of quality. Optimization PDF/A format can be optimized. The optimization can be focused on images (e.g. scanned checks) or extracting structured data (e.g. voucher information). Metadata Metadata like title, author, creation date, modification date, subject, keywords, etc. can be stored in a PDF/A file. 28.05.2009 Copyright 2008 PDF/A 31