City of Mercer Island Request for Proposals for Document Scanning and Indexing PROJECT DESCRIPTION In 2008, the City of Mercer Island purchased and implemented an electronic document management system (EDMS). The City continues to maintain a sizable collection of physical non digitized records in various mediums such as paper. It is the objective of this project to image, index, and import a subset of the City s collection of non digitized records that exist in the City s Development Services Group into the City s EDMS. PROJECT IMPLEMENTATION OUTLINE Once the proposal has been formally posted there will be a period of time for questions, site visits, etc. Submitted questions and answers from all vendors will be forwarded to all who have submitted their intent to bid. The goal is to select a vendor by May 1 st, 2009. PROJECT DETAILS I. Imaging Requirements A. Packaging and Transportation 1. Package and transport records to vendor location for scanning and indexing. 2. Return to City Hall all records in original storage configuration with the exception of binding materials such as paperclips, staples or other binding mechanisms. B. Imaging 1. Scan resolution shall be 300dpi. 2. Scanned image should be in color when the original document is in color and 256 level grayscale for all other documents. a. A color document is defined as being printed with non black ink. b. Colored paper but black ink is not considered to be a color document, such as legal paper, receipt slips, etc. 3. Digital format of imaged documents shall be multipage TIFF (tagged image file format) or PDF (portable document format). 4. A new TIFF or PDF file shall be generated for every set of documents. A set is defined by records that are individual or combined by a binding material such as a staple, paperclip or other binding mechanism. C. Indexing 1. Indexes shall be submitted in XML format that complies specifically with Exhibit A. 2. Each folder in this project has three indexes clearly labeled.
a. 1. GIS ID b. 2. Address c. 3. Parcel Number 3. These three indexes shall be assigned to each TIFF or PDF created. 4. Each vendor needs to include detailed cost on each additional index to be completed for each TIFF or PDF manually by the vendor. D. Access to Documents 1. The City requires digital access to any file or folder requested within one business day of the request being submitted. E. Deliverables 1. All indexes and images shall be delivered to the City on DVD. F. Estimates 1. Each vendor is required to use the following table in determining the number of records. Records Estimate Estimated number of records based on a sampling done by the City Number of Folders 4300 Each folder is pre indexed with three values: Address, GIS ID and Parcel number Number of pages tabloid 420000 Number of pages > tabloid 19000 Individual pages may need manual indexing Number of misc page sizes 18000 Number of color pages 11000 Color pages are primarily photos Number of bindings 75000 Examples of bindings are staples, paperclips, plastic binding combs, etc. II. Bid Requirements A. Vendors submitting their bids must meet the following requirements to be considered: 1. Must provide evidence of meeting prevailing wage requirements. 2. Must provide a minimum of three verifiable references. Government references are preferred. 3. Must submit intent to bid in writing or via email to the City Clerk by 5:00 pm, April 3, 2009. See contact information included. B. Site visits/surveys and questions must be made by 5:00 pm, April 10, 2009. All questions must be directed to Mike Kaser, Information Services Manager and will be resubmitted with the answer to all vendors whose intent to bid has been received by 5:00 pm, April 3, 2009. C. Bids must be received by the City Clerk no later than 5:00 pm on Friday April 17, 2009 and must include copies of the proposal in two formats: printed and digital (Microsoft Word or PDF). The total bid (including references and examples) must not exceed 20 pages on an 8.5 by 11 document when printed with 10 point font. D. Bids must detail the following:
1. Complete project timeline with clear itemized actions and assigned responsible party (City or Contractor) with completion date prior to August 1, 2009. 2. Detailed description of how each goal will be met and what standards will be implemented including itemized project cost with time and material estimates. 3. Listing of all image cleanup and enhancements such as, skew, noise, background, brightness and contrast, speckle, etc. 4. Detailed quality control and quality assurance procedures. 5. Detailed pricing on each additional index for each TIFF or PDF manually entered by the vendor. 6. Pricing for two packaging and transportation packages: a. One time pick up of all documents and a delivery of digital files as soon as each filing cabinet is imaged and indexed. b. Three separate pickups and deliveries broken down into twelve filing cabinets each. E. Contact Information 1. City Hall 9611 SE 36 th Street, Mercer Island, WA 98040 2. City Clerk: Ali Spietz, Ali.Spietz@mercergov.org, (206) 275 7793 3. Information Services Manager: Mike Kaser, Mike.Kaser@mercergov.org, (206) 275 7772 REJECTION OF PROPOSALS The City of Mercer Island reserves the right to reject any or all bids or proposals which are deemed to be non responsive, late in submission or unsatisfactory in any way. The City of Mercer Island shall have no obligation to award a contract for product, work, goods and/or services as a result of this RFP. VENDOR SELECTION The City of Mercer Island reserves the right to make an award based solely on the information provided, to conduct discussion or to request proposal revisions if deemed necessary. The City of Mercer Island has no obligation to reveal how vendor proposals were assessed. Therefore, proposals should contain best terms as related to the submission requirements of this RFP.
Exhibit A The purpose of this document is to describe the XML structural requirements for the use of importing metadata and associated files like tiffs, pdfs, docs, etc. It also describes how the batches should be assembled for delivery to the customer and how they are imported into the System. Pages 3 & 4 are examples of XML formats we recommend. Referring to these examples throughout the document may help in understanding some of the concepts that are presented. Page 5 describes the available options for the files to be imported. A description of XML Simply put, an XML file is a text file that contains information for communicating between computers and humans. The information inside an XML is arranged using a parent/child relationship, like a book has pages and pages has words. When you view an XML in Internet Explorer or Notepad, you can see these relationships based on the names that are used in between the brackets <>. For example the following shows how a book is related to a page and how a page is related to a word that is on that page. <book> <page> <word /> </page> </book> The above XML is not useful for the lack of information it provides, however it does show what an XML relationship looks like. As stated, relationships are grouped by using the brackets <> with words in between them. These are called nodes. Each node can contain information or it can act as a parent for child nodes that contain the actual information. What makes the XML language so flexible is that the nodes can be named anything*. What makes XML so useful is that anyone can look at it and get an idea of what information it contains, how that information is split up and how it relates to each other. *The names as well as the casing of the nodes should be consistent within the XML. Brief introduction to Mercer Island s EDMS Our EDMS System hierarchy is made up of cabinets containing folders that contain files. Each cabinet contains like documents that are organized within folders. The folders are labeled with metadata, called data fields or index values, which describe the folder and its contents which are the files such as images and/or text documents. Thus, each file has a parent folder and a parent cabinet. As part of our system, the XML Import utility is used to automate the import of metadata and its associated files to cabinets and individual folders. It makes it possible to recognize XML nodes as folders, specific indexes or files so they can be processed automatically. It is highly configurable and enables us to setup rules to maintain standards on how documents are brought in. It implements the hierarchal relationship that our System is built upon (cabinet-folder-file). Based on a set of rules called a configuration, the utility will know the
Exhibit A (cont d) following; where to look for new batches to process, and how the folders, data fields and files are recognized and associated with the indexes. Best practices on how deliverables are to be assembled Although many of the details of delivering the batches are flexible, to ensure efficiency, the XML Import utility accepts the following scenarios. 1. The XML file is in the same folder as its associated files. Example: On a cd, it would look like this; E:\1234\abc.tif E:\1234\def.pdf E:\1234\1234.xml E:\5678\jkl.tif E:\5678\mno.doc E:\5678\5678.xml 2. The associated files (tiff, pdf, etc.) are contained in one single folder. The path to these files is shared by all XML files that are processed with a specific configuration. The XML would only contain the filenames while the configuration would define the root path to the images directory. (Recommended) Example: On a cd, it would look like this; E:\Images\faaf.tif E:\Images\geeg.pdf E:\XML\1234.xml E:\XML\5678.xml 3. Each XML contains a fully qualified path for each associated file independently. Note: There are no restrictions on the following; the file types of the associated files, their actual filenames, the names of the batches and the number of associated files. However, the batch size and files types should be discussed and agreed upon prior to delivery. XML Constraints For the most part, the requirements are based on how the markup or nodes are nested. 1. The XML must comply with the XML specification, version 1.0. Although there is no specific schema that the XML must follow, it must be valid and properly formatted. It needs to at least conform to the XML specification, version 1.0 (http://www.w3.org/tr/2004/rec-xml-20040204/). 2. All folder nodes must be children of one single parent node and each folder node must be on the same child level or depth as other folder nodes. A folder node cannot be a child of another folder node. It also cannot be a child of a different parent node. 3. Each folder node must be independent and contain all the metadata and file information that pertains to it. The values can either be contained in the folder node itself as attributes or inner text, or it can be contained in child nodes as attributes or inner text. The datafields do not need to be in any specific order within the folder node 1 Each file node that pertains to the same folder node must be on the same child level or depth as other file nodes associated with the folder. 2 Each file node must be independent of other nodes and contain all the metadata that pertains to that file including; file path, file description, OCR text file path,
Exhibit A (cont d.) etc. The values can either be contained in the file node itself as attributes or inner text, or it can be contained in child nodes as attributes or inner text. 6. Nodes representing the data fields must be uniquely named (see Example 1) or have a unique value for an attribute (see Example 2). This makes it possible to design an XPath (an address) to point to a specific node within the folder node. If a data field node is named the same as the other data field nodes, one of its attributes must have a unique value. 7. Due to a limitation to.net 2.0, the XML should not contain namespaces. Important Note: Restrictions on the actual values are defined by the City of Mercer Island like the length, data type, etc. The following 2 examples represent the same data (2 folders) in a different but acceptable way: Example 1 (Recommended): <documents> <document description="conditional Use Permit" id="6506" scan_date="10/16/2004" keywords="conditional use, permit,restaurant"> <file path="0000214e4.tif" description="coversheet" ocrpath="0000214e4.txt" /> <file path="0000232g5.pdf" description="approval form" ocrpath="0000232g5.txt" /> </document> <document description="conditional Use Permit" id="6507" scan_date="1/4/2007" keywords="conditional use, permit,office building"> </document> </documents> <file path="0000563h9.tif" description="coversheet" ocrpath="0000563h9.txt" /> <file path="0000727y0.pdf" description="additional Info" ocrpath="0000727y0.txt" /> <file path="0000727y0.pdf" description="approval form" ocrpath="0000727y0.txt" /> In Example 1, there are two nodes named document that represent two separate folders. Note that each folder node is on the same level as the other and each node contains all the metadata and files that are associated with its own folder. In this example, the data fields are represented as separate attributes on the folder node document. Also, in this example the files are represented as child nodes, called file which detail the filename and an optional description for the file. Note: If a special data field like Multi-value is needed, the value can be delimited with any single digit like a comma, pipe or a semicolon. In this example, the attribute keywords is comma delimited. Multi-values can also be separated in their own nodes like the next example. Example 2:
Exhibit A (cont d.) <documents> <document> <fields> <field name="description">conditional Use Permit</field> <field name="id">6506</field> <field name="scan_date">10/16/2004</field> <field name="keywords"> <value>conditional use</value> <value>permit</value> <value>restaurant</value> </field> </fields> <files> <file> <info name="path">0000214e4.tif</info> <info name="description">coversheet</info> <info name="ocrpath">0000214e4.txt</info> <file> <info name="path">0000232g5.pdf</info> <info name="description">approval form</info> <info name="ocrpath">0000232g5.txt</info> </files> </document> <document> <fields> <field name="description">conditional Use Permit</field> <field name="id">6507</field> <field name="scan_date">1/4/2007</field> <field name="keywords"> <value>conditional use</value> <value>permit</value> <value>office building</value> </field> </fields> <files> <file> <info name="path">0000563h9.tif</info> <info name="description">coversheet</info> <info name="ocrpath">0000563h9.txt</info> </files> <file> <info name="path">0000727y0.pdf</info> <info name="description">additional Info</info> <info name="ocrpath">0000727y0.txt</info> <file> <info name="path">0000727y0.pdf</info> <info name="description">approval form</info> <info name="ocrpath">0000727y0.txt</info>
Exhibit A (cont d.) </document> </documents> Like example 1, Example 2 shows how the folder node document contains all the data fields and files that are associated with the folder. The data fields are separated out to their own node. Also the file information like the path is shown on its own child node. Although the data field nodes are named the same, the attribute called name makes each data field node unique inside the folder node. In this example, keywords is a special data field called a Multi-value and it is separated out into child nodes. File Options The following dynamic options are available for the files associated with the folders. Each value must be contained in the file node that it is associated with. These are optional and do not need to be included if the default is acceptable. File Description (text type) By default files that are imported into the system are named generically as File1.tif, File2.tif. The file description gives the ability to give a descriptive name for easier identification. File Extension (text type) This makes it possible to identify the file type or mime association with this file. Value should not contain the dot (Example: tif, pdf ) OCR Path (text type) Path to the OCR text file associated with an image file to be imported. Annotation A text annotation can be applied to image files. The following is required for annotating and should be separated in nodes or attributes inside the file node; text (text type), X coordinates (integer type), Y coordinates (integer type), Width (integer type), Height (integer type), Layer name (text type) as specified by the City of Mercer Island Rotate Angle (integer type) Images can be rotated by 90, 180 or 270 degrees.