Digitisation Disposal Policy Toolkit Glossary of Digitisation Terms August 2014
Department of Science, Information Technology, Innovation and the Arts Document details Security Classification Date of review of security classification Authority Author Document Status Version PUBLIC August 2014 Final Version Version 1.1 Contact for enquiries All enquiries regarding this document should be directed in the first instance to: Manager, Agency Services 07 3131 7777 info@archives.qld.gov.au Copyright Digitisation Disposal Policy Toolkit Glossary of Digitisation Terms Copyright The State of Queensland (Department of Public Works) 2010 Licence Digitisation Disposal Policy Toolkit Glossary of Digitisation Terms by is licensed under a Creative Commons Attribution 2.5 Australia Licence. To view a copy of this licence, please visit http://creativecommons.org/licenses/by/2.5/au/. Information security This document has been security classified using the Queensland Government Information Security Classification Framework (QGISCF) as PUBLIC and will be managed according to the requirements of the QGISCF. Page 2 of 6
Digitisation Disposal Policy Toolkit Glossary of Digitisation Terms Introduction This document forms part of the Digitisation Disposal Policy Toolkit. The purpose of this glossary is to explain a range of terms related to the digitisation of paper records. Anti-Aliasing Bit Depth Bi-tonal Colour Depth Continuous Colour De-skewing De-speckling Discrete Colour Dithering Dots per Inch Improves the appearance of grey scale images by adding grey pixels at the border of black and white areas, smoothing the transition from black to white. Also used in colour images to smooth transitions between colours. The number of bits used to describe the colour of each pixel. Greater bit depth allows more colours to be used in the colour palette for the image. Images containing only black and white pixels. Bi-tonal images are often used to represent modern, non-illustrated text documents. The colour or bit depth of an image refers to the number of bits used to describe the colour of each pixel. Greater bit depth allows more colours to be displayed in an image. Colour depths can range from 1 bit per pixel for bi-tonal images to 24 bits per pixel or greater in high quality colour images. An image, such as an original photographic transparency or print, in which the tones or colours blend smoothly from one to another. Continuous colour images have a virtually unlimited range of colour or shades of greys. Correction of distortion caused by image capture from a viewpoint other than on the perpendicular. [JISC Digital Media Glossary] Process of removing speckles (extra pixels or collections of extra pixels) that can occur in scanned images because of imperfections in the scanner hardware, dirt or dust on the camera, scanning surface or document being scanned. Instances when the colours in an image are separate and distinct. Discrete colour images do not blend smoothly from one colour to the next and lack the many shades of colour seen in photographs. The computer graphics equivalent to printed halftones, this technique creates the illusion of colour depth in images with a limited colour palette. This is done by interspersing pixels of different colours over the required area to give the appearance of a third colour. For example, white and black pixels allocated over an area will provide a grey appearance to that area. A measure of the resolution of a printer. It refers to the number of dots the printer is able to place in a linear one-inch space. The more dots per inch, the higher the resolution and the higher the printing quality. Page 3 of 6
Department of Science, Information Technology, Innovation and the Arts File format The specific way that data is arranged in a file. Some file formats can be used by a range of applications (such as text files or some image files) while others may only be used by a specific application (usually the same application used to create the file). Most applications can save documents in one or more standard formats as well as in their native format (i.e. a document produced in Microsoft Word can be saved as a Word document, or in rich text format, or in WordPerfect format). File formats may be proprietary or non-proprietary. Greyscale Half-tone Lossless compression Lossy compression LZW Naming conventions Near-line storage Non-proprietary Greyscale images use only black, white and a range of shades of grey. The number of grey shades available depends on the colour depth of the image. A printed image in which the density and pattern of black and white dots are varied, giving the appearance of a continuous tone image when viewed from an appropriate distance. Half-tone images are used extensively in magazines and newspapers. The compression of data that guarantees the original data can be restored exactly. A file that compressed using a lossless method and then retrieved is exactly the same as the original, uncompressed file. The compression of data that may result in some data being changed or lost. A file that is compressed using a lossy method and then retrieved may be different from the original file, but is "close enough" to be useful in some way. A lossless compression algorithm developed by Abraham Lempel, Jacob Ziv, and Terry Welch. Lempel-Ziv-Welch is a proprietary lossless datacompression algorithm used in GIF files. The patent to the LZW algorithm is owned by Unisys Corporation. A standardised approach to naming computer files. Storage of files, normally on magnetic or optical media, so that files can be accessed if needed. The accessing of files in near- line storage should not require human intervention, as in the case of off-line storage, but will usually be slower to access than on-line storage. Robotically controlled tape libraries and CD/DVD jukeboxes are applications of near line storage. Refers to a technological design or architecture whose configuration is available for use by the public. Use of non-proprietary technology is not restricted by licences or patents. Software is considered non-proprietary once it is released with a license that would permit others to modify the software and release their own versions without restrictions. Nonproprietary technology allows individuals or organisations to copy, modify and study the technology. Page 4 of 6
Digitisation Disposal Policy Toolkit Glossary of Digitisation Terms Off-line storage On-line storage Palette Palettised Pixel PPI Proprietary Raster Resolution Storage of files, normally on magnetic or optical media, in a manner where the files are separate from and not directly accessible by the computer system. Human intervention, such as loading a tape into a tape drive, is required for the file to be accessed by the computer system. Storage of files, normally on networks or hard disks, so that files are immediately available to the computer system. A palette is the set of available colours that may be used to display an image. Each pixel in the image is assigned a value that relates to a specific colour in the pallet. The number of entries in the palette is the total number of colours which can appear simultaneously on screen A type of image that is composed of a distinct set of colours from a palette. Standard palettised images are made up of 16 or 256 colour palettes. The smallest element of a digital image; short for picture element. Pixels are the many tiny squares that make up the representation of a digital picture. Usually the squares are so small and so numerous that, when displayed on a computer monitor or printed, they appear to merge into a smooth image. Pixels per inch (PPI) is a commonly used measure for digital images. Each pixel can represent a number of different shades or colours, depending on how much storage space is allocated for it. A measure of the resolution of an image. The more pixels per inch, the finer the resolution. PPI is used to describe the resolution of an image in a virtual state, or on a monitor. PPI is often confused with DPI, which is used to describe the resolution of a printing device. A technological design or architecture whose configuration is unavailable to the public and may not be duplicated without permission from the designer or architect. Proprietary technology is created for a given company's purposes. For example, Microsoft Word stores documents in a proprietary format, namely Microsoft Word format. Proprietary technology may be legally used only by a person or entity purchasing an explicit license. Proprietary means "privately owned and controlled", and hence software can remain proprietary even when source code is made publicly available, if control over use, distribution, or modification is retained. A category of digital still images. Raster images are the most common images created and used within digitisation projects. Raster images take the form of a grid or matrix of pixels. Each pixel has a defined value that precisely identifies its specific colour, size and place within the image. Examples of raster image file formats are TIFF, GIF and JPEG. The other category of digital still images is vector images. Resolution is the amount of picture data in a specific area of an image. Resolution is usually measured in pixels per inch (PPI). The higher the resolution, the sharper and clearer an image will be. Page 5 of 6
Department of Science, Information Technology, Innovation and the Arts Vector A category of digital still images. Vector images are defined by mathematical equations and are used for drawing and diagrams that can be constructed from points, lines and area shapes. Vector images are resolution independent, meaning they can be scaled up to large sizes with no loss of quality. Examples of vector files formats are CAD drawings, Corel Draw files, and SVG files. The other category of digital still images is raster images. More Information For more detailed guidance on the management of public records visit the Queensland State Archives website at www.archives.qld.gov.au or contact us on: Telephone: (07) 3131 7777 or Email: info@archives.qld.gov.au Page 6 of 6