MASTER'S THESIS. A Concept of Using 2D Bar Codes in Retail Environments. Per Jonsson. Luleå University of Technology



Similar documents
QR Codes and Other Symbols Seen in Mobile Commerce

Barcodes principle. Identification systems (IDFS) Department of Control and Telematics Faculty of Transportation Sciences, CTU in Prague

Masterclass 2D codes - market applications. Kenniscentrum Papier en Karton Bumaga BV - Kennis in Productie

The ID Technology. Introduction to GS1 Barcodes

Use of 2D Codes and Mobile Technology for Monitoring of Machines in Manufacturing Systems

SE05: Getting Started with Cognex DataMan Bar Code Readers - Hands On Lab Werner Solution Expo April 8 & 9

Demonstration of Barcodes to QR Codes through Text Using Document Software

Enhanced Bar Code Engine

Technical guide 1. june 2011

2D Mobile Barcodes A Definitive Guide

ELFRING FONTS UPC BAR CODES

CHAPTER I INTRODUCTION

Achieving 5 Nines Business Process Reliability With Barcodes. Michael Salzman, VP Marketing (408) sales@inliteresearch.

LEAR Corporation Sweden. Odette Transport Label Guideline Vers. 1 Rev. 4

Identification of products that require activation at the Pointof-sale. The global language of business. in Europe

Session 7 Bivariate Data and Analysis

LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK

Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras

GS1 QR Code. GS1 US Guideline

Simplified Machine Vision Verification of 1D and 2D Barcodes

ELFRING FONTS BAR CODES EAN 8, EAN 13, & ISBN / BOOKLAND

WHITE PAPER DECEMBER 2010 CREATING QUALITY BAR CODES FOR YOUR MOBILE APPLICATION

Back to Basics: Introduction to Industrial Barcode Reading

THE VALSPAR CORPORATION

Creating Interactive PDF Forms

May Prepared: Product version: Keyword: Accelio Present Central 5.4. Original value:

About Data Matrix Symbology

Ten steps to GS1 barcode implementation. User Manual

Frequently Asked Questions

HTML Code Generator V 1.0 For Simatic IT Modules CP IT, IT, IT

All V7 registers support barcode printing, except the Sharp 410/420 1A ROM and that limitation is based upon the register.

Application of Data Matrix Verification Standards

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

An Implementation of a High Capacity 2D Barcode

Selecting the Correct Automatic Identification & Data Collection Technologies for your Retail Distribution Center Application

What s the Power behind 2D Barcodes? Are they the Foundation of the Revival of Print Media?

Learn about OCR: Optical Character Recognition Track, Trace & Control Solutions

Let s talk symbology. A guide to decoding barcodes

OCR and 2D DataMatrix Specification for:

To effectively manage and control a factory, we need information. How do we collect it?

Elliott NWSM Laser Form Technical Information

Paper-based Document Authentication using Digital Signature and QR Code

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

What Resolution Should Your Images Be?

Tips for optimizing your publications for commercial printing

Support Guide for Codification of Medicines

Layman's Guide to ANSI X3.182

BAR CODE 39 ELFRING FONTS INC.

Base Conversion written by Cathy Saxton

Johannes Sametinger. C. Doppler Laboratory for Software Engineering Johannes Kepler University of Linz A-4040 Linz, Austria

6.4 Normal Distribution

1 Introduction. 1.1 Overview of barcode technology Definition of barcode

Digital Versus Analog Lesson 2 of 2

McKinsey Problem Solving Test Practice Test A

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005

NOT ALL CODES ARE CREATED EQUAL

Data Visualization. Prepared by Francisco Olivera, Ph.D., Srikanth Koka Department of Civil Engineering Texas A&M University February 2004

BAR CODE 2 OF 5 INTERLEAVED

Elfring Fonts, Inc. PCL MICR Fonts

If you know exactly how you want your business forms to look and don t mind


SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

Common 16:9 or 4:3 aspect ratio digital television reference test pattern

Microsoft Excel 2010 Tutorial

ELECTRONIC DOCUMENT IMAGING

Version of Barcode Toolbox adds support for Adobe Illustrator CS

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

GAP CLOSING. 2D Measurement GAP CLOSING. Intermeditate / Senior Facilitator s Guide. 2D Measurement

A whitepaper on: Invisible and Fluorescing Bar Code Printing and Reading

Excel 2007: Basics Learning Guide

INVENTION DISCLOSURE

Microsoft Windows Overview Desktop Parts

Web Editing Tutorial. Copyright Esri All rights reserved.

Digitisation Disposal Policy Toolkit

1 Solving LPs: The Simplex Algorithm of George Dantzig

designed and prepared for california safe routes to school by circle design circledesign.net Graphic Standards

STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.

2D Barcode Sub-Coding Density Limits

MassArt Studio Foundation: Visual Language Digital Media Cookbook, Fall 2013

Detailed Specifications

EPSON SCANNING TIPS AND TROUBLESHOOTING GUIDE Epson Perfection 3170 Scanner

WHAT You SHOULD KNOW ABOUT SCANNING

Matt Cabot Rory Taca QR CODES

Archival Data Format Requirements

2D COLOR BARCODES FOR MOBILE PHONES

2D symbols in distribution and logistics

Comparison of different image compression formats. ECE 533 Project Report Paula Aguilera

Test Automation Architectures: Planning for Test Automation

National Scenic Byways Program Image Style Guide

Contents. Bar code data transmission specifications...b-1. A October 1997 i

White Paper April 2006

User Tutorial on Changing Frame Size, Window Size, and Screen Resolution for The Original Version of The Cancer-Rates.Info/NJ Application

Integrated Invoicing and Debt Management System for Mac OS X

Barcoding 101 for Manufacturers: What You Need to Know to Get Started

NovaBACKUP Storage Server User Manual NovaStor / April 2013

Determine If An Equation Represents a Function

Image Compression through DCT and Huffman Coding Technique

The Layman's Guide to ANSI, CEN, and ISO Bar Code Print Quality Documents

Adobe Acrobat 6.0 Professional

How to recover a failed Storage Spaces

Transcription:

MASTER'S THESIS 2008:062 CIV A Concept of Using 2D Bar Codes in Retail Environments Per Jonsson Luleå University of Technology MSc Programmes in Engineering Computer Science and Engineering Department of Computer Science and Electrical Engineering Division of Computer Science 2008:062 CIV - ISSN: 1402-1617 - ISRN: LTU-EX--08/062--SE

A Concept Study of Using 2D Bar Codes in Retail Environments Per Jonsson March 10, 2008

Abstract The recent popularity of camera equipped mobile phones have sparked a new field of use for machine-readable 2D bar codes. A code can be printed on everyday items and be interacted with using a camera phone. The interaction is initiated by scanning the 2D bar code with the phone s camera. And the reaction can be anything from simply retrieving information to triggering some action. Because of the low cost of printing 2D bar codes, there many possible applications that could be based on this. One such application is examined in this thesis and aims to improve the usefulness of normal cash register receipts. The idea is to print a 2D bar code on receipts that contains the purchase information in digital form. This is assumed to help customers with managing their receipts or with more detailed book-keeping. This assumption is evaluated by developing a prototype implementation of a receipt reader. This prototype is based on two Java libraries for decoding DataMatrix and Quick Response codes. The main conclusion from this evaluation is that the libraries used were able to store an estimate of 42 purchase items in a 2D bar code. There are trade-offs that can be implemented to increase the capacity when needed which helps to make the assumption valid.

Acknowledgments I would like to thank Ericsson Research in Luleå for the opportunity to write this master thesis. Especially my supervisor at Ericsson, John Sandberg and also Jakob Saros for their help and guidance. Tommy Arngren, Angelina Frediksson, Xiaolei Hu and Simon Persson deserves a big acknowledgment for their inputs and for the participation in the brainstorming session. Furthermore, I would like to thank the interview respondents and finally my supervisor at the University, Jingsen Chen. Per Jonsson March 2008 Luleå University of Technology ii

Contents 1 Introduction 1 1.1 Objectives and Scope......................... 1 1.2 Related Work............................. 2 1.3 Outline................................ 2 2 Background 3 2.1 Bar codes............................... 3 2.1.1 1D bar codes......................... 3 2.1.2 2D bar codes......................... 4 2.2 Examples of 2D Symbologies.................... 5 2.2.1 ColorCode........................... 5 2.2.2 ShotCode........................... 6 2.2.3 Quick response code..................... 6 2.2.4 DataMatrix.......................... 8 3 Methods 9 3.1 Implementation............................ 9 3.1.1 Generator........................... 9 3.1.2 Reader............................. 9 3.2 Evaluation............................... 10 3.2.1 Performance tests...................... 10 3.2.2 Interviews........................... 10 4 Selection of Feasible Symbologies 12 4.1 Concept................................ 12 4.1.1 Problem............................ 12 4.1.2 Proposed solution...................... 12 4.2 Analysis of Requirements...................... 13 4.2.1 Variables........................... 13 4.2.2 Concept requirements.................... 14 4.2.3 Delimitations......................... 15 4.2.4 Final requirements...................... 16 4.3 Selection................................ 16 4.3.1 Data capacity and density.................. 17 4.3.2 Error correction....................... 17 4.3.3 Color print.......................... 17 4.3.4 Available software...................... 18 4.3.5 Result............................. 18 iii

5 Implementation 20 5.1 2D Bar Code Generator....................... 20 5.1.1 Serialization.......................... 21 5.1.2 Compression......................... 21 5.1.3 Visualization......................... 21 5.2 Receipt Reader Prototype...................... 21 5.2.1 Recognition.......................... 21 5.2.2 Decompression........................ 22 5.2.3 Unserialization........................ 22 5.2.4 User interaction....................... 22 6 Evaluation of Concept 24 6.1 Performance Tests.......................... 24 6.1.1 Resolution test........................ 24 6.1.2 Capacity test......................... 25 6.2 Interviews............................... 27 6.2.1 Interview 1.......................... 28 6.2.2 Interview 2.......................... 28 6.2.3 Interview 3.......................... 28 6.2.4 Interview summary...................... 28 7 Discussion 30 7.1 Feasibility............................... 30 7.2 Interview Input............................ 31 8 Conclusions 32 8.1 Future Work............................. 32 References 35 A Format Specification 37 B Tables 39 iv

Chapter 1 Introduction Bar codes are often associated with the scanning of items at the cash register in most stores. These bar codes encode data using several vertical lines with varying widths. But only using one dimension leaves a large part of the symbol unused. This thesis examines a variation called 2D bar codes. These codes are, unlike 1D bar codes, able to make use of the height of the symbol. As illustrated in Figure 1.1. Figure 1.1: (a) A 1D bar code only stores data in one direction while (b) 2D bar codes can utilize both dimensions. This is possible because of a change in the scanning process. 1D bar code scanners are based on only capturing a horizontal scan line. The 2D bar codes are designed to instead be processed from complete 2D camera images. The increasing availability if camera phones allows users to interact with 2D bar codes in a way that was not possible before. This introduces several opportunities for new applications. A concept of such an application is explored further in this study. The idea of this concept is to allow customers read receipts with their mobile phone. This is done by printing a 2D bar code on the receipt containing the information in digital form. As an example, having the receipt information in digital form could serve useful for personal book-keeping. 1.1 Objectives and Scope This thesis will study a concept with the goal of gaining greater knowledge of applications based upon the use of 2D bar codes. The main objective is to 1

2 CHAPTER 1. INTRODUCTION determine if the studied concept is feasible. This is done by examining the concept from a technical viewpoint and also by receiving input from potential providers of service. The feasibility is estimated on the basis of a prototype implementation illustrating the concept. To limit this thesis to the relevant parts of the previous paragraph, the scope have to be slightly narrowed to not include: A comparison between implementations of decoding libraries. The selection of 2D bar codes only depends on requirements posed on the capabilities in specifications and no further selection will be made on basis of performance. The results will only tell that there exists libraries so that at least those results could be reached. Tests, surveys or interviews with users. This is a subject of later evaluation and not included in this introductory assessment. Furthermore, the implementation is dependent on the available hardware, in this case a Java enabled mobile phone. This restricts the software used to the Java ME 1 platform. 1.2 Related Work Kato and Tan did a study[1] on 2D bar codes in 2005. Their study tried to determine which bar code was the most suitable for mobile phone applications in general. Based on a set of requirements, they selected the VS Code specification as the most flexible. This study is more specific toward a certain concept and also have to take construction of a prototype into consideration and will lead to a different end-result than Kato and Tan s. Even though many of the requirements concurs, no Java ME decoding library was found for the VS Code. This is seen as required in this study. 1.3 Outline This introduction is followed by a background chapter introducing bar codes. Chapter 3 describes the methods used. Chapter 4 provides the concept and also the analysis and selection of two feasible bar codes. With the help of these results, the implementation process is described in Chapter 5 where a prototype bar code reader is described. The evaluation of the prototype is presented in Chapter 6 before the discussion and some concluding points in Chapters 7 and 8. Two appendices follows the report and lists additional information about implementation details and tables. 1 Micro Edition. A subset of Java for resource-constrained devices, such as mobile phones.

Chapter 2 Background This chapter will begin by introducing bar codes in general and will then continue focusing on describing 2D bar codes. The background will also present some examples of what different 2D bar codes are capable of. 2.1 Bar codes Bar codes are basically a way of cheaply printing machine readable information on objects. There are many different kinds of bar codes, ranging from only being able to store a small number to several thousand bytes. The terms symbology and symbol will be used extensively in this report. Even though the terms in some cases can be used interchangeably, there is a subtle difference. The symbology is the bar code specification and the symbol is the, often printed, bar code itself. The symbology specifies a symbol s appearance, how it store its data and other attributes. This section will continue to briefly introduce 1D bar code symbologies and then move on to explain the properties of 2D bar code symbologies. 2.1.1 1D bar codes 1D, or linear, bar codes is the code type most people are familiar with. These symbologies are based on multiple side-by-side vertical bars of different widths. As an example, many retail products are labeled with a number according to the European Article Numbering[2]. This number is represented by a 13 digit 1D bar code as illustrated in Figure 2.1. Figure 2.1: EAN-13 bar code encoding 123456789012 with check digit 8. 1D bar codes are not very space efficient since the bars do not carry any data 3

4 CHAPTER 2. BACKGROUND along the height of the symbol[3]. This is due to the scanning process based on horizontal scan lines. The redundancy in the height of the symbol compensates for slightly misaligned scans. 2.1.2 2D bar codes Even if 2D bar codes are not necessarily made up of bars, the term bar code is still widely used for the 2D version. The scanning process for 2D bar codes are different from the one for 1D bar codes. Instead of a horizontal scan line, 2D bar codes uses complete 2D images for decoding[4]. This study classifies the 2D bar code types into three distinct categories: matrix, circular and arbitrarily shaped. Matrix shaped The majority of 2D bar code symbologies are built up by several modules arranged in a matrix. One module can represent log 2 (c) bits, where c is the number of colors that are supported by the symbology. Most common is that a module can have one of two colors, which renders each module equal to one bit. This kind of code is represented in Figure 2.2. Figure 2.2: A schematic illustration of a typical matrix shaped 2D bar code. The black and white squares inside the data area are modules. Different symbologies defines different amount of quiet zone around the data area, typically a margin of about two to four times the module width. This area should be empty to allow the reader to find and read the code more quickly. Circular shaped Circular symbols has the advantage of being more easily recalled by humans since they are not just perceived as a matrix of random blocks. Because of the lower density, these symbologies do not typically aim at handling large amounts of data. They are rather focused on storing a small identifier that is quick to read. The modules of a circular 2D bar code is parts of the circle segments composing the symbol. Figure 2.3 depicts this module arrangement.

2.2. EXAMPLES OF 2D SYMBOLOGIES 5 Figure 2.3: A schematic illustration of a circular shaped 2D bar code. Arbitrarily shaped Symbologies that have have some other structure than those above are classified as arbitrarily shaped. 2.2 Examples of 2D Symbologies There are two different approaches taken by the 2D symbologies, here called online and offline. 1. Online symbologies stores a small identifier representing the content. The identifier is used to retrieve the content from a remote database when read. This is also the same strategy employed by the earlier mentioned EAN-13 symbols. In the case of applications involving mobile phones, this does in practice mean that a Internet connection have to be established to retrieve content. 2. Offline symbologies are designed to store the actual content into the code itself. It should be noted that nothing restricts an application to use offline symbologies in an online fashion but the intention of offline code types is to support a greater variety of content. This is done mainly by allowing larger storage capacity but also other content types, such as numbers, text and raw binary data. The symbologies studied in this thesis are summarized in Table 2.1. A smaller subset of these will be presented in more detail below. The first two, ColorCode and ShotCode, are online symbologies and the other two, Quick Response code and DataMatrix, are offline. 2.2.1 ColorCode The standard ColorCode[5] symbol consists of 5 5 modules. This symbol have a 4 4 field of data and the last row and column are parity values for error detection (ED). With each module being one of four colors (red, blue, green or black), the number of combinations are 4 4 4 = 2 32 which equals a width of 32 bits. ColorCode content retrieval is based on a central index where the decoded identifier is looked up. The information received is a pointer to the content server. This should enable a company to host its own content server. The

6 CHAPTER 2. BACKGROUND Table 2.1: List of available 2D symbologies that were studied. 3-DI ArrayTag Atom tag Aztec Code Small Aztec Code bcode BeeTagg Bullseye Code 1 Color code CP Code DataGlyphs DataMatrix Datastrip Code Dot Code A EZcode HCCB HueCode INTACTA.CODE InterCode MaxiCode mcode MiniCode PDMark PaperDisk Optar QR Code Quickmark SmartCode Snowflake Code ShotCode SuperCode Trillcode UltraCode VeriCode VSCode WaterCode parity modules can detect errors, but there is no known mechanism for correcting errors. ColorCode s properties are summarized in Table 2.2. Table 2.2: A summary of ColorCode s properties and an example symbol. The symbol is available at http://www.colorcode.com.sg/technology.html. Name Company Data capacity Error detection Error correction Standard Domain ColorCode ColorZip Media 32 bits excl. ED Parity. None No Proprietary 2.2.2 ShotCode The known variation of ShotCode[6] consists of one outer ring with the code s name. The two rings inside of the outer ring contains data, and the rings in the center are used as a finder pattern. By inspecting a ShotCode, it seems that each data ring can contain 24 modules which gives a total data capacity of 48 bits for both rings. Other variations may also exist. Because ShotCode is closed, there is no information on how many of the data bits that are devoted to error detection and correction or used as control bits. ShotCode s properties are summarized in Table 2.3. 2.2.3 Quick response code A Quick Response code (also known as QR code) is constructed as a matrix of modules and is characterized by its three large squares, each one located in a separate corner. These squares are used as a finder pattern. The smallest size defined in the standard[7] is 21 21 modules while the bigger codes can be as large as 177 177 modules. There are four levels of error

2.2. EXAMPLES OF 2D SYMBOLOGIES 7 Table 2.3: A summary of ShotCode s properties and an example symbol. The symbol is available at http://www.shotcode.com/download/. Name Company Data capacity Error detection Standard Domain ShotCode OP3 48 bits incl. ED Unknown No Proprietary detection and correction (EDAC) capability using Reed-Solomon code as listed in Table 2.4. Table 2.4: Error correction levels for Quick Response code. Error correction level Max data capacity L (7 %) 2 953 M (15 %) 2 331 Q (25 %) 1 663 H (30 %) 1 273 According to the standard, the recommended error correction level is M for a good capacity reliability ratio. QR code s properties are summarized in Table 2.5. QR code symbols are not defined for non-square versions. Table 2.5: A summary of QR code s properties and an example symbol. Name Quick Response code Company Denso Wave Data capacity 2 331 bytes (Level M) Error correction Reed-Solomon Standards AIM ITS/97/001 ISO/IEC 18004 Domain Public

8 CHAPTER 2. BACKGROUND 2.2.4 DataMatrix A DataMatrix symbol is recognized by the solid bars at the left and bottom edge and the dotted bars at the top and right edges. According to the standard for DataMatrix[8], these bars are used to detect the size and any distortions of the symbol. The minimum number of modules are 10 10 and the maximum are 144 144. The data capacity for the largest code size is 1 555 bytes including error correction data. The error correction code is Reed-Solomon. The non-square versions of DataMatrix are defined for sizes up to a maximum of 16 48 which equals a data capacity of 47 bytes. DataMatrix s properties are summarized in Table 2.6. Table 2.6: A summary of DataMatrix s properties and an example symbol. Name DataMatrix Company Siemens Data capacity 1 555 bytes incl. EDAC Error correction Reed-Solomon Standards ANSI/AIM BC11 ISO/IEC 16022 Domain Public

Chapter 3 Methods This chapter will present the methods used during the thesis work. 3.1 Implementation The implementation done in this study consists of two main parts, a 2D bar code generator and a prototype for a 2D bar code reader. This section will present the materials used for both of these parts. 3.1.1 Generator The generator is developed using the Java SE 1.6.0 release 2[9] Developers Kit running on a Microsoft Windows environment. The 2D bar codes are generated using these two publicly available encoding libraries: com.idautomation.datamatrix[10] V2006.9 DEMO. com.java4less.qrcode[11] Evaluation Version. The generator does not have any requirements other than being able to produce the desired 2D bar code according to the symbology standards. As far as the implementation done in this study is concerned, these two encoding libraries manages that. 3.1.2 Reader The reader is developed for a Sony Ericsson K800i camera phone[12]. The K800i has a 3.2 mega pixels camera and is equipped with auto-focus. The Sony Ericsson SDK 2.5.0 for the Java ME Platform[13] is used during development and emulation. The libraries used for the decoding of 2D bar codes are: jp.sourceforge.qrcode[14] 0.8. org.semacode[15] 1.6. Additionally, the recently released com.google.zxing[16] 0.1.2 library is used in the reader as a complement. These three libraries are picked on the basis that they are easily available and manages the platform and symbology standard 9

10 CHAPTER 3. METHODS requirements. They can thus not be guaranteed to be the most competent libraries for decoding 2D bar codes. 3.2 Evaluation The concept is evaluated with the help of the prototype. The evaluation consists of two main parts, performance tests and interviews. 3.2.1 Performance tests All tests are performed in a typical office environment containing a single ceilingmounted fluorescent light armature. The receipts are printed using a HP Laser- Jet 1018 600 DPI on standard white printer paper. A receipt is assumed to be 8.0 cm wide which is a common format in Sweden and supported by HP Receipt Printers[17] among others. The performance tests tries to determine the smallest module width possible using resolution tests, and the maximum amount of items using capacity tests. The resolution tests are conducted by scanning increasingly smaller module widths until it fails. A failure is defined as not being able to scan the symbol successfully once using five tries. The capacity test are based on the results from the resolution tests and also uses the same scan criterion with at least one successful scan in five tries. This criterion is an estimate of how many times an actual user can be expected to re-scan the symbol on unsuccessful scans. As a reference, the Tasman[18] demo application version 3.50 is used in conjunction with the prototype reader during the evaluation. This is a commercial demo application which is not supported on Java ME CLDC/MIDP 1. This means that the Tasman library can not be used on the Sony Ericsson K800i. Instead, the prototype is able to store images taken with the camera for later processing on a supported platform. In this case the Tasman application is executed in a Microsoft Windows environment. 3.2.2 Interviews A series of three interviews are used to generate input on the concept developed. All of the interviews are conducted in a semi-structured manner, as proposed by Seaman[19], with representatives from local retail stores. The interviews are performed at the location of the store. The interviewee is presented with a short introduction of the concept and the prototype. After the interviewee understood the concept, he or she is encouraged to answer a number of questions. These questions are pre-written but several of them are open-ended to allow room for discussion. All interviewees are ensured that their names and the stores names would remain anonymous to not have any influence on the answers. Each respondent is briefly introduced below. Interview 1 This first respondent is a manager at a discount department store that is a part of a chain of stores in Sweden. This store have customers of mixed age with a 1 Connected Limited Device Configuration (CLDC), Mobile Information Device Profile (MIDP)

3.2. EVALUATION 11 slight bias toward adults and elders. Interview 2 The respondent in interview 2 is a store manager for a small grocery store in Sweden. Their customers are of mixed age. Interview 3 Interviewee 3 is a store manager at a clothing company that is well established with a chain of stores located in the north of Europe. Their stores have youths as their typical customers.

Chapter 4 Selection of Feasible Symbologies The main goal for this chapter is to describe the election of two symbologies that fulfills a certain set of requirements. These requirements are found with the help of the concept examined in this thesis. This concept is presented next. 4.1 Concept By looking at current problems that might be solved by 2D bar codes and feedback from the brainstorming session, a concept is developed. The concept is motivated by a problem description. 4.1.1 Problem Paper receipts are used to record many retail business transactions today. For a customer, a receipt is often only used as a proof of purchase and it is hard to gain any additional advantage of the detailed information available. It is difficult to replace physical receipts with some other method. Mainly because there is currently no existing feasible alternative for every user to store that information at the time of purchase. The alternate way is to store the information from the purchase in a central database and later let the customer access the information via the Internet. This will entirely exclude the customers that do not have the possibility to do so on the Internet, but might be a viable option for the future. Having established that paper receipt will at least remain until every customer feels comfortable enough with accessing the information using the Internet, are there improvements to be done to an actual paper receipt? 4.1.2 Proposed solution Printing a 2D bar code on each receipt containing detailed information about the purchase enables customers to easily access the information with their camera phone. The receipt can still be used as ordinarily, but extra functionality is provided to those who prefer to use it. The 2D bar code should contain the 12

4.2. ANALYSIS OF REQUIREMENTS 13 same purchase information as written on the receipt, but it is possible to store additional information as well. As soon as the information is in digital form, it can automatically be categorized or used in different statistics. One example is to transfer the information to a desktop computer, where detailed book-keeping could be held. This example is illustrated in Figure 4.1. Figure 4.1: Example usage. (a) Purchase the items and receive a receipt. (b) Scan the 2D bar code on the receipt with the camera phone. (c) The information can later be transfered to, for instance, computer book-keeping. 4.2 Analysis of Requirements For a symbology to be able to support the concept introduced in the previous section, it has to be able to handle the requirements inflicted by it. This section aims to find these requirements. 4.2.1 Variables The analysis is based on the set of variables listed in Table 4.1. These variables are factors that are identified as varying between different symbologies. Table 4.1: The variables that forms the basis for the analysis. Variable Description Data capacity The maximum amount of storage capacity. Data density The ratio between data capacity and print size. Error correction The ability to restore damaged symbols. Available software Available decoding libraries. Color print The need for color to represent a symbol. Character encodings The ability to support multiple languages. Security If the symbology supports any security aspects.

14 CHAPTER 4. SELECTION OF FEASIBLE SYMBOLOGIES 4.2.2 Concept requirements These are the requirements inflicted by the concept presented in Section 4.1. Data capacity and density Two ways of storing the receipt information in a 2D bar code are considered, as illustrated by Figure 4.2. Figure 4.2: Information retrieval process for (a) an offline solution, and (b) an online solution. One way is to do it in an offline fashion. That is, to store all of the information into the symbol itself. Another approach is to only store an identifier. This identifier is later looked up in a remote database to retrieve the actual content. Both solutions have advantages and disadvantages: Symbol size Offline symbols are highly restricted by the symbol size. The amount of information that can be stored into the symbol is limited, which causes problems with receipts containing many items. In an online solution, the symbol size is not dependent on the total number of items. It will always have the same symbol size. Privacy The online solution causes issues with privacy that the users might fell less comfortable with. Mainly in having all of their purchase information in a remote database. Because the information can be retreived using only the identifier, it has to be protected by a password. This adds another step in the scanning process and can still not be guaranteed to be safe. The offline way do not have this issue since it do not have any worse privacy concerns than a normal receipt. Life-time How long should a user s information be stored? The online solution will not work as soon as the information in the database is purged while the offline solution works as long as the receipt can be scanned. Scalability With the online solution, the database infrastructure have to be extended as the system gains more customers. The offline solution is entirely independent from the number of simultaneous customers.

4.2. ANALYSIS OF REQUIREMENTS 15 Availability An online solution requires access to a database, which in practice means access to the Internet, to function. This might introduce costs for the customer and thus affect the desire to use it. The offline solution have all the necessary information inside the symbol itself. When weighing the advantages against the disadvantages it is clear that the offline solution have many advantages. The only real advantage for the online solution is the constant symbol size. Which, as will be described, can be solved for the offline solution without never being worse than the online. As mentioned earlier, an offline symbology can always act in an online fashion, but not necessarily the reverse. For the analysis, choosing the offline solution means that the symbol have to handle as large amounts of data on as small area as possible. Error correction Error correction is the ability to reconstruct data from a damaged symbol. Assuming that a symbology do not implement error correction, the consequence would be that one single damaged bit on a symbol would cause it to be unusable. The advantage is that the total data capacity will be larger since there is no need for the data redundancy, but this extra data capacity is useless if the data gets damaged. Considering that receipts are in a very exposed environment due to folds and creases, error correction is seen as required for this concept. Color print Receipt printers are generally not able to print multiple colors. The reason being that it would cause the printer to be more expensive for a functionality that is not really needed. Because of this, symbologies that rely on other colors than black and white for encoding will not be used. Character encodings and security Character encodings and security are not considered to be required. An implementation of the concept will have to be proprietary since there is no standard for encoding receipt information into 2D bar codes. This means that the data can be encoded 1 or encrypted in any way desirable, it only depends on how the 2D bar code reader choses to interpret the data. Summary Table 4.2 summarizes the requirements the concept poses on the reader. 4.2.3 Delimitations The Java ME delimitation introduced in Section 1.1 will affect the selection of symbologies. To be able to implement a prototype for a Java enabled mobile phone, there have to exist a publicly available Java ME decoding library for the selected symbology. According to a press release[21] from Sun Microsystems the Java platform was already in 2004 supported by 250 million mobile phones. 1 Using for instance UTF-8[20] for encoding text.

16 CHAPTER 4. SELECTION OF FEASIBLE SYMBOLOGIES Table 4.2: Requirements introduced by the concept. Variable Requirement Data capacity High. Data density High. Error correction Requires error correction. Color print Requires non-color print. Which means that this requirement does not automatically imply a limited customer base. Summary The delimitations poses one real requirement on the reader, listed in Table 4.3. Table 4.3: Requirements introduced by the delimitations. Variable Requirement Available software Public Java ME decoding library. 4.2.4 Final requirements When inspecting the results from the previous sections, there are no conflicting requirements. Therefore, the list of final requirements in Table 4.4 can directly be compiled from the Tables 4.2 4.3. Table 4.4: Tables 4.2 4.3 compiled into the final requirements list. Variable Requirement Data capacity High. Data density High. Error correction Requires error correction. Available software Public Java ME decoding library. Color print Requires non-color print. Character encodings Advantageous but not required. Security Advantageous but not required. This list will assist in the selection process of feasible symbologies presented in the following section. 4.3 Selection The task in this section is to combine the requirements from Table 4.4 with the available symbologies studied in Section 2.2 to exclude all unfeasible symbologies. The result will be the symbologies that fulfills all the criteria.

4.3. SELECTION 17 4.3.1 Data capacity and density The first variables are the data capacity and the data density of a symbology. According to the final requirements list, both the data capacity and data density should be high. This effectively excludes all online symbologies because of their limited storage capabilities. Determining exactly where the limit for high data capacity lies is difficult at this stage, therefore all symbologies that could be classified as offline are included. The exclusion of all online symbologies leaves the symbologies listed in Table 4.5. Table 4.5: Symbologies which passed the data capacity requirement. Aztec Code Quickmark DataMatrix Quick Response Code Datastrip Code SmartCode EZcode Snowflake Code HCCB SuperCode INTACTA.CODE Trillcode MaxiCode UltraCode mcode VeriCode Optar VS Code 4.3.2 Error correction Error correction is important for the user experience as discussed in the analysis. Some symbologies could not be classified with certainty due to problem of finding detailed specifications for the non-standardized symbologies. Only the symbologies that have confirmed error correction capabilities can be accepted and are listed in Table 4.6. Table 4.6: Which of the remaining symbologies that passed the error correction requirement. Aztec Code DataMatrix MaxiCode Quickmark Quick Response Code Snowflake Code SuperCode UltraCode VS Code 4.3.3 Color print Symbologies relying on color or different shades of gray are excluded because of receipt printer s inability to produce color prints. The two remaining sym-

18 CHAPTER 4. SELECTION OF FEASIBLE SYMBOLOGIES bologies left that requires color are UltraCode and HCCB. Table 4.7 lists the symbologies that passed this requirement. Table 4.7: Which of the remaining symbologies that passed the non-color requirement. Aztec Code DataMatrix MaxiCode Quickmark Quick Response Code Snowflake Code SuperCode VS Code 4.3.4 Available software The symbologies that have passed all requirements thus far, including having easily available Java ME compatible libraries for download, are listed in Table 4.8. Table 4.8: Which of the remaining symbologies that passed the Java ME library requirement. DataMatrix Quick Response Code 4.3.5 Result According the selection process based on the analysis of the requirements, two symbologies passed all criteria. Their main properties are summarized in Table 4.9. Table 4.9: Symbologies that passes all the requirements. Quick Response DataMatrix Data capacity 2 331 bytes (level M) 1 555 bytes (incl. EDAC) Data density High High Error correction Reed-Solomon Reed-Solomon Color print No No Available software Java ME library Java ME library The Quick Response and DataMatrix symbologies are very similar. Both are standardized, widely supported and are visually resembling to each other. The largest differences are that Quick Response code have higher capacity and DataMatrix is slightly more dense. Visually they can be distinguished by the

4.3. SELECTION 19 three square finder patterns located only on the Quick Response code. Both of these two symbologies will be used when implementing the prototype.

Chapter 5 Implementation The implementation is done in two separate parts, a 2D bar code generator and a 2D bar code reader. As illustrated by Figure 5.1, the generator encodes receipt information into a symbol while the reader is able to decode the symbol back to the original information. The generator and reader implementations will later serve as tools for the evaluation. Figure 5.1: High level schematic of the system. The 2D bar code symbol is used as a medium to digitally store the receipt information. A common format specification of how the information is structured in a 2D bar code symbol is necessary for the reader to be able to interpret symbols created by the generator. The specification is located in Appendix A. 5.1 2D Bar Code Generator In order to facilitate for the evaluation, the generator is written to be able to create symbols containing custom receipt information with any module width and margin size. The basic steps of the generation process are shown in Figure 5.2. 20

5.2. RECEIPT READER PROTOTYPE 21 Figure 5.2: The process of generating a 2D bar code symbol from receipt information. 5.1.1 Serialization The purpose of serialization is to efficiently flatten an object structure into a stream of bytes. The serialization is done according to the format specification which allows the reader to interpret the data correctly. The serialization is necessary for the next steps which operate on bytes and not on the object structure. 5.1.2 Compression The serialized data is then compressed with a simple prefix code scheme algorithm introduced by Huffman[22]. Compression of smaller amounts of data do not typically save any space because there is some overhead storing the frequency table. When larger amounts of data are compressed, enough space can be earned to both fit the frequency table and to save space. Compression is only used if the total compressed size is smaller than the original data, otherwise compression is skipped. 5.1.3 Visualization As a last step, the 2D bar code is generated using the encoding library for the specified symbology. The encoding libraries are hidden behind a common interface which simplifies the adding and changing of encoding routines. 5.2 Receipt Reader Prototype The reader, which is designed to run on a camera phone, does the opposite of the work done by the generator. This way, the camera phone is able to reconstruct the receipt information from a 2D bar code symbol. 5.2.1 Recognition This step recognizes and extracts the byte data from the scanned 2D bar code. The first byte in the data sequence is tested against a pre-defined signature byte. If it is a match, it is assumed to contain receipt information. Otherwise

22 CHAPTER 5. IMPLEMENTATION Figure 5.3: Reversing the generation process to transform a 2D bar code back to receipt information. the reader skips the next two steps in the decoding process and shows the data as plain text. That allows the reader to read normal plain text content as well. Similarly to the generator, the reader also hides the decoding libraries behind an interface which makes it possible to add or change decoders without any rewriting. 5.2.2 Decompression If the data was compressed by the generator, the frequency table is read and the payload data is decompressed before further processing. Otherwise, this step is skipped. 5.2.3 Unserialization At this stage, all the data is ready for unserialization. By following the format specification, the reader is now capable of completely recreate all of the necessary receipt information. 5.2.4 User interaction An example usage is visualized in Figure 5.4. A customer have purchased three items at the local food store. After scanning the receipt it is stored and organized in the phone.

5.2. RECEIPT READER PROTOTYPE 23 Figure 5.4: Prototype running in a mobile phone emulator. (1) Find and scan a bar code through the camera s viewfinder, (2) view the purchase information, (3) pick a category and (4) see the categorized receipts.

Chapter 6 Evaluation of Concept This chapter provides an evaluation of the concept, using the prototype described in Chapter 5 as the main tool. Performance tests are used for the evaluation of the technical capabilities and interviews are conducted to get input from a few actual retailers. 6.1 Performance Tests The performance tests are used to evaluate the technical aspect of the concept. The purpose is to study what such an application could handle in terms of capacity, on a prototype level. This will later help to make a better estimate of what can be expected from a concrete implementation. To help the reasoning, three variables are defined in Equation 6.1 to represent the different measurements of a symbol. x module width, w symbol width (excluding margin), and (6.1) n number of modules on the x-axis. Only the width is discussed but the argument applies to both the width and the height since all symbols are considered to be square. Non-square versions are, according to Sections 2.2.3 2.2.4, only defined for small data sizes or not at all, and are thus not studied. 6.1.1 Resolution test The goal of this first test is to determine how small module widths the decoders can handle and make an estimation of how important auto-focus is. The symbols were generated with 100 bytes of data with increasingly smaller module widths. The results for scanning these symbols is presented in Table 6.1 for DataMatrix, and in Table 6.2 for Quick Response code. The results suggests that the decoding libraries chosen can perform down to 700 µm for DataMatrix and 600 µm for Quick Response code using 100 bytes of data. Repeating the resolution test but with the phone s auto-focus option turned off yields the results listed in Table 6.3 and Table 6.4. Turning auto-focus off severely increases the module width needed. In all of the cases, the module width is at least doubled. This means that for 100 bytes 24

6.1. PERFORMANCE TESTS 25 Table 6.1: Resolution test results for DataMatrix with 100 bytes of data. Using auto-focus. Module Width (µm) semacode Tasman 1 000 Passed Passed 900 Passed Passed 800 Passed Passed 700 Passed Passed 600 Passed 500 Passed 400 Passed 300 200 Table 6.2: Resolution test results for Quick Response code with 100 bytes of data. Using auto-focus. Module Width (µm) qrcode zxing Tasman 1 000 Passed Passed Passed 900 Passed Passed Passed 800 Passed Passed 700 Passed Passed 600 Passed Passed 500 Passed 400 Passed 300 200 of data, turning off auto-focus increases the surface area needed for a symbol with at least four times. Table B.1 and Table B.2 in Appendix B illustrates how the width of the symbol relates to the amount of data it contains. These tables are constructed with the module widths found in Table 6.1 and Table 6.2 and by using the relationship in Equation 6.2. w = x n (6.2) Assuming that a receipt, leaving room for the margins, is 7.5 centimeters wide. It is clear from Tables B.1 B.2 how large amount of data each symbology can handle with the specified module width. According to the tables, a Data- Matrix symbol can handle 813 bytes while a Quick Response symbol is able to store up to 1 125 bytes. 6.1.2 Capacity test In the worst case, the 2D bar code generator is not able to compress the payload data stored in a receipt symbol. By following to the format specification in Table A.1 the total storage size (in bytes) is calculated with the expression in Equation

26 CHAPTER 6. EVALUATION OF CONCEPT Table 6.3: Resolution test results for DataMatrix with 100 bytes of data. Not using auto-focus. Module Width (µm) semacode Tasman 1 800 Passed Passed 1 700 Passed Passed 1 600 Passed Passed 1 500 Passed Passed 1 400 Passed 1 300 Passed 1 200 1 100 1 000 Table 6.4: Resolution test results for Quick Response code with 100 bytes of data. Not using auto-focus. Module Width (µm) qrcode zxing Tasman 2 000 Passed Passed Passed 1 900 Passed Passed Passed 1 800 Passed Passed 1 700 Passed Passed 1 600 Passed Passed 1 500 Passed Passed 1 400 Passed 1 300 Passed 1 200 6.3. size(code) = size(signature) + size(f lags) + size(p ayloada) = 2 + size(p ayloada) = 2 + size(storename) + size(date) + + size(n umitems) + size(itementries) = 2 + size(storename) + 8 + + N umitems (avgsize(itemn ame) + size(itemp rice)) = 10 + size(storename) + NumItems (4 + avgsize(itemname)) (6.3) Where size is a function that maps a field to the size in bytes of that field and avgsize(itemname) is the average size of all ItemNames. An expression for the number of items can be derived by simply rearranging Equation 6.3 into Equation 6.4. NumItems = size(code) 10 size(storename) 4 + avgsize(itemn ame) (6.4) The number of items that can be stored in a specified amount of space thus depends on the length of Storename and ItemNames. An item name on a paper

6.2. INTERVIEWS 27 receipt is seldom longer than that it can be fitted on one line, including the price. Considering this, assuming 20 characters as an average length for item names is not unreasonable. A store name can also typically be stored using 20 characters. The format specification defines the size of a string to be size(string) = 2+ Length, which directly allows the number of items to be calculated in Equation 6.5 for size(code) = 1 125 bytes as indicated by the resolution test. 1125 10 22 NumItems = = 42 (6.5) 4 + 22 Rephrased, this means that when using a Quick Response code on 7.5 cm wide area with a module width of 600 µm it should be possible to read a receipt containing 42 average sized items with the jp.sourceforge.qrcode library. To confirm these calculations, a symbol with the specifications mentioned above is constructed. The symbol can be read correctly using the prototype. Figure 6.1 shows such a successful scan. Figure 6.1: Image representing an actual picture from a successful scan of a QR code containing 1 125 bytes. When examining Table B.2, the next code size level 129 129 modules which on 7.5 cm gives an effective module width of approximately 581 µm. Repeating the test with this size failed, which indicates that 42 average sized items is the theoretical upper limit under these conditions. Trying to scan a 813 bytes large code with the DataMatrix decoding library failed. Even sizes that this decoder should handle seems to fail which indicates that this decoding library malfunctions for some unknown reason at larger data sizes. Thus only the Quick Response code s capacity results are valid. 6.2 Interviews The other part of the evaluation is to present the concept and the prototype to local store managers. Their input is a complement to the performance tests and provides suggestions on the service as a whole. The important parts of each interview are briefly described below.

28 CHAPTER 6. EVALUATION OF CONCEPT 6.2.1 Interview 1 Interview 1 was conducted with a store manager at a discount department store. This respondent did not see too much use for their customers to be able to read the receipt with their phones. Many of the customers did not seem to save the receipts, and the store manager did not expect that to change drastically with this system. Mainly because of reading every receipt seemed to be a bit cumbersome. Other types of stores was mentioned to possibly benefit more from such a system, such as durables stores and clothing stores, where the purchases are less frequent. In general, for reading bar codes with a camera phone on objects such as business cards, the respondent was positive. 6.2.2 Interview 2 This respondent is a store manager of a small grocery store. The respondent believed that reading bar codes with a phone was good, but was not certain if this application would suit the store. The respondent suggested that the bar code could be printed on the back of the receipt, which would not generate any loss of space or larger paper area. Another point mentioned was that informing the customers about the benefits could be hard which might lead slower user adaptation. If the customers do not see why or how to use it, such a system is useless. 6.2.3 Interview 3 The respondent in interview 3 is a store manager at a clothing store. The respondent was overall positive to reading bar codes with camera phones, including receipts. Concerning the concept, the respondent felt that the book-keeping example is not very interesting for most people. One service that were mentioned as more relevant is to simply store the receipt in the phone. This would free the customer from keeping track of all paper receipts when returning an item. Customers are now allowed to make the return without a receipt. This can cause dissatisfaction if the receipt was lost. Storing the receipt in their mobile phone could help to relieve this issue. The respondent also stated that some additional services was needed to increase the value for the store and also for the customer. These additional services might include advertisement, special offers, product returns and coupons. 6.2.4 Interview summary The most important inputs from the interviews are: 1. Stores with infrequent purchases might benefit more from such as system. 2. Additional services are necessary to generate any surplus value for the store, such as: Advertisement and special offers Include some advertisement or special offers into the symbol. Warranty and returns Store the receipt information in a safe way to enable the customer to use it for warranty claims and product returns.

6.2. INTERVIEWS 29 Coupons Coupon offers to help attract customers to the store. 3. Informing and adapting the customers to the scanning might be difficult. 4. The respondents seems to be generally positive toward reading bar codes with camera phones.

Chapter 7 Discussion This study has looked at using camera phones and 2D bar codes as an enabler for new kinds of services. The main focus is on the concept presented in Section 4.1. This concept augments normal paper receipts with a 2D bar code to provide additional services to such a common objects as receipts. The implementation part produced a prototype version of a reader for symbols containing receipt information. This prototype is used during the evaluation to help make an estimation of how such a service would perform. 7.1 Feasibility When looking at the prototype introduced in Chapter 5, it is clear that the prototype can scan and interpret receipt information stored in a 2D bar code. The prototype thus works as a proof of concept. If it is feasible from a practical viewpoint is, however, not clear without examining the results from the evaluation. The evaluation establishes that the Quick Response decoding library used for the prototype can read 42 average sized items from a 8.0 cm receipt. If the prototype would be good enough to read the symbols containing the largest possible amount of data, it would be enough to store 82 average sized items. Having this hard upper limit is a problem since there could in theory be a huge number of items on a receipt. There are at least five solutions to apply when all of the items can not fit in one symbol: Combine several items of the same product into one single item. No real loss, should always be done by the receipt generator. Store only product categories instead of every single product. Loses the specific product information. Extend the receipt with one or more additional symbols. Might increase the paper receipt size and the time spent scanning. Store only the total sum. Removes the upper limit but loses all product information. Delay the lookup of the product names until after the scan of the receipt. Saves space but needs online access. 30