Paninimania: sticker rarity and cost-effective strategy



Similar documents
A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic

Managerial Economics Prof. Trupti Mishra S.J.M. School of Management Indian Institute of Technology, Bombay. Lecture - 13 Consumer Behaviour (Contd )

Graphical Integration Exercises Part Four: Reverse Graphical Integration

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

The fundamental question in economics is 2. Consumer Preferences

Projects Involving Statistics (& SPSS)

Chapter 3 RANDOM VARIATE GENERATION

How To Check For Differences In The One Way Anova

Process Capability Analysis Using MINITAB (I)

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Variables Control Charts

Permutation Tests for Comparing Two Populations

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?

1 Another method of estimation: least squares

Introduction to Hypothesis Testing OPRE 6301

Matching Investment Strategies in General Insurance Is it Worth It? Aim of Presentation. Background 34TH ANNUAL GIRO CONVENTION

Difference tests (2): nonparametric

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Standard Deviation Estimator

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

LOGISTIC REGRESSION ANALYSIS

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

START Selected Topics in Assurance

Inference for two Population Means

Two-sample inference: Continuous data

Testing Research and Statistical Hypotheses

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

3.4 Statistical inference for 2 populations based on two samples

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Examples on Monopoly and Third Degree Price Discrimination

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

3. Mathematical Induction

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Sample Size and Power in Clinical Trials

A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Sample Questions with Explanations for LSAT India

Recall this chart that showed how most of our course would be organized:

Colored Hats and Logic Puzzles

LOGNORMAL MODEL FOR STOCK PRICES

Comparing Means in Two Populations

arxiv: v1 [cs.ir] 12 Jun 2015

Simple linear regression

WISE Power Tutorial All Exercises

10. Comparing Means Using Repeated Measures ANOVA

Lab 11. Simulations. The Concept

The Graphical Method: An Example

Methodology Overview

EXPONENTIAL FUNCTIONS

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Betting with the Kelly Criterion

The Bayesian Approach to Forecasting. An Oracle White Paper Updated September 2006

For additional information, see the Math Notes boxes in Lesson B.1.3 and B.2.3.

Systems of Equations Involving Circles and Lines

Fatigue Analysis of an Inline Skate Axel

p-values and significance levels (false positive or false alarm rates)

IS YOUR DATA WAREHOUSE SUCCESSFUL? Developing a Data Warehouse Process that responds to the needs of the Enterprise.

Transmission Line and Back Loaded Horn Physics

Solutions to Homework 10 Statistics 302 Professor Larget

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Analysis of Variance (ANOVA) Using Minitab

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Tests for Two Proportions

Duration and Bond Price Volatility: Some Further Results

Simulation: An Effective Marketing Tool

Descriptive Analysis

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Fixed-Effect Versus Random-Effects Models

A Comparative Analysis of Speech Recognition Platforms

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida

Special Situations in the Simplex Algorithm

Non-Inferiority Tests for One Mean

Fairfield Public Schools

Hypothesis Testing for Beginners

Chapter 19 The Chi-Square Test

Linear Programming. Solving LP Models Using MS Excel, 18

Elasticity. I. What is Elasticity?

RANDOM-NUMBER GUESSING AND THE FIRST DIGIT PHENOMENON1

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Statistics 2014 Scoring Guidelines

Direct Evidence Delay with A Task Decreases Working Memory Content in Free Recall

CHAPTER 7 SUGGESTED ANSWERS TO CHAPTER 7 QUESTIONS

ECONOMIC SUPPLY & DEMAND

COMPUTING DURATION, SLACK TIME, AND CRITICALITY UNCERTAINTIES IN PATH-INDEPENDENT PROJECT NETWORKS

Chapter 27: Taxation. 27.1: Introduction. 27.2: The Two Prices with a Tax. 27.2: The Pre-Tax Position

AP Statistics Solutions to Packet 2

Abstract. In this paper, we attempt to establish a relationship between oil prices and the supply of

Games, gadgets, and other goods discount coupon: An ethics case

Performance Analysis of the IEEE Wireless LAN Standard 1

Information differences between closed-ended and open-ended survey questions for high-technology products

Tutorial Paper on Quantitative Risk Assessment. Mohammad Reza Sohizadeh Abyaneh Seyed Mehdi Mohammed Hassanzadeh Håvard Raddum

Transcription:

Paninimania: sticker rarity and cost-effective strategy Sylvain Sardy and Yvan Velenik Section de mathématiques Université de Genève Abstract We consider some issues related to the famous Panini stickers devoted to the football world cup. In particular, we address the following questions: is there a planned shortage of some stickers? What is a good cost-effective strategy to fill in an album? 1 Introduction The collectors frenzy over Panini s stickers is now almost a tradition with each new football world cup [1]. In this note we discuss the alleged rarity of certain stickers (famous players, etc. and propose a cost-effective strategy. For the purpose of the discussion below, we recall that in Switzerland the stickers can be purchased in three different ways: buying one packet of different stickers for CHF 1; buying one box of 00 different stickers 1 for CHF 100 (actually prices as low as CHF 70 can be found; buying specific individual stickers directly from Panini at a cost of CHF 0.30 apiece with, however, a limitation to at most 0 stickers. The album comprises different stickers. 2 Rare stickers: a myth? 2.1 Testing for uniformity To test whether all stickers appear with the same frequency, 12 boxes (three in four different Swiss cantons of 100 packets containing each five different stickers have been collected, which amounts to a total of 6000 stickers. Because of the dependence induced by the fact that stickers bought in a single box are all different, we cannot perform a standard chi-square test. Instead, we perform a Monte-Carlo simulation to estimate the null distribution under the hypothesis that stickers are uniformly distributed overall with the constraints that there are no duplicates neither within a packet nor within a box. Hence we generate 12 independent boxes with such constraints, count the total number of times N i, i = 1,...,, each sticker occurs, and calculate the statistics S = i=1 (N i e i 2 e i with e i = 6000. 1 Although this is true for the analyzed boxes, it seems that some boxes may occasionally contain a few duplicates. We ignore this issue here. 1

Repeating this experiment M = 10 4 times provides an estimate of the null distribution that we plot on the left graph of Figure 1. The statistics s = 138 calculated on the collected stickers is also plotted and amounts to a p-value of 0.9974, so that we do not reject the null hypothesis that stickers are uniformly distributed. Interestingly the p-value is large, which can be explained by the fact that in two of the four cantons, it was possible to complete an entire album by buying three boxes. If boxes were independent of each other the probability of such an event would be very small: 160 k=0 ( k 160 ( 00 k ( 340+k 00 00+k ( 0 2 3.7 10. (1 Looking at the serial number of the boxes reveals that successive boxes seem to be dependent, which violates the assumption of the previous test. For a more accurate test, we start by dropping the data from the two cantons where we were able to complete an entire album, and repeat the test. The right graph of Figure 1 now shows the estimated null distribution along with the statistics s = 168.44 based on the 3000 measurements. The p-value is now 0.123, so that we still do not reject the null hypothesis. Data from 4 cantons Data from 2 cantons Density 0.00 0.01 0.02 0.03 0.04 O Density 0.00 0.01 0.02 0.03 0.04 O 130 140 10 160 170 180 190 120 140 160 180 s s Figure 1: Density estimate of the distribution of the statistics S under the null hypothesis that stickers are uniformly distributed along the value of the statistics calculated from the data: (left based on the 6000 stickers of 4 cantons, in which case the independence assumption seems violated by boxes with successive serial numbers; (right based on 3000 stickers of 2 cantons. 2.2 How to explain the myth Most people who try to fill in the Panini album believe some stickers are indeed rare [1]. To explain the myth, we considered two scenarios and made some calculations. 2

Number of packets bought 233 +233 +233 +233 Number of different stickers obtained 0 +90 +17 +3 Table 1: Single person scenario. Average number of different stickers obtained as a function of the number of packets bought. Consider first the situation where a single person buys independent packets of five stickers. The average number of packets he needs to complete the album is 931 as one can read on Figure 2. And on average, with one fourth (i.e., 233 packets, he can obtain 0 different stickers; with another 233 packets, he can obtain 640 different stickers; with another 233 packets, 67 different stickers; finally another 233 packets are needed to complete the album with different stickers. These average calculations are summarized in Table 1. Hence the last three stickers missing seem rare since it takes so long to get them. Let us turn now to a more realistic scenario which takes into account sticker swapping. To model swapping we consider optimal swapping between k individuals who buy their cards together until they fill k albums. Other swapping procedures could also be considered (e.g., each individual swapping his duplicates for missing cards, but this would lead to less cost effective strategies. Consider ten friends who buy 100 packets each, one by one (as opposed to buying boxes and perform optimal swapping. Since a total of 000 stickers are bought, one has the feeling that it is unlikely that at least one sticker is missing to all the ten friends. But calculations show that this event actually occurs slightly more than 2% of the time. Hence, a group of friends that has at least one sticker missing will incorrectly believe that these stickers are rare. 3 Strategies 3.1 Without swapping Consider first the situation of a person trying to fill in his album without swapping stickers. This version is very reminiscent of the classical coupon collector problem [3, Section IX.3]. In the latter, one of n different coupons is obtained when buying one instance of some product; how many instances are necessary in order to complete the collection? This problem can easily be solved explicitly, yielding in particular an expected number of required purchases equal to n(1 + 1 2 + + 1 n n(log n + 0.77. In particular for n =, this leads to approximately 4666.3 stickers, or CHF 933.27. In our case, the situation is slightly more complicated, since different stickers are obtained each time a packet is purchased. Of course, one should not expect the latter constraint to affect significantly the conclusion, since objects sampled (with replacement from a pool of objects are all different with probability ( / 0.98. This situation has also been treated analytically [4, 2] yielding for example an expectation of the number of required packets 3

of five stickers equal to ( ( ( 1 j+1 j j=1 ( ( j 930.84. 3.2 Effect of swapping Let us now turn to the effect of swapping on the overall expected cost of filling an album. As above we consider optimal swapping between k individuals. Figure 2 shows the evolution of the expected cost per collector as a function of k. 1000 900 Expected number of required packets per person 800 700 600 00 400 300 200 1 2 3 4 6 7 8 9 10 11 12 13 14 1 16 17 18 19 20 Number of persons Figure 2: Average number of packets each of k persons have to buy in order for all of them to complete their album, under an assumption of optimal swapping. Notice that in the limiting situation of an infinite number of friends, this minimal number is actually equal to 132 = /; this value corresponds to the horizontal axis in the picture. 3.3 A good strategy In order to devise a reasonable strategy, we use the facts that it is possible to buy boxes containing 00 different stickers for a price of CHF 70 100, depending on the vendor, and that it is possible to buy from Panini up to 0 specified stickers for a unit price of CHF.30 each. 4

Numerical computations suggest the following strategy, when swapping with 9 other persons: 1 buy a box, 2 buy 40 additional packets and swap the duplicates until at most 0 stickers are missing from your collection, 3 order the missing stickers from Panini. The cost of this strategy is between CHF 12 and CHF 1, depending on the price of the box. If one ignores swapping, it is possible to derive analytically the optimal strategy [4]. References [1] À bon entendeur, Télévision Suisse Romande, 18 mai 2010, http://www.tsr.ch/emissions/abe/1968926-foot-abe-met-le-maillot-et-sortson-album-panini.html#1968926 [2] Adler, Ilan and Ross, Sheldon M.: The coupon subset collection problem, J. Appl. Probab. 38 (2001, no. 3, 737 746. [3] Feller, William: An introduction to probability theory and its applications. Vol. I., Third edition John Wiley & Sons, Inc., New York-London-Sydney 1968. [4] Stadje, Wolfgang: The collector s problem with group drawings., Adv. in Appl. Probab. 22 (1990, no. 4, 866 882.