Some Microfoundations of Collective Wisdom

Similar documents

Properties of MLE: consistency, asymptotic normality. Fisher information.

Modified Line Search Method for Global Optimization

I. Chi-squared Distributions

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 7 Methods of Finding Estimators

Output Analysis (2, Chapters 10 &11 Law)

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

A probabilistic proof of a binomial identity

5: Introduction to Estimation

Groups of diverse problem solvers can outperform groups of high-ability problem solvers

Hypothesis testing. Null and alternative hypotheses

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Determining the sample size

Department of Computer Science, University of Otago

INVESTMENT PERFORMANCE COUNCIL (IPC)

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

1 Computing the Standard Deviation of Sample Means

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Incremental calculation of weighted mean and variance

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

MARTINGALES AND A BASIC APPLICATION

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

LECTURE 13: Cross-validation

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Normal Distribution.

Building Blocks Problem Related to Harmonic Series

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Lesson 17 Pearson s Correlation Coefficient

1. C. The formula for the confidence interval for a population mean is: x t, which was

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Sequences and Series

Measures of Spread and Boxplots Discrete Math, Section 9.4

The Stable Marriage Problem

Confidence Intervals for One Mean

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Maximum Likelihood Estimators.

Soving Recurrence Relations

Math C067 Sampling Distributions

PSYCHOLOGICAL STATISTICS

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Notes on exponential generating functions and structures.

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

MTO-MTS Production Systems in Supply Chains

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Chapter 7: Confidence Interval and Sample Size

Asymptotic Growth of Functions

5 Boolean Decision Trees (February 11)

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

CONTROL CHART BASED ON A MULTIPLICATIVE-BINOMIAL DISTRIBUTION

Statistical inference: example 1. Inferential Statistics

Dynamic House Allocation

ODBC. Getting Started With Sage Timberline Office ODBC

3 Basic Definitions of Probability Theory

Learning objectives. Duc K. Nguyen - Corporate Finance 21/10/2014

Pre-Suit Collection Strategies

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

Designing Incentives for Online Question and Answer Forums

CHAPTER 3 THE TIME VALUE OF MONEY

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

Decomposition of Gini and the generalized entropy inequality measures. Abstract

Subject CT5 Contingencies Core Technical Syllabus

CHAPTER 3 DIGITAL CODING OF SIGNALS

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Amendments to employer debt Regulations

Quadrat Sampling in Population Ecology

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps

Entropy of bi-capacities

Hypergeometric Distributions

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

The Power of Free Branching in a General Model of Backtracking and Dynamic Programming Algorithms

1 The Gaussian channel

Universal coding for classes of sources

I. Why is there a time value to money (TVM)?

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

A Mathematical Perspective on Gambling

A Recursive Formula for Moments of a Binomial Distribution

THE HEIGHT OF q-binary SEARCH TREES

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

How to read A Mutual Fund shareholder report

Transcription:

Some Microfoudatios of Collective Wisdom Lu Hog ad Scott E Page May 12, 2008 Abstract Collective wisdom refers to the ability of a populatio or group of idividuals to make a accurate predictio of a future outcome or a accurate characterizatio of a curret outcome. I some circumstaces, the collective ca be more accurate tha ay of its members. Yet, collective wisdom eed ot emerge i all situatios. Crowds ca be uwise as well as presciet. I this paper, we upack what uderpis ad what udermies collective wisdom usig a model of agets with predictive models. Our model exteds traditioal statistical approaches to characterizig collective wisdom. Withi our model we demostrate how idividual sophisticatio /expertise ad collective diversity, combie to produce accuracy. Breakdows i collective wisdom ecessarily etail a lack of these two characteristics. 1

I describig the beefits of democracy, Aristotle observed that whe idividuals see distict parts of the whole, the collective appraisal ca surpass that of idividuals. Ceturies later, vo Hayek i describig the role of iformatio i decetralized markets made a related argumet that suggested the market ca accurately determie prices eve if the average perso i the market caot (vo Hayek 1945). To be sure, istitutioal structures such as democracies ad markets rests substatially o the emergece of collective wisdom. Without a geeral tedecy for groups of people, writ large ad small, to make reasoable appraisals ad decisios, democracy would be doomed. The success of democracies, ad for that matter markets, provides broad stroke support that collective wisdom ofte does exist. Abudat aecdotal ad small to large scale empirical examples also suggest at least the potetial for a wisdom of crowds (Suroweicki 2004). Collective wisdom, as we shall defie it here, exists whe the crowd outperforms the people i it. The logical foudatios for collective wisdom are well established. First, a straightforward mathematical calculatio demostrates that the average predictio of a crowd always outperforms the crowd s average member (Page 2007). Secod, this same calculatio implies that that with some regularity, that a crowd should outperform ay member or all but a few of their members. We describe how that ca be the case i detail. Mathematics ad lofty prose ot withstadig, the claim that whole of a society or group somehow exceeds the sum of its parts occurs to may to be over idealized. Ay mathematicia or philosopher who took a momet to veture out of his or her 2

office would fid o ed of committee decisios, jury verdicts, democratic choices, ad market valuatios that have prove far wide of the mark. Collective wisdom, therefore, should be see oly as a potetial outcome, as somethig that ca occur whe the right coditios hold. The gap betwee theory ad reality ca be explaied by the starkess of the theory. The core assumptios that drive the mathematical ecessity of collective wisdom may be too coveiet. I particular, the idea that people receive sigals that correlate with the truth has come to be accepted without thought. Ad, as we shall argue, it is this assumptio that creates the ear ievitability of collective wisdom. I this paper, we describe a richer theoretical model that ca explai the existece of collective wisdom as well as the lack thereof. I this model, idividuals possess predictive models. Hog ad Page (2007) refer to these as iterpreted sigals to capture the fact that these predictios ca be thought of as statistical sigals but that their values deped o how people iterpret the world. the predictio of a crowd of people ca be thought of as some type of average of the models cotaied withi those peoples heads. Thus, collective wisdom depeds o characteristics of the models people carry aroud i their heads. We show that those models must be sophisticated, ad they must be diverse. These two features refer to differet uits of aalysis. Diversity refers to the collectio see as a whole. The people withi it, or should we say their models, must differ. Sophisticatio /expertise refers to the capabilities of idividuals withi the collectio. The idividuals must be smart. 3

A lack of either characteristic requires a large icrease i the other. Homogeeous crowds ca oly be accurate if they cotai extremely sophisticated idividuals, ad groups of aive idividuals ca oly be collectively accurate if they possess great diversity. 1 The ituitio for why collective wisdom requires sophisticated idividuals should be straightforward. We caot expect a itelliget whole to emerge from icompetet parts. The ituitio for why diversity matters, ad matters as much as it does, proves more subtle, so much so that several accouts misiterpret the mechaism through which diversity operate ad others resort to had wavig. The logic for why diversity matters requires two steps. First, diverse models ted to produce egatively correlated predictios. 2 Secod, egatively correlated predictios produce better aggregate outcomes. If two predictios are egatively correlated whe oe teds to be high, the other teds to be low, makig the average just right. The model we describe differs from the stadard approach i political sciece ad ecoomics, or what we call the statistical model of aggregatio. As metioed above, i the statistical model, idividuals receive sigals that correlate with the value or outcome of iterest. Each idividual s sigal may ot be that accurate but i aggregate, owig to a law of large umbers logic, those errors ted to cacel. I the 1 Our approach borrows ideas from esemble learig theory. I esemble learig, collectios of models are traied to make a predictio or a classificatio. The predictios of the idividual models are the aggregated to produce a collective predictio. 2 I the case of a yes or o choice, Hog ad Page (2008) show that whe people use maximally diverse models (we formalize this i the paper), their predictios are ecessarily egatively correlated. 4

caoical statistical model, errors are assumed to be idepedet. More elaborate versios of the model iclude both egative ad positive correlatio, a modificatio we take up at some legth as egative correlatio proves to be crucial for collective wisdom. I what follows, we first describe the statistical model of collective wisdom. This approach domiates the social sciece literature o votig ad markets as well as the early computatioal literature o esemble learig. That said, the computatioal scietists do a much more complete job of characterizig the cotributios of diversity. Social sciece models ted to sweep diversity uder the rug callig it oise. I fact, we might eve go so far as to say that social scietists cosider diversity to be more of a icoveiece that a beefit. We the formally defie iterpreted sigals (Hog ad Page 2008). These form the basis for what we will call the cogitive model of collective wisdom. This approach domiates the curret computatioal sciece models. This cogitive model does ot i ay way cotradict the statistical model. I fact, we rely o the statistical model as a les through which to iterpret the cogitive model. I characterizig both types of models, we cosider a geeral eviromet that icludes both biary choice eviromets, i.e. simple yes or o choices, ad cardial estimatio, such as whe a collectio of people must predict the value of a stock or the rate of iflatio. Whe ecessary for clarity, we refer to the former as classificatio problems ad to the latter as estimatio problems. The aalysis differs oly slightly across the two domais, ad the core ituitios prove to be the same. We coclude our aalysis with a legthy 5

discussio of what the theoretical results imply for the the existece or lack thereof of collective wisdom i markets ad democracies ad we discuss what we call the paradox of weightig. That discussio is by o meas exhaustive, but is meat to highlight the value of costructig deeper micro foudatios. Before begiig, we must address two issues. First, a growig literature i political sciece ad i ecoomics cosiders the implicatios of ad icetives for strategic votig. For the most part, we steer clear of strategic cosideratios. Whe they do come ito play, we poit out what their effect might be. We wat to make clear from the outset that regardless of what motivates the votes cast, the possibility of collective wisdom ultimately higes o a combiatio of collective diversity ad idividual sophisticatio /expertise. Secod, we would be remiss if we did ot ote the iroy of our model s mai result: that collective wisdom requires diverse ad sophisticated models. Yet, i this paper, we have costructed just two models - a statistical model ad a model based o idividuals who themselves have models. If our theory is correct, these two caot be eough. Far better that we have what Page (2007) calls a crowd of models. Complemetig these models with historical, empirical, sociological, psychological, experimetal, ad computatioal models should provide a deeper, more accurate picture of what coditios must hold for collective wisdom to emerge. Clearly, cultural, social, ad psychological distortios ca also bias aggregatio. 6

The Statistical Model of Collective Wisdom The statistical model of collective wisdom cosiders the predictios or votes of idividuals to be radom variables draw from a distributio. That distributio ca be thought of as geeratig radom variables coditioal o some true outcome. I the most basic of models, the accuracy of the sigals is captured by a error term. I more elaborate models, sigals ca also iclude a bias. I the caoical model, the sigals are assumed to be idepedet. This idepedece assumptio ca be thought of as capturig diversity but how that diversity traslates ito the sigals is left implicit. More uaced theoretical results will allow for degrees of correlatio as we shall show. I all of the models that follow, we assume a collectio of idividuals of fixed size. The set of idividuals: N = {1,..., }. The voters attempt to predict the outcome. As metioed above that outcome ca either be a simple yes /o or it could be a umerical value. The outcome θ Θ I classificatio problems Θ = {0, 1}, ad i estimatio problems Θ = [0, ] As metioed, idividuals receive sigals. A sigal ca be thought of as the predictio or opiio of the idividual. To distiguish these sigals from those produced by cogitive models, we refer to this first type as geerated sigals ad to the latter as iterpreted sigals. This omeclature serves as a remider that i the statistical 7

model the sigals are geerated by some process that produces sigals accordig to some distributio whereas i the cogitive model the sigal a idividual obtais depeds o how she iterprets the world. Idividual i s geerated sigal s i Θ is draw from the distributio f i ( θ). The otatio allows for each idividual s sigals to be draw from a differet distributio fuctio. We collectio of all sigals ca be characterized by a collective distributio fuctio. The collective distributio fuctio g(s 1, s 2,...s θ) describes the joit distributio of all of the sigals coditioal o θ, g i (s 1, s 2,...s θ) = f i (s i θ) The squared-error of a idividual s sigal equals the square of the differece betwee the sigal ad the true outcome. The sq-error of the ith idividual s sigal SqE(s i ) = (s i θ) 2 The average sq-error SqE( (s)) = 1 i=1 (s i θ) 2 I what follows, we assume that the collective predictio equals the average of the idividuals sigals. I the last sectio of the paper, we take up differetial weightig of sigals. The collective predictio c = i=1 s i. We deote the squared error of the collective predictio by SqE(c) The squared error gives a measure of the accuracy of the collective. We ca measure the predictive 8

diversity of the collective by takig the variace of the predictios. The predictive diversity of a vector of sigals s = (s 1, s 2,..s ) equals their variace P Div( s) = (s i c) 2 i=1 Statistical Model Results With this otatio i had, we ca ow state what Page (2007) calls the Diversity Predictio Theorem ad the Crowds Beat Averages Law. These widely kow results provide the basic logic for the wisdom of crowds. The first theorem states that the squared error of the collective predictio equals the average squared error mius the predictive diversity. Here, we see the first evidece that collective accuracy depeds both o expertise (low average error) ad diversity. Theorem 1. (Diversity Predictio Theorem) The sqaured error of the collective predictio equals the average squared error mius the predictive diversity. SqE(c) = SqE( s) P Div( s) pf. Expadig each term i the expressio, it suffices to show that 9

(c θ) 2 = 1 [ ] (s i θ) 2 1 [ ] (s i c) 2 i=1 i=1 = 1 [ ] (s 2 i 2s i θ + θ 2 ) 1 [ ] (s 2 i 2s i c + c 2 ) i=1 i=1 [ ] [ ] s 2 i = 2cθ + θ 2 s 2 i + 2c 2 c 2 i=1 = c 2 2cθ + θ 2 i=1 A corollary of this theorem states that the collective squared error must always be less tha or equal to the average of the idividuals squared errors. Corollary 1. (Crowd Beats Averages Law) The squared error of the collective s predictio is less tha or equal to the averaged squared error of the idividuals that comprise the crowd. The fact that predictive diversity caot be egative implies that the corollary follows immediately from the theorem. Nevertheless, the corollary proves importat. It demostrates a logic for the idea of collective wisdom. I aggregatig sigals, the whole caot be less accurate tha the average of its parts. We ow describe more geeral results from the statistical model of collective wisdom. The previous two results describe a particular istace. Here, we derive results i expectatio over all possible realizatios of the geerated sigals give the outcome. Before, we cosidered a sigle istace, so we could thik of each sigal as havig a error. Now, we re averagig over a distributio ad errors ca take two 10

forms. A perso s sigal could be systematically off the mark, or it could just be off i a particular realizatio. To differetiate betwee the systematic error i a idividual s geerated sigal ad the idiosycratic oise, statisticias refer to these as the bias ad the variace of the sigal. Let µ i (θ) deote the mea of idividual i s sigal coditioal o θ. Idividual i s bias, b i = (µ i θ) The variace of idividual i s sigal v i = E[(s i µ i ) 2 ] We ca also defie the average bias ad the average variace across the idividuals. The average bias, b = 1 N N i=1 (µ i θ) The average variace V = 1 N N i=1 E[s i µ i ] 2 To state the ext result, we eed to itroduce the idea of covariace. The covariace of two radom variables characterizes whether they ted to move i the same directio or i opposite directios. If covariace is positive, whe oe sigal is above average, the other is likely to be above average as well. Negative covariace implies the opposite. Thus, egatively correlated sigals ted to cacel out oe aother s errors. The average covariace C 1 = N N(N 1) i=1 j i E[s i µ i ]E[s j µ j ] Note the implicit mode of thikig that uderpis this aalysis. Each perso has a associated distributio fuctio that geerates sigals. Ay oe predictio ca be 11

thought of as a radom draw from that distributio. Give our costructio we ca ow write a theorem that relates the expected squared error for the collective as a fuctio of the bias, the variace, ad the covariace. We use the otatio E[SqE(c)] to deote that we are takig a expectatio. This result is commoly kow as the bias-variace-covaraice decompositio. Theorem 2. (Bias-Variace-Covariace (BVC) Decompositio) Give geerated sigals with average bias b, average variace V, ad average covaraice C, the followig idetity holds: E[SqE(c)] = b 2 + 1 V + 1 C pf. Choose ɛ i such that b i +ɛ i = s i θ. Note that by assumptio E[ɛ i ] = 0, E[[ε i 2 ] = v i. We ca therefore write [( 1 E[SqE(c)] = E [ 1 = E i=1 [ 1 = E i=1 [ 1 = = [ 1 i=1 i=1 i=1 ) ] 2 s i θ (s i θ) (b i + ɛ i ) ] 2 ] 2 ] 2 b i + 2 [ ][ ] b 2 i E[ε i ] + 1 [ E[ε 2 2 i ] + i=1 i=1 i=1 ] 2 b i + 1 [ ] E[ε 2 2 i ] + 1 [ ] E[ε 2 i ε j ] i=1 i=1 = b 2 + 1 V + 1 C j=1,j i i=1 j=1,j i ] E[ε i ε j ] Accordig to the BVC Decompositio expected squared error icreases i average 12

bias. That s ituitive. Icreasig the bias of sigals should make the collective less accurate o average. Furthermore, icreasig the variace of sigals icreases expected error. This too makes sese. If sigals icrease i variace, i.e. become more likely to take high or low values (though maitaiig the same mea), their average should be less accurate. At first glace, this last ituitio seems to cotradict the Diversity Predictio Theorem. I that theorem, variatio i predictios icreased accuracy. The Diversity Predictio Theorem cosiders realized predictios. Greater variace i realized sigals (holdig their accuracy costat) improves collective accuracy. This follows because high variace requires that some predictios must be too high ad some must be too high. The BVC Decompositio cosiders the sigals as radom variables. More variace i the radom variables beig averaged results i less accurate collective predictios. A way to egieer the distributios so that the realized predictios have high variace would be to make the sigals egatively correlated. I that way, whe oe sigal is too high, aother sigal teds to be too low, resultig i both high realized variace ad a more accurate collective predictio. Ad, i fact, that is what the BVC decompositio shows: collective accuracy icreases whe sigals have egative covariace. If we cosider geerated sigals that are egatively correlated to be diverse, the the theorems provide two alterative ways of seeig the beefits of accuracy ad diversity for collective predictio. We coclude our aalysis of the statistical model with a corollary that states that 13

as the collective grows large, if geerated sigals have o bias ad bouded variace ad covariace, the the expected squared error goes to zero. Corollary 2. (Large Populatio Accuracy) Assume that for each idividual average bias b = 0, average variace V is bouded from above, ad that average covaraice C is weakly less tha zero. As the umber of idividuals goes to ifiity, the expected collective squared error goes to zero. pf. From the BVC Decompositio, we have that E[SqE(c)] = b 2 + 1 V + 1 C By assumptio b = 0 ad there exists a T such that V < T. Furthermore, C 0. Therefore E[SqE(c)] < T which goes to zero as approaches ifiity. Note that idepedet ubiased geerated sigals are a special case of this corollary. If each idividual s geerated sigals equal the truth plus a idiosycratic error term, the as the collective grows large, it ecessarily becomes wise. The Cogitive Model of Collective Wisdom We ow describe a cogitive model of collective wisdom. This cogitive model allows us to geerate deeper isights tha the statistical model. I this sectio we show that for collective wisdom to emerge, the idividuals must have relatively sophisticated models of the world otherwise, we caot expect them collectively to come to 14

the correct aswer. Furthermore, the models that people have i their heads must differ. If they do t, if everyoe i the collective thiks the same way, the collective caot be ay better tha the people i it. Thus, collective wisdom must deped o moderately sophisticated ad diverse models. Fially, sophisticatio ad diversity must be measured relative to the cotext. What is it that these idividuals are tryig to predict? Whe we thik of collective wisdom from a cogitive viewpoit, we begi to see shortcomigs with the statistical model. The statistical model uses accuracy as a proxy for sophisticatio or expertise as well as for problem difficulty ad uses idepedece or covariace as a proxy for diversity. The cogitive model that we describe cosiders expertise, diversity, ad sophisticatio explicitly. To do so, the model relies o a differet type of sigal called iterpreted sigals. These sigals come from predictive models. Iterpreted Sigals Iterpreted sigals ca be thought of as model based predictios that idividuals make about the outcome. Those models, i tur, ca be thought of as approximatios of a uderlyig outcome fuctio. Therefore, before we ca defie a iterpreted sigal, we must first defie the outcome fuctio that the models approximate. To do this, we first deote the set of all possible states of the world. The set of states of the world X 15

The outcome fuctio, F maps each possible state ito a outcome. The outcome fuctio F : X Θ Each idividual has a iterpretatio (Page 2007) which is a partitio of the set of states. A iterpretatio partitios the states of the world ito distict categories. These categories form the basis for the idividual s predictive model. For example, oe idividual might partitio politicias ito two categories: liberals ad coservatives. Aother voter might partitio politicias ito categories based o idetity characteristics such as age, race, ad geder. 3 Idividual i s iterpretatio Φ i = {φ i1, φ i2,...φ im } equals a set of categories that partitio of X. We let Φ i (x) deote the category i the iterpretatio to which the state of the world x belogs. Idividuals with fier iterpretatios ca be thought of as more sophisticated. Formally, we say that oe idividual is more sophisticated tha aother if every category i its iterpretatio is cotaied i a category of the others. Idividual i s iterpretatio is more sophisticated tha idividual j s iterpretatio if for ay x, Φ i (x) Φ j (x), with strict iclusio for at least oe x. A collectio of idividuals becomes more sophisticated if every idividual s iterpretatio becomes 3 A iterpretatio is similar to a iformatio partitio (Auma 1976). What Auma calls a iformatio set, we call a category. The differece betwee our approach ad Auma s is that he assume that oce a state of the world is idetified idividuals kow the value of the outcome fuctio. 16

more sophisticated. Idividuals have what we call predictive models which map their categories ito outcomes. Predictive models are coarser tha the outcome fuctio. Whereas the objective fuctio maps states of the world ito outcomes, predictive models maps sets of states of the world, amely categories, ito outcomes. Thus, if a idividual places two two states of the world i the same category, the idividual s predictive model must assig the same outcome to those two states. Idividual i s predictive model M i : X Θ s.t. if Φ i (x) = Φ i (y) the M i (x) = M i (y). A idividual s predictio equals the output of his or her predictive model. The predictive model of a idividual ca be thought of as a sigal. However, ulike a geerated sigal, this sigal depeds o the outcome fuctio ad o the idividual s iterpretatio ad predictive model. We refer to this predictio as a iterpreted sigal. The collective predictio of a populatio of idividuals we take to be the average of the predictios of the idividuals The collective predictio M(x) = i=1 M i(x) The ability of a collectio of idividuals to make a accurate predictio depeds upo their predictive models. Ituitively, if those models are idividually sophisticated, i.e. partitio the set of states of the world ito may categories, ad collectively diverse, i.e. they create differet partitios, the we should expect the collective pre- 17

dictio to be accurate. The ext example shows how this ca occur. Example X = (x 1, x 2, x 3 ) x i {0, 1}. Each state is equally likely. The outcome fuctio, F (x) = x 1 + x 2 + x 3. = 3. P hi i = {{x x i = 0}, {x x i = 0}} ad M i (x) = 1 if x i = 0 ad M i (x) = 2 if x i = 1. The table below gives the iterpreted sigals (the predictios) for each realizatio of x as well as the mea of the predictios ad the value of the outcome fuctio. State M 1 (x) M 2 (x) M 3 (x) M(x) F (x) Sq-Error 000 1 1 1 1 0 1 001 1 1 2 4/3 1 1/9 010 1 2 1 4/3 1 1/9 100 2 1 1 4/3 1 1/9 011 1 2 2 5/3 2 1/9 101 2 1 2 5/3 2 1/9 110 2 2 1 5/3 2 1/9 111 2 2 2 2 3 1 Average 3/2 3/2 3/2 3/2 3/2 1 3 We ca view these iterpreted sigals usig the statistical framework. Though each predictio results from the applicatio of a cogitive model, we ca thik of them as radom variables. By symmetry, it suffices to cosider a sigle iterpreted sigal to compute bias ad squared error. 18

State M 1 (x) F (x) Error SqE 000 1 0 1 1 001 1 1 0 0 010 1 1 0 0 100 2 1 1 1 011 1 2-1 1 101 2 2 0 0 110 2 2 0 0 111 2 3-1 1 Average 3/2 3/2 0 1/2 As ca be see from the table, the bias of the iterpreted sigal equals zero ad the equals 1. Notice that each idividual has a average squared error equal to 1/2 2 but the collectio has a expected squared error equal to just 1/3. So i this case, the collective is more accurate, o average, tha ay of the idividuals. I the example, each idividual cosidered a distict attribute. Hog ad Page (2007) refer to these as idepedet iterpreted sigals. The iterpreted sigals of idividual 1 ad 2 are based o idepedet iterpretatios if ad oly if for all i ad j i {1, 2,...m} Prob(φ 1j φ 2i ) = Prob(φ 1j ) Prob(φ 2i ) 19

If two idividuals use idepedet iterpretatios, the they look at differet dimesios give the same represetatio. Hog ad Page (2007) show that for classificatio problems, i.e. problems with biary outcomes, idepedet iterpreted sigals must be egatively correlated. The theorem requires mild costraits o the idividuals predictive models amely that they predict both outcomes with equal probability ad that they are correct more tha half the time. 4 Theorem 3. If F : X {0, 1}, if each outcome is predicted equally ofte ad if each idividual s predictio is correct with probability p > 1/2 the idepedet iterpreted sigals are egatively correlated. pf. See Hog ad Page (2007) This theorem provides a likage betwee the models that idividuals use ad statistical properties of their predictios. For classificatio problems, model diversity implies egatively correlated predictios, which we kow from the statistical models implies more accurate collective predictios. The statistical approach focuses o the size of the expected error as a fuctio of bias, error, ad correlatio of geerated sigals. The cogitive model approach does ot assume ay radomess i the predictio, though ucertaity about the state of the world does exist. Therefore, a atural questio to ask withi the cogitive model 4 Extedig the theorem to apply to arbitrary outcome spaces would require stroger coditios o the predictive models ad o the outcome fuctio. 20

approach is whether a collectio of idividuals ca, through votig, produce the correct outcome. I other words, we ca ask - what has to be true of the idividuals ad of the outcome fuctio, for collective wisdom to emerge? The aswer to that questio is surprisigly straightforward. Idividuals thik at the level of category. They do ot distiguish amog states of the world that belog to the same category. Therefore, we ca thik of each idividual s iterpretatio ad predictive model as producig a fuctio that assigs the same value to ay two states of the world i the same category. If a collectio of people vote or express opiio about the likely value of a outcome, the what they are doig is aggregatig these fuctios. If the outcome fuctio ca be defied over the categories of idividuals, the it would seem possible that the idividuals ca combie their models ad approximate the outcome fuctio. However, suppose that the outcome fuctio assigs a extremely high value to states of the world i the set S but that o idividual ca idetify S, i.e. for each idividual i S is strictly cotaied withi a category i Φ i. The, we should ot expect the idividuals to be able to approximate the outcome fuctio. Thus, a ecessary coditio for collective wisdom to arise is that, collectively, the iterpretatios of the idividuals must be fie eough to approximate the outcome fuctio. I additio, the outcome fuctio must be a liear combiatio of the predictive models of the idividuals (See Hog ad Page 2008 for a full characterizatio) 21

0.1 Sophisticatio ad Diversity i Cogitive Models I the statistical model, sophisticatio bias ad error are meat to be proxies for sophisticatio ad correlatio is thought to capture diversity. I the cogitive model framework, sophisticatio refers to the umbers ad sizes of the categories. Iterpretatios that create more categories produce more accurate predictios. Ad, as just described, the ability of a collectio of people to make accurate appraisals i all states of the world depeds o their ability to idetify all sets that are relevat to the outcome fuctio. Therefore, as the idividuals become more sophisticated, the collective becomes more itelliget. As for diversity, we have see i the case of classificatio problems that idepedet iterpretatios produce egatively correlated iterpreted sigals. That mathematical fidig exteds to a more geeral isight: more diverse iterpretatios ted to produce more egatively correlated predictios. Cosider first the extreme case. If two idividual s use idetical iterpretatios ad make the best possible predictio for each category, the their predictive models will be idetical. They will have o diversity. Their two heads will be o better tha oe. If, o the other had, two people categorize states of the world differetly, they likely make differet predictios at a give state. Thus, diversity i predictios come from diversity i predictive models. Eve with a large umber of idividuals, we might expect some limits o the amout of diversity preset. I the statistical model, as the umber of idividuals teds to ifiity, the i the absece of bias, the collective becomes perfectly accu- 22

rate. That will ot happe i the cogitive model uless we assume that each ew idividual brigs a distict predictive model. Discussio I this paper, we have provided possible micro foudatios for collective wisdom. We have cotrasted this approach with the stadard statistical model of collective wisdom that domiates the literature. While both approaches demostrate the importace of sophisticatio ad diversity, they do so i differet ways. The statistical model makes assumptios that might be expected to correlate with sophisticatio ad diversity, while the cogitive model approach icludes sophisticatio ad diversity directly. The cogitive micro foudatios that we have preseted also help to explai the potetial for the madess of crowds. A collectio of people becomes more likely to make a bad choice if they rely o similar models. This idea aligs with the argumet made by Capla (2007) that people make systematic mistakes. Note though that i other veues where collectios of idividuals do ot make mistakes, they are ot ecessarily less accurate idividually, they may just be more diverse collectively. Fially, we have yet to discuss the potetial for persuasio withi a group. I the statistical model, persuasio places more weight o some idividuals tha o others. Ideally, the weight assiged to each geerated sigal would be proportioal to its accuracy. I ay particular a group settig we have o guaratee that such weightig will emerge. Ad, i fact, improper weightigs may lead to eve worse choices. I our 23

cogitive model, persuasio ca have a similar effect. However, istead of chagig weights people may abado models because they fid aother perso s model more covicig. Ofte such behavior proves to make the collective worse off. It is better for the collective to cotai a differet ad less accurate model tha to add oe more copy of ay existig model, eve if that existig model is more accurate. Refereces [1] Al-Najjar, N., R. Casadesus-Masaell ad E. Ozdeore (2003) Probabilistic Represetatio of Complexity, Joural of Ecoomic Theory 111 (1), 49-87. [2] Aragoes, E., I. Gilboa, A. Postlewaite, ad D. Schmeidler (2005) Fact-Free Learig, The America Ecoomic Review 95 (5), 1355-1368. [3] Barwise ad Seligma, (1997) Iformatio Flow: The Logic of Distributed Systems Cambridge Tracts I Theoretical Computer Sciece, Cambridge Uiversity Press, New York. [4] Billigsley, P. (1995) Probability ad Measure (3rd Editio) Wiley-Itersciece [5] Capla, Brya (2007) The Myth of the Ratioal Voter: Why Democracies Choose Bad Policies Priceto Uiversity Press. 24

[6] Fryer, R. ad M. Jackso (2008 ), A Categorical Model of Cogitio ad Biased Decisio-Makig, Cotributios i Theoretical Ecoomics, B.E. Press [7] Hollad, J. ad J. Miller (1991) Artificial Agets i Ecoomic Theory, The America Ecoomic Review Papers ad Proceedigs 81, 365-370. [8] Gilboa, I., ad D. Schmeidler, (1995) Case-Based Decisio Theory, The Quarterly Joural of Ecoomics, 110 605-639. [9] Hollad, J.H., K. Holyoak, R E Nisbett ad P. Thagard. (1989) Iductio: Processes of Iferece, Learig, ad Discovery MIT Press. [10] Hog L. ad S. Page (2001) Problem Solvig by Heterogeeous Agets, Joural of Ecoomic Theory 97, 123-163. [11] Hog L. ad S. Page (2007) Iterpreted ad Geerated Sigals workig paper [12] Hog L. ad S. Page (2008) O the Possibility of Collective Wisdom workig paper [13] Judd, K. (1997) Computatioal Ecoomics ad Ecoomic Theory: Complemets or Substitutes? Joural of Ecoomic Dyamics ad Cotrol. [14] Judd, K. ad S. Page (2004) Computatioal Public Ecoomics, Joural of Public Ecoomic Theory forthcomig. [15] Klemperer, P. (2004) Auctios: Theory ad Practice Priceto Uiversity Press. 25

[16] Ladha, K. (1992) The Codorcet Jury Theorem, Free Speech, ad Correlated Votes, America Joural of Political Sciece 36 (3), 617-634. [17] Milgrom, P. ad R. Weber (1982) A Theory of Auctios ad Competitive Biddig, Ecoometrica 50 (5), 1089-1122. [18] Nisbett, R. (2003) The Geography of Thought: How Asias ad Westerers Thik Differetly...ad Why Free Press, New York. [19] Page, S. (2007) The Differece: How the Power of Diversity Creates Better Firms, Schools, Groups, ad Societies Priceto Uiversity Press. [20] Pearl, Judea (2000) Causality New York: Oxford Uiversity Press. [21] Stichecombe, A. (1990) Iformatio ad Orgaizatios Califoria Series o Social Choice ad Political Ecoomy I Uiversity of Califoria Press. [22] Tesfatsio, L. (1997) How Ecoomists Ca Get A-Life i The Ecoomy as a Complex Evolvig System II W. Bria Arthur, Steve Durlauf, ad David Lae eds. pp 533 565. Addiso Wesley, Readig, MA. [23] Valiat, L.G. (1984) A Theory of the Learable Commuicatios of the ACM, 17(11),1134-1142. [24] Vo Hayek, F. (1945) The Use of Kowledge i Society, America Ecoomic Review, 4 pp 519-530. 26

[25] Wellma, MP, A Greewald, P. Stoe, ad PR Wurma (2003) The 2001 Tradig Aget Competitio Electroic Markets 13(1). 27