Session 5x: Bonus material



Similar documents
The new gold standard? Empirically situating the TPP in the investment treaty universe

Addressing institutional issues in the Poverty Reduction Strategy Paper process

Today s tips for the Country Buy Report

Estimating Global Migration Flow Tables Using Place of Birth Data

Ken Jackson. January 31st, 2013

Building Capacity in PFM

Human Resources for Health Why we need to act now

A new metrics for the Economic Complexity of countries and products

China: How to maintain balanced growth? Ricardo Hausmann Kennedy School of Government Harvard University

Economic Complexity and the Wealth of Nations

Economic Growth: The Neo-classical & Endogenous Story

Figure 1.1 The Parade of World Income. Copyright 2005 Pearson Addison-Wesley. All rights reserved. 1-1

The Fall of the Final Mercantilism

Does Absolute Latitude Explain Underdevelopment?

Deep Roots of Comparative Development

Infrastructure and Economic. Norman V. Loayza, World ldbank Rei Odawara, World Bank

Lecture 21: Institutions II

Natural Resources and Development in the Middle East and North Africa: An Alternative Perspective

ECON 260 Theories of Economic Development. Instructor: Jorge Agüero. Fall Lecture 1 September 29,

Geography and Economic Transition

Political Economy of Growth

Macroeconomics II. Growth

Economic Growth: the role of institutions

Evaluation with stylized facts

Institute for Development Policy and Management (IDPM)

Bringing Up Incentives: A Look at the Determinants of Poverty. Alice Sheehan

Addressing The Marketing Problem of the Social Market Economy

Subjective Well-Being, Income, Economic Development and Growth

Fertility Convergence

Financial services and economic development

Does Export Concentration Cause Volatility?

The Role of Trade in Structural Transformation

Lecture 12 The Solow Model and Convergence. Noah Williams

Subjective Well Being, Income, Economic Development and Growth

Country Risk Classifications of the Participants to the Arrangement on Officially Supported Export Credits

Global Value Chains in the Current Trade Slowdown

Governance, Rule of Law and Transparency Matters: BRICs in Global Perspective

BUILDING A DATASET FOR BILATERAL MARITIME CONNECTIVITY. Marco Fugazza Jan Hoffmann Rado Razafinombana

Non-market strategy under weak institutions

Growing Together with Growth Polarization and Income Inequality

Infrastructure and Economic Growth in Egypt

Bands (considered to be) Shared on an Equal Basis Between Space and Terrestrial Services (for Region 1)

Reported measles cases and incidence rates by WHO Member States 2013, 2014 as of 11 February data 2013 data

Online Appendix to The Missing Food Problem: Trade, Agriculture, and International Income Differences

The Role of Women in Society: from Preindustrial to Modern Times

Trends in global income inequality and their political implications

Trade and International Integration: A Developing Program of Research

DEPENDENT ELITES IN POST- SOCIALISM: ARE LAND-BASED POST- COLONIAL SYSTEMS SO DIFFERENT FROM THE TRANSCONTINENTAL ONES? by Pal TAMAS [Institute of

Informality in Latin America and the Caribbean

How To Increase Crop Output

The distribution of household financial contributions to the health system: A look outside Latin America and the Caribbean

The Impact of Primary and Secondary Education on Higher Education Quality 1

Department of Economics

Life-cycle Human Capital Accumulation Across Countries: Lessons From U.S. Immigrants

Specialization Patterns in International Trade

Technical partner paper 8

The contribution of trade in financial services to economic growth and development. Thorsten Beck

Tripartite Agreements for MEPC.2/Circ. Lists 1, 3, 4 received by IMO following issuance of MEPC.2/Circ.20

Subjective Well Being and Income: Is There Any Evidence of Satiation? *

Human Rights and Governance: The Empirical Challenge. Daniel Kaufmann World Bank Institute.

Design of efficient redistributive fiscal policy

Daniel Kaufmann, World Bank Institute

Rodolfo Debenedetti Lecture

Tripartite Agreements for MEPC.2/Circ. Lists 1, 3, 4 received by IMO following issuance of MEPC.2/Circ.21

International Investment Patterns. Philip R. Lane WBI Seminar, Paris, April 2006

Incen%ves The Good, the Bad and the Ugly

Although seafood is the most highly

Economists have long known that poorly managed exchange rates can. The Real Exchange Rate and Economic Growth. DANI RODRIK Harvard University

Political Economy of Development and Underdevelopment

Industrial Policy, Capabilities, and Growth: Where does the Future of Singapore lie? Jesus Felipe Asian Development Bank

TRADE WATCH DATA JANUARY T RVSFRRTVL

EC 2725 April Law and Finance. Effi Benmelech Harvard & NBER

Transcription:

The Social Statistics Discipline Area, School of Social Sciences Session 5x: Bonus material Mitchell Centre for Network Analysis Johan Koskinen http://www.ccsr.ac.uk/staff/jk.htm! johan.koskinen@manchester.ac.uk Workshop: Mon-Fri, 7- July 24 Advanced Meths SNA, Manchester

Session 5x: Bonus material q Bayesian analysis q Missing data q Snowball sampled data q Fitting ERGM to LARGE data sets q Spatial Embedded networks q Multilevel ERGM q Longitudinal ERGM

Bayesian analysis in MPNet

Bayesian inference (in MPNet): Fishermen Bayesian estimation Go back into Select parameters, start afresh by clearing all and then select edge, ASA, ATA BE PATIENT. Bayesian estimation can be slower (we are working on automation).

Quick MCMC settings for Bayesian We need a slightly large multiplication factor than for non- Bayesian estimation Maximum lag should be chosen to be roughly the lag where SACF is (in order for ESS to be correct) roughly 2-4 If model is good we can use Pre- tuning only to get good initial values The objective is to get high acceptance rate around.85 Run number of small MCMC sample sizes and press update When pre- tuning not too bad check Nonconditional simulation and press update (the latter to start in a better place and get proposal covariance) The objective is to get acceptance rate around.25 and SACF around lag 2 small and ESS large If acceptance rate too small (say smaller than.5) reduce Proposal scaling (e.g. divide by 2); if too large (say greater than.45) increase Proposal scaling (e.g. multiply by 2) Once SACF at large lags (say or 2) is low (say, around.) you can Improve the ESS by making the MCMC sample size bigger If you have a good run and want the perfect run read in Covariance file

Bayesian analysis in MPNet ts(output[, k + ]) Set multiplication factor to 8 Scale.5 MCMC sample size Max lag 2 Scaled identity After run -3.6-3.4-3.2-3. -2.8 EdgeA Time Note that Inverse D matrix is diagonal 2 6 ts(output[, k + ]) -.3 -.2 -.. ASA 2 6 Time ts(output[, k + ])..2.4.6.8 ATA 2 6 Time This run just got us close to where we want to be Inverse D matrix:......... Acceptance rate:.42 Estimation results Effects Lambda PostMean Stddev EdgeA 2. -3.2257.237 * ASA 2. -.862.94 ATA 2..7372.39 * SACF Effect 3 9 ESS(2) EdgeA.973.94.265 4 ASA.95.78.42 4 ATA.82.456 -.53 2

Bayesian analysis in MPNet Set multiplication factor to 8 Scale.5 MCMC sample size Max lag 2 Scaled identity After run Note that Inverse D matrix is diagonal Press Update MCMC sample size 55; Parameter burnin 5 Proposal scaling. Nonconditional simulation This run just got us close to where we want to be We want to draw values roughly here BUT more efficiently (by setting a better Proposal variance than the diagonal)

Bayesian analysis in MPNet ts(output[, k + ]) -4.5-4. -3.5-3. -2.5 EdgeA 2 4 Time SACF for EdgeA ts(output[, k + ]) -.4 -.2..2 ASA 2 4 Time SACF for ASA ts(output[, k + ]).6.7.8.9. ATA 2 4 Time SACF for ATA - > It is moving around really well BUT it takes too small steps (acceptance ratio:.93) Inverse D matrix:.5 -.5. -.5.2 -.. -.. ACF..2.4.6.8. 2 6 Re- run with longer moves ESS: 4 Set Proposal scaling.5 rerun ACF -.2..2.4.6.8. 2 6 ESS: 2 ACF -.2.2.6. 2 6 ESS: 4 Acceptance rate:.93 Estimation results Effects Lambda PostMean Stddev EdgeA 2. -3.6399.385 * ASA 2. -.596.38 ATA 2..7484.8 * SACF Effect 3 9 ESS(2) EdgeA.952.87.47 2 ASA.959.887.499 2 ATA.955.87.37 22

Bayesian analysis in MPNet ts(output[, k + ]) -5-4 -3-2 EdgeA 3 5 Time ts(output[, k + ]) -.6 -.2..2.4.6 ASA 3 5 Time ts(output[, k + ]).4.6.8..2 ATA 3 5 Time - > It is moving around really well AND it takes Nice LONG strides (acceptance ratio: Acceptance rate:.34 Estimation results Effects Lambda PostMean Stddev EdgeA 2. -3.5364.534 * ASA 2. -.223.85 ATA 2..867.22 * ACF..2.4.6.8. SACF for EdgeA ACF..2.4.6.8. SACF for ASA ACF..2.4.6.8. SACF for ATA SACF Effect 9 ESS(2) EdgeA.442 -.7 35 ASA.467 -.43 328 ATA.5.44 25 2 6 2 6 2 6 ESS: 229 ESS: 22 ESS: 83 Effective sample size. We want these to be larger than 5, else the Stddev:s are misleading. Here, as acceptance rate GOOD, rerun the estimation with larger MCMC sample size Here SACF is almost zero allready at lag 3!!!

Bayesian analysis in MPNet The output file, [session name]_posterior_bayesian.txt contains the Bayesian posteriors. EdgeA Frequency Frequency Frequency 4 8 4 8 4 8-5 -4-3 -2 ASA -.8 -.6 -.4 -.2..2.4.6 ATA ASA -.6 -.2..2.4.6-5 -4-3 -2 ATA.4.6.8..2-5 -4-3 -2.4.6.8..2 EdgeA EdgeA

Missing data in MPNet

Session 4: More complex models q The datafile miss2.txt is an 85X85 randomly simulated matrix with a density of.2. This will be a matrix equivalent 2% missing data at random. q The datafile fish_miss2.txt is the fishermen data except that all the s in miss2.txt are set to zero. In other words, fish_miss2.txt can be regarded as the fishermen s network with 2% missing data (both s and s) q Note that to use the missing data estimation in MPNET, you need to have an indicator matrix with entered into every missing cell, and all missing cells in the original data have to be entered as o s.

Session 4: More complex models Our results Effects Lambda PostMean Stddev EdgeA 2. -4.4253.552 * ASA 2..45.96 ATA 2..7759.77 * In MPNET, under the Bayesian estimation tab: Enter the fish_miss2 file to be estimated Enter the miss2 as the missing indicators file, Select parameters and clear any previous parameter values (i.e. start from ) Conduct Bayesian estimation for an edge, AS and AT model. Use 3 as the MCMC sample size.

Session 4: More complex models Make sure no ME Are all zeros really zeros In principle valid for sampled data (admissible) MNAR impossible to check (but robustness can be assessed) Are missing data different than observed If attributes are missing we can use a similar technique of data- augmentation (not in Pnet yet)

Unobserved data: snowball sampling

Unobserved data: snowball sampling

Unobserved data: snowball sampling

Unobserved data: snowball sampling

Unobserved data: snowball sampling

Unobserved data: snowball sampling

Unobserved data: snowball sampling missing data observed data

Sampling in/on networks - x = =

Sampling in/on networks - - x = =

Sampling in/on networks - - x = =

Sampling in/on networks - - x = =

Sampling in/on networks - - x = = - - - -

Sampling in/on networks = x = - - - - - - - -

Ignoring non-sampled = x = - - - - - - - -

What about alter alter across ego = x = - - - - - - - -

Unobserved data: snowball sampling Making some (brave) assumptions (Handcock & Gile 2) we can fit an ERGM (Wang et al. 23) to snowball sampled networks Importance sampling MCMCMLE (Handcock & Gile 2) Stochastic approximation and the missing data principle (Orchard & Woodbury,972) (Koskinen & Snijders, 23) Bayesian data augmentation (Koskinen, Robins & Pattison, 2,23) (MPNet) Conditional MLE (Pattison, Robins, Snijders & Wang, 23)(SnowPNet)

Unobserved data: snowball sampling Bayesian data augmentation (Koskinen, Robins & Pattison, 2,23) (MPNet) Need to know N Need to simulate un-observed ties Time-consuming Conditional MLE (Pattison, Robins, Snijders & Wang, 23)(SnowPNet) No need to know N No need to simulate un-observed data properties of conditional MLE unclear

Estimating ERGM for LARGE networks

Stivala et al. (24) Take many small snowball samples from your LARGE N network Estimate Conditional MLE for each (Pattison, Robins, Snijders & Wang, 23) Pool estimates using Meta-analysis techniques

Stivala et al. (24)

Stivala et al. (24)

Stivala et al. (24)

Spatial embedding

Spatial embedding (Book Ch. 8) 36 actors in Victoria, Australia

Spatial embedding (Book Ch. 8) 36 actors in Victoria, Australia spatially embedded... all living within 4 kilometres of each other

Spatial embedding (Book Ch. 8) 36 actors in Victoria, Australia spatially embedded... all living within 4 kilometres of each other

Spatial embedding (Book Ch. 8) 36 actors in Victoria, Australia Bernoulli conditional on distance Empirical probability... all living within 4 kilometres of each other

Spatial embedding (Book Ch. 8) Spatial interaction function: Tie probability as a function of distance E.g. Attenuated Power-Law: p Pr(X ij = d ij ) = γ +αd ij

Spatial embedding (Book Ch. 8) Spatial interaction function: Tie probability as a function of distance The Attenuated Power-Law: Is equivalent to: Pr(X = x D = (d ij )) = Pr(X ij = d ij ) = p +αd ij γ exp{θ x i< j ij +θ 2 x ij log(d ij )} i< j exp{θ u ij +θ 2 u ij log(d ij )} u X p = α = e θ with: γ = θ 2 AND: log(d ij ) i< j i< j

Spatial embedding (Book Ch. 8) Edges -4.87* (.3) Alt. star Alt. triangel Log distance Age heterophily Gender homophily -.7* (.) -.3* (.6)

Spatial embedding (Book Ch. 8) Edges -4.87* (.3).56* (.65) Alt. star Alt. triangel Log distance -.78* (.8) Age heterophily Gender homophily -.7* (.) -.7* (.) -.3* (.6) -.3 (.69)

Spatial embedding (Book Ch. 8) Edges -4.87* (.3).56* (.65) -4.79* (.66) Alt. star -.86* (.8) Alt. triangel 2.74* (.5) Log distance -.78* (.8) Age heterophily Gender homophily -.7* (.) -.7* (.). (.7) -.3* (.6) -.3 (.69).9 (.83)

Spatial embedding (Book Ch. 8) ERGM: distance and endogenous dependence explain different things Edges -4.87* (.3).56* (.65) -4.79* (.66) -.2 (.87) Alt. star -.86* (.8) -.86* (.2) Alt. triangel 2.74* (.5) 2.69* (.4) Log distance -.78* (.8) -.56* (.7) Age heterophily Gender homophily -.7* (.) -.7* (.). (.7).2 (.6) -.3* (.6) -.3 (.69).9 (.83).7 (.47)

Bipartite and Multilevel ERGM

Multilevel B rs The B- network Level B X ir The X- network Level A A ij The A- network

Multilevel Network statistics can be derived based on the same dependence assumptions Different interpretation as we assume dependencies between tie- variables of different types. + + + + + = = = = Q Q Q Q Q Q Q Q Q Q Q Q Q x b a z x b z x a z x z b z a z b B x X a A ),, ( ), ( ), ( ) ( ) ( ) ( exp ) ( ),, Pr( θ θ θ θ θ θ θ κ Three network variables A, B and X Within level effects Between level effects Interaction between within level and between level networks Cross level effects

Multilevel Bernoulli Markov Affiliation based activity Affiliation based closure or homophily Social circuit and three- path Affiliation assortativity Cross- level assortativity/entrainment

Multilevel

Multilevel: example, global fisheries governance (Hollway & Koskinen, 24)

Multilevel: example, global fisheries governance (Hollway & Koskinen, 24) IRQ HTI ERI CPV TGO GUY BEN SLE LBR GIN SOM KWT GHA GRD BEL SDN LBY VUT LCA FSM FJI GMB BRB AGO CIV LBN PRK MDG SLB DMA NGA IRL TUV DJI QAT MUS DZA MMR PNG URY CMR MLT BHS KHM ISR GEO SYC KNA WSM CYP VCT MHL SYR SVN BHR ATG BIH BGD MNE LKA ARG TUN CAN CHN OMN MAR BRN SGP ALB IND VNM IRN YEM PAK HND IDN MYS NOR TUR PHL ARE GNQ GRC TTO PER PRT SLV BGR DNK MOZ HRV SAU ESP CHL ROK ISL PAN POL JAM THA COD CUB UKR FIN PLW SWE JPN GBR ZAF TLS LVA GAB NAM EGY TON TZA KEN LTU NZL RUS BLZ CRI SEN GTM VEN COL BRA MEX ITA FRA NLD KIR SUR EST AUS USA JOR DOM MRT COG ROU NIC DEU GNB ECU MCO BOL PRY SMR LIE TCD LSO MNG AND LUX BLR BWA CZE CAF ZWE MLI KGZ BTN ARM NER BFA SWZ AFG TJK SVK COM UZB NPL HUN RWA CHE MDA AZE YUG AUT TKM STP ETH KAZ MKD MDV BDI ZMB LAO UGA MWI TWN NRU EU IRQ HTI ERI CPV TGO GUY BEN SLE LBR GIN SOM KWT GHA GRD BEL SDN LBY VUT LCA FSM FJI GMB BRB AGO CIV LBN PRK MDG SLB DMA NGA IRL TUV DJI QAT MUS DZA MMR PNG URY CMR MLT BHS KHM ISR GEO SYC KNA WSM CYP VCT MHL SYR BHR SVN ATG BIH BGD MNE LKA ARG TUN CAN CHN OMN MAR BRN SGP ALB IND VNM IRN YEM PAK HND IDN MYS NOR TUR PHL ARE GNQ GRC TTO PER PRT SLV BGR DNK MOZ HRV SAU ESP CHL ROK ISL PAN POL JAM THA COD CUB UKR FIN PLW SWE JPN GBR ZAF TLS LVA GAB NAM EGY TON TZA KEN LTU NZL RUS BLZ CRI SEN GTM VEN COL BRA MEX ITA FRA NLD KIR SUR EST AUS USA JOR DOM MRT COG ROU NIC DEU GNB ECU MCO BOL PRY SMR LIE TCD LSO MNG AND LUX BLR BWA CZE CAF ZWE MLI KGZ BTN ARM NER BFA SWZ AFG TJK SVK COM UZB NPL HUN RWA CHE MDA AZE YUG AUT TKM STP ETH KAZ MKD MDV BDI ZMB LAO UGA MWI TWN NRU 546 5685 5359 4477 5355 378 84 53423 362 384 753 4 924 383 333 88 5347 87 826 5349 53399 657 5386 237 5433 99 239 27 39 83 227 53638 53759 858 835 889 38 25 73 9 5363 958 964 232 54445 388 27 53563 978 32 33 29 4 99 56 43 53358 2 53833 976 389 234 997 53294 53889 826 87 83 53354 767 732 225 74 848 34 735 536 3 535 53475 33 48 97 587 43 479 96 52 433 426 77 53479 58 425 423 457 476 76 422 4 43 44 5 5379 5375 539 475 28 92 56 44643 73 5325 53896 72 The A- network The X- network The B- network

Multilevel: example, global fisheries governance (Hollway & Koskinen, 24) The B- network

global fish. (Hollway & Koskinen, 24) Effects Parameter Stderr t- ra/o SACF EdgeA - 2.222 9.526 -.2 -.5 ASA.348.36 -.5 -. * ATA.3388.3.7. * GDP_SumA.68.865 -.2 -.9 GDP_ProductA -.6.78 -.22 -.2 species_suma.84.2 -. -.2 * distance_edgea -.64.95 -.9 -. * XEdge - 9.8.896 -.6.6 * IsolatesA - 6.824.748.8 -.4 * XASA 4.3784.32 -.63.55 * XASB -.467.396 -.6.62 * XACA -.4665.3 -.67 -.4 * XACB.2 -.45.33 loggdpstatetreat_xedge.467.4 -.7.63 * Star2AX.458. -.8 -.24 * Star2BX -.5834.84 -.46.45 * TriangleXBX 2.8928.97 -.3.4 * L3XBX 3.433.267.57.7 * ATXBX -.35. -.3 -.24 * L3AXB -.36. -. -.4

Multilevel: example, global fisheries governance (Hollway & Koskinen, 24) EdgeA 59 59.53 4.54 -.35 Star2A 45 25.96 38.5.38 Star3A 7525 6986.738 62.3.867 Star4A 48877 44694.3 2383.26.755 Star5A 26433 245549.822 7529.89 2.494 TriangleA 37 38.89.6 -.8 ASA 326.9238 328.5935 42.39 -.39 ASA2 326.9238 328.5935 42.39 -.39 ATA 85.963 86.578 9.376 -.34 A2PA 62.2734 45.7699.27.5 AETA 83.82 97.295 62.753 -.225 coast_suma 8227.8 7848.968 698.574.223 coast_differencea 574.2 628.7824 655.48 -.34 coast_producta 552268.66 5742.9956 5757.678.786 GDP_SumA 358.3 3592.88 325.357 -.36 GDP_DifferenceA 248.9 248.4494 9.5.24 GDP_ProductA 28.929 274.487 83.76 -.36 species_suma 426 468.939 38.74 -.3 species_differencea 5458 627.877 76.398 -.985 species_producta 229363 2267.25 43495.224.2 distance_edgea 245.633 249.2669 2.7 -.32 XEdge 744 744.34 25.393 -.2 XStar2A 2297 666.35 29.44 2.875 XStar2B 4955 447.62 476.587.7 XStar3A 8932 74966.58 9.676 5.464 XStar3B 36652 354528.85 5253.258 2.28 X3Path 94997 937576.735 2398.243.9 X4Cycle 7539 69696.854 899.436 2.82 XECA 2265537 2239.446 72782.425.995 XECB 38684 4999. 223227.68.28

Multilevel: example, global fisheries governance (Hollway & Koskinen, 24) IsolatesA 6 6.8.966 -.4 IsolatesB.93.44 -.438 XASA 2785.5996 2786.52 48.835 -. XASB 347.4525 348.564 5.356 -.2 XACA 4344.3993 4345.9997 6.884 -.26 XACB 22625.6352 22624.8278 4.667.8 XAECA 293489.2989 27696.6764 7695.3 2.26 XAECB 29868.776 278382.2566 762.24 2.67 loggdpstatetreat_xe dge 8782.9 8786.8598 267.542 -.8 Star2AX 526 5272.267 529.59 -.2 StarAAX 6539.459 6834.3932 922.6 -.32 StarAXA 9277.9278 933.466 956.33 -.27 StarAXAA 34.2835 349.8379 78.967.6 TriangleXAX 7 888.225 7.855.695 L3XAX 27.8552 253.83 25.28.78 ATXAX 47884 45349.46 556.9.459 EXTA 2273 2638.83 853.83 -.428 Star2BX 23 23.26 26.2 -. StarABX 85.375 824.7945 2.68 -.774 StarAXB 3833.3 3833.2944 52.83 -.5 StarAXAB 36.9437 36.825 5.42 -.7 TriangleXBX 396 396.673 6.435 -.4 L3XBX 54.3754 54.38.832 -.6 ATXBX 37668 3783.366 88.2 -.75 EXTB 552 566.58 5.745-2.525 L3AXB 5383 5398.32 538.28 -.28 C4AXB 73 875.858 8.474.87 ASAXASB 8354.834 8659.877 923.27 -.33

Longitudinal ERGM

LERGM FDI electricity market (Koskinen and Lomi, 23)

LERGM FDI electricity market (Koskinen and Lomi, 23)

LERGM FDI electricity market (Koskinen and Lomi, 23)

LERGM FDI electricity market (Koskinen and Lomi, 23)