Big Data in Astronomy The Large Synoptic Survey Telescope Prof. Sarah Bridle, University of Manchester 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST Image simulations Rapid image processing High precision image processing Catalogue search 3. Ways to get involved
Big Data in Astronomy The Large Synoptic Survey Telescope Prof. Sarah Bridle, University of Manchester 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST Image simulations Rapid image processing High precision image processing Catalogue search 3. Ways to get involved
Big Data from 3.2G pixel camera 2000 exposures per night -> 20TB per night 10 year survey
Big Camera
Big Telescope
Big Data from Within its first month of operation LSST will survey more of the Universe than all previous telescopes built by mankind
Big Data from 800 images (movie) of the southern hemisphere in 6 colours ~100 000 alerts/ night worldwide, within 60 seconds
14
15
http://www.lsst.org/news/enews/first-blast-201104.html
LSST Basics 8.4m mirror 9.6 sq deg FOV 20,000 deg of sky 1000 visits per field filters: ugrizy 320-1035 nm r ~24.7 in single visit, ~27.7 stacked depth 3.2 Gpix camera ~0.01 mag precision photometry
Big Collaboration a subset in Tucson Arizon
Big Data in Astronomy The Large Synoptic Survey Telescope Prof. Sarah Bridle, University of Manchester 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST Image simulations Rapid image processing High precision image processing Catalogue search 3. Ways to get involved
Big Data in Astronomy The Large Synoptic Survey Telescope Prof. Sarah Bridle, University of Manchester 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST Image simulations Rapid image processing High precision image processing Catalogue search 3. Ways to get involved
LSST Data Products
LSST'science'and'engineering'tools'' All%soKware%is%version%controlled%and%provenance%informa) on%is%output%with%the%data. % Systems%are%validated%through%project%ini) ated%reviews%with%external%members%.% Slide by Andy Connolly, LSST Simulation Scientist 3
C/C++/Python% The'simula4on'framework' CatSim% PhoSim% Slide by Andy Connolly, LSST Simulation Scientist 15
0.2 % Op) cal%model %+Tracking %+Diffrac) on %+Det%Perturba) ons% % % % % % % % % +Lens%Perturba) ons %+Mirror%Perturba) ons %+Detector %+Dome%Seeing% % % % % % % % % +Low%Al) tude %+Mid%Al) tude %+High%Al) tude %+Pixeliza) on% %Atmosphere %Atmosphere %Atmosphere% 12 PhoSim Peterson%et%al%2013%
3%Gigapixels% % 10%sq.%degrees% % 20%million%sources% % 10 10 %photons% % 11%Gbytes% % 1000%CPU%hours% % 14
Big Data in Astronomy The Large Synoptic Survey Telescope Prof. Sarah Bridle, University of Manchester 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST Image simulations Rapid image processing High precision image processing Catalogue search 3. Ways to get involved
Flux Flux Flux Flux Flux Flux 0 15000 10000 5000 0 15000 g r Supernova 0 r Classification r SNPhotCC (Kessler et 0 al 2011) 500 500 250 1000 500 i 0 500 250 0 i 10000 5000 0 i 500 0 z 500 0 z -20 0 20 40 60 80 T obs - 53991.4-20 0 20 40 60 80 T obs - 54033.9-20 0 20 40 60 80 T obs - 54362 SN SDSS 2007og z=0.2 SN SDSS 14475 z=0.14 SN SDSS 2006kn z=0.12 200 200 500 0 400 u 0 u 0 u 200 200 500 0 400 200 g 0 500 250 g 0 500 g 0 500 250 r 0 500 r 0 500 r i i 0 0 0 Leads in LSST: Alex Kim, Michael Wood-Vasey 500 500 500 Leads in LSST:UK: Mark Sullivan (Southampton), Hiranya Peiris (U i
Table adapted from Rau et al. 2009 Slide by Lucianne Walkowicz, Co-Chair of Transients and Variable Sta Expected Rate of Transients Class Mag t (days) Universal Rate LSST Rate Luminous SNe -19...-23 50-400 10-7 Mpc -3 yr -1 20000 Orphan Afterglows SHB Orphan Afterglows LSB On- axis GRB afterglows Tidal Disruption Flares Luminous Red Novae -14...-18 5-15 3 x10-7...-9 Mpc -3 yr -1 ~10-100 -22...-26 2-15 3 x 10-10...-11 Mpc -3 yr -1 1000...-37 1-15 10-11 Mpc -3 yr -1 ~50-15...-19 30-350 10-6 Mpc -3 yr -1 6000-9...-13 20-60 10-13 yr -1 Lsun -1 80-3400 Fallback SNe -4...-21 0.5-2 <5 x 10-6 Mpc -3 yr -1 < 800 SNe Ia -17...-19.5 30-70 3 x 10-5 Mpc -3 yr -1 200000 SNe II -15...-20 20-300 (3..8) x 10-5 Mpc -3 yr -1 100000
TABLE 5 List of Par t icipant s in t he SNPhot CC. Classified SN Part icipant s Abbreviat ion a +Z b /noz c z d ph CPU e Descript ion (st rat egy class f ) P. Belov and S. Glazov Belov & Glazov yes/ no no 90 light curve χ 2 test against Nugent templates (2) S. Gonzalez Gonzalez yes/ yes no 120 cuts on SiFT O fit χ 2 and fit paramet ers (1) J. Richards, Homrighausen, InCA g no/ yes no 1 Spline fit & nonlinear dimensionality C. Schafer, P. Freeman reduct ion (4) J. Newling, M. Varuguese, JEDI-K DE yes/ yes no 10 K ernel Density Evaluat ion with 21 params (4) B. Basset t, R. Hlozek, JEDI Boost yes/ yes no 10 Boost ed decision t rees (4) D. Parkinson, M. Smit h, JEDI-Hubble yes/ no no 10 Hubble diagram K DE (3) H. Campbell, M. Hilt on, JEDI Combo yes/ no no 10 Boost ed decision t rees + Hubble K DE (3+ 4) H. Lampeit l, M. Kunz, P. Pat el (JEDI group h ) S. Philip, V. Bhat nagar, M GU+ DU-1 i no/ yes no < 1 light curve slopes & Neural Network (2) A. Singhal, A. Rai, M GU+ DU-2 no/ yes no < 1 light curve slopes & Random Forest s (2) A. M ahabal, K. Indulekha H. Campbell, B. Nichol, Port smout h χ 2 yes/ no no 1 SA LT 2 χ 2 r & False Discovery Rat e St at ist ic (1) H. L ampiet l, M.Smit h Port smout h-hubble yes/ no no 1 Deviat ion from paramet rized Hubble diagram (3) D. Poznanski Poz2007 RAW yes/ no yes 2 SN A ut omat ed Bayesian Classifier (SN A BC) (2) Poz2007 OPT yes/ no yes 2 SN A BC wit h cut s t o opt imize C FoM Ia (2). S. Rodney Rodney yes/ yes yes 230 SN Ont ology wit h Fuzzy Templat es (2) M. Sako Sako yes/ yes yes 120 χ 2 test against grid of Ia/ I I/ Ibc templates (2) S. K uhlmann, R. K essler SNA NA cuts yes/ yes yes 2 Cut on ml cs fit probability, S/ N & sampling (1) a Groups are list ed alphabet ically by abbreviat ion. b Classificat ions included for SNPhot CC/ HOSTZ. c Classificat ions included for SNPhot CC/ nohostz. d phot o-z est imat es included. e Average processi ng t ime per SN (seconds) usi ng si milar 2-3 GHz cores. f From 3, st rat egy classes are 1) select ion cut s, 2) Bayesian probabilit ies, 3) Hubble-diagram paramet rizat ion and 4) st at istical inference. g Int er nat ional Comput at ional A st rophysics Group: ht t p: / / www. i ncagr oup. org h Joint Exchange and Development Init iat ive: ht t p: / / j edi. saao. ac. za i MGU= Mahat ma Gandhi University, DU= Delhi University. best method in this first SNPhot CC, here we carefully examine the C FoM Ia for the unconfirmed sample in the SNPhot CC/ HOSTZ (Fig. 4). The entry with the highest SNPhotCC (Kessler et al 2011) subset was generally treated as a random subset, which it clearly is not ( 2.5). T he magnit ude-limit ed select ion of spectroscopic targets resulted in the selection of brighter
Big Data in Astronomy The Large Synoptic Survey Telescope Prof. Sarah Bridle, University of Manchester 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST Image simulations Rapid image processing High precision image processing Catalogue search 3. Ways to get involved
It s a Big Deal Discovery of Accelerating Universe Wins 2011 Nobel Prize
Why is the Universe Accelerating? Einstein s cosmological constant A new fluid called Dark Energy Equation of state w = p/ General Relativity is wrong
Using the bending of light to see the invisible
Cosmic Shear Galaxies seen through dark matter distribution analogous to Streetlamps seen through your bathroom window
Cosmic Shear g i ~0.2 Real data: g i ~0.03
Atmosphere and Telescope Convolution with kernel Real data: Kernel size ~ Galaxy size
Pixelisation Sum light in each square Real data: Pixel size ~ Kernel size /2
Noise Mostly Poisson. Some Gaussian and bad pixels. Uncertainty on total light ~ 5 per cent
Bridle et al 2010
A typical galaxy image for cosmic shear Intrinsic galaxy shape b/a ~ 0.5 Uncertainty due to no σb/a ~ 0.5 Modification due to le Δb/a ~ 0.05 Effect of changing w b δb/a ~ 0.0005
Annals of Applied Statistics March 2009
Slide by David Hogg Following NIPS Cosmology Workshop discussion with Iain Mu
Typical Running+Joseph s+code+ DES data multiple exposures Image+ Model+ Weight+ Residuals DESDM data, PSFs; im3shape fit (Zuntz, Hirsch, Kacprzak, Rowe, Ma
Successful fits How to deal with overlaps? interloper target model mask removes interloper lovely residuals DESDM data, PSFs; im3shape fit (Zuntz, Hirsch, Kacprzak, Rowe, Ma
Slide by David Kirkby Today DES-r 800s 13.7 electrons
Slide by David Kirkby In 10 years LSST-r 6900s 13.7 electrons
Big Data in Astronomy The Large Synoptic Survey Telescope Prof. Sarah Bridle, University of Manchester 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST Image simulations Rapid image processing High precision image processing Catalogue search 3. Ways to get involved
Catalogue search in phase space Find remnants of galaxies colliding with the Milky Way From positions and velocities of 10 billion stars The Sagittarius dwarf galaxy in o Leads in LSST: John Bochanski, Nitya Jacob Kallivayalil, Beth W Leads in LSST:UK: Vasily Belokurov (Cambridge), Nic Walton (Cam
Big Data in Astronomy The Large Synoptic Survey Telescope Prof. Sarah Bridle, University of Manchester 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST Image simulations Rapid image processing High precision image processing Catalogue search 3. Ways to get involved
LSST Scientific Possibilities LSST Science Book: http://www.lsst.org/lsst/ science/scibook 598 pages 245 authors Preface 8. The Transient and Variable Universe 1. Introduction 9. Galaxies 2. LSST System Design 10. Active Galactic Nuclei 3. System Pergormance 11. Supernovae 4. Education and Public Outreach 12. Strong Lenses 5. The Solar System 13. Large-Scale Structure 6. Stellar Populations 14. Weak Lensing 7. Milky Way and Local Volume 15. Cosmological Physics
Science Collaborations Solar System Milky Way and Local Volume Structure Transients & Variable Stars Galaxies Active Galactic Nuclei Supernovae Stellar Populations Strong Lensing Weak Lensing Large Scale Structure & Baryon Oscillations Informatics & Statistics
Science Collaborations Solar System Milky Way and Local Volume Structure Transients & Variable Stars Galaxies Active Galactic Nuclei Supernovae Stellar Populations Strong Lensing Weak Lensing Large Scale Structure & Baryon Oscillations Informatics & Statistics LSST:UK I&S Leads: Hiranya Peiris (UCL), Jason McEwen (UCL)
Slide by Kirk Bourne, Dept of Computational & Data Sciences Ge University
Sign up to get involvedhttps://docs.google.com/spreadsheet/ccc?key=0aqx4pj9ojyrudfvjq U85SS02eEZxeEhTaUJKYmZjVmc&usp=sharing Current status: Submitted 40 page proposal to STFC. PPRP panel presentation on 27 th October 2014
Open Problems Related to LSST Shear measurement (GREAT08,, GREAT3) Cosmological parameter estimation (e.g. CosmoSIS) LSST simulations (CatSim, PhoSim, ImSim) Real-time transient classification Supernova Classification Challenge Catalogue search Dark Worlds Kaggle Challenge Strong Lens Time Delay Challenge Communication in large collaborations
Noise Bias Many identical images with different noise
Bias disappears at high S/N Above requirements at low S/N
What causes the bias? For model fitting methods Noise bias Refregier, SB et al; Kacprzak, SB et al 2012 Maximum likelihood methods are biased Calibration works well enough Model bias Voigt & Bridle 2009 e.g use wrong profile in fit e.g. use elliptical isophote model in fit
Galaxy Models But galaxies aren t simple Model galaxy Actual galaxy
Model Bias The effect of realistic galaxy shapes Measure with sims from HST data Bias for red and blue galaxies shown DES 5-year requires mean m < 0.005 Plots from Tomasz Kacprzak
Impact on dark energy constraints Simulate for different redshifts Kacprzak, SB, et al 2013
69/19 Taken from Bridle et al GREAT08 Handb
Slide by Kirk Bourne, Dept of Computational & Data Sciences Ge University
71/19 Typical gala used for cos shear analy Typical star Used for finding Convolution kernel
Big Data in Astronomy The Large Synoptic Survey Telescope 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST 3. Weak Lensing in LSST 3.1 Big Data: Galaxy shape measurement 3.2 Big Models: Covariance matrix estimation 3.3 It s a Big Deal: Proving Einstein wrong 4. Ways to get involved
Big Data in Astronomy The Large Synoptic Survey Telescope 1. The Large Synoptic Survey Telescope (LSST) 2. Big Data challenges in LSST 3. Weak Lensing in LSST 3.1 Big Data: Galaxy shape measurement 3.2 Big Models: Covariance matrix estimation 3.3 It s a Big Deal: Proving Einstein wrong 4. Ways to get involved