Data analysis of L2-L3 products
|
|
- Rhoda Rodgers
- 8 years ago
- Views:
Transcription
1 Data analysis of L2-L3 products Emmanuel Gangler UBP Clermont-Ferrand (France) Emmanuel Gangler BIDS 14 1/13
2 Data management is a pillar of the project : L3 Telescope Caméra Data Management Outreach L1 & L2 «The data volumes [ ] of LSST are so large that the limitation on our ability to do science isn't the ability to collect the data, it's the ability to understand [ ] the data» Andrew Conolly (U. Washington) How do you turn petabytes of data into scientific knowledge? Kirk Borne (George Mason U.) Emmanuel Gangler BIDS 14 2/13
3 Data products: L1 Nightly g L2 Annual Image Catalog Emmanuel Gangler BIDS 14 3/13
4 From Image to Catalog Raw image (in 1 band) Calibration images «Flat»,... SNLS images, from P. Astier Clean image (+weights image + fag image) Standard container :.fts format Emmanuel Gangler BIDS 14 4/13
5 From Image to Catalog N Sources 1 Object Clean images + astrometry Stacked image (here : 600 images) SNLS images, from P. Astier Emmanuel Gangler BIDS 14 5/13
6 From image to catalog For each object/source, extract data Metadata Sky coordinates ( almost an index ) Ra/dec, pixel,... Flux measurement Time of observation, band, exposure,... Aperture, PSF, extended source, Shape measurements 2nd ordre moment, Quality fags And associated covariance ~100 attributes to describe a source ~1000 sources per object ~ 40 B objects Remarks : LSST Paradigm : Characterize frst (L2), Analyze later (L3) Image processing : I/O driven, highly parallel Scalability : ex. using map/reduce for coaddition. Emmanuel Gangler BIDS 14 6/13
7 Data mining Astroinformatics point of view: Borne 2009 VO domain Emmanuel Gangler BIDS 14 7/13
8 Data mining Astroinformatics point of view: Which knowledge to extract? How to reuse knowledge? How to integrate information and learning algorithms? Which new algorithms to develop? How to test the new ideas? VO domain Emmanuel Gangler BIDS 14 Borne /13
9 Distributing LSST data The baseline Orchestration tool SQL parser Metadata DB User defned function (geometry) Communication with xrootd MySQL Backend Returns agregate results Partitioning : Geometry (cone searches) Sources and Object in the same node Limitations SQL-based Some queries can't be treated Ad hoc optimization Emmanuel Gangler BIDS 14 9/13
10 Distributing LSST data The baseline Partitioning : Orchestration tool Geometry (cone searches) SQL parser Sources and Object in the same node Metadata DB User defned function (geometry) WG1 Limitations SQL-based Communication with xrootd Some queries can't be treated MySQL Backend Ad hoc optimization Returns agregate results Emmanuel Gangler BIDS 14 10/13
11 Which knowledge to extract? Classical problems in astronomy Objects classifcation Highly dimensional problems (> 1000 dimensions, >1010 entries) 2-points (or N-points) correlations Rarity metric, effcient algorithmic Discoveries? Anomalies (detector, software) Dimensional reduction Rarity detection Cluster signifcance? (statistical/scientifc) Confusion problems Effcient algorithms for Compact data representation Measurements errors, statistical approach Impact usually underestimated in machine learning S. G. Djorgovski,
12 Some astrophysical challenges for the machine learning Galaxy Classifcation Transient classifcation Human better than computer at this task citizen science (ex. Galaxyzoo) (however : 20B galaxies in LSST) See Darko Talk Photometric Redshifts How to invert (galaxy type + ''distance'') u g r i z y ( & morphology) relation to retrieve distance and galaxy properties? Spectroscopic training sample smaller by ~103 Finding back hidden parameters...
13 Toughts about bridging expertize Big Data research needs data! Informatics research needs reference (and documented) data sets to experiment. Solving specifc issues Machine-learning-aware Geo- and Astro- researchers (WG3) Not all problems are impacted the same way by the scalability «classical» learning can still lead to good results. bottleneck in integrating learning methods and data Disentangle Machine learning and Big Data mining LSST had handy precursor data (SDSS, CHFTLS, DES, HSC...) Simulation is mandatory to assess performances / detect biasses Some algorithmic approach specifc to Big Data (1-pass algorithms, sublinear methods...) need select/apply existing methods to Astro- and Geo- data need to fnd the questions where the learning will provide answers Matching Algorithms, Data and Issues is the key! Emmanuel Gangler BIDS 14 13/13
Learning from Big Data in
Learning from Big Data in Astronomy an overview Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ From traditional astronomy 2 to Big Data
More informationConquering the Astronomical Data Flood through Machine
Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:
More informationThe LSST Data management and French computing activities. Dominique Fouchez on behalf of the IN2P3 Computing Team. LSST France April 8th,2015
The LSST Data management and French computing activities Dominique Fouchez on behalf of the IN2P3 Computing Team LSST France April 8th,2015 OSG All Hands SLAC April 7-9, 2014 1 The LSST Data management
More informationData Literacy For All: Astrophysics and Beyond (Astronomy is evidence-based forensic science, thus it is a data & information science)
Data Literacy For All: Astrophysics and Beyond (Astronomy is evidence-based forensic science, thus it is a data & information science) Kirk Borne George Mason University, Fairfax, VA www.kirkborne.net
More informationLSST Resources for Data Analysis
LSST Resources for the Community Lynne Jones University of Washington/LSST 1 Data Flow Nightly Operations : (at base facility) Each 15s exposure = 6.44 GB (raw) 2x15s = 1 visit 30 TB / night Generates
More informationSoftware challenges in the implementation of large surveys: the case of J-PAS
Software challenges in the implementation of large surveys: the case of J-PAS 1/21 Paulo Penteado - IAG/USP pp.penteado@gmail.com http://www.ppenteado.net/ast/pp_lsst_201204.pdf (K. Taylor) (A. Fernández-Soto)
More informationHow To Teach Data Science
The Past, Present, and Future of Data Science Education Kirk Borne @KirkDBorne http://kirkborne.net George Mason University School of Physics, Astronomy, & Computational Sciences Outline Research and Application
More informationComputational Science and Informatics (Data Science) Programs at GMU
Computational Science and Informatics (Data Science) Programs at GMU Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ Outline Graduate Program
More informationIntroduction to LSST Data Management. Jeffrey Kantor Data Management Project Manager
Introduction to LSST Data Management Jeffrey Kantor Data Management Project Manager LSST Data Management Principal Responsibilities Archive Raw Data: Receive the incoming stream of images that the Camera
More informationThe Tonnabytes Big Data Challenge: Transforming Science and Education. Kirk Borne George Mason University
The Tonnabytes Big Data Challenge: Transforming Science and Education Kirk Borne George Mason University Ever since we first began to explore our world humans have asked questions and have collected evidence
More informationAstrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research
Astrophysics with Terabyte Datasets Alex Szalay, JHU and Jim Gray, Microsoft Research Living in an Exponential World Astronomers have a few hundred TB now 1 pixel (byte) / sq arc second ~ 4TB Multi-spectral,
More informationCommentary on Techniques for Massive- Data Machine Learning in Astronomy
1 of 24 Commentary on Techniques for Massive- Data Machine Learning in Astronomy Nick Ball Herzberg Institute of Astrophysics Victoria, Canada The Problem 2 of 24 Astronomy faces enormous datasets Their
More informationMigrating a (Large) Science Database to the Cloud
The Sloan Digital Sky Survey Migrating a (Large) Science Database to the Cloud Ani Thakar Alex Szalay Center for Astrophysical Sciences and Institute for Data Intensive Engineering and Science (IDIES)
More informationDescription of the Dark Energy Survey for Astronomers
Description of the Dark Energy Survey for Astronomers May 1, 2012 Abstract The Dark Energy Survey (DES) will use 525 nights on the CTIO Blanco 4-meter telescope with the new Dark Energy Camera built by
More informationLSST Data Management. Tim Axelrod Project Scientist - LSST Data Management. Thursday, 28 Oct 2010
LSST Data Management Tim Axelrod Project Scientist - LSST Data Management Thursday, 28 Oct 2010 Outline of the Presentation LSST telescope and survey Functions and architecture of the LSST data management
More informationLSST Data Management System Applications Layer Simulated Data Needs Description: Simulation Needs for DC3
LSST Data Management System Applications Layer Simulated Data Needs Description: Simulation Needs for DC3 Draft 25 September 2008 A joint document from the LSST Data Management Team and Image Simulation
More informationMANAGING AND MINING THE LSST DATA SETS
MANAGING AND MINING THE LSST DATA SETS Astronomy is undergoing an exciting revolution -- a revolution in the way we probe the universe and the way we answer fundamental questions. New technology enables
More informationPMCS - WBS with Definition
02C Data Management Construction This WBS element provides the complete LSST Data Management System (DMS). The DMS has these main responsibilities in the LSST system: Process the incoming stream of images
More informationLibraries and Large Data
Libraries and Large Data Super Computing 2012 Elisabeth Long University of Chicago Library What is the Library s Interest in Big Data? Large Data and Libraries We ve Always Collected Data Intellectual
More informationAnalytics-as-a-Service: From Science to Marketing
Analytics-as-a-Service: From Science to Marketing Data Information Knowledge Insights (Discovery & Decisions) Kirk Borne George Mason University, Fairfax, VA www.kirkborne.net @KirkDBorne Big Data: What
More informationData Mining and Pattern Recognition for Large-Scale Scientific Data
Data Mining and Pattern Recognition for Large-Scale Scientific Data Chandrika Kamath Center for Applied Scientific Computing Lawrence Livermore National Laboratory October 15, 1998 We need an effective
More informationLSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist
LSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist DERCAP Sydney, Australia, 2009 Overview of Presentation LSST - a large-scale Southern hemisphere optical survey
More informationDominique Fouchez. 12 Fevrier 2011
données données CPPM 12 Fevrier 2011 The Data données one 6.4-gigabyte image every 17 seconds 15 terabytes of raw scientific image data / night 60-petabyte final image data archive 20-petabyte final database
More informationScientific Computing Meets Big Data Technology: An Astronomy Use Case
Scientific Computing Meets Big Data Technology: An Astronomy Use Case Zhao Zhang AMPLab and BIDS UC Berkeley zhaozhang@cs.berkeley.edu In collaboration with Kyle Barbary, Frank Nothaft, Evan Sparks, Oliver
More informationCourse 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
More informationChapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
More informationTaming the Internet of Things: The Lord of the Things
Taming the Internet of Things: The Lord of the Things Kirk Borne @KirkDBorne School of Physics, Astronomy, & Computational Sciences College of Science, George Mason University, Fairfax, VA Taming the Internet
More informationVisualization of Large Multi-Dimensional Datasets
***TITLE*** ASP Conference Series, Vol. ***VOLUME***, ***PUBLICATION YEAR*** ***EDITORS*** Visualization of Large Multi-Dimensional Datasets Joel Welling Department of Statistics, Carnegie Mellon University,
More informationASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS)
ASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS) Jessica Chapman, Data Workshop March 2013 ASKAP Science Data Archive Talk outline Data flow in brief Some radio
More informationChapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
More informationHow To Use The Wynn Odi
WIYN ODI: Observing Process, Data Analysis and Archiving Pierre Martin Yale Survey Workshop, October 2009 ODI: Scientific Challenges ODI is designed to take advantage of the best seeing conditions at WIYN.
More informationData Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web
More informationIntro to Sessions 3 & 4: Data Management & Data Analysis. Bob Mann Wide-Field Astronomy Unit University of Edinburgh
Intro to Sessions 3 & 4: Data Management & Data Analysis Bob Mann Wide-Field Astronomy Unit University of Edinburgh 1 Outline Data Management Issues Alternatives to monolithic RDBMS model Intercontinental
More informationThe World-Wide Telescope, an Archetype for Online Science
The World-Wide Telescope, an Archetype for Online Science Jim Gray, Microsoft Research Alex Szalay, Johns Hopkins University June 2002 Technical Report MSR-TR-2002-75 Microsoft Research Microsoft Corporation
More informationAn ArrayLibraryforMS SQL Server
An ArrayLibraryforMS SQL Server Scientific requirements and an implementation László Dobos 1,2 --dobos@complex.elte.hu Alex Szalay 2, José Blakeley 3, Tamás Budavári 2, István Csabai 1,2, Dragan Tomic
More informationEuropean Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project
European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project Janet Delve, University of Portsmouth Kuldar Aas, National Archives of Estonia Rainer Schmidt, Austrian Institute
More informationPrinciples for Working with Big Data"
Principles for Working with Big Data" Juliana Freire Visualization and Data Analysis (ViDA) Lab Computer Science & Engineering Center for Urban Science & Progress (CUSP) Center for Data Science New York
More informationDistributed Database Access in the LHC Computing Grid with CORAL
Distributed Database Access in the LHC Computing Grid with CORAL Dirk Duellmann, CERN IT on behalf of the CORAL team (R. Chytracek, D. Duellmann, G. Govi, I. Papadopoulos, Z. Xie) http://pool.cern.ch &
More informationDAME Astrophysical DAta Mining Mining & & Exploration Exploration GRID
DAME Astrophysical DAta Mining & Exploration on GRID M. Brescia S. G. Djorgovski G. Longo & DAME Working Group Istituto Nazionale di Astrofisica Astronomical Observatory of Capodimonte, Napoli Department
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationData Mining Challenges and Opportunities in Astronomy
Data Mining Challenges and Opportunities in Astronomy S. G. Djorgovski (Caltech) With special thanks to R. Brunner, A. Szalay, A. Mahabal, et al. The Punchline: Astronomy has become an immensely datarich
More informationEnergy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
More informationData Management Plan Extended Baryon Oscillation Spectroscopic Survey
Data Management Plan Extended Baryon Oscillation Spectroscopic Survey Experiment description: eboss is the cosmological component of the fourth generation of the Sloan Digital Sky Survey (SDSS-IV) located
More informationCollege of Science George Mason University Fairfax, VA 22030
College of Science George Mason University Fairfax, VA 22030 Dr. Sidney Wolff and the LSST Board of Directors LSST Corporation 933 N. Cherry Avenue Tucson, AZ 85721-0009 June 14, 2010 Dear Dr. Wolff and
More informationVirtual Observatories A New Era for Astronomy. Reinaldo R. de Carvalho DAS-INPE/MCT 2010
Virtual Observatories Virtual Observatories 1%%&'&$#-&6!&9:#,*3),!#,6!6#$C!&,&$D2 *:#%&+-3;& D&);&-$2!!"! "!" &,&$D2 %),-&,-!"#$%&'&#()*! $#%&!(!!! $ '!%&$ $! (% %)'6!6#$C!;#--&$G $! '!!! $#63#-3),G $!
More informationETL as a Necessity for Business Architectures
Database Systems Journal vol. IV, no. 2/2013 3 ETL as a Necessity for Business Architectures Aurelian TITIRISCA University of Economic Studies, Bucharest, Romania aureliantitirisca@yahoo.com Today, the
More informationLSST All Hands Meeting SLAC, December 4-8 2006 (MAP)
LSST All Hands Meeting SLAC, December 4-8 2006 (MAP) Monday, December 4 th Plenary Session Day One, Kavli Auditorium Project Status 1:00 Welcome; Project and MREFC Status D. Sweeney 1:40 Directors Report
More informationPerformance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics platform. PENTAHO PERFORMANCE ENGINEERING
More informationSimilarity Search in a Very Large Scale Using Hadoop and HBase
Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France
More informationThe Large Synoptic Survey Telescope: Status Update
The Large Synoptic Survey Telescope: Status Update Steven M. Kahn LSST Director Mid-Decadal Review Committee December 13, 2015 LSST in a Nutshell The LSST is an integrated survey system designed to conduct
More informationThe Virtual Observatory: What is it and how can it help me? Enrique Solano LAEFF / INTA Spanish Virtual Observatory
The Virtual Observatory: What is it and how can it help me? Enrique Solano LAEFF / INTA Spanish Virtual Observatory Astronomy in the XXI century The Internet revolution (the dot com boom ) has transformed
More informationIntroduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationVisIVO, an open source, interoperable visualization tool for the Virtual Observatory
Claudio Gheller (CINECA) 1, Ugo Becciani (OACt) 2, Marco Comparato (OACt) 3 Alessandro Costa (OACt) 4 VisIVO, an open source, interoperable visualization tool for the Virtual Observatory 1: c.gheller@cineca.it
More informationOne Degree Imager Pipeline Software and Archive Science Requirements Document
One Degree Imager Pipeline Software and Archive Science Requirements Document Version 1.5: 7/23/09!Ian!Dell(Antonio-!Daniel!Durand-!Daniel!1arbeck-!5nut!Olsen-!8ohn!Sal;er! Final!Draft!>dition?!Pierre!Aartin!
More informationSummary of Data Management Principles Dark Energy Survey V2.1, 7/16/15
Summary of Data Management Principles Dark Energy Survey V2.1, 7/16/15 This Summary of Data Management Principles (DMP) has been prepared at the request of the DOE Office of High Energy Physics, in support
More informationThe Challenge of Data in an Era of Petabyte Surveys Andrew Connolly University of Washington
The Challenge of Data in an Era of Petabyte Surveys Andrew Connolly University of Washington We acknowledge support from NSF IIS-0844580 and NASA 08-AISR08-0081 The science of big data sets Big Questions
More informationLecture 5b: Data Mining. Peter Wheatley
Lecture 5b: Data Mining Peter Wheatley Data archives Most astronomical data now available via archives Raw data and high-level products usually available Data reduction software often specific to individual
More informationMaking astronomical discoveries on the web
Making astronomical discoveries on the web David W. Hogg Center for Cosmology and Particle Physics, New York University Max-Planck-Institut für Astronomie, Heidelberg 2011 July 12 Conclusions It is possible
More informationBig Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?
More informationReduced data products in the ESO Phase 3 archive (Status: 15 May 2015)
Reduced data products in the ESO Phase 3 archive (Status: 15 May 2015) The ESO Phase 3 archive provides access to reduced and calibrated data products. All those data are stored in standard formats. The
More informationPerformance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and
More informationInternational Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,
More informationChallenges and Solutions for Big Data in the Public Sector:
Challenges and Solutions for Big Data in the Public Sector: Digital Government Institute s Annual Big Data Conference, October 9, Washington, DC Reagan Building Dr. Brand Niemann Director and Senior Data
More informationDAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY
Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com
More informationLSST Data Management plans: Pipeline outputs and Level 2 vs. Level 3
LSST Data Management plans: Pipeline outputs and Level 2 vs. Level 3 Mario Juric Robert Lupton LSST DM Project Scien@st Algorithms Lead LSST SAC Name of Mee)ng Loca)on Date - Change in Slide Master 1 Data
More informationKepler Data and Tools. Kepler Science Conference II November 5, 2013
Kepler Data and Tools Kepler Science Conference II November 5, 2013 Agenda Current and legacy data products (S. Thompson) Kepler Science Center tools (M. Still) MAST Kepler Archive (S. Fleming) NASA Exoplanet
More informationBIG DATA AND ANALYTICS
BIG DATA AND ANALYTICS Björn Bjurling, bgb@sics.se Daniel Gillblad, dgi@sics.se Anders Holst, aho@sics.se Swedish Institute of Computer Science AGENDA What is big data and analytics? and why one must bother
More informationSTeP-IN SUMMIT 2013. June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)
10 th International Conference on Software Testing June 18 21, 2013 at Bangalore, INDIA by Sowmya Krishnan, Senior Software QA Engineer, Citrix Copyright: STeP-IN Forum and Quality Solutions for Information
More informationThe Big Picture: Information 01100 Technology Revolution, and 1010011 Science in the 21st Century 00101000
011 The Big Picture: Information 01100 Technology Revolution, and 1010011 Science in the 21st Century 00101000 Roy & George s Excellent Adventure 1110100011 001001110110110 100101010001011101 Lecture 4
More informationHadoop Cluster Applications
Hadoop Overview Data analytics has become a key element of the business decision process over the last decade. Classic reporting on a dataset stored in a database was sufficient until recently, but yesterday
More informationCommunity Training: Partitioning Schemes in Good Shape for Federated Data Grids
: Partitioning Schemes in Good Shape for Federated Data Grids Tobias Scholl, Richard Kuntschke, Angelika Reiser, Alfons Kemper 3rd IEEE International Conference on e-science and Grid Computing Bangalore,
More informationHow To Process Data From A Casu.Com Computer System
CASU Processing: Overview and Updates for the VVV Survey Nicholas Walton Eduardo Gonalez-Solares, Simon Hodgkin, Mike Irwin (Institute of Astronomy) Pipeline Processing Summary Data organization (check
More informationExample application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
More informationLSST Database Design Jacek Becla
LSST Database Design Jacek Becla Database and Data Access Lead October 21-25, 2013 FINAL DESIGN REVIEW October 21-25, 2013 Name of Mee)ng Loca)on Date - Change in Slide Master 1 Outline Driving requirements
More informationConstructing the Subaru Advanced Data and Analysis Service on VO
Constructing the Subaru Advanced Data and Analysis Service on VO Yuji Shirasaki on behalf of ADC National Astronomical Observatory of Japan Astronomy Data Center Contents Subaru Telescope and Instruments
More informationMaking the Most of Missing Values: Object Clustering with Partial Data in Astronomy
Astronomical Data Analysis Software and Systems XIV ASP Conference Series, Vol. XXX, 2005 P. L. Shopbell, M. C. Britton, and R. Ebert, eds. P2.1.25 Making the Most of Missing Values: Object Clustering
More informationBest Practices for Hadoop Data Analysis with Tableau
Best Practices for Hadoop Data Analysis with Tableau September 2013 2013 Hortonworks Inc. http:// Tableau 6.1.4 introduced the ability to visualize large, complex data stored in Apache Hadoop with Hortonworks
More informationMAST: The Mikulski Archive for Space Telescopes
MAST: The Mikulski Archive for Space Telescopes Richard L. White Space Telescope Science Institute 2015 April 1, NRC Space Science Week/CBPSS A model for open access The NASA astrophysics data archives
More informationHexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
More informationNITRD and Big Data. George O. Strawn NITRD
NITRD and Big Data George O. Strawn NITRD Caveat auditor The opinions expressed in this talk are those of the speaker, not the U.S. government Outline What is Big Data? Who is NITRD? NITRD's Big Data Research
More informationHow to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
More informationSilviu Panica, Marian Neagul, Daniela Zaharie and Dana Petcu (Romania)
Silviu Panica, Marian Neagul, Daniela Zaharie and Dana Petcu (Romania) Outline Introduction EO challenges; EO and classical/cloud computing; EO Services The computing platform Cluster -> Grid -> Cloud
More informationData Mining and Database Systems: Where is the Intersection?
Data Mining and Database Systems: Where is the Intersection? Surajit Chaudhuri Microsoft Research Email: surajitc@microsoft.com 1 Introduction The promise of decision support systems is to exploit enterprise
More informationInformation Processing, Big Data, and the Cloud
Information Processing, Big Data, and the Cloud James Horey Computational Sciences & Engineering Oak Ridge National Laboratory Fall Creek Falls 2010 Information Processing Systems Model Parameters Data-intensive
More informationPolitecnico di Torino. Porto Institutional Repository
Politecnico di Torino Porto Institutional Repository [Proceeding] NEMICO: Mining network data through cloud-based data mining techniques Original Citation: Baralis E.; Cagliero L.; Cerquitelli T.; Chiusano
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More informationFluency With Information Technology CSE100/IMT100
Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999
More informationBIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
More informationDistributed Computing and Big Data: Hadoop and MapReduce
Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:
More informationSanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a
More informationContent visualization of scientific corpora using an extensible relational database implementation
. Content visualization of scientific corpora using an extensible relational database implementation Eleftherios Stamatogiannakis, Ioannis Foufoulas, Theodoros Giannakopoulos, Harry Dimitropoulos, Natalia
More informationData Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
More informationEngineering the Data Processing Pipeline
Engineering the Data Processing Pipeline Mark Stalzer Center for Advanced Computing Research California Institute of Technology stalzer@caltech.edu October 29, 2009 A systems engineering view of computational
More informationPerforming a data mining tool evaluation
Performing a data mining tool evaluation Start with a framework for your evaluation Data mining helps you make better decisions that lead to significant and concrete results, such as increased revenue
More informationcuration, analyses and interpretation of massive datasets opportunities are varied across disciplines
! Efficiency in scientific discovery through curation, analyses and interpretation of massive datasets! Uptake level and concentration on Big Data opportunities are varied across disciplines The nature
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationStatistics, Data Mining and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data. and Alex Gray
Statistics, Data Mining and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data Željko Ivezić, Andrew J. Connolly, Jacob T. VanderPlas University of Washington and Alex
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationData Pipelines & Archives for Large Surveys. Peter Nugent (LBNL)
Data Pipelines & Archives for Large Surveys Peter Nugent (LBNL) Overview Major Issues facing any large-area survey/search: Computational power for search - data transfer, processing, storage, databases
More information