Commentary on Techniques for Massive- Data Machine Learning in Astronomy
|
|
- Lorena Curtis
- 8 years ago
- Views:
Transcription
1 1 of 24 Commentary on Techniques for Massive- Data Machine Learning in Astronomy Nick Ball Herzberg Institute of Astrophysics Victoria, Canada
2 The Problem 2 of 24 Astronomy faces enormous datasets Their size, dimensionality, and complexity require intelligent, automated investigation Exponential increase in data size: algorithms cannot scale worse than O(N log N) Most data mining algorithms naïvely scale as N 2 or worse
3 The Solution 3 of 24 Make data mining algorithms that scale as N log N! (or better) May have to compromise accuracy slightly Deploy them so that astronomers are willing and able to use them They must work on real astronomical data
4 Collaboration is Vital 4 of 24 Successful use of astrostatistics and data mining requires expertise in computer science, statistics, and astronomy Collaboration enables novelty that would not arise from a single group So, computer scientists supplying algorithms in this way is excellent
5 But 5 of expertise in computer science, statistics, and astronomy Successful collaborations have involved astronomers who are experts in computing/statistics, or who are working closely and over time with these experts
6 And 6 of 24 Astronomy data are messy: - Large, complex, increasingly high-dimensional, timedomain - Missing data: non-observation or non-detection - Heteroscedastic, non-gaussian, underestimated errors - Outliers, artifacts, false detections, systematic effects - Correlated inputs - Etc.
7 An Example 7 of 24 How do you apply astrostatistics and fast algorithms to this?
8
9 The Next Generation Virgo Cluster Survey 9 of 24 10σ point source limiting magnitude g = 25.7 (faint!) Photometric (few spectra), ~100 deg 2, 5 bands (ugriz, like Sloan) galaxies, 2.6 terabytes data 40 people at at 23 institutions in Canada, France, etc. (PI Laura HIA)
10 Virgo is an actual cluster of galaxies, the nearest large one to us
11 NGVS Statistical Challenges 11 of 24 Object detection and classification Photometric redshifts (photo-z) Virgo cluster membership / background Missing data Field-to-field variation Multi-wavelength data Completeness(mag, SB, etc. etc.)
12 Object detection: low surface brightness galaxies
13 13 of 24 Cluster membership: photometric redshift using k nearest neighbours
14 14 of 24 Missing data: NGVS fields (not final) don t all contain all 5 bands ugriz
15 Multi-wavelength data
16 Canadian Astronomy Data Centre CADC is one of the world s largest astronomy data centres ~500 terabytes of data (will grow to petabytes) Uses Virtual Observatory standards Staffed by astronomers and computer specialists, but not statisticians 16 of 24
17 CANFAR 17 of 24 Canadian Advanced Network for Astronomical Research, at CADC Combines cluster job scheduling with cloud computing resources Users manage their own virtual machines
18 So 18 of 24 Put fast data mining tools on the CANFAR infrastructure... but early days, not much to say yet
19 Guide to Data Mining in Astronomy 19 of 24 Virtual Observatory KDD-IG guide: IvoaKDDguide Emphasizes data mining, which is part of astroinformatics But this overlaps with astrostatistics -> potential outreach channel to wider community
20 knn Quasar Photometric Redshifts 20 of 24 Use kd-tree for fast knn assignment of photo-zs to Sloan Digital Sky Survey quasars Single neighbour, perturb input features to make a PDF in redshift Removing multi-peaked PDFs removes almost all catastrophic outliers
21 knn Quasar Photometric Redshifts 21 of z mean = z spec 20
22 knn Quasar Photometric Redshifts 22 of z one peak = z spec 20
23 Questions 23 of 24 Can we overcome the problems of real data? Will there be data of high intrinsic dimension? Will astronomers be able to deploy the algorithms? Where do GPUs fit? (GPU+brute force may be just as fast?)
24 Conclusions 24 of 24 Provided the data can be suitably prepared, and the science-driven usage of the algorithm intelligently motivated, the fast algorithms presented here have excellent potential for advancing astronomical research
Learning from Big Data in
Learning from Big Data in Astronomy an overview Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ From traditional astronomy 2 to Big Data
More informationData Mining Challenges and Opportunities in Astronomy
Data Mining Challenges and Opportunities in Astronomy S. G. Djorgovski (Caltech) With special thanks to R. Brunner, A. Szalay, A. Mahabal, et al. The Punchline: Astronomy has become an immensely datarich
More informationAstrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research
Astrophysics with Terabyte Datasets Alex Szalay, JHU and Jim Gray, Microsoft Research Living in an Exponential World Astronomers have a few hundred TB now 1 pixel (byte) / sq arc second ~ 4TB Multi-spectral,
More informationConquering the Astronomical Data Flood through Machine
Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:
More informationData analysis of L2-L3 products
Data analysis of L2-L3 products Emmanuel Gangler UBP Clermont-Ferrand (France) Emmanuel Gangler BIDS 14 1/13 Data management is a pillar of the project : L3 Telescope Caméra Data Management Outreach L1
More informationA Preliminary Summary of The VLA Sky Survey
A Preliminary Summary of The VLA Sky Survey Eric J. Murphy and Stefi Baum (On behalf of the entire Science Survey Group) 1 Executive Summary After months of critical deliberation, the Survey Science Group
More informationExample application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
More informationEfficient data reduction and analysis of DECam images using multicore architecture Poor man s approach to Big data
Efficient data reduction and analysis of DECam images using multicore architecture Poor man s approach to Big data Instituto de Astrofísica Pontificia Universidad Católica de Chile Thomas Puzia, Maren
More informationThe Sloan Digital Sky Survey. From Big Data to Big Database to Big Compute. Heidi Newberg Rensselaer Polytechnic Institute
The Sloan Digital Sky Survey From Big Data to Big Database to Big Compute Heidi Newberg Rensselaer Polytechnic Institute Summary History of the data deluge from a personal perspective. The transformation
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationScience and the Taiwan Airborne Telescope
Cosmic Variability Study in Taiwan Wen-Ping Chen Institute of Astronomy National Central University, Taiwan 2010 November 16@Jena/YETI Advantages in Taiwan: - Many high mountains - Western Pacific longitude
More informationBig Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014
Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions
More informationData Literacy For All: Astrophysics and Beyond (Astronomy is evidence-based forensic science, thus it is a data & information science)
Data Literacy For All: Astrophysics and Beyond (Astronomy is evidence-based forensic science, thus it is a data & information science) Kirk Borne George Mason University, Fairfax, VA www.kirkborne.net
More informationSoftware challenges in the implementation of large surveys: the case of J-PAS
Software challenges in the implementation of large surveys: the case of J-PAS 1/21 Paulo Penteado - IAG/USP pp.penteado@gmail.com http://www.ppenteado.net/ast/pp_lsst_201204.pdf (K. Taylor) (A. Fernández-Soto)
More information1 About the Book and Supporting Material. This chapter introduces terminology and nomenclature, reviews a few relevant
1 About the Book and Supporting Material Even the longest journey starts with the first step. (Lao-tzu paraphrased) This chapter introduces terminology and nomenclature, reviews a few relevant contemporary
More informationVisualization of Large Multi-Dimensional Datasets
***TITLE*** ASP Conference Series, Vol. ***VOLUME***, ***PUBLICATION YEAR*** ***EDITORS*** Visualization of Large Multi-Dimensional Datasets Joel Welling Department of Statistics, Carnegie Mellon University,
More informationVirtual Observatories A New Era for Astronomy. Reinaldo R. de Carvalho DAS-INPE/MCT 2010
Virtual Observatories Virtual Observatories 1%%&'&$#-&6!&9:#,*3),!#,6!6#$C!&,&$D2 *:#%&+-3;& D&);&-$2!!"! "!" &,&$D2 %),-&,-!"#$%&'&#()*! $#%&!(!!! $ '!%&$ $! (% %)'6!6#$C!;#--&$G $! '!!! $#63#-3),G $!
More informationMaking the Most of Missing Values: Object Clustering with Partial Data in Astronomy
Astronomical Data Analysis Software and Systems XIV ASP Conference Series, Vol. XXX, 2005 P. L. Shopbell, M. C. Britton, and R. Ebert, eds. P2.1.25 Making the Most of Missing Values: Object Clustering
More informationVirtual Observatory tools for the detection of T dwarfs. Enrique Solano, LAEFF / SVO Eduardo Martín, J.A. Caballero, IAC
Virtual Observatory tools for the detection of T dwarfs Enrique Solano, LAEFF / SVO Eduardo Martín, J.A. Caballero, IAC T dwarfs Low-mass (60-13 MJup), low-temperature (< 1300-1500 K), low-luminosity brown
More informationData Mining Techniques in CRM
Data Mining Techniques in CRM Inside Customer Segmentation Konstantinos Tsiptsis CRM 6- Customer Intelligence Expert, Athens, Greece Antonios Chorianopoulos Data Mining Expert, Athens, Greece WILEY A John
More informationDAME Astrophysical DAta Mining Mining & & Exploration Exploration GRID
DAME Astrophysical DAta Mining & Exploration on GRID M. Brescia S. G. Djorgovski G. Longo & DAME Working Group Istituto Nazionale di Astrofisica Astronomical Observatory of Capodimonte, Napoli Department
More informationData Mining and Pattern Recognition for Large-Scale Scientific Data
Data Mining and Pattern Recognition for Large-Scale Scientific Data Chandrika Kamath Center for Applied Scientific Computing Lawrence Livermore National Laboratory October 15, 1998 We need an effective
More informationGalaxy Morphological Classification
Galaxy Morphological Classification Jordan Duprey and James Kolano Abstract To solve the issue of galaxy morphological classification according to a classification scheme modelled off of the Hubble Sequence,
More informationMANAGING AND MINING THE LSST DATA SETS
MANAGING AND MINING THE LSST DATA SETS Astronomy is undergoing an exciting revolution -- a revolution in the way we probe the universe and the way we answer fundamental questions. New technology enables
More informationDS6 Phase 4 Napoli group Astroneural 1,0 is available and includes tools for supervised and unsupervised data mining:
DS6 Phase 4 Napoli group Astroneural 1,0 is available and includes tools for supervised and unsupervised data mining: Preprocessing & visualization Supervised (MLP, RBF) Unsupervised (PPS, NEC+dendrogram,
More informationWhat is the Sloan Digital Sky Survey?
What is the Sloan Digital Sky Survey? Simply put, the Sloan Digital Sky Survey is the most ambitious astronomical survey ever undertaken. The survey will map one-quarter of the entire sky in detail, determining
More informationCanadian Astronomy Data Centre. Séverin Gaudet David Schade Canadian Astronomy Data Centre
Canadian Astronomy Data Centre Séverin Gaudet David Schade Canadian Astronomy Data Centre Data Activities in Astronomy Features of the astronomy data landscape Multi-wavelength datasets are increasingly
More informationThe Virtual Observatory: What is it and how can it help me? Enrique Solano LAEFF / INTA Spanish Virtual Observatory
The Virtual Observatory: What is it and how can it help me? Enrique Solano LAEFF / INTA Spanish Virtual Observatory Astronomy in the XXI century The Internet revolution (the dot com boom ) has transformed
More informationThe Tonnabytes Big Data Challenge: Transforming Science and Education. Kirk Borne George Mason University
The Tonnabytes Big Data Challenge: Transforming Science and Education Kirk Borne George Mason University Ever since we first began to explore our world humans have asked questions and have collected evidence
More informationIndiana University Science with the WIYN One Degree Imager
Indiana University Science with the WIYN One Degree Imager Katherine Rhode (Indiana University, WIYN SAC member) Indiana University Department of Astronomy Nine faculty members, plus active emeritus faculty
More informationDATA MINING TECHNIQUES TO CLASSIFY ASTRONOMY OBJECTS
DATA MINING TECHNIQUES TO CLASSIFY ASTRONOMY OBJECTS Project Report Submitted by V.SUBHASHINI Under the guidance of Dr. Ananthanarayana V. S. Professor and Head Department of Information Technology DEPARTMENT
More informationBig Analytics: A Next Generation Roadmap
Big Analytics: A Next Generation Roadmap Cloud Developers Summit & Expo: October 1, 2014 Neil Fox, CTO: SoftServe, Inc. 2014 SoftServe, Inc. Remember Life Before The Web? 1994 Even Revolutions Take Time
More informationHow To Use Game To Learn From Data
Astronomical data Mining DAMEWARE and beyond Giuseppe Longo Università Federico II Napoli (Italy) M. Brescia INAF OAC G.S. Djorgovski Caltech S. Cavuoti INAF UFII & the DAMEWARE people Astroinformatics
More informationLecture 3 The Future of Search and Discovery in Big Data Analytics: Ultrametric Information Spaces
Lecture 3 The Future of Search and Discovery in Big Data Analytics: Ultrametric Information Spaces Themes 1) Big Data and analytics: the potential for metric (geometric) and ultrametric (topological) analysis.
More informationAnalytics-as-a-Service: From Science to Marketing
Analytics-as-a-Service: From Science to Marketing Data Information Knowledge Insights (Discovery & Decisions) Kirk Borne George Mason University, Fairfax, VA www.kirkborne.net @KirkDBorne Big Data: What
More informationEinstein Rings: Nature s Gravitational Lenses
National Aeronautics and Space Administration Einstein Rings: Nature s Gravitational Lenses Leonidas Moustakas and Adam Bolton Taken from: Hubble 2006 Science Year in Review The full contents of this book
More informationTaming the Internet of Things: The Lord of the Things
Taming the Internet of Things: The Lord of the Things Kirk Borne @KirkDBorne School of Physics, Astronomy, & Computational Sciences College of Science, George Mason University, Fairfax, VA Taming the Internet
More informationData Pipelines & Archives for Large Surveys. Peter Nugent (LBNL)
Data Pipelines & Archives for Large Surveys Peter Nugent (LBNL) Overview Major Issues facing any large-area survey/search: Computational power for search - data transfer, processing, storage, databases
More informationMigrating a (Large) Science Database to the Cloud
The Sloan Digital Sky Survey Migrating a (Large) Science Database to the Cloud Ani Thakar Alex Szalay Center for Astrophysical Sciences and Institute for Data Intensive Engineering and Science (IDIES)
More informationStorm Prediction in a Cloud. Ian Davis, Hadi Hemmati, Ric Holt, Mike Godfrey Douglas Neuse, Serge Mankovskii
Storm Prediction in a Cloud Ian Davis, Hadi Hemmati, Ric Holt, Mike Godfrey Douglas Neuse, Serge Mankovskii Load Balancing in Clouds The goal / balancing act: Want to maximise delivery of cloud services
More informationGalaxy Survey data analysis using SDSS-III as an example
Galaxy Survey data analysis using SDSS-III as an example Will Percival (University of Portsmouth) showing work by the BOSS galaxy clustering working group" Cosmology from Spectroscopic Galaxy Surveys"
More informationDescription of the Dark Energy Survey for Astronomers
Description of the Dark Energy Survey for Astronomers May 1, 2012 Abstract The Dark Energy Survey (DES) will use 525 nights on the CTIO Blanco 4-meter telescope with the new Dark Energy Camera built by
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationSEYMOUR SLOAN IDEAS THAT MATTER
SEYMOUR SLOAN IDEAS THAT MATTER The value of Big Data: How analytics differentiate winners A DATA DRIVEN FUTURE Big data is fast becoming the term keeping senior executives up at night. The promise of
More informationMAST: The Mikulski Archive for Space Telescopes
MAST: The Mikulski Archive for Space Telescopes Richard L. White Space Telescope Science Institute 2015 April 1, NRC Space Science Week/CBPSS A model for open access The NASA astrophysics data archives
More informationSome Basic Principles from Astronomy
Some Basic Principles from Astronomy The Big Question One of the most difficult things in every physics class you will ever take is putting what you are learning in context what is this good for? how do
More informationStatistics, Data Mining and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data. and Alex Gray
Statistics, Data Mining and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data Željko Ivezić, Andrew J. Connolly, Jacob T. VanderPlas University of Washington and Alex
More informationNetApp Big Content Solutions: Agile Infrastructure for Big Data
White Paper NetApp Big Content Solutions: Agile Infrastructure for Big Data Ingo Fuchs, NetApp April 2012 WP-7161 Executive Summary Enterprises are entering a new era of scale, in which the amount of data
More informationConcept and Project Objectives
3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the
More informationShroudbase Technical Overview
Shroudbase Technical Overview Differential Privacy Differential privacy is a rigorous mathematical definition of database privacy developed for the problem of privacy preserving data analysis. Specifically,
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationHow To Use Arxiver
arxiver Dealing with the big data of scientific literature Vanessa Moss and Aidan Hotan 1 Why is the literature important? Science is fundamentally built upon previous work - astrophysics is no exception
More informationHexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
More informationTop 10 Discoveries by ESO Telescopes
Top 10 Discoveries by ESO Telescopes European Southern Observatory reaching new heights in astronomy Exploring the Universe from the Atacama Desert, in Chile since 1964 ESO is the most productive astronomical
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationHow To Teach Data Science
The Past, Present, and Future of Data Science Education Kirk Borne @KirkDBorne http://kirkborne.net George Mason University School of Physics, Astronomy, & Computational Sciences Outline Research and Application
More informationVisIVO, an open source, interoperable visualization tool for the Virtual Observatory
Claudio Gheller (CINECA) 1, Ugo Becciani (OACt) 2, Marco Comparato (OACt) 3 Alessandro Costa (OACt) 4 VisIVO, an open source, interoperable visualization tool for the Virtual Observatory 1: c.gheller@cineca.it
More informationData Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web
More informationAnalysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
More informationBig Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
More informationData Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
More informationBEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
More informationlocuz.com Big Data Services
locuz.com Big Data Services Big Data At Locuz, we help the enterprise move from being a data-limited to a data-driven one, thereby enabling smarter, faster decisions that result in better business outcome.
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationFight fire with fire when protecting sensitive data
Fight fire with fire when protecting sensitive data White paper by Yaniv Avidan published: January 2016 In an era when both routine and non-routine tasks are automated such as having a diagnostic capsule
More informationBUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE
BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE Alex Lin Senior Architect Intelligent Mining alin@intelligentmining.com Outline Predictive modeling methodology k-nearest Neighbor
More informationChemometric Analysis for Spectroscopy
Chemometric Analysis for Spectroscopy Bridging the Gap between the State and Measurement of a Chemical System by Dongsheng Bu, PhD, Principal Scientist, CAMO Software Inc. Chemometrics is the use of mathematical
More informationCharacterizing Task Usage Shapes in Google s Compute Clusters
Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang 1, Joseph L. Hellerstein 2, Raouf Boutaba 1 1 University of Waterloo, 2 Google Inc. Introduction Cloud computing is becoming a key
More informationAdaptive Optics (AO) TMT Partner Institutions Collaborating Institution Acknowledgements
THIRTY METER TELESCOPE The past century of astronomy research has yielded remarkable insights into the nature and origin of the Universe. This scientific advancement has been fueled by progressively larger
More informationDanny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
More informationAustralian Virtual Observatory
Australian Virtual Observatory International Astronomical Union GA 2003 Joint Discussion 08 17th-18th July 2003 Sydney David Barnes The University of Melbourne Our take on virtual observatories bring legacy
More informationHOW WILL ASTRONOMY ARCHIVES SURVIVE THE DATA TSUNAMI?
HOW WILL ASTRONOMY ARCHIVES SURVIVE THE DATA TSUNAMI? Astronomers are collecting more data than ever. What practices can keep them ahead of the flood? G. Bruce Berriman, NASA Exoplanet Science Institute,
More informationRise of the Machines: An Internet-Wide Analysis of Web Bots in 2014
SESSION ID: SPO2-W04 Rise of the Machines: An Internet-Wide Analysis of Web Bots in 2014 John Summers VP, Security Products Akamai #RSAC The Akamai Intelligent Platform The Platform 167,000+ Servers 2,300+
More informationAlignment and Preprocessing for Data Analysis
Alignment and Preprocessing for Data Analysis Preprocessing tools for chromatography Basics of alignment GC FID (D) data and issues PCA F Ratios GC MS (D) data and issues PCA F Ratios PARAFAC Piecewise
More informationNetwork Intrusion Detection Systems
Network Intrusion Detection Systems False Positive Reduction Through Anomaly Detection Joint research by Emmanuele Zambon & Damiano Bolzoni 7/1/06 NIDS - False Positive reduction through Anomaly Detection
More informationAn Introduction to Advanced Analytics and Data Mining
An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit
More informationAugmented Search for Web Applications. New frontier in big log data analysis and application intelligence
Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.
More informationThe Big Picture: Information 01100 Technology Revolution, and 1010011 Science in the 21st Century 00101000
011 The Big Picture: Information 01100 Technology Revolution, and 1010011 Science in the 21st Century 00101000 Roy & George s Excellent Adventure 1110100011 001001110110110 100101010001011101 Lecture 4
More informationDAME: A Distributed Data Mining & Exploration Framework. within the Virtual Observatory
DAME: A Distributed Data Mining & Exploration Framework within the Virtual Observatory Massimo Brescia a*, Stefano Cavuoti b Longo b Raffaele D Abrusco c, Omar Laurino d, Giuseppe a INAF Osservatorio Astronomico
More informationEfficient Astronomical Data Classification on Large-Scale Distributed Systems
Efficient Astronomical Data Classification on Large-Scale Distributed Systems Cheng-Hsien Tang 1,Min-FengWang 1, Wei-Jen Wang 1, Meng-Feng Tsai 1,, Yuji Urata 2, Chow-Choong Ngeow 2, Induk Lee 2, Kuiyun
More informationIntroduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
More informationBig Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs
1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE
More informationData Management Plan Extended Baryon Oscillation Spectroscopic Survey
Data Management Plan Extended Baryon Oscillation Spectroscopic Survey Experiment description: eboss is the cosmological component of the fourth generation of the Sloan Digital Sky Survey (SDSS-IV) located
More informationAGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
More informationSTAAR Science Tutorial 30 TEK 8.8C: Electromagnetic Waves
Name: Teacher: Pd. Date: STAAR Science Tutorial 30 TEK 8.8C: Electromagnetic Waves TEK 8.8C: Explore how different wavelengths of the electromagnetic spectrum such as light and radio waves are used to
More informationSection 2.2. Contents of the Tycho Catalogue
Section 2.2 Contents of the Tycho Catalogue 141 2.2. Contents of the Tycho Catalogue Overview of the Tycho Catalogue: The Tycho Catalogue provides astrometry (positions, parallaxes and proper motions)
More informationInternational Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com
More informationInternational Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More informationThe Sino-French Gamma-Ray Burst Mission SVOM (Space-based multi-band astronomical Variable Objects Monitor)
The Sino-French Gamma-Ray Burst Mission SVOM (Space-based multi-band astronomical Variable Objects Monitor) Didier BARRET on behalf of the SVOM collaboration didier.barret@cesr.fr Outline SVOM background
More informationDATA MINING: AN OVERVIEW
DATA MINING: AN OVERVIEW Samir Farooqi I.A.S.R.I., Library Avenue, Pusa, New Delhi-110 012 samir@iasri.res.in 1. Introduction Rapid advances in data collection and storage technology have enables organizations
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationLSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist
LSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist DERCAP Sydney, Australia, 2009 Overview of Presentation LSST - a large-scale Southern hemisphere optical survey
More informationData Mining and Machine Learning in Bioinformatics
Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg
More informationAssessing Data Mining: The State of the Practice
Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality
More informationWhat is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014
What is Data Science? { Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa
More informationEnvironmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
More informationBringing the Night Sky Closer: Discoveries in the Data Deluge
EARTH AND ENVIRONMENT Bringing the Night Sky Closer: Discoveries in the Data Deluge Alyssa A. Goodman Harvard University Curtis G. Wong Microsoft Research Th r o u g h o u t h i s t o r y, a s t r o n
More information