Data Science. Brian Sletten
|
|
|
- Gilbert Maxwell
- 10 years ago
- Views:
Transcription
1 Data Science Brian 09/29/2014
2 Speaker Qualifications Specialize in next-generation technologies Author of "Resource-Oriented Architecture Patterns for Webs of Data" Speaks internationally about REST, Semantic Web, Security, Visualization, Architecture Worked in Defense, Finance, Retail, Hospitality, Video Game, Health Care and Publishing Industries One of Top 100 Semantic Web People 3/110 Agenda Introduction Data Science Techniques Programming Visualization Machine Learning Data Mining Big Data Linked Data 4/110
3 Introduction 6/110
4 We re witnessing the beginning of a massive, culturally saturated feedback loop where our behavior changes the product and the product changes our behavior. Technology makes this possible: infrastructure for large-scale data processing, increased memory, and bandwidth, as well as a cultural acceptance of technology in the fabric of our lives. This wasn t true a decade ago. Cathy O'Neil and Rachel Schutt 8/110
5 9/110 10/110
6 Correlation is not causation. Empirically observed covariation is a necessary but not sufficient condition for causality. edvard Tufte
7 Correlation is not causation but it sure is a hint. edvard Tufte So What? Unrelated (Pirates and Climate Change) Reverse Causation (Windmills Cause Wind) Bi-Directional Causation (Temperature/Pressure) Common Causal Variable (Sleeping with your shoes on causes headaches) 14/110
8 Long before worrying about how to convince others, you first have to understand what's happening yourself. Andrew Gelman Naïve realism, also known as direct realism or common sense realism, is a philosophy of mind rooted in a theory of perception that claims that the senses provide us with direct awareness of the external world. Wikipedia
9 17/ Princeton/Dartmouth Game Storied rivalry Princeton's star player had his nose broken Princeton player snapped a Dartmouth player's leg Princeton won 13-0 Editorials from both schools blamed the other Two versions of Truth 18/110
10 They Saw a Game Albert Hastorf (Dartmouth) and Hadley Cantril (Princeton) showed the game again to students from both schools Asked them to notice infractions, penalties, fill out a questionnaire Princeton students 'saw' twice as many infractions by Dartmouth players than Dartmouth students did Dartmouth students saw a 'rough but fair' game 19/110 In brief, the data here indicate that there is no such 'thing' as a 'game' existing 'out there' in its own right which people merely 'observe.' The game 'exists' for a person and is experienced by him only insofar as certain happenings have significances in terms of his purpose. Hastorf and Cantril They Saw a Game
11 Everything that has ever happened to you has happened inside your skull. David McRaney You Are Now Less Dumb Comparing the Students All male Ethnic and socioeconomically similar Same part of the country Same age Same basic culture and religious beliefs Different schools 22/110
12 It s a real problem, though, when politicians, CEOs, and other people with the power to change the way the world works start bungling their arguments for or against things based on self-delusion generated by imperfect minds and senses. David McRaney You Are Now Less Dumb Real World Issues Foods and Cancer Vaccination and Autism Global Warming GMOs 24/110
13 Data scientist: n. person who is better at statistics than any software engineer and better at software engineering than any statistician. Josh Wills There's a distinct lack of respect for researchers in academia and industry labs who have been working on this kind of stuff for years, and whose work is based on decades (in some cases, centuries) of work by statisticians, computer scientists, mathematicians, engineers and scientists of all types. Cathy O'Neil and Rachel Schutt
14 27/110 Data science is the civil engineering of data. Its acolytes possess a practical knowledge of tools and materials, coupled with a theoretical understanding of what is possible. Cathy O'Neil and Rachel Schutt
15 Data Science Strategy Engineering and Infrastructure for collection and logging Privacy Access policies Role in Decision Making Process 29/110 Narratives are meaning transmitters. They are history-preservation devices. They create and maintain cultures, and they forge identities that emerge out of the malleable, imperfect memories of life events. David McRaney You Are Now Less Dumb
16 Your narrative bias makes it nearly impossible for you to really absorb the information from the outside world without arranging it into causes and effects. David McRaney You Are Now Less Dumb Your ancestors invented the scientific method because the common belief fallacy renders your default strategies for making sense of the world generally awful and prone to error. David McRaney You Are Now Less Dumb
17 Your natural tendency is to start from a conclusion and work backward to confirm your assumptions, but the scientific method drives down the wrong side of the road and tries to disconfirm your assumptions. David McRaney You Are Now Less Dumb Data Science Techniques
18 Doing Data Science (O'Neil and Schutt) 35/110 Math Statistics Linear Algebra Numerical Analysis Calculus 36/110
19 Statistics Observations and Samples Bias Modeling Distributions Fitting a model Overfitting 37/110 Techniques Linear Regression k Nearest Neighbor k Means clustering Decision Trees 38/110
20 Programming Programming Languages C/C++ Fortran Python Julia R 40/110
21 Python General Purpose, High-Level Programming Language Emphasis on readability Expressive syntax Supports OO, FP, Procedural Programming Dynamic CPython/Jython 41/110 SciPy Ecosystem of open-source packages for science, math and engineering NumPy SciPy Matplotlib IPython Sympy pandas 42/110
22 SciPy/NumPy Numerical analysis Optimization problems N-Dimensional Arrays Linear Algebra Fourier transformations 43/110 Julia An attempt to create a high-performance, general purpose numerical language Think: MATLAB meets Fortran, Python and Lisp LLVM-based JIT Compiler Growing base of packages Impressive performance benchmarks MIT License 44/110
23 45/110 Nerdy Language Features Multiple dispatch Dynamic type system Built-in package manager Lisp-like macros and other metaprogramming fu Ability to call C/Python functions Supports parallel and distributed processing Coroutines 46/110
24 R Created by Ross Ihaka and Robert Gentleman Maintained by the R Development Core Team Part of the GNU Project Commercial support Implemented in C, Fortran and R 47/110 48/110
25 49/110 50/110
26 Supports Basic Math Statistical Analysis Optimization Problems Signal Processing Graphics and Visualization Data Mining Machine Learning 51/110 Visualization
27 53/110 54/110
28 55/110 56/110
29 57/110 58/110
30 59/110 Techniques Graphical Analysis Presentation Graphics 60/110
31 61/110 ipython -pylab PYTHON x = linspace( 0, 10, 100 ) plot( x, sin(x) ) plot( x, 0.5*cos(2*x) ) title( "A matplotlib plot" ) text( 1, -0.8, "A text label" ) ylim( -1.1, 1.1 ) savefig( 'matplotlib.png' ) 62/110
32 63/110 d3.js Data-Driven Documents JavaScript library that uses HTML, SVG and CSS Bind data to the DOM Data-driven transformations 64/110
33 URL URL URL URL URL 65/110 Machine Learning
34 Data Mining Big Data
35 First, it is a bundle of technologies. Second, it is a potential revolution in measurement. And third, it is a point of view, or philosophy, about how decisions will be and perhaps should be made in the future. Steve Lohr, New York Times ( ) Everything You Know About Something ID Col1 Col2 Col3 Col4 Col5 Col6... ColN Thing1 Value1 Value2 Value3 Value4 Value5 Value6... ValueN 70/110
36 Everything You Know About Everything ID Col1 Col2 Col3 Col4 Col5 Col6... ColN Thing1 Value1 Value2 Value3 Value5... ValueN Thing2 Value1 Value3 Value4 Value5 Value6... ValueN Thing3 Value2 Value3 Value5 Value6... ValueN Thing4 Value1 Value2 Value3 Value4 Value5 Value6... ValueN /110 Distribute Rows in their Entirety ID Col1 Col2 Col3 Col4 Col5 Col6... ColN Thing1 Value1 Value2 Value3 Value5... ValueN Thing3 Value2 Value3 Value5 Value6... ValueN 72/110
37 Distribute Columns in their Entirety ID Col2 Col3 Col5 ColN Thing1 Value2 Value3 ValueN Thing3 Value2 Value3 Value5 ValueN Thing4 Value2 Value3 Value5 ValueN /110 Linked Data
38 Distribute Arbitrary Cells ID Col2 Col3 Col5 ColN Thing1 Value3 ValueN Thing3 Value5 ValueN Thing4 Value2 Value3 ValueN /110 Linking Open Data Project Started in 2007 by W3C Semantic Web Education and Outreach(SWEO) Interest Group Make data freely available Doubled in size every 10 months 76/110
39 77/110 78/110
40 79/110 80/110
41 81/110 82/110
42 83/110 84/110
43 85/110 86/110
44 87/110 Domain # Datasets # Triples # Links Media 25 1,800,000,000 50,000,00 Geographic 31 6,000,000,000 35,000,000 Government 49 13,000,000,000 19,000,000 Publications 87 2,900,000, ,000,000 Cross-Domain 41 4,100,000,000 63,000,000 Life Sciences 41 3,000,000, ,000,000 User-Generated Content ,000,000 3,400,000 Total ,000,000, ,000, /110
45 89/110 90/110
46 91/110 DBPedia Linked Dataset derived from Wikipedia Creative Commons Attribution-ShareAlike 3.0 License GNU Free Documentation License Multi-domain Consensus-based Kept current by Wikipedia activity Multi-lingual 92/110
47 DBPedia Numbers (English Version) Describes 4 million things 3.22 million are classified by an ontology 832,000 people 639,000 places 372,000 creative works 209,000 organizations 226,000 species 5,600 diseases 93/110 DBPedia Numbers (Non-English Version) Localized Language Versions Describe 24.9 million things (w/ repetition) 16.8 million are connected to English DBPedia 94/110
48 DBPedia Summary Overall 12.6 million unique things 24.6 million links to images 27.6 million links to pages 45 million links to other RDF datasets 67 million links to Wikipedia categories 41.2 million links to YAGO categories 2.46 billion RDF triples 470 million (English), 1.98 billion (Non-English) 95/110 Use Cases Improve Wikipedia Search Include DBPedia data in your documents Support for Geographic Data Documentation Classification, Annotation Multi-Domain Ontology 96/110
49 DBPedia 97/110 Most Important Query Ever Run 98/110
50 HTML HTML 99/110 Books
51 Data Science Books Data Analysis w/ Open Source Tools, Philipp K. Janet (ORA) Data Smart: Using Data Science to Transform Information into Insight, John W. Foreman (Wiley) Doing Data Science : Straight Talk from the Frontline, Cathy O'Neil, Rachel Schutt (ORA) The R Book, Michael J. Crawley (Wiley) R Tutorial w/ Bayesian Statistics Using OpenBUGS, Chi Yau Applied Predictive Modeling, Max Kuhn and Kjell Johnson (Springer) Introductory Statistics w/ R, Peter Dalgaard (Wiley) Think Stats, Allen B. Downey (ORA) R Cookbook, Paul Teetor (ORA) R Graphics Cookbook, Winston Chang (ORA) 101/110 HTML HTML 102/110
52 103/ /110
53 105/ /110
54 107/ /110
55 109/110 Questions? + $ bsletten
Statistics, Data Mining and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data. and Alex Gray
Statistics, Data Mining and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data Željko Ivezić, Andrew J. Connolly, Jacob T. VanderPlas University of Washington and Alex
CS 40 Computing for the Web
CS 40 Computing for the Web Art Lee January 20, 2015 Announcements Course web on Sakai Homework assignments submit them on Sakai Email me the survey: See the Announcements page on the course web for instructions
Machine Learning and Data Mining. Fundamentals, robotics, recognition
Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,
DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2
DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.
The Julia Language Seminar Talk. Francisco Vidal Meca
The Julia Language Seminar Talk Francisco Vidal Meca Languages for Scientific Computing Aachen, January 16, 2014 Why Julia? Many languages, each one a trade-off Multipurpose language: scientific computing
What is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014
What is Data Science? { Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa
Predicting outcome of soccer matches using machine learning
Saint-Petersburg State University Mathematics and Mechanics Faculty Albina Yezus Predicting outcome of soccer matches using machine learning Term paper Scientific adviser: Alexander Igoshkin, Yandex Mobile
Intro to scientific programming (with Python) Pietro Berkes, Brandeis University
Intro to scientific programming (with Python) Pietro Berkes, Brandeis University Next 4 lessons: Outline Scientific programming: best practices Classical learning (Hoepfield network) Probabilistic learning
203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
Introduction to Python
1 Daniel Lucio March 2016 Creator of Python https://en.wikipedia.org/wiki/guido_van_rossum 2 Python Timeline Implementation Started v1.0 v1.6 v2.1 v2.3 v2.5 v3.0 v3.1 v3.2 v3.4 1980 1991 1997 2004 2010
Scientific Programming, Analysis, and Visualization with Python. Mteor 227 Fall 2015
Scientific Programming, Analysis, and Visualization with Python Mteor 227 Fall 2015 Python The Big Picture Interpreted General purpose, high-level Dynamically type Multi-paradigm Object-oriented Functional
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
What is Data Science? Girl Develop It! Meetup Renée M. P. Teate, March 2015
What is Data Science? { Girl Develop It! Meetup Renée M. P. Teate, March 2015 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa _Big_Data.jpg https://encryptedtbn2.gstatic.com/images?q=tbn:and9gcs9dku3_tzi-swwyaqee5y0ehuvoiznsya_raknubbd0jyxpx7pw
Data Analytics at NERSC. Joaquin Correa [email protected] NERSC Data and Analytics Services
Data Analytics at NERSC Joaquin Correa [email protected] NERSC Data and Analytics Services NERSC User Meeting August, 2015 Data analytics at NERSC Science Applications Climate, Cosmology, Kbase, Materials,
Value Proposition Open Source Financial Predictive Analytics
Value Proposition Open Source Financial Predictive Analytics http://www.asymptotix.eu/fpa.pdf http://www.asymptotix.eu/ecap.pdf http://www.revolution computing.com/industry/finance.php Retrospect and Currency
Pragmatic Web 4.0. Towards an active and interactive Semantic Media Web. Fachtagung Semantische Technologien 26.-27. September 2013 HU Berlin
Pragmatic Web 4.0 Towards an active and interactive Semantic Media Web Prof. Dr. Adrian Paschke Arbeitsgruppe Corporate Semantic Web (AG-CSW) Institut für Informatik, Freie Universität Berlin [email protected]
Postprocessing with Python
Postprocessing with Python Boris Dintrans (CNRS & University of Toulouse) [email protected] Collaborator: Thomas Gastine (PhD) Outline Outline Introduction - what s Python and why using it? - Installation
Programming Languages & Tools
4 Programming Languages & Tools Almost any programming language one is familiar with can be used for computational work (despite the fact that some people believe strongly that their own favorite programming
Outline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
Is a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
SURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
Introduction to D3.js Interactive Data Visualization in the Web Browser
Datalab Seminar Introduction to D3.js Interactive Data Visualization in the Web Browser Dr. Philipp Ackermann Sample Code: http://github.engineering.zhaw.ch/visualcomputinglab/cgdemos 2016 InIT/ZHAW Visual
Professional Organization Checklist for the Computer Science Curriculum Updates. Association of Computing Machinery Computing Curricula 2008
Professional Organization Checklist for the Computer Science Curriculum Updates Association of Computing Machinery Computing Curricula 2008 The curriculum guidelines can be found in Appendix C of the report
GETTING STARTED WITH R AND DATA ANALYSIS
GETTING STARTED WITH R AND DATA ANALYSIS [Learn R for effective data analysis] LEARN PRACTICAL SKILLS REQUIRED FOR VISUALIZING, TRANSFORMING, AND ANALYZING DATA IN R One day course for people who are just
Sanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a
a Data Science initiative @ Univ. Piraeus [GR]
a Data Science initiative @ Univ. Piraeus [GR] The Data Science Lab members June 2015 What is Data Science source: quora.com! Looking at data! Tools and methods used to analyze large amounts of data! Anything
CSE 373: Data Structure & Algorithms Lecture 25: Programming Languages. Nicki Dell Spring 2014
CSE 373: Data Structure & Algorithms Lecture 25: Programming Languages Nicki Dell Spring 2014 What is a Programming Language? A set of symbols and associated tools that translate (if necessary) collections
Proposal for Undergraduate Certificate in Large Data Analysis
Proposal for Undergraduate Certificate in Large Data Analysis To: Helena Dettmer, Associate Dean for Undergraduate Programs and Curriculum From: Suely Oliveira (Computer Science), Kate Cowles (Statistics),
Fall 2012 Q530. Programming for Cognitive Science
Fall 2012 Q530 Programming for Cognitive Science Aimed at little or no programming experience. Improve your confidence and skills at: Writing code. Reading code. Understand the abilities and limitations
HIGH PERFORMANCE ANALYTICS FOR TERADATA
F HIGH PERFORMANCE ANALYTICS FOR TERADATA F F BORN AND BRED IN FINANCIAL SERVICES AND HEALTHCARE. DECADES OF EXPERIENCE IN PARALLEL PROGRAMMING AND ANALYTICS. FOCUSED ON MAKING DATA SCIENCE HIGHLY PERFORMING
POL 204b: Research and Methodology
POL 204b: Research and Methodology Winter 2010 T 9:00-12:00 SSB104 & 139 Professor Scott Desposato Office: 325 Social Sciences Building Office Hours: W 1:00-3:00 phone: 858-534-0198 email: [email protected]
Exercise 0. Although Python(x,y) comes already with a great variety of scientic Python packages, we might have to install additional dependencies:
Exercise 0 Deadline: None Computer Setup Windows Download Python(x,y) via http://code.google.com/p/pythonxy/wiki/downloads and install it. Make sure that before installation the installer does not complain
KNIME Enterprise server usage and global deployment at NIBR
KNIME Enterprise server usage and global deployment at NIBR Gregory Landrum, Ph.D. NIBR Informatics Novartis Institutes for BioMedical Research, Basel 8 th KNIME Users Group Meeting Berlin, 26 February
Collaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What
Data Analytics @ UNC. Vinayak Deshpande
Data Analytics @ UNC Vinayak Deshpande New MBA elective @UNC MBA706, Data Analytics: Tools and Opportunities Instructor: Adam Mersereau Course Goals: Data Analytics: Tools and Opportunities" prepares students
Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
Introduction Installation Comparison. Department of Computer Science, Yazd University. SageMath. A.Rahiminasab. October9, 2015 1 / 17
Department of Computer Science, Yazd University SageMath A.Rahiminasab October9, 2015 1 / 17 2 / 17 SageMath(previously Sage or SAGE) System for Algebra and Geometry Experimentation is mathematical software
Questionnaire about the skills necessary for people. working with Big Data in the Statistical Organisations
Questionnaire about the skills necessary for people working with Big Data in the Statistical Organisations Preliminary results of the survey (19.08 2014) More detailed analysis will be prepared by October
Coding for Desktop and Mobile with HTML5 and Java EE 7
Coding for Desktop and Mobile with HTML5 and Java EE 7 Coding for Desktop and Mobile with HTML5 and Java EE 7 Geertjan Wielenga - NetBeans - DukeScript - VisualVM - Jfugue Music Notepad - Java - JavaScript
What? So what? NOW WHAT? Presenting metrics to get results
What? So what? NOW WHAT? What? So what? Visualization is like photography. Impact is a function of focus, illumination, and perspective. What? NOW WHAT? Don t Launch! Prevent your own disastrous decisions
What is a programming language?
Overview Introduction Motivation Why study programming languages? Some key concepts What is a programming language? Artificial language" Computers" Programs" Syntax" Semantics" What is a programming language?...there
Unlocking the True Value of Hadoop with Open Data Science
Unlocking the True Value of Hadoop with Open Data Science Kristopher Overholt Solution Architect Big Data Tech 2016 MinneAnalytics June 7, 2016 Overview Overview of Open Data Science Python and the Big
Faculty of Science School of Mathematics and Statistics
Faculty of Science School of Mathematics and Statistics MATH5836 Data Mining and its Business Applications Semester 1, 2014 CRICOS Provider No: 00098G MATH5836 Course Outline Information about the course
Car Insurance. Prvák, Tomi, Havri
Car Insurance Prvák, Tomi, Havri Sumo report - expectations Sumo report - reality Bc. Jan Tomášek Deeper look into data set Column approach Reminder What the hell is this competition about??? Attributes
1. Classification problems
Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification
02-201: Programming for Scientists
1. Course Information 1.1 Course description 02-201: Programming for Scientists Carl Kingsford Fall 2015 Provides a practical introduction to programming for students with little or no prior programming
Sunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
0 Introduction to Data Analysis Using an Excel Spreadsheet
Experiment 0 Introduction to Data Analysis Using an Excel Spreadsheet I. Purpose The purpose of this introductory lab is to teach you a few basic things about how to use an EXCEL 2010 spreadsheet to do
SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics
SAP Brief SAP HANA Objectives Transform Your Future with Better Business Insight Using Predictive Analytics Dealing with the new reality Dealing with the new reality Organizations like yours can identify
The Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures
CRASH COURSE PYTHON. Het begint met een idee
CRASH COURSE PYTHON nr. Het begint met een idee This talk Not a programming course For data analysts, who want to learn Python For optimizers, who are fed up with Matlab 2 Python Scripting language expensive
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
Exploratory Data Analysis with R. @matthewrenze #codemash
Exploratory Data Analysis with R @matthewrenze #codemash Motivation The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that
Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner Petar Ristoski, Christian Bizer, and Heiko Paulheim University of Mannheim, Germany Data and Web Science Group {petar.ristoski,heiko,chris}@informatik.uni-mannheim.de
Assignment 5: Visualization
Assignment 5: Visualization Arash Vahdat March 17, 2015 Readings Depending on how familiar you are with web programming, you are recommended to study concepts related to CSS, HTML, and JavaScript. The
Programming Languages
Programming Languages Qing Yi Course web site: www.cs.utsa.edu/~qingyi/cs3723 cs3723 1 A little about myself Qing Yi Ph.D. Rice University, USA. Assistant Professor, Department of Computer Science Office:
Availability of the Program A free version is available of each (see individual programs for links).
Choosing a Programming Platform Diane Hobenshield Tepylo, Lisa Floyd, and Steve Floyd (Computer Science and Mathematics teachers) The Tasks Working Group had many questions and concerns about choosing
01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.
(International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models
This chapter reviews the general issues involving data analysis and introduces
Research Skills for Psychology Majors: Everything You Need to Know to Get Started Data Preparation With SPSS This chapter reviews the general issues involving data analysis and introduces SPSS, the Statistical
R YOU READY FOR PYTHON? Sunday 19th April, 2015
R YOU READY FOR PYTHON? Sunday 19th April, 2015 THIS IS NOT A PYTHON VS R TALK credits - https://meetmrholland.wordpress.com/2013/02/03/creative-5-tips-to-make-all-your-meetings-exactly-the-same/ WHO ARE
ViviSight: A Sophisticated, Data-driven Business Intelligence Tool for Churn and Loan Default Prediction
ViviSight: A Sophisticated, Data-driven Business Intelligence Tool for Churn and Loan Default Prediction Barun Paudel 1, T.H. Gopaluwewa 1, M.R.De. Waas Gunawardena 1, W.C.H. Wijerathna 1, Rohan Samarasinghe
Data Visualization. Christopher Simpkins [email protected]
Data Visualization Christopher Simpkins [email protected] Data Visualization Data visualization is an activity in the exploratory data analysis process in which we try to figure out what story
HEP data analysis using jhepwork and Java
HEP data analysis using jhepwork and Java S. Chekanov HEP Division, Argonne National Laboratory, 9700 S.Cass Avenue, Argonne, IL 60439 USA 1 Introduction Abstract A role of Java in high-energy physics
Scientific Programming in Python
UCSD March 9, 2009 What is Python? Python in a very high level (scripting) language which has gained widespread popularity in recent years. It is: What is Python? Python in a very high level (scripting)
1 Topic. 2 Scilab. 2.1 What is Scilab?
1 Topic Data Mining with Scilab. I know the name "Scilab" for a long time (http://www.scilab.org/en). For me, it is a tool for numerical analysis. It seemed not interesting in the context of the statistical
CS 207 - Data Science and Visualization Spring 2016
CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler [email protected] An introduction to techniques for the automated and human-assisted analysis of data sets. These
Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015
Course Information Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015 Credit Hours: 3 Semester: Fall 2015 Meeting times and location: MWF, 12:10 13:00, Sloan 163 Course website:
Data Science Certificate Program
Information Technologies Programs Data Science Certificate Program Accelerate Your Career extension.uci.edu/datascience Offered in partnership with University of California, Irvine Extension s professional
Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem
Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem 2015 SAP SE or an SAP affiliate company. All rights
Today's Topics. COMP 388/441: Human-Computer Interaction. simple 2D plotting. 1D techniques. Ancient plotting techniques. Data Visualization:
COMP 388/441: Human-Computer Interaction Today's Topics Overview of visualization techniques 1D charts, 2D plots, 3D+ techniques, maps A few guidelines for scientific visualization methods, guidelines,
Python for Scientific Computing. http://bender.astro.sunysb.edu/classes/python-science
http://bender.astro.sunysb.edu/classes/python-science Course Goals Simply: to learn how to use python to do Numerical analysis Data analysis Plotting and visualizations Symbol mathematics Write applications...
Position Classification Flysheet for Computer Science Series, GS-1550. Table of Contents
Position Classification Flysheet for Computer Science Series, GS-1550 Table of Contents SERIES DEFINITION... 2 OCCUPATIONAL INFORMATION... 2 EXCLUSIONS... 4 AUTHORIZED TITLES... 5 GRADE LEVEL CRITERIA...
Web Development Frameworks
COMS E6125 Web-enHanced Information Management (WHIM) Web Development Frameworks Swapneel Sheth [email protected] @swapneel Spring 2012 1 Topic 1 History and Background of Web Application Development
Big Data better business benefits
Big Data better business benefits Paul Edwards, HouseMark 2 December 2014 What I ll cover.. Explain what big data is Uses for Big Data and the potential for social housing What Big Data means for HouseMark
Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot
www.etidaho.com (208) 327-0768 Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot 3 Days About this Course This course is designed for the end users and analysts that
Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm!
Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm! Moderator: David L. Snell, ASA, MAAA Presenters: Brian D. Holland, FSA, MAAA
ECON 424/CFRM 462 Introduction to Computational Finance and Financial Econometrics
ECON 424/CFRM 462 Introduction to Computational Finance and Financial Econometrics Eric Zivot Savery 348, email:[email protected], phone 543-6715 http://faculty.washington.edu/ezivot OH: Th 3:30-4:30 TA: Ming
BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, [email protected]) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
SUNPY: PYTHON FOR SOLAR PHYSICS. AN IMPLEMENTATION FOR LOCAL CORRELATION TRACKING
ISSN 1845 8319 SUNPY: PYTHON FOR SOLAR PHYSICS. AN IMPLEMENTATION FOR LOCAL CORRELATION TRACKING J. I. CAMPOS ROZO 1 and S. VARGAS DOMINGUEZ 2 1 Universidad Nacional de Colombia, Bogotá, Colombia 2 Big
Applied Mathematics and Mathematical Modeling
Applied Mathematics and Mathematical Modeling Joseph Malkevitch, York College (CUNY), Chair Ricardo Cortez, Tulane University Michael A. Jones, Mathematical Reviews Michael O Neill, Claremont McKenna College
Big Data Governance Certification Self-Study Kit Bundle
Big Data Governance Certification Bundle This certification bundle provides you with the self-study materials you need to prepare for the exams required to complete the Big Data Governance Certification.
Course Descriptions. 50101 Concepts of Programming. 50103 Discrete Math. 51030 ios Application Development
Course Descriptions [email protected] www.csmasters.uchicago.edu (773) 834-3388 50101 Concepts of Programming An introduction to programming no prior programming knowledge or experience is required.
Diploma Of Computing
Diploma Of Computing Course Outline Campus Intake CRICOS Course Duration Teaching Methods Assessment Course Structure Units Melbourne Burwood Campus / Jakarta Campus, Indonesia March, June, October 022638B
Course Descriptions: Undergraduate/Graduate Certificate Program in Data Visualization and Analysis
9/3/2013 Course Descriptions: Undergraduate/Graduate Certificate Program in Data Visualization and Analysis Seton Hall University, South Orange, New Jersey http://www.shu.edu/go/dava Visualization and
What Does Big Data Mean and Who Will Win? Michael Stonebraker
What Does Big Data Mean and Who Will Win? Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume Business stuff with simple (SQL) analytics Business stuff with complex (non-sql) analytics Science
