How do we train Data Scientists and Data Engineers?



Similar documents
Realization of Your Dream: Higher Study, Partnership, Collaboration Opportunities

US News & World Report Graduate Program Comparison Year ranking was published

College Admissions Deadlines

U.S. News & World Report

Applying to Graduate Programmes in Economics

Test Requirements at Top Colleges

Prof. Elizabeth Raymond Department of Chemistry Western Washington University

in the Rankings U.S. News & World Report

in the Rankings U.S. News & World Report

UCLA in the Rankings. U.S. News & World Report

1/1/2016 LUKASZ A. DROZD, Curriculum Vitae

in the Rankings U.S. News & World Report

Scientific Thought. Opportunities in Biomedical Sciences. The Traditional Path. Stuart E. Ravnik, Ph.D. Observation

College of Liberal Arts, Cohorts: Placement of PhD Holders

SURVEY OF INTEREST IN ECE FROM THE BEST UNIVERSITIES IN THE WORLD

27.9% of the graduates responded. Respondents. Degree Surveys with Salaries Male Average Female Average. Table of Contents

Robert Crown Law Library Legal Research Paper Series

Admission to US Universities. College-Ready in a Competitive Context

Summary of Doctoral Degree Programs in Philosophy

UC AND THE NATIONAL RESEARCH COUNCIL RATINGS OF GRADUATE PROGRAMS

Arizona State University: A Strategic Perspective

US News & World Report Undergraduate Rankings. Updated 09/08/2014

The University of California at Berkeley. A Brief Overview July 2007 For Presentation in Tokyo by Professor Stephen Cohen

Higher Education, Higher Expectations The State of U.S. University and College Websites

A Ranking of Political Science Programs Based on Publications in Top Academic Journals and Book Presses

FELLOWSHIP OPPORTUNITIES ADVANCE THE UNDERSTANDING OF AUTISM

The Next Step College Admissions

Data Science Institute Recent Activities. Education Faculty Excellence & Recruiting Research Industrial Relationships

Seattle, Washington Director of Technology Start by July lakesideschool.org

The Data for Good Exchange Awards Program

32.31% of the graduates responded. Respondents

*** SUMMER PSYCHOLOGY OPPORTUNITIES ***

Sandra D. Collins. Work Experience. 234C Mendoza College of Business University of Notre Dame Notre Dame, IN

University School Country Massachusetts Institute of

Presented to: Johns Hopkins School of Public Health

Robert Crown Law Library Legal Research Paper Series

Summary of Doctoral Degree Programs in Philosophy

Ivy League Admission Statistics for Class of Preliminary Results Early Action and Early Decision Entering Fall 2012

DISCUSSION ITEM ANNUAL REPORT ON NEWLY APPROVED INDIRECT COSTS AND DISCUSSION OF THE RECOVERY OF INDIRECT COSTS FROM RESEARCH BACKGROUND

How To Rank A Graduate School

UMBC. A University On The Move

SAMPLE. Student and Alumni. Including: Graduating Class of 2012, Internship Class of 2013, and Alumni

Q2 Which university will you be attending? American University (366) Arizona State University (367) Boston University (368) Brown University (439)

Moneyball for Law Firms: Rethinking Associate Talent Acquisition

The Hong Kong University of Science and Technology

BIOMEDICAL SUMMER RESEARCH OPPORTUNITIES FOR UNDERGRADUATE STUDENTS AND PREP PROGRAMS FOR POST-BACCALAUREATE

PH.D. IN BUSINESS ADMINISTRATION

Rank Full Journal Title Total Cites

The Path to Being an Economics Professor: What Difference Does the Graduate School Make? Zhengye Chen. University of Chicago

Diamond Bar High School Senior Statistics for Class of Information Provided by DBHS

Family Leave Policy Proposal for Engineering and Public Policy Author: Rebecca Balebako

Günter J. Hitsch. Education. Professional Experience. Grants, Awards, and Honors

David S. Lee. FIELDS OF INTEREST Labor Economics, Econometrics, Political Economy, Public Policy

Graduate Programs Applicant Report 2011

How To Run A College

December 2, U. S. Senate Washington, DC Dear Senator:

Evaluating the Top-50 Graduate Programs in Chemistry A New, Comprehensive Ranking by David Fraley that includes the Top-10 Listings in 7 Specialties

at the University of California, Santa Barbara Training the next generation of environmental economists

APRU Business Deans Meeting 2008 Preparing Students to Lead in Tomorrow s Global Economy. By Leonard Cheng Acting Dean HKUST Business School

CURRICULUM VITAE Panle Jia Barwick

University Your selection: 169 universities

Advising Sheets for Students Interested In Focusing on Law and Policy and Subsequent Graduate Studies

Trusted. Independent. Global.

UNIVERSITY TOP 50 BY SUBJECTS a) Arts and Humanities Universities

Office of the Provost

Establishing and Fostering Collaborative Relationships to Enhance Diversity Initiatives ( and a few things about Vanderbilt)

Pathology Informatics Training and Education Workshop. Shaking the Fruit from the Tree

University of Illinois at Chicago. Associate Professor (With Tenure), Aug 2012 Present

ACADEMIC POSITIONS EDUCATION

Darren T. Roulstone. Analyst Following and Market Liquidity (Contemporary Accounting Research, Volume 20 No. 3, )

Discover Viterbi: Computer Science

GrantsNet (for training in the biomedical sciences and undergraduate science education)

Tufts University Senior Survey 2010 Graduate Schools by Major Report

The Common Application

See more info at:

Graduate school information session. Why go? Big Decision #1 Clinical vs. Other. Mike Dodd, University of Nebraska - Lincoln

RFI Summary: Executive Summary

Librarians Forum Salary Report May 2015

Master s Degree Programs. Global Technology Leadership

Innovative Higher Education Business Models

Bard Center for Environmental Policy

KATE HO CURRICULUM VITAE. Department of Economics Tel: (212)

present: Assistant Professor, Foster Faculty Fellow Michael G. Foster School of Business, University of Washington

SBE PhD Production and Employment: Pipelines and Pathways

Building a College List

Viewing the Landscape

22-Nov Duke -4.5 over Minnesota Win 4 $ Nov Oklahoma -3 over UTEP Win 3 $300 7 $ $940

Psychology NRC Study R Rankings (1 of 6)

Psychology NRC Study S Rankings (1 of 6)

Studies of Political Science publishers rankings

BENCHMARKING UNIVERSITY ADVANCEMENT PERFORMANCE

Evan Rawley

A STUDY PUBLISHED in 1966 revealed a

SARAH ELIZABETH MCVAY

Messiah College. Fulbright winner

GWEN YU. Morgan Hall 383 Boston, MA,

July 2015 Pennsylvania Bar Examination

Biographical Sketches of Panel Members and Staff

Jack Baskin School of Engineering The University of California, Santa Cruz. Steve Kang, Dean and Prof. of Electrical Engineering October 15, 2003

Grant Writing Courses and Training Offered by Institutions

Transcription:

How do we train Data Scientists and Data Engineers? Eric Rozier Asst Prof of EECS at the University of Cincinnati Faculty Mentor DSSG at the University of Chicago

Training the Next Generation of Data Scientists Focus on two main programs: Summer 3 month intensive program DSSG Normal year curriculum development to support in class hands-on experiences

Data Science for Social Good

The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship 2014

The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship 2014

The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship 2014

Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship http://dssg.uchicago.edu @datascifellows

Data Science for Social Good @datascifellows Intro to DSSG

What is DSSG? 40-50 Fellows in teams of 3-4 Experienced Mentors 12 weeks in Chicago Impactful problems with nonprofit & govt partners Data Science for Social Good Fellowship Data Science for Social Good @datascifellows

Goals of the Fellowship Train data scientists who care about and understand how to solve social problems Expose and train governments & non profits to use data to make better decisions Seed a community of people and organizations working together to make social impact Create open source data science tools that are targeted at the needs of high impact social problems Data Science for Social Good @datascifellows

By the Numbers 2013 2014 36 Fellows 48 Fellows 6 Mentors 12 Weeks 12 Projects 8 Mentors 12 Weeks 14 Projects Data Science for Social Good @datascifellows

Ideal Fellows Problem Formulation Computer Science & Programming Statistics & Machine Learning Communication Making an Impact with Data Econometrics & Social Science Methods Experimental Design Databases Data Science for Social Good @datascifellows

2013-2014 Fellows ~1000 Applicants 40 countries ~250 Universities 84 fellows Computer Scientists CMU U. of Chicago Northwestern Harvard MIT Stanford ITAM Cornell Yale Villanova Ohio State USC U Penn Notre Dame U of Minnesota U of Michigan Cambridge McGill UC Berkeley U of Colorado Swarthmore Oberlin UIUC Emory Duke Fordham Johns Hopkins IIT SAIC NYU Penn State Simon Fraser UC Santa Barbara Statisticians Economists Public Policy (and other computational and quantitative fields) Data Science for Social Good @datascifellows

Breadth of Projects Partners: Non-Profits, Government Agencies, Corporations with a Social Mission Geographies: Local, State, National, and International Types of Problems: Impact Evaluation, Targeting, Risk Modeling, Types of Data: Structured data, geospatial data, time series, text data, network data Data Science for Social Good @datascifellows

Variety of Project Areas Health Energy Education Economic Development Corruption Federal Budgeting Home inspection data Smartmeter data Education records Administra -tive data Contract data Congressional bills Predicting lead poisoning Reducing energy use via disaggregatio n Predicting high school dropout Targeting and assessing urban revitalization Detecting collusion Identifying earmarks Data Science for Social Good @datascifellows

Data Science for Social Good @datascifellows 2014 Project Partners

Sample Projects Improving high school graduation rates by identifying at-risk students early Increasing government transparency by identifying earmarks Developing new strategies to reduce maternal mortality Preventing Lead Poisoning by proactive home inspections and health check-ups Data Science for Social Good @datascifellows

Prediction Saves Time & Money No Prediction Current Model Model Forecast Buildings: 197,157 Time: 76 years Money: $98 million Buildings: 42,695 Time: 16.4 years Money: $21.3 million Buildings: 378 Time: 2 months Money: $189,000 The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship 2014

At Risk Children Even without detailed child-level features, there are strong, sanity-checked, predictioncapable patterns Lead Levels During Childhood The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship 2014

Target: Prediction From Birth The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship 2014

Target: Prediction From Birth The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship 2014

The Tool The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship 2014

Who we re looking for? Fellows Mentors Partners Data Science for Social Good @datascifellows

Expertise in one or more of the following ares Computer Science Statistics Public Policy Social Science Other Quantitative or Analytical Areas Some coding experience Passion for making a social impact Problem solving (critical thinking) experience Enjoy working on a team Fellows Data Science for Social Good @datascifellows

Mentors Deep expertise in computer science, machine learning, statistics, or social sciences Ben Yuhas Principal, Yuhas Consulting Group Eric Rozier Assistant Professor of Electrical and Computer Engineering Experience working on real problems in industry Experience leading teams and managing projects Data Science for Social Good @datascifellows Kate Cagney Sociology & Health Studies Director, Population Research Center U Chicago Joe Walsh Lead Forecaster for GE Healthcare & Policy Consultant

Project Partners Organizations that 1. Have an interesting social-impact problem to solve 2. Have data that can help solve it 3. Have a desire to put our work into action Especially interested in longer-term collaborations beyond 12 weeks of the fellowship Governments / Government Orgs Foundations Non-Profits Research Institutions Data Science for Social Good @datascifellows

Application deadlines: Fellows: Feb 1, 2015 Mentors: Feb 1, 2015 Partners: Jan 10, 2015 Get Involved! Applications & more info: http://dssg.uchicago.edu Or email: datasciencefellowship@ci.uchicago.edu Data Science for Social Good @datascifellows

Data Science in the Curriculum with the Digital Observatory

The Data Deluge

The Data Deluge Big Data education suffers from similar challenges. How do we help students drink from the fire hose?

Big Data and the Curriculum Big Data is putting pressure on the curriculum Not just CS/ECE: Business, Finance, Social Science, Economics, Biology, Medicine, Public Policy NIH has held several meetings on Big Data education. Wants to integrate Big Data/Data Science into the regular curriculum.

NIH Conclusions Teach from case studies Proper training should include hands on experience with real data. Use and study of cutting edge: Tools Techniques

NIH Conclusions Teach from case studies Proper training should include hands on experience with real data. Use and study of cutting edge: Tools Techniques

NIH Conclusions Train Data Scientists to work as team members. The team is one of the most important parts of real data science applications. Emphasize multidisciplinary teams.

New Ways of Thinking Get students used to the pace of change, thinking exponentially

New Ways of Learning vs

Active Learning After 2 weeks we tend to remember: Passive learning 10% of what we read 20% of what we hear 30% of what we see 50% of what we hear and see Active learning 70% of what we say 90% of what we say and do

Bloom s Taxonomy Evaluation Synthesis Analysis Application Comprehension Knowledge

Three Pronged Approach Reading, presenting, and discussing current state of the art. Hands on study with real data. Original research in the field.

Involving Partners

Involving Partners

Creating a Classroom Around a Digital Observatory

Telescopes for Big Data

Transitioning a DSSG like Environment to the Year Identify a smaller number of partners to work with larger groups on a longer time scale. Understand that our expectations need to be tempered Summer exclusive, competitive program with international recruitment Year drawn from, admittedly excellent, student body at large, motivation may be lower.

Frontiers of Data Science Class Several published papers resulting from the class. Mixed undergrad and graduate, interdisciplinary environment. Awarded Frontiers of Engineering Education by the NAE

Growth of the Course First year 8 students Electrical Engineers, Computer Engineers, Computer Scientists, Environmental Scientists, Economists

Growth of the Course First year 8 students Electrical Engineers, Computer Engineers, Computer Scientists, Environmental Scientists, Economists Second year 14 students More industry involvement

Developing Scalable Infrastructures

Developing Scalable Infrastructures Understand the financial limitations of the classroom Develop resources which can be leveraged for research and curriculum, a practical curriculum based on real experience will have similar needs anyway!

The Need for Practice in the Academy We need to train ourselves in Data Science to teach it. Many faculty haven t had real industrial experience with Data Science. The field and practice is changing fast. Encourage the development of Data Science workshops, boot camps, and summer programs for faculty as well as students.

More information http://dssg.io http://dataengineering.org