The National Consortium for Data Science (NCDS)
|
|
|
- Derek Pitts
- 10 years ago
- Views:
Transcription
1 The National Consortium for Data Science (NCDS) A Public-Private Partnership to Advance Data Science Ashok Krishnamurthy PhD Deputy Director, RENCI University of North Carolina, Chapel Hill
2 What is NCDS? is a public-private partnership to advance data science Mission Leadership in data science research & education, help industry to use the power of data to drive economic growth Vision Focused multi-sector, multidisciplinary data science community to solve big data challenges and drive the field forward Goals Engage broad communities of data experts Coordinate data science research priorities that span disciplines and industries Facilitate development education & training programs Support development of technical, ethical & policy standards Apply NCDS expertise to data challenges in science, business and government 2
3 NCDS Members The Big Data Frontier 4
4 Why a Consortium? Time Consortium can plant a stake in the ground quickly: significant funding and full-time staff not essential to launch. Participation Shared vision, ability to have your voice heard, define the issues to be tackled. Flexibility Able to try different models, different key projects, different core foci to see what works best and to respond to changing and varied needs and interests. Community Consortium is way of building a community that can eventually become the foundation for a center (a physical place). 5
5 NCDS Components Data Lab & Observatory Data Fellows Program Working Groups Data Science Events Shared, distributed infrastructure housing large organized data; serves as a platform for data R&D and data science education (Graduate certificates, MS) Seed grants for faculty to work on consortiumapproved projects; NCDS review panel evaluates proposals Industry internships for graduate students Visiting industry data scientists at member universities Year long deep dive into topics of interest to members Produces position papers, workshops, software, events, etc. Leadership Summits (Spring) Data Matters Short Courses (Summer) Student Career events (Fall/Spring) Invited lectures and outreach (ongoing) 6
6 Accomplishments: Organizational: Bylaws passed, steering committee, kickoff featuring Dr. Eric Green (Director, NHGRI) and US Rep. David Price, 10 paid memberships so far. Programmatic: NCDS Leadership Summit (April 2013); Five Faculty Fellows appointed (October 2013); Student-Industry-Faculty career awareness event (April 2014); Data Innovation Showcase (May 2014); Data Matters short course series (June 2014), Observatory active with data sets (June 2014). Upcoming: Tech Talks with UNC Computer Science and UNC Career Services (October 2014), New Data Fellows CFP (October 2014), Working Groups (Fall 2014). 8
7 NCDS Data Cyberinfrastructure Secure Research Workspace/ Secure Medical Research Workspace: Secured virtual environments ExoGENI/ADAME NT: Federated Infrastructure as a Service irods: Policy-driven data management DataBridge: Social media- like discovery of useful data sets Genomic Medical Workflow Engine: Informatics and HPC in High Throughput Sequencing Key: Infrastructure that adapts to problems 9
8 What is irods? free to use free to modify free to contribute sits between the files and the user irods is open source data grid middleware for Data Discovery Workflow Automation Secure Collaboration Data Virtualization Metadata policies: any condition; any action sharing without losing control file system flexibility 1 10
9 irods 4.0: Ready for Enterprise Product of nearly 20 years of research and development, funded by DARPA, DOE, NASA, NSF, NARA, and NOAA. Sustainability - Formation of the irods Consortium 6 members, presently: developers, users, storage vendors Provides interaction between user/developer community Professional integration services, technical support, training and certification Enterprise Quality - Starting with irods 4.0, the entire codebase has been reviewed and restructured. Plug-in architecture Each change is verified with a test case in a continuous integration suite Pre-compiled binary packages are available for several Linux distributions and multiple database management systems. 1 11
10 Who Uses irods? The Wellcome Trust Sanger Institute manages 2 PB of data with irods Data discovery and workflow automation: data is tagged with processing history and checksums Secure collaboration: workgroups can share with each other while independently maintaining archiving and access policies Data virtualization: data is replicated for redundancy and high availability 1 12
11 Who Uses irods? The iplant Collaborative uses irods to manage over 112M files (>750 TB) with over 20,000 users Data discovery: Templates guide application of metadata according to international data curation standards Workflow automation: Fine-grained user permissions conditioned on domain, group, file size, metadata Data virtualization: Data is easily moved between storage and compute resources, always maintaining a specified level of redundancy 1 13
12 Data Science Education Modular courses for 11 month program Graduate Certificate in Data Science (Half time) MS in Data Science (Full time) 14
13 Conclusion Developing Data Science Will: Develop the next generation of data science experts and leaders Create strategies, practices and scientific methods for understanding data Enable more collaborations among data and domain scientists, business, academia and government Assist those who are struggling to collect, analyze, manage and use data Establish methodologies for measuring the value and impact of data 15
14 THANK YOU!
Managing Next Generation Sequencing Data with irods
Managing Next Generation Sequencing Data with irods Presented by Dan Bedard // [email protected] at the 9 th International Conference on Genomics Shenzhen, China September 12, 2014 Managing NGS Data with
Data Management using irods
Data Management using irods Fundamentals of Data Management September 2014 Albert Heyrovsky Applications Developer, EPCC [email protected] 2 Course outline Why talk about irods? What is irods?
INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS)
INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS) Todd BenDor Associate Professor Dept. of City and Regional Planning UNC-Chapel Hill [email protected] http://irods.org/ SESYNC Model Integration Workshop Important
Panel on Emerging Cyber Security Technologies. Robert F. Brammer, Ph.D., VP and CTO. Northrop Grumman Information Systems.
Panel on Emerging Cyber Security Technologies Robert F. Brammer, Ph.D., VP and CTO Northrop Grumman Information Systems Panel Moderator 27 May 2010 Panel on Emerging Cyber Security Technologies Robert
UNIVERSITY GLOBAL PARTNERSHIP NETWORK (UGPN) RESEARCH COLLABORATION FUND 2014 THIRD CALL FOR PROPOSALS
UNIVERSITY GLOBAL PARTNERSHIP NETWORK (UGPN) RESEARCH COLLABORATION FUND 2014 THIRD CALL FOR PROPOSALS SUMMARY The University Global Partnership Network (UGPN) is a preferred partnership network including
Automated and Scalable Data Management System for Genome Sequencing Data
Automated and Scalable Data Management System for Genome Sequencing Data Michael Mueller NIHR Imperial BRC Informatics Facility Faculty of Medicine Hammersmith Hospital Campus Continuously falling costs
Balancing Big Data for Security, Collaboration and Performance
Balancing Big Data for Security, Collaboration and Performance Sai Balu Lineberger Cancer Center UNC Chapel Hill Oct 14, 2014 About UNC Oldest Public University -1793 Top 5 Public University. 46th World
National Big Data R&D Initiative
National Big Data R&D Initiative Suzi Iacono, PhD National Science Foundation Co-chair NITRD Big Data Senior Steering Group for CASC Spring Meeting April 23, 2014 Why is Big Data Important? Transformative
Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un.
Policy-driven Distributed Data Management (irods) Richard Marciano [email protected] Professor @ SILS / Chief Scientist for Persistent Archives and Digital Preservation @ RENCI Director of the Sustainable
How To Understand The Nature Of Big Data
Big Data is Coming for You W. Christopher Lenhardt RENCI DAARWG, Chair Outline A few words about RENCI Introduction: On the Nature of BIG Big Challenges Big Science Questions Big Data Other Big Trends
Integrated Rule-based Data Management System for Genome Sequencing Data
Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer
NITRD: National Big Data Strategic Plan. Summary of Request for Information Responses
NITRD: National Big Data Strategic Plan Summary of Request for Information Responses Introduction: Demographics Summary of Responses Next generation Capabilities Data to Knowledge to Action Access to Big
Data Registry Workshop Report
Data Registry Workshop Report Background A Joint Working Group on Data Sharing and Archiving (JWG), representing major professional societies that publish ecology, evolution, and organismal biology journals,
Science Gateways in the US. Nancy Wilkins-Diehr [email protected]
Science Gateways in the US Nancy Wilkins-Diehr [email protected] NSF vision for cyberinfrastructure in the 21st century Software is critical to today s scientific advances Science is all about connections
irods Policy-Driven Data Preservation Integrating Cloud Storage and Institutional Repositories
irods Policy-Driven Data Preservation Integrating Cloud Storage and Institutional Repositories Reagan W. Moore Arcot Rajasekar Mike Wan {moore,sekar,mwan}@diceresearch.org h;p://irods.diceresearch.org
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) Goal Develop and deploy comprehensive, integrated, sustainable, and secure cyberinfrastructure (CI) to accelerate research
Graduate Research and Education: New Initiatives at ORNL and the University of Tennessee
Graduate Research and Education: New Initiatives at ORNL and the University of Tennessee Presented to Joint Workshop on Large-Scale Computer Simulation Jim Roberto Associate Laboratory Director Graduate
High Performance Computing Initiatives
High Performance Computing Initiatives Eric Stahlberg September 1, 2015 DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health National Cancer Institute Frederick National Laboratory is
irods for Big Data Management in Research Driven Organizations Charles Schmitt CTO & Director of Informatics RENCI
irods for Big Data Management in Research Driven Organizations Charles Schmitt CTO & Director of Informatics RENCI Acknowledgements Presented work funded in part by grants from NIH, NSF, NARA, DHS, as
Data Science at the NIH Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health
Data Science at the NIH Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health Data Science Timeline 6/12 Findings: Sharing data & software through catalogs Support methods
Institutes for Data Science: New York University University of Washington University of California, Berkeley
Advancing scientific discovery through collaboration across research domains Institutes for Data Science: New York University University of Washington University of California, Berkeley Data Science growing
MBS: Webinar. Please note: Voice is over the phone. Please use the call in number on the left panel.
MBS: Webinar Please note: Voice is over the phone. Please use the call in number on the left panel. About Rutgers State University of New Jersey: Rutgers is the sole university in the United States that
New Jersey Big Data Alliance
Rutgers Discovery Informatics Institute (RDI 2 ) New Jersey s Center for Advanced Computation New Jersey Big Data Alliance Manish Parashar Director, Rutgers Discovery Informatics Institute (RDI 2 ) Professor,
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 ST CENTURY SCIENCE, ENGINEERING, AND EDUCATION (CIF21)
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 ST CENTURY SCIENCE, ENGINEERING, AND EDUCATION (CIF21) Overview The Cyberinfrastructure Framework for 21 st Century Science, Engineering, and Education (CIF21) investment
College of Human Environmental Sciences Strategic Plan for 2012-2015
College of Human Environmental Sciences Strategic Plan for 2012-2015 Revised Fall 2013 Mission: The College will be a well-recognized leader in preparing students to impact the lives of individuals and
College of Architecture Strategic Plan 2014-2025
DRAFT College of Architecture Strategic Plan 2014-2025 Design. Technology. Engagement. School of Architecture School of Building Construction School of City and Regional Planning School of Industrial Design
Workforce Development for Teachers and Scientists Funding Profile by Subprogram and Activity
Activities at the DOE Laboratories Funding Profile by Subprogram and Activity FY 2012 Current (dollars in thousands) FY 2013 Annualized CR* Request Science Undergraduate Laboratory Internships 6,500 7,300
Strategic Plan 2013-2015. The College of Business Oregon State University. Strategic Plan. Approved June 2012 Updated June 2013 Updated June 2014
The College of Business Oregon State University Strategic Plan Approved June 2012 Updated June 2013 Updated June 2014 1 The College of Business Oregon State University Vision Developing professionals who
How To Manage Research Data At Columbia
An experience/position paper for the Workshop on Research Data Management Implementations *, March 13-14, 2013, Arlington Rajendra Bose, Ph.D., Manager, CUIT Research Computing Services Amy Nurnberger,
Pluggable Rule Engine
Pluggable Rule Engine CurateGear2016 Terrell Russell, Ph.D. @terrellrussell Senior Data Scientist, irods Consortium Renaissance Computing Institute (RENCI), UNC-Chapel Hill 1 2 irods Consortium The irods
LabArchives Electronic Lab Notebook:
Electronic Lab Notebook: Cloud platform to manage research workflow & data Support Data Management Plans Annotate and prove discovery Secure compliance Improve compliance with your data management plans,
US NSF s Scientific Software Innovation Institutes
US NSF s Scientific Software Innovation Institutes S 2 I 2 awards invest in long-term projects which will realize sustained software infrastructure that is integral to doing transformative science. (Can
irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI!
irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI! Renaissance Computing Institute (RENCI) A research unit of UNC Chapel Hill Current
Electric Energy and Power Panel Sessions
Electric Energy and Power Panel Sessions S.S. (Mani) Venkata University of Washington, Seattle, WA [email protected] 520-820-8005 2011 ECEDHA Annual Conference Phoenix, AZ March 13, 2011 Background
SURENDRA SARNIKAR. 820 N Washington Ave, EH7 Email: [email protected] Madison, SD 57042 Phone: 605-256-7341
SURENDRA SARNIKAR 820 N Washington Ave, EH7 Email: [email protected] Madison, SD 57042 Phone: 605-256-7341 EDUCATION PhD in Management Information Systems May 2007 University of Arizona, Tucson, AZ MS in
Director, Office of Health IT and e Health; State Government HIT Coordinator. Deputy Director, Office of Health IT and e Health
Assignment Location: Minnesota Department of Health St. Paul, Minnesota Primary Mentor: Secondary Mentor: Martin LaVenture, PhD, MPH, FACMI Director, Office of Health IT and e Health; State Government
Agenda. University of Southern California. Viterbi School of Engineering. Master s Programs. Doctoral Programs. Work Experience Opportunities Q&A
Agenda University of Southern California Viterbi School of Engineering Master s Programs Doctoral Programs Work Experience Opportunities Q&A University of Southern California Overview Oldest private university
Technical. Overview. ~ a ~ irods version 4.x
Technical Overview ~ a ~ irods version 4.x The integrated Ru e-oriented DATA System irods is open-source, data management software that lets users: access, manage, and share data across any type or number
Discover Viterbi: Computer Science
Discover Viterbi: Computer Science Gaurav S. Sukhatme Professor and Chairman USC Computer Science Department Meghan Balding Graduate & Professional Programs November 2, 2015 WebEx Quick Facts Will I be
irods Overview Introduction to Data Grids, Policy-Driven Data Management, and Enterprise irods
irods Overview Introduction to Data Grids, Policy-Driven Data Management, and Enterprise irods Renaissance Computing Institute (RENCI) A research unit of UNC Chapel Hill Directed by Stan Ahalt, formerly
Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.
Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Dr Liz Lyon, UKOLN, University of Bath Introduction and Objectives UKOLN is undertaking
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF COPENHAGEN. Strategy 2015-2018. Department of Psychology University of Copenhagen
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF COPENHAGEN Strategy 2015-2018 Department of Psychology University of Copenhagen Copenhagen / March 20, 2015 Foreword This strategy plan for the Department of Psychology
Agenda. University of Southern California. Viterbi School of Engineering. Master s Programs. Work Experience Opportunities Q&A
Agenda University of Southern California Viterbi School of Engineering Master s Programs Work Experience Opportunities Q&A University of Southern California Overview Oldest private university in western
College of Architecture Strategic Plan 2014-2025
College of Architecture Strategic Plan 2014-2025 From the Dean The College of Architecture at the Georgia Institute of Technology houses a rich mix of disciplines that are critical in shaping how people
EMBL Identity & Access Management
EMBL Identity & Access Management Rupert Lück EMBL Heidelberg e IRG Workshop Zürich Apr 24th 2008 Outline EMBL Overview Identity & Access Management for EMBL IT Requirements & Strategy Project Goal and
Life Sciences and Large Data Challenges
Life Sciences and Large Data Challenges David Fergusson Head of Scientific Computing The Francis Crick Institute WHAT IS THE CRICK? The Francis Crick Institute Sir Paul Nurse Nobel Prize with Hartwell
Homeland Open Security Technology HOST Program
Homeland Open Security Technology HOST Program Informational Briefing August 2011 Sponsored by: U.S. Department of Homeland Security Science and Technology Directorate Implemented by: Open Technology Research
NOMINATION OF THE. For the 2010 USASBE Entrepreneurship Education National Award in. Outstanding Specialty Entrepreneurship Program.
NOMINATION OF THE California State University San Bernardino Inland Empire Center for Entrepreneurship (IECE) Integrated Technology Transfer Network Program (ITTN) For the 2010 USASBE Entrepreneurship
Charting the Evolution of Campus Cyberinfrastructure: Where Do We Go From Here? 2015 National Science Foundation NSF CC*NIE/IIE/DNI Principal
Jim Bottum Charting the Evolution of Campus Cyberinfrastructure: Where Do We Go From Here? 2015 National Science Foundation NSF CC*NIE/IIE/DNI Principal Investigators Meeting The CC* Mission Campuses today
EUDAT. Towards a pan-european Collaborative Data Infrastructure. Willem Elbers
EUDAT Towards a pan-european Collaborative Data Infrastructure Willem Elbers EUDAT / MPI-TLA Focus meeting: Data repositories SURF, Utrecht March 3, 2014 Outline EUDAT project EUDAT services Summary and
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 ST CENTURY SCIENCE, ENGINEERING, AND EDUCATION (CIF21) $100,070,000 -$32,350,000 / -24.43%
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 ST CENTURY SCIENCE, ENGINEERING, AND EDUCATION (CIF21) $100,070,000 -$32,350,000 / -24.43% Overview The Cyberinfrastructure Framework for 21 st Century Science, Engineering,
Education and Workforce Development in the High End Computing Community
Education and Workforce Development in the High End Computing Community The position of NITRD s High End Computing Interagency Working Group (HEC-IWG) Overview High end computing (HEC) plays an important
Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network
Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network November 16, 2010 Data Management Short Course Series Sponsored by the Odum Institute and the UNC Libraries Campus
Databases & Data Infrastructure. Kerstin Lehnert
+ Databases & Data Infrastructure Kerstin Lehnert + Access to Data is Needed 2 to allow verification of research results to allow re-use of data + The road to reuse is perilous (1) 3 Accessibility Discovery,
Big Data to Knowledge (BD2K)
Big Data to Knowledge () potential funding agency synergies Jennie Larkin, PhD Office of the Associate Director of Data Science National Institutes of Health idash-pscanner meeting UCSD September 16, 2014
Technology solutions for managing and computing on largescale biomedical data
Technology solutions for managing and computing on largescale biomedical data Charles Schmitt CTO & Director of Informatics RENCI Brand Fortner Executive Director, irods Consortium Jason Coposky Chief
Vanderbilt University Biomedical Informatics Graduate Program (VU-BMIP) Proposal Executive Summary
Vanderbilt University Biomedical Informatics Graduate Program (VU-BMIP) Proposal Executive Summary Unique among academic health centers, Vanderbilt University Medical Center entrusts its Informatics Center
Foundation for HEP Software
Foundation for HEP Software (Richard Mount, 19 May 2014) Preamble: Software is central to all aspects of modern High Energy Physics (HEP) experiments and therefore to their scientific success. Offline
The University of Edinburgh Global Health Academy
The University of Edinburgh Global Health Academy Collective Vision for Global Health Academy Bringing together the University s eminent Global Health Scientists and Practitioners and students from disciplines
Request for Information National Network for Manufacturing Innovation (NNMI)
Request for Information National Network for Manufacturing Innovation (NNMI) At the College of Engineering at the University of California, Berkeley, we believe that advanced manufacturing has tremendous
Manjula Ambur NASA Langley Research Center April 2014
Manjula Ambur NASA Langley Research Center April 2014 Outline What is Big Data Vision and Roadmap Key Capabilities Impetus for Watson Technologies Content Analytics Use Potential use cases What is Big
Case Studies in Systems Engineering Central to the Success of Applied Systems Engineering Education Programs
Complexity Case Studies in Systems Engineering Central to the Success of Applied Systems Engineering Education Programs Carlee A. Bishop Principal Research Engineer, Georgia Tech Research Institute Georgia
The Resource Management Life Cycle
The Resource Management Life Cycle Resource Planning for 2013 Revised November 2012 http://epmlive.com Contents Introduction...2 What is Resource Management?...2 Who Participates in Resource Management?...2
Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management
Research Data Management Canadian National Research Data Repository Service Progress Report, June 2016 As their digital datasets grow, researchers across all fields of inquiry are struggling to manage
SECURE AND TRUSTWORTHY CYBERSPACE (SaTC)
SECURE AND TRUSTWORTHY CYBERSPACE (SaTC) Overview The Secure and Trustworthy Cyberspace (SaTC) investment is aimed at building a cybersecure society and providing a strong competitive edge in the Nation
STUDENT ACTIVITIES STUDENT ORGANIZATION ANNUAL CERTIFICATION PACKET 2015-2016
STUDENT ACTIVITIES STUDENT ORGANIZATION ANNUAL CERTIFICATION PACKET 2015-2016 SUBMIT COMPLETED PACKET (NO LATER THAN Friday, April 17 th, 2015 at 5:00pm) TO: Dean Ophelia Morgan Trinity Washington University,
Digital Stewardship Education at the Graduate School of Library & Information Science, Simmons College
Digital Stewardship Education at the Graduate School of Library & Information Science, Simmons College Martha Mahard and Ross Harvey Graduate School of Library & Information Science Simmons College Boston,
Psychological Science Strategic Plan February 18, 2014. Department of Psychological Science Mission
Psychological Science Strategic Plan Department of Psychological Science Mission The Department of Psychological Science strives to achieve and maintain excellence in undergraduate and graduate education,
