Big Data in Evaluating Transformative Scientific Research: Concepts and a Case Study



Similar documents
Alexis-Michel Mugabushaka, Theodore Papazoglou. European Research Council Executive Agency, Brussels 1

RIA and regulatory Impact Analysis

Winning Proposals: Understanding the Basics of Federal and Foundation Grants

Animation. Intelligence. Business. Computer. Areas of Focus. Master of Science Degree Program

Hue Streams. Seismic Compression Technology. Years of my life were wasted waiting for data loading and copying

An Overview. University. Performance Metrics. and. Rankings

Identifying Focus, Techniques and Domain of Scientific Papers

SMAL BUSINESS INNOVATIVE RESEARCH

Quick Reference. Future Manufacturing Platform Grants

Public private partnership research cooperation

School of Public Health and Health Services. Doctor of Public Health Health Behavior Department of Prevention and Community Health.

Trading and Price Diffusion: Stock Market Modeling Using the Approach of Statistical Physics Ph.D. thesis statements. Supervisors: Dr.

COGNITIVE SCIENCE AND NEUROSCIENCE

Information Visualization WS 2013/14 11 Visual Analytics

Bibliometric Big Data and its Uses. Dr. Gali Halevi Elsevier, NY

The College of Science Graduate Programs integrate the highest level of scholarship across disciplinary boundaries with significant state-of-the-art

Grade 12 Psychology (40S) Outcomes Unedited Draft 1

Timely Portfolio Management for Efficient Resourcing

Global Convergences and Divergences in Doctoral Education. Deirdre Mageean

Research Statement Immanuel Trummer

Data Science at NSF Draft Report of StatSNSF committee: Revisions Since January MPSAC Meeting

SWIFT: A Text-mining Workbench for Systematic Review

Impact of Genetic Testing on Life Insurance

AIE: 85-86, 193, , 294, , , 412, , , 682, SE: : 339, 434, , , , 680, 686

GRADUATE EDUCATION AT USM: A STATEMENT OF THE GRADUATE COUNCIL March 2, 2009

Research competences in University education: profile of Master's Programs

Learning Goals and Assessment Methods: Undergraduate Academic Programs (Non-Accredited)

Discussion Paper On the validation and review of Credit Rating Agencies methodologies

Big Data Text Mining and Visualization. Anton Heijs

Bibliometrics and Transaction Log Analysis. Bibliometrics Citation Analysis Transaction Log Analysis

2015 Annual Letter to Shareholders. Fellow Shareholders:

The Impact of Sustainability in Manufacturing Companies Globally

The Battle for the Right Features or: How to Improve Product Release Decisions? 1

Benchmarking Liability Claims Management

STRATEGIC PRODUCT MANAGEMENT. Copyright 2005/2006

ORGANIZATIONAL ASSESSMENT TOOL FOR NGOs WORKING ON DRUG PREVENTION, DEMAND REDUCTION AND DRUG CONTROL

HAMPTON UNIVERSITY ONLINE Hampton University School of Business PhD in Business Administration

EXCELLENCE AND DYNAMISM. University of Jyväskylä 2017

Billing Code: P DEPARTMENT OF HEALTH AND HUMAN SERVICES. Agency for Toxic Substances and Disease Registry. [30Day ]

KNOWLEDGE NETWORK SYSTEM APPROACH TO THE KNOWLEDGE MANAGEMENT

Public Sector Executive Education


To order the book click on the link,

Time for Computer Science to Grow Up

Organizing Your Approach to a Data Analysis

Performance Evaluation

A NEW STRATEGIC DIRECTION FOR NTIS

American Heart Association Tobacco Regulation and Addiction Center (A-TRAC) NIH/FDA Grant 1P50HL Pilot Research Grants Program Description

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ FOR

Partnering for Project Success: Project Manager and Business Analyst Collaboration

Computer and Information Scientists $105, Computer Systems Engineer. Aeronautical & Aerospace Engineer Compensation Administrator

SUMMARY STUDY PURPOSE AND SCOPE

GEF/LDCF.SCCF.17/05/Rev.01 October 15, th LDCF/SCCF Council Meeting October 30, 2014 Washington, DC. Agenda Item 5

get connected RESIDENTIAL COLLEGES

How To Create A Hull White Trinomial Interest Rate Tree With Mean Reversion

FACULTY OF SOCIAL ADMINISTRATION DEPARTMENT OF SOCIAL WORK

PHYSICAL THERAPY PROGRAM STANDARDS FACULTY OF PHYSICAL THERAPY PROGRAM Revised 05/18/2016

COMPARATIVE EFFECTIVENESS RESEARCH (CER) AND SOCIAL WORK: STRENGTHENING THE CONNECTION

NEW FRONTIERS IN ENTREPRENEURSHIP:

12 th UNWTO AWARDS FOR EXCELLENCE AND INNOVATION IN TOURISM GUIDELINES

What sets breakthrough innovators apart PwC s Global Innovation Survey 2013: US Summary

Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics

Defining Education Research: Continuing the Conversation

IT services for analyses of various data samples

Alternative infrastructure funding and financing

MONROE TOWNSHIP PUBLIC SCHOOLS WILLIAMSTOWN, NEW JERSEY. Computer Animation. Grade 8

A NEW STRATEGIC DIRECTION FOR NTIS

Deriving Value from ORSA. Board Perspective

THE LEE SCHIPPER MEMORIAL SCHOLARSHIP FOR SUSTAINABLE TRANSPORT AND ENERGY EFFICIENCY Contact:

European University Association Contribution to the Public Consultation: Science 2.0 : Science in Transition 1. September 2014

DOCTOR OF BUSINESS ADMINISTRATION POLICY

Putting Reliable Health Care Performance Measurement Systems into Practice

Quadrennial Energy Review

Study of Impact of 3D Printing Technology and Development on Creative Industry

Data Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA)

Nursing Programs and Funding Opportunities at Georgia Health Sciences University

An outcomes research perspective on medical education: the predominance of trainee assessment and satisfaction

WHAT IS LEAN ACCOUNTING?

The International Journal Of Science & Technoledge (ISSN X)

Challenges in Developing a Small Business Tax payer Burden Model

A Guide to Learning Outcomes, Degree Level Expectations and the Quality Assurance Process in Ontario

FET-Open in Horizon2020 Work Programme Roumen Borissov Future and Emerging Technologies FET-Open Research Executive Agency

PMBOK 5. Chapters. 1. Introduction What is Project Management 2. Organizational Influences and Project Life Cycle 3. Project Management Processes

TEXATA 2015 PREPARATION GUIDE

The Economics Of Student Debt

Making critical connections: predictive analytics in government

The Nature of Science Virginia Science Standards Institute Denny Casey, Ph.D.

FDA Tobacco Regulatory Science Fellowship

A Guide for Preparing and Submitting White Papers to the Technology Innovation Program

INDIVIDUAL DEVELOPMENT PLAN (IDP )*

Master of Science in Marketing Analytics (MSMA)

National Commission for Academic Accreditation & Assessment. Standards for Quality Assurance and Accreditation of Higher Education Institutions

Concept and Applications of Data Mining. Week 1

J. Appl. Environ. Biol. Sci., 4(3) , , TextRoad Publication

FY2015 Annual Report. Towards an Economic Behavioral Science Approach to Cyber Security. Scott Farrow UMBC,

Change Management and Consensus Building

Innovation Metrics: Measurement to Insight

ACT. Recommendations A VISION TOWARDS A NEW RENAISSANCE LEADING ENABLING TECHNOLOGIES FOR SOCIETAL CHALLENGES. italia2014.eu

The additional Political Science Program Standards, where relevant, appear in italics.

Complex, true real-time analytics on massive, changing datasets.

Transcription:

Big Data in Evaluating Transformative Scientific Research: Concepts and a Case Study Bhavya Lal Vanessa Peña IDA Science and Technology Policy Institute (STPI) Understanding Federal R&D Impact Big Data: Measuring the Impact of the Government s Research and Development Investments National Press Club March 19, 2013 1

About Us: Science and Technology Policy Institute Federally funded research and development center chartered by Congress Provide rigorous and objective analysis of science and technology policy issues for the White House Office of Science and Technology Policy and other offices within the executive branch of the U.S. government and federal agencies Conduct science and technology analysis to inform policy decision-makers 2

Today s Presentation Challenges in evaluating non-traditional research Conceptual Data related Selected case study 3

Background Belief that the research enterprise is becoming more conservative Programs have been created to support transformative research 4

A Smorgasbord of Strategies in Federal Programs Traditional Programs Non-Traditional Programs People Process DESIGN REVIEW RESEARCH Traditional Grant Program A Program With Stakeholder Input Visionary Leaders Can Be Combined With Large (e.g. > $1M) Funding Seed Funding Length (e.g. > 3 years) In Funding Frontier Research Topics Non Traditional Selection Criteria Traditional Review Mechanism Can Be Combined With Pioneering Reviewers Risk Taking Program Managers Can Be Combined With Mechanisms Such As Different Role for Program Managers or Leadership Interdisciplinary Review Panels Simplified Proposal Submission Unique Criteria or Ranking Schemes Incremental Approaches Can Be Combined With Pioneering Researcher(s) Environment That Nurtures Creativity Can Be Combined With Approaches That Are Team Based/ Partnerships High Risk Interdisciplinary Topic/Outcome Focused MANAGEMENT Annual Reports/PI Meetings Can Be Combined With Enlightened Managers Can Be Combined With Management Approaches That Range From Hands off Management Allows Research to Change Topics or Direction To Hands On Must Accomplish Stated Goals or Funding Withdrawn OUTCOME Will Achieve Transformative Outcomes And/Or Incremental Outcomes With Spillovers: Create Community Institutional Change Train Students 55

NIH Director s Pioneer Award Program DESIGN REVIEW RESEARCH MANAGEMENT OUTCOME Will Achieve Transformative Outcomes Non-Traditional Programs People Process A Program With Can Be Combined With Large (e.g. > $1M) Funding Length (e.g. > 3 years) In Funding Can Be Combined Pioneering With Reviewers Can Be Combined With Mechanisms Such As Can Be Combined Pioneering With Researcher(s) Can Be Combined With Approaches That Are High Risk Can Be Combined With Can Be Combined With Management Approaches That Range From Hands off Management Allows Research to Change Topics or Direction And/Or Incremental Outcomes With Spillovers: Create Community Institutional Change Train Students Simplified Proposal Submission Non Traditional Selection Criteria 6

NSF Emerging Frontiers in Research & Innovation DESIGN REVIEW RESEARCH MANAGEMENT OUTCOME Will Achieve Transformative Outcomes Non-Traditional Programs People Process A Program With Can Be Combined With Can Be Combined With Can Be Combined With Stakeholder Input Visionary Leaders Can Be Combined With Large (e.g. > $1M) Funding Seed Funding Length (e.g. > 3 years) In Funding Can Be Combined With Mechanisms Such As Interdisciplinary Review Panels Can Be Combined With Approaches That Are Team Based/Partnerships High Risk Can Be Combined With Management Approaches that Range From Hands off Management Allows Research to Change Topics or Direction And/Or Incremental Outcomes With Spillovers: Create Community Institutional Change Train Students Frontier Research Topics Interdisciplinary Non Traditional Selection Criteria Unique Criteria or Ranking Schemes Topic/Outcome Focused 7

Policy Question Set-aside transformative research programs are viewed as taking resources from traditional programs Do they offer more transformative outcomes in return? 8

Conceptual Challenge How does one operationalize and measure transformative research? 9

Defining Transformative Research Tolstoyian thesis: although all conventional practitioners in the life sciences may be said to be conventional in the same way, all rebels seem to rebel in their own particular fashion The dominant paradigm with respect to the definition is I know it when I see it 10

Working Definition Transformative means: Research that involves ideas, discoveries, or tools that a. radically change our understanding of an important existing scientific or engineering concept or educational practice or b. leads to the creation of a new paradigm or field of science, engineering, or education Characterized not only by exceptional innovation, but also by the conscious taking of risks in the choice of its research subjects and methods 11

Operationalizing Transformative Research is Challenging (1) No known metrics for transformative research except in retrospect Darwin's theory of natural selection (replacement of Lamarckism as the mechanism for evolution) Development of quantum mechanics (supplanted classical mechanics) Plate tectonics (which combined the hypotheses of continental drift and sea floor spreading into the theory of plate tectonics) replaced the static geosynclinals theory in describing continental drift If an evaluation is sought too soon, the transformative nature of the results may not be evident 12

Operationalizing Transformative Research is Challenging (2) (2) Results may be controversial - especially if contradicting prevailing wisdom not even possible to publish them George Akerlof s path breaking economic theory of asymmetric information and adverse selection, which ultimately garnered him a Nobel Prize, was initially rejected by three major economics journals Barry Marshall and Robin Warren s discovery that peptic ulcers are caused by bacterium H. pylori and not stress brought them ridicule in the biomedical community Barbara McClintock on cytogenetics was ignored for decades Imagine Einstein submitting an application in 1905: I propose to explore the possibility that time slows down as things speed up. 13

Operationalizing Transformative Research In a traditional research portfolio, we can expect that: Some fraction are breakthroughs Some are incremental If a set-aside program is focused specifically on producing breakthrough research, it should lead to A larger number of research breakthroughs And/Or At least a few bigger breakthroughs Entire population distribution better Subset (tail) of the population does better Increasing performance on metric of interest 14

Evaluation Approach Identify comparison group(s) Identify indicators for comparison Whether impact is (more/differently) transformative Whether research approaches are (more) unusual Compare outcomes 15

Data Challenge Comparison research has to be similar enough to the transformative research Finding comparable comparison groups is an immense challenge 16

Limitations of Bibliometrics Traditional bibliometrics cannot tell the difference between traditional and transformative outcomes 17

Big Data Helped Us Partly Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze Potential for emerging tools to address data challenges NIH Maps database: http://nihmaps.org/index.php 18

CASE STUDY: USING BIG DATA IN EVALUATING A TRANSFORMATIVE RESEARCH PROGRAM 19

Transformative Research Program Outcome Evaluation Does the program s research produce higher impact as compared to research funded by other programs? Are the approaches used by the program s researchers more innovative as compared to research funded by other programs? 20

Uses of Text Mining Comparison Group Selection, selecting researchers working in similar sub-fields Data Analysis: Selecting subject-specific experts to assess research Comparing similarity of research topics from test and comparison groups 21

Evaluation Methods Test and Comparison Groups Groups Test Group Test Researchers Group 1 Similar Researchers Group 2 Similar Program Group 3 Program Finalists Group 4 Random Researchers Source 35 Researchers Transformative Research Program 35 Researchers One Program in the Same Funding Agency 39 Researchers High-Prestige Program from a Different Funding Organization 35 Researchers Applied to Same Funding Agency Portfolios on avg. 80 Researchers Many Programs from Same Funding Agency 22 22

Selection of Comparison Group 1 Match researcher and research dimensions: Research area Year of award Years since degree Institutional prestige Prior research program funding Terminal degree(s) received Receipt of early career awards Comparison Groups Group 1 Similar Researchers Group 2 Similar Program Group 3 Program Finalists Group 4 Random Researchers Matched? Yes By design No Matching helped us select an equal or balanced distribution of variables 23

Identifying Similar Research Areas All other matching indicators are straightforward quantitative Research area is a qualitative indicator Accessed and gleaned from award proposal text 40,000 award proposals used in the analysis, typical award proposal title and abstract length ~500 words 24

Selecting Similar Research Areas Topic Modeling and Similarity Index Based on topic modeling using Latent Dirichlet Allocation to assess similarity (Blei 2003, Talley 2011) Topic co-occurrence in unstructured text in NIH award titles and abstracts from 2007 to 2010 Values 0 to 1, representing proportion of co-occurrence of topics (1 being identical) Each award was matched with 7-200 similar awards with their respective index, these values were used in matching Selected 35 researchers from more than 12,000 NIH awardees Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993-1022. Talley, E. M. et al (2011). Database of NIH grants using machine-learned categories and graphical clustering. Nature Methods, 8, 443-444. 25

Data Analyses Qualitative and Quantitative Analyses Examination of researchers publications: 1. Expert Assessments quantitative / qualitative 2. Text-Mining Analysis quantitative / qualitative 3. Bibliometric Analysis quantitative Test and Comparison Groups Experts Text-Mining Bibliometrics Test 162 354 3,287 Group 1 Select Researchers 154 243 3,274 Group 2 Similar Program 194 117 3,313 Group 3 Program Finalists -- -- 2,298 Group 4 Random Researchers -- 11,431 14,352 26

Expert Assessments 94 experts conducted over 1,500 ratings of over 500 publications For each test and comparison group researcher, the five most impactful publications were identified (represented the body of work for a researcher) Body of work and individual papers reviewed at least three times Impact Expert Assessments Innovativeness Body of Work Individual Paper Body of Work Individual Paper 27

Identifying Subject-Specific Experts Sought experts from past NIH awardees in similar fields to the test program, used field similarity from topics in researchers awards (~ 730) Excluded awards used in comparison groups Selected those with recent early career awards Selected those with previous long-term awards (e.g., NIH/MERIT) Supplemented as needed via contact with NIH Combined lists and verified suitability 28

Analyzing Research Topic Divergence Investigated how similar research was to broader scientific community Used awardattributed Test Group publications in NDPA Group 1 matched-r01 Random Group 2 R01 PubMed through HHMI Group 4 NIH SPIRES from awards 2007-2010 Helps support study findings and could be used to identify emerging research Divergence Publication Proportion 0.00 0.05 0.10 0.15 0.20 0.25 0.30 2.6 3 3.4 3.8 4.2 4.6 5 5.4 5.8 6.2 6.6 No statistical difference (Kolmogorov-Smirnov p = 0.44, 0.64, 0.15) 29

Opportunities for Use of Big Data and Text Mining in Evaluations Selection of comparison groups Topic modeling and similarity metrics help match closely related topics Selecting experts for assessing research Subject-specific experts can be identified using topic modeling approaches Comparing research similarity Validates results and may identify innovative research areas not similar to research in the broader scientific community BUT usefulness is unclear 30

Thank you! For our full evaluation study: http://commonfund.nih.gov/pdf/ida_paper_p_4899.pdf Acknowledgements: Seth Jonas, Elizabeth Lee, Amy Richards, and Alyson Wilson (STPI), and Ned Talley (NIH) Bhavya Lal blal@ida.org Vanessa Peña vpena@ida.org 1899 Pennsylvania Avenue NW Washington, D.C. 20006 31