Making Sense of Massive Data by Hypothesis Testing



Similar documents
ACH 1.1 : A Tool for Analyzing Competing Hypotheses Technical Description for Version 1.1

Data Driven Discovery In the Social, Behavioral, and Economic Sciences

CHANCE ENCOUNTERS. Making Sense of Hypothesis Tests. Howard Fincher. Learning Development Tutor. Upgrade Study Advice Service

Disciple-LTA: Learning, Tutoring and Analytic Assistance 1

Introduction of Information Visualization and Visual Analytics. Chapter 2. Introduction and Motivation

HOW TO WRITE A LABORATORY REPORT

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

REGULATION REGULATION ON THE WORKING PROCEDURES AND PRINCIPLES OF RISK ASSESSMENT COMMITTEES AND COMMISSIONS SECTION ONE

Cognitive and Organizational Challenges of Big Data in Cyber Defense

Biological Weapons During the Cold War. Lecture No. 4

Defining Your Intelligence Requirements

How can you unlock the value in real-world data? A novel approach to predictive analytics could make the difference.

Paper Airplanes & Scientific Methods

National Nursing Informatics Deep Dive Program

NATIONAL SECURITY DECISION MAKING FORMAL VS. INFORMAL PROCEDURES AND STRUCTURES

Analyzing and Interpreting Data: What makes things sink or float?

Explore the Possibilities

Analytics For Everyone - Even You

BOSTON UNIVERSITY SCHOOL OF PUBLIC HEALTH PUBLIC HEALTH COMPETENCIES

Conference Call with Dr. Olli Heinonen Transcript

Chapter 10 Practical Database Design Methodology and Use of UML Diagrams

Tableau's data visualization software is provided through the Tableau for Teaching program.

FIVE STEPS FOR DELIVERING SELF-SERVICE BUSINESS INTELLIGENCE TO EVERYONE CONTENTS

APPENDIX T: GUIDELINES FOR A THESIS RESEARCH PROPOSAL. Masters of Science Clinical Laboratory Sciences Program

Critical Analysis So what does that REALLY mean?

Biotechnical Engineering PLTW Scope and Sequence Year at a Glance First Semester

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

MOOCdb: Developing Data Standards for MOOC Data Science

CSI: Chemistry. Lesson Created by Brandon Watters, Lakes Community High School

DON T GET LOST IN THE FOG OF BIG DATA

Information Management for National Guard Agribusiness Development Teams: An Agile Development Case Study

ANALYTICS & CHANGE KEYS TO BUILDING BUY-IN

Why your business decisions still rely more on gut feel than data driven insights.

COLLECTIVE INTELLIGENCE: A NEW APPROACH TO STOCK PRICE FORECASTING

INTERNATIONAL RELATIONS COMPREHENSIVE EXAMINATION WINTER 2015

The Research Proposal

THESIS MANUAL GRNS 391 DEPARTMENT OF NURSING GRADUATE PROGRAM

The Science of Analytical Reasoning

Onboarding Blueprint By Jonathan DeVore The Accidental Trainer This workbook is to be used with the Salesforce Onboarding Blueprint.

REQUIREMENTS FOR THE MASTER THESIS IN INNOVATION AND TECHNOLOGY MANAGEMENT PROGRAM

Degree Level Expectations, Learning Outcomes, Indicators of Achievement and the Program Requirements that Support the Learning Outcomes

10-Step Guide to Knowledge Capture

Model-driven Business Intelligence Building Multi-dimensional Business and Financial Models from Raw Data

The United States Department of Defense Biological Threat Reduction Program. Threat Agent Detection and Response and Cooperative Biological Research

Managing Third Party Databases and Building Your Data Warehouse

School of Advanced Studies Doctor Of Management In Organizational Leadership. DM 004 Requirements

Business Networks: The Next Wave of Innovation

Information Visualization WS 2013/14 11 Visual Analytics

ANALYTICS & CHANGE. Keys to Building Buy-In

CREDIT TRANSFER: GUIDELINES FOR STUDENT TRANSFER AND ARTICULATION AMONG MISSOURI COLLEGES AND UNIVERSITIES

Useful Key Performance Indicators for Maintenance

TOP 10 TRENDS FOR 2016 BUSINESS INTELLIGENCE

Explorable Visual Analytics (EVA) Interactive Exploration of LEHD. Saman Amraii - Amir Yahyavi Carnegie Mellon University

Tools for Managing and Measuring the Value of Big Data Projects

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

Three Methods for ediscovery Document Prioritization:

Leveraging Global Media in the Age of Big Data

UN Security Council Resolution 1540: Monitoring and Detecting Breaches in Biosecurity & Illicit Trafficking of BW-Related Materials

Qualitative data acquisition methods (e.g. Interviews and observations) -.

Feature Factory: A Crowd Sourced Approach to Variable Discovery From Linked Data

MINISTRY OF HIGHER EDUCATION UNIVERSITY OF HAIL COLLEGE OF PHARMACY

Designing a Scientific Poster

DEEPER LEARNING COMPETENCIES April 2013

Dear Delegates, It is a pleasure to welcome you to the 2014 Montessori Model United Nations Conference.

Rethinking Information Security for Advanced Threats. CEB Information Risk Leadership Council

Guide to a winning business plan

Alvin Elementary & Alvin ISD Elementary Invention Showcase Guidelines

Chapter 2 Thesis Structure

This Module goes from 9/30 (10/2) 10/13 (10/15) MODULE 3 QUALITATIVE RESEARCH TOPIC 1. **Essential Questions**

How to Use Boards for Competitive Intelligence

Comparing Primary and Secondary Sources Lesson Plan

Predictive Coding, TAR, CAR NOT Just for Litigation

Sample. Session 4: Case Analysis & Planning. Identify potential legal and non-legal options for achieving client goals

Top 5 best practices for creating effective dashboards. and the 7 mistakes you don t want to make

Getting Published. Ed Diener Smiley Professor of Psychology University of Illinois

Performance Management: How to Use Data to Drive Programmatic Efforts

Chapter 2 Conceptualizing Scientific Inquiry

IRMAC SAS INFORMATION MANAGEMENT, TRANSFORMING AN ANALYTICS CULTURE. Copyright 2012, SAS Institute Inc. All rights reserved.

Government Technology Trends to Watch in 2014: Big Data

ICT Perspectives on Big Data: Well Sorted Materials

B.A. Programme. Psychology Department

NASA SUPPLEMENTAL CLASSIFICATION SYSTEM NON-AST SCHEMATIC

Christina Wojcik, VP Legal Services, Seal Software Steven Toole, VP Marketing, Content Analyst Company Jason Voss, Senior Product Manager, TCDi

Review & AI Lessons learned while using Artificial Intelligence April 2013

How do we build and refine models that describe and explain the natural and designed world?

Department of Defense DIRECTIVE. SUBJECT: Policy and Program for Immunizations to Protect the Health of Service Members and Military Beneficiaries

AIE: 85-86, 193, , 294, , , 412, , , 682, SE: : 339, 434, , , , 680, 686

Transcription:

Making Sense of Massive Data by Hypothesis Testing Dr. John W. Bodnar SAIC Supported by: ARDA NIMD Program SAIC IR&D Program A Think Loop Model for analysis is presented that breaks the analytical process down into a nested series of think loops which indicate how analysts combine bottom-up data driven steps with top-down hypothesis driven steps to be able to forage for new data then synthesize that data into evidence-based schemas and theories. I suggest that this model can not only account for current problems being encountered throughout the Intelligence Community in making sense out of the massive data available in many disparate databases but also can suggest strategies for re-thinking our current analytical methods and tools to overcome those problems. 1

Analytical Workflow for the BCW Analyst Support Team Open Source, Internet Tools Methods Input Databases Analysis Training Collaboration Security Feedback We need to understand what happens in the Analyst s Brain before we can build proper IT tools to support that process. Nations Non-State Actors Proliferation Analysis Chemical Weapons Biological Weapons Products Customers What does it mean? KNOWLEDGE What does it mean? Collectors What is it? Collection Analysts (NSA,GSA,FBIS) Input Databases DATA What is it? Foraging Sense-Making The All- Sources Analyst 2

Analyzing the Analytical Workflow The Glass Hypercube Analysis Guinea Pig Analysts -John Bodnar: DIA, SAIC -DIA Colleagues -Glass Box Analysts Analyzing how We Analyze Analysis Analyzing Analysis John Bodnar: DIA, SAIC Peter Pirolli, Stu Card: PARC with help from David Moore, Frank Hughes: NIMD John Bodnar: DIA, SAIC with help from Peter Pirolli, Stu Card: PARC Guinea Pig IT Experts -Stu Card, Peter Pirolli: PARC -NIMD Researchers -Julie Rosen: INET, SAIC Think Loop Model for Analysis 3

Analyzing the Analyst Research Projects have much in common. Formulation of a Problem Publication of a Theory supported by Evidence Scientist Technical Paper Experimental Results Historian Book Documentation Lawyer Case Evidence Analyst Assessment Reporting All use some variation of the Scientific Method Hypothesis Testing 4

Analyzing the Analyst Scholarship Counts Research Projects have much in common. Formulation of a Problem Publication of a Theory supported by Find many footnotes and references in scientific,historical, or legal documents BUT Evidence Scientist Technical Paper Experimental Results Historian Book Documentation Lawyer Case Evidence Analyst Assessment Reporting All use some variation of the Scientific Method TAKE-HOME MESSAGE 1: Hypothesis testing is built on a foundation of scholarship. The IC needs to build scholarship to improve analysis. References are removed in published IC assessments!! Hypothesis Testing 5

Hypothesis Testing The Steps hypothesis. n. 2. A proposition or principle put forth or stated (without any reference to its correspondence with fact) merely as a basis for reasoning or argument, or as a premise from which to draw a conclusion; a supposition. The Oxford English Dictionary evidence. n. 5. Grounds for belief; testimony or facts tending to prove or disprove any conclusion. The Oxford English Dictionary theory. n. 4.a. A scheme or system of ideas or statements held as an explanation or account of a group of facts or phenomena; a hypothesis that has been confirmed or established by observation or experiment, and is propounded or accepted accounting for the known facts; a statement of what are held to be general laws, principles, or causes of something known or observed. The Oxford English Dictionary Publication, Book, Case, Assessment A THEORY is a HYPOTHESIS with supporting EVIDENCE. DATA becomes EVIDENCE only when it is used to support or discredit a HYPOTHESIS in building a THEORY 6

Hypothesis Testing The Process (as it was). Hypothesis Testing - Industrial Age HYPOTHESIS TESTING is a Think Loop in which hypotheses are compared with experimental data to build a theory with evidence to support it. 7

Hypothesis Testing is a Think Loop Why? A THEORY is a HYPOTHESIS with supporting EVIDENCE. DECISION-MAKER Presentation The top-down mode asks Why? TOP-DOWN Goal-Driven Analysis starts with a Question, builds a Hypothesis then looks for support to build a Theory Conceptual Complexity changes as one moves up or down a Think Loop. Relations Read & Extract SENSEMAKING Hypothesis One Evidence gets from a Hypothesis to a Theory Evidence by finding EVIDENCE File to support it. Schematize How do we know? HYPOTHESIS Support Schema Build Case Theory What does it have to do with the problem at hand? Tell Story Shoebox How are they related? External Data COLLECTOR Sources How? Search & Filter Who & what? BOTTOM-UP Data-Driven Analysis starts with a Dataset or Database and builds a Theory The bottom-up mode asks How? 8

Hypothesis Testing The Process (as it is) Hypothesis Testing - Industrial Age In the Information Age, a researcher can extract EVIDENCE from DATABASES to test HYPOTHESES without personally having to do experiments. Hypothesis Testing - Information Age Collection Cycle Formulate Collection Requirements Protocol Collect intelligence on a given target Archive data in database Forage for New Data Check database for completeness Move to new target Information Cycle Hypothesis Formulate Evidence Needed to Test Hypothesis Recall Reports from Database Compare Reports and Hypothesis Make Sense from the Data Refine Hypothesis 9

Think Loop Model Foraging & Sense-Making Why? Sense-Making Support TOP-DOWN Goal-Driven Steps Conceptual Complexity changes as one moves up or down a Think Loop. Assembling the evidence. Hypothesis HYPOTHESIS SENSEMAKING Theory DECISION-MAKER Presentation Re-evaluate Tell Story Are we sure? Evidence File How do we know? COLLECTOR External Data Sources Who & what? FORAGING Foraging Information Content changes as one moves right or left on a Think Loop. Finding the evidence. BOTTOM-UP Data-Driven Steps How? 10

COMPUTER Tools The Dataset Format Problem TAKE-HOME MESSAGE 2: The analyst uses multiple types of datasets that must be integrated. Most IT tools don t account for data integration. Analytical Datasets have Different Syntactical Structures Build Theory Theory Nugget Extract Evidence Shoebox Nuggets Evidence Dr. Smith shipped a package to Dr. Jones. Anthrax is grown in a fermenter. Dr. Smith presented a paper on anthrax vaccines at the Microbiology Meeting. Smallville is 25 miles from Beantown. The Smallville Vaccine Plant makes anthrax vaccine. Dr. Smith works at the Smallville Vaccine Plant. Schema Text Files Sentences from Meta-Text Evidence in Relational File Database Visualization Hypertext on Website 11

Think Loop Model Foraging Bottom-Up Tasker HYPOTHESIS DECISION-MAKER Question Re-evaluate TOP-DOWN Goal-Driven Steps IT Tools act Bottom-Up. HYPOTHESIS Theory DECISION-MAKER Presentation Re-evaluate Tell Story Are we sure? Support Available Assessments SENSEMAKING Evidence EVIDENCE Archived Documents DATA COLLECTOR Relations Information FORAGING External Data Sources Search & Filter Who & what? Read & Extract Shoebox Evidence File Foraging Information Content changes as one moves right or left on a Think Loop. Finding the evidence. BOTTOM-UP Data-Driven Steps COLLECTOR Publish DATA Repository Search & Filter Document Archive EVIDENCE Read & Extract File Schematize Hypothesis Archive THEORY Set Build Case Tell Story PRESENTATION 12

Think Loop Model Sense-Making Bottom-Up Tasker DECISION-MAKER Question Re-evaluate TOP-DOWN Goal-Driven Steps HYPOTHESIS Sense-Making Conceptual Complexity Support changes as one moves up or down a Think Loop. Assembling the evidence. Available Assessments IT Tools act Bottom-Up. SENSEMAKING HYPOTHESIS Build Case Schema SENSEMAKING Re-evaluate Theory DECISION-MAKER Tell Story Presentation Are we sure? Evidence Schematize EVIDENCE Relations Evidence File Archived Documents Information FORAGING DATA COLLECTOR External Data Sources Who & what? BOTTOM-UP Data-Driven Steps COLLECTOR Publish DATA Repository Search & Filter Document Archive EVIDENCE Read & Extract File Schematize Hypothesis Archive THEORY Build Case Tell Story PRESENTATION 13

Think Loop Model Sense-Making Top-Down Question Tasker Re-evaluate HYPOTHESIS Available Assessments Evidence TOP-DOWN Goal-Driven Steps Analysts think Top-Down. SENSEMAKING Evidence Hypothesis HYPOTHESIS Schema Support Re-evaluate Theory DECISION-MAKER Tell Story Presentation Are we sure? EVIDENCE Relations Evidence File How do we know? Archived Documents DATA COLLECTOR Information External Data Sources Who & what? FORAGING Foraging Sense-Making Information Content changes as one moves right or left on a Finding Searching the for Think Loop. evidence. BOTTOM-UP Data-Driven Steps COLLECTOR Publish DATA Repository Document EVIDENCE Hypothesis THEORY Set Search & Filter Archive Read & Extract File Schematize Archive Build Case Tell Story PRESENTATION 14

Sense-Making Top-Down Find Somebody s Schema If a credible source has already built a schema, use it directly. Soviet BW Program 1990 HYPOTHESIS GENERAL SECRETARY When the BOTTOM-UP analysis provides sufficient knowledge of the relevant NEIGHBORHOODS at the current level to be able to predict with some degree of confidence that those NEIGHBORHOODS CENTRAL COMMITTEE themselves can OF be redefined as ENTITIES one level up, it s time to move POLITBURO on. THE COMMUNIST PARTY (Ken Alibek, Biohazard) COUNCIL of MINISTERS Committee of State Security (KGB) USSR Academy of Sciences Military Industrial Commission (VPK) Biological Warfare Directorate GOSPLAN Biological Warfare Dept Biotech Research Interagency Scientific & TechnicalCouncil (MNTS) Ministry of Health Ministry of Medical & Microbiological Industries (GLAVMIKROBIOPROM) Ministry of Agriculture Ministry of Chemical Industry Ministry of External Affairs Ministry of Internal Affairs Ministry of Defense (MOD) Biological & Medical Research BIOPREPARAT Medical & Vaccine Production Main Directorate for Industrial Production and Scientific Enterprise Veterinary and Agricultural Research Veterinary Vaccines and Production Chemical Weapons Directorate Main Directorate for Internal Military Forces (security) Main Correction Directorate (prisons and concentration camps) Fifteenth Directorate BCW Weaponization and Employment BCW Defense 15

Sense-Making Top-Down & Build on that Schema Then search for new evidence to build on the schema. Soviet BW Program 1990 HYPOTHESIS GENERAL SECRETARY When the BOTTOM-UP analysis provides sufficient knowledge of the relevant NEIGHBORHOODS at the current level to be able to predict with some degree of confidence that those NEIGHBORHOODS CENTRAL COMMITTEE themselves can OF be redefined as ENTITIES one level up, it s time to move POLITBURO on. THE COMMUNIST PARTY (Ken Alibek, Biohazard) COUNCIL of MINISTERS Committee of State Security (KGB) USSR Academy of Sciences Military Industrial Commission (VPK) Biological Warfare Directorate GOSPLAN Biological Warfare Dept Biotech Research Interagency Scientific & TechnicalCouncil (MNTS) Ministry of Health Ministry of Medical & Microbiological Industries (GLAVMIKROBIOPROM) Ministry of Agriculture Ministry of Chemical Industry Ministry of External Affairs Ministry of Internal Affairs Ministry of Defense (MOD) Biological & Medical Research KALININ = Head BIOPREPARAT Medical & Vaccine Production Main Directorate for Industrial Production and Scientific Enterprise Veterinary and Agricultural Research Veterinary Vaccines and Production Chemical Weapons Directorate Main Directorate for Internal Military Forces (security) Main Correction Directorate (prisons and concentration camps) Fifteenth Directorate BCW Weaponization and Employment BCW Defense 2000 KALININ replaced Prof. Vladimir ZAVYALOV, the respected civilian director of a BIOPREPARAT-affiliated research institute in the Moscow region, with a military scientist. 1982 GEN Vorobyov = 1 st Deputy Director tularremia test. GEN Klyucherov = Head of Scientific Directorate 1982 GEN Lededinsky tularremia test. LtGEN Evstigneev Senior Official 16

Think Loop Model Sense-Making Top-Down Analysts think Top-Down. Why? Question Tasker Re-evaluate HYPOTHESIS TOP-DOWN Goal-Driven Steps Hypothesis HYPOTHESIS Theory DECISION-MAKER Presentation Re-evaluate Tell Story Are we sure? Available Assessments SENSEMAKING Evidence EVIDENCE Archived Documents DATA COLLECTOR Relations Information External Data Sources Information Who & what? Shoebox Evidence File Relations How do we know? Foraging Information Content changes as one moves right or left on a Think Loop. Finding Searching the for evidence. data. BOTTOM-UP Data-Driven Steps COLLECTOR Publish DATA Repository Document EVIDENCE Hypothesis THEORY Set Search & Filter Archive Read & Extract File Schematize Archive Build Case Tell Story PRESENTATION 17

Foraging Top-Down Who? What? When? Where? Search the SHOEBOX for new evidence to build a new schema. When you ve searched everything at hand, and you ve extended the schema as far as you can, THEN search for new data. Vladimir P. Zav'yalov (Zavyalov, Zaliyalov) Former Director, Institute of Immunological Engineering 142380 Lyubuchany, Moscow Region Publication List (Zavyalov-Pubs) Inst. Of Immuno. Engineering Director: VLADIMIR ZAV YALOV Biokad Co. ORG CHARTS VP Zav'yalov VP Zav valoy Shemyakin Institute DEPARTMENT OF MOLECULAR AND IMMUNOLOGICAL ENGINEERING (REF) DIRECTOR: Vyacheslav M. ABRAMOV Inst. Of Highly Pure Biopreparations VG Korobko MP Kirpichnikov DA Dolgikh VM Lipkin VM Abramov OA Kaurov Interleukin-1 Artificial Proteins Peptide Bioregulators Plague Interleukin-2 Interferon-alpha TIMELINEs 19 60 - - - - - - - - - - - - - - - - 19 8 0 - - - - - - - - - - - - - - - - 20 00 -- - -- PLAGUE AA Vorobyov 69 72 73 76 79 94 99 02 PLAGUE VM Abramov 90 92 95 96 97 99 01 02 PLAGUE VP Zav yalov 90 92 95 96 97 99 01 IMMUNOMORPHIN VP Zav'yalov 96 01 02 03 INTERFERON-ALPHA VP Zav yalov 91 92 97 02 03 18

Sense-Making Top-Down vs Bottom-Up Sources TAKE-HOME Message 3: Different kinds of sources provide different levels of detail. - Top-Down sources help understand WHY. - Bottom-Up sources help understand HOW. Org Chart BOTTOM-UP SOURCES Org Chart TOP-DOWN SOURCE The FSU Biotech Program (in VP Zav yalov s Neighborhood) The FSU BW Program (Aibek s View) FSU Politburo Bioengineered Plague Peptide Bioregulators Protein Bioregulators Artificial Proteins Shemyakin Institute VPZav yalov Institute of Immunological Engineering Biokad Co VP Zav yalov Inst of Highly Pure Biopreps Biopreparat- Affiated Research Insitutes Vladimir Zav yalov Health Med & Micro Biopreparat Agri Chem Ext Int Defense Biotech Leadership 15 th Directorate Named Key Players Named Project Timelines Biotech Research Alibek Research Biotech Development Development Production Biotech Production Named Key Players BOTTOM-UP SOURCES Very precise, much detail Built from direct links (direct documented interactions) Closely linked to Project Timeline Precise but not accurate on large scale Dr. Zav yalov s Neighborhood is very tiny indeed when seen through Dr. Alibek s eyes. TOP-DOWN SOURCES Broad overview, little detail Usually reports indirect Links (through contacts) Covers entire scope of Source s Interests Accurate but not precise 19

Senior Analysts vs IT Experts Any time you tell a Senior Analyst - I ve got new software for you. - I ve got an IT project I d like you to help with. - We ve got a new database. That Analyst usually will disappear as fast as possible. There appears to be a mismatch between the way Senior Analysts work and the IT tools they are provided. WHY? TAKE-HOME Message 4: -Senior Analysts think Top-Down to understand WHY. -IT Experts think Bottom-Up to understand HOW. Therefore, we aren t building tools to help the Senior Analyst. 20

Senior Analysts vs Novice Analysts Senior Analyst When asked a question reflects on current theory and builds a new or modified hypothesis to test. Builds a large scale theory supported by multiple schemas and much evidence, then tells a story from the collected knowledge in response to a tasker. Employs Top-Down Strategies: -Remember current theory and schemas. - evidence in SHOEBOX. - new data. Novice Analyst When asked a question builds a case based on the keywords in the question then assembles data about them. Reacts to each tasker individually and builds a theory from scratch in response to every tasker. Employs Bottom-Up Strategies: -Find the data in the DATABASE. -Extract the evidence. -Build a schema. TAKE-HOME Message 5: -The Senior Analyst works Top-Down. -The Novice Analyst works Bottom-Up. We need to: -Train new analysts how to analyze using hypothesis testing. -Build corporate Shoeboxes, Evidence Files, & Schemas, and not re-invent them when Senior Analysts move on or retire. 21

NIMD Analyzing the Analyst Senior Analyst When asked a question reflects on current theory and builds a new or modified hypothesis to test. Novice Analyst When asked a question builds a case based on the keywords in the question then assembles data about them. Builds a large scale theory supported by Reacts to each tasker individually and builds a multiple schemas and much evidence, then theory from scratch in response to every tells a story from the collected knowledge in tasker. response to a tasker. TAKE-HOME Message 6: Most of today s IT tools are being Employs Top-Down Strategies: designed for the novice Employs analyst. Bottom-Up Strategies: -Remember current We schema. need to build both: -Find the data in the DATABASE. - evidence -Expert in SHOEBOX. IT tools for Novice Analysts -Extract the and evidence. - new data. -Novice IT tools for Expert Analysts -Build a schema. NIMD Glass Hypercube Understand how the senior analyst works on an ongoing problem. Build a novice IT system for an expert analyst. NIMD Glass Box Understand how the novice analyst works on a new problem. Build an expert IT system for a novice analyst. 22

Think Loop Model Analyzing Analysis Why? STRUCTURE (= S = DEGREE OF ASSEMBLY) Tasker HYPOTHESIS COLLECTOR Support Available Assessments EVIDENCE Archived Documents DATA DECISION-MAKER Question Re-evaluate Evidence Relations Information FORAGING External Data Sources TOP-DOWN Goal-Driven Steps Information Search & Filter Who & what? Relations Shoebox Read & Extract How are they related? SENSEMAKING Evidence Evidence File Schematize How do we know? HYPOTHESIS Support Schema Build Case Re-evaluate Theory What does it have to do with the problem at hand? BOTTOM-UP Data-Driven Steps DECISION-MAKER Tell Story Presentation Are we sure? By breaking down the analytical process to its component steps we can both improve how we teach analysis and how we build IT tools to support the process. COLLECTOR Publish DATA Repository Document EVIDENCE HypothesisAr THEORY Set Search & Filter Archive Read & Extract File Schematize chive Build Case Tell Story PRESENTATION How? EFFORT ( = H = POWER) 23

Think Loop Model Understanding the Analyst s Brain WHY? = Lessons Learned The IC needs to build scholarship to improve analysis. -Hypothesis testing is built on a foundation of scholarship. Most IT tools don t account for data integration. -The analyst uses multiple types of datasets that must be integrated.. Different kinds of sources provide different levels of detail. -Top-Down sources tell WHY, but Bottom-Up sources tell HOW. We aren t building IT tools to help the Senior Analyst. -Senior Analysts think Top-Down to answer WHY, but IT Experts think Bottom-Up to answer HOW. -Senior Analysts work Top-Down, but Novice Analysts work Bottom-Up HOW do we get there? = Take Home Messages -Train new analysts how to analyze using hypothesis testing methods. -Build corporate Shoeboxes, Evidence Files, & Schemas, and not reinvent them when Senior Analysts move on or retire. -Build both: -Expert IT tools for Novice Analysts -Novice IT tools for Expert Analysts 24