Enabling the Big Data Commons through indexing of data and their interactions

Size: px
Start display at page:

Download "Enabling the Big Data Commons through indexing of data and their interactions"

Transcription

1 biomedical and healthcare Data Discovery Index Ecosystem Enabling the Big Data Commons through indexing of and their interactions 2 nd BD2K all-hands meeting Bethesda 11/12/15

2 Aims 1. Help users find accessible 2. Assist producers on how to publish for maximal discoverability 3. Build a prototype/platform to dock related products PubMed of Data = DataMed 2

3 Outline v The ecosystem w w Data, derived, meta Stakeholders v Nuts and Bolts w w Components: meta, search tool Plan and timelines v How to participate w w w Working groups Pilots Collaborations

4 What does it take to use big? v v v Find the (across various resources) Find the that operate on the Find the appropriate computational environment v v v Access the Access the (/systems) Access the computational environment

5 pre-curated meta preprocessed curated standards prep for ecosystem

6 Characterizing implies describing the that helped produce them preprocessed pre-curated curated meta standards prep for ecosystem

7 one woman s preprocessed are another woman s raw preprocessed pre-curated curated meta prep for ecosystem

8 hosting in a cloud, cluster, server preprocessed pre-curated curated meta standards prep for understanding which computational environment is best for the combination of and relevant is important (e.g., HPC, GPU) ecosystem

9 preprocessed selecting the right computational environment for the right type of is important pre-curated curated meta meta standards PHI curated PHI pre-curated PHI preprocessed curated Protected Health Information (PHI) hosting in a HIPAA cloud, cluster, server understanding the conditions of accessibility is also important ecosystem

10 pre-curated meta PHI pre-curated standards Repositories preprocessed curated + PHI curated PHI preprocessed curated big analytics depend on 1. merging from several different sources (e.g., reference bases, molecular repositories, clinical repositories), 2. proper, and the 3. proper computational environment ecosystem

11 pre-curated meta PHI pre-curated standards Repositories preprocessed curated + PHI curated PHI preprocessed curated prep for join prep for big projects use several types of digital objects and they are inter-related pre-curated meta preprocessed curated Centers Big Data projects prep for ecosystem

12 pre-curated meta PHI pre-curated standards Repositories results of analyses are too preprocessed prep for curated + join PHI curated prep for PHI preprocessed curated journal published results pre-curated meta selection post-processed (results) preprocessed pre-processed curated Centers, Big projects prep for ecosystem

13 new types of publications have emerged preprocessed pre-curated prep for curated meta standards + join PHI curated PHI pre-curated prep for PHI preprocessed curated Repositories journal published results pre-curated selection post-processed (results) preprocessed pre-processed prep for meta curated Centers, Big projects ecosystem

14 Stakeholders stakeholders have different responsibilities and interests funder owner producer big user curator manager host ecosystem

15 Stakeholders stakeholders have different abilities in indexing different types of funder owner producer big user curator manager host searching across different resources is time consuming because no one is an expert in all resources ecosystem

16 Data'Discovery'Index' searching across indices and repositories A' aggregator' B C A aggregator' existing indices can interoperate with the cross-aggregator index best indexers for are those who use it all the time, but they may not know as much about other resources '

17 find on Kawasaki disease Data'Discovery'Index' platform and portal A' aggregator' B C A aggregator' A, B, C: mapping of meta, standards, links to aggregators, passing of queries aggregators: various indices whose meta are or can be mapped into Commons meta ' digital objects

18 Meta Model A set of meta specifications, future-proofed for progressive extensions, to support intended capability of the Data Discovery Index prototype Created using

19 BioSharing: Content Standards and Databases Supported by the NIH grant 1U24 AI to the University of California, San Diego

20 Data Identifiers Define a set of best practices and operating procedures for identifiers that support the intended capability of the NIH BD2K Data Discovery Index (DDI) prototype - being designed by the biocaddie Core Development Team. Check document at biocaddie.org Attend breakout session on Identifiers

21 biocaddie Prototype Ingestion Indexing Repositories Meta Ingestion ElasticSearch Data Sources Online sets User Interface Funding Agencies Publishers Data producers Terminology server

22 Data Indexing Pipeline Data Source 1. Configuration file developed by curator 2. Extraction of meta/ from resource or set via ingestion module w Cache information for further processing 3. Process meta/ via a set of modules w e.g. ID conversion, keyword extraction, normalization 4. Mapping of meta/ to meta model(s) 5. Export to target endpoint(s) 6. Search via ElasticSearch APIs

23 User Interface Workflow Query Entry Entity Identification Expansion Query Execution biocaddie backend Advanced filters Terminology server ElasticSearch Presentation Facets Organize results Visualization

24

25 Core Development Roadmap DDI architecture Setup website for searching for sets Set up infrastructure for web portal Data identifier Implement Data identifier into the DDI Data indexing Set up indexing using meta from WG 3.0 Data ingestion Determine sets Decide on scalable /meta input routes Meta mapping September 2015 Version 0.1 Search function Implement the function for 3 repositories Feedback collection Github RFA for pilot on Harvester for DDI schema RFA announced Review, selection and award Wrap up of Y1 pilot projects Literature/set link: Advanced search Recommender System: Ranking results isee/delve: Innovative visualization PDB citation pipelines

26 Core Development Roadmap Dataset result display Sort sets Group meta Terminology server Import ontology Integrate to Scigraph API Integrate autocomplete feature to prototype Interface design New interface for prototype v 0.2 Global statistics Usability study UI Analysis Ranking algorithm Results from pliot project Search function Expand the function to 7 repositories Find similar sets Search history Architecture Code refactoring November 2015 Version 0.2 We are here

27 Core Development Roadmap Personalized search Share/save search results User account Link set to external resources PubMed Grants Search algorithm Boolean/advanced search Data repository search function Usability study User study Track user s action Ranking algorithm Refine search results based on user s selection Report from WG 8 on Ranking Data duplication problem Meta management February 2016 Version 0.5 June 2016 Version 1.0

28 v Working groups w Participate or follow v Prototype w Participation Using and providing feedback on the prototype search engine v Interoperate with the prototype w Link your favorite index Use or map to meta Collaborate on APIs v Recommend /repositories for inclusion w New working group 28

29 Pilot$ Newly awarded Meta Discover w w w Distributed discovery using gym: github, yaml and markdown Chris Mungall, Lawrence Berkeley National Laboratory Feasibility study of indexing clinical research using HL7 FHIR Guoqian Jiang, Mayo Clinic College of Medicine Meta discovery and integration to support repurposing of heterogeneous using the Openfurther platform Ram Gouripeddi and Julio Facelli, University of Utah

30 Working Groups

31 An Index of Data (Indices) Data'Discovery'Index' Organizing framework and portal for A' aggregator' B C A aggregator' dashed lines: mapping of meta, standards, links to aggregators aggregators: various indices whose meta are or can be mapped into Commons meta ' Data Digital objects

32 Acknowledgements 93 working group members 12 steering committee members 8 pilot application reviewers staff and trainees collaborators Supported by the NIH grant 1U24 AI to the University of California, San Diego

33 pre-curated PHI a mouse model for science pre-curated preprocessed meta curated + PHI curated join published pre-curated meta preprocessed PHI curated selection result preprocessed curated prep for

Searching biomedical data sets. Hua Xu, PhD The University of Texas Health Science Center at Houston

Searching biomedical data sets. Hua Xu, PhD The University of Texas Health Science Center at Houston Searching biomedical data sets Hua Xu, PhD The University of Texas Health Science Center at Houston Motivations for biomedical data re-use Improve reproducibility Minimize duplicated efforts on creating

More information

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons The NIH Commons Summary The Commons is a shared virtual space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage,

More information

Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011)

Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011) Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011) Key Dates Release Date: June 6, 2013 Response Date: June 25, 2013 Purpose This Request

More information

SEED: Standard Energy Efficiency Data Platform

SEED: Standard Energy Efficiency Data Platform SEED: Standard Energy Efficiency Data Platform 2014 Building Technologies Office Peer Review Rich Brown, REBrown@lbl.gov Lawrence Berkeley National Laboratory Project Summary Timeline: Start date: 2013

More information

CiteSeer x in the Cloud

CiteSeer x in the Cloud Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

How to avoid building a data swamp

How to avoid building a data swamp How to avoid building a data swamp Case studies in Hadoop data management and governance Mark Donsky, Product Management, Cloudera Naren Korenu, Engineering, Cloudera 1 Abstract DELETE How can you make

More information

LMI Open Data Portal. LMI Advisory Group August 6, 2015. Presenter: Marlon Fletcher EDD Labor Market Information Division

LMI Open Data Portal. LMI Advisory Group August 6, 2015. Presenter: Marlon Fletcher EDD Labor Market Information Division LMI Open Data Portal LMI Advisory Group August 6, 2015 Presenter: Marlon Fletcher EDD Labor Market Information Division Agenda Define Open Data Factors influencing the need to provide Open Data Benefits

More information

onetransport 2016 InterDigital, Inc. All Rights Reserved.

onetransport 2016 InterDigital, Inc. All Rights Reserved. onetransport 1 onetransport: Who We are Today Platform Provider Transport Expert Analytics Sensors / Analytics Data providers / Use case owners 11 partners 2- year project 3.5m Total funding 2 How this

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

Considering Third Generation ediscovery? Two Approaches for Evaluating ediscovery Offerings

Considering Third Generation ediscovery? Two Approaches for Evaluating ediscovery Offerings Considering Third Generation ediscovery? Two Approaches for Evaluating ediscovery Offerings Developed by Orange Legal Technologies, Providers of the OneO Discovery Platform. Considering Third Generation

More information

Open Platform. Clinical Portal. Provider Mobile. Orion Health. Rhapsody Integration Engine. RAD LAB PAYER Rx

Open Platform. Clinical Portal. Provider Mobile. Orion Health. Rhapsody Integration Engine. RAD LAB PAYER Rx Open Platform Provider Mobile Clinical Portal Engage Portal Allegro PRIVACY EMR Connect Amadeus Big Data Engine Data Processing Pipeline PAYER CLINICAL CONSUMER CUSTOM Open APIs EMPI TERMINOLOGY SERVICES

More information

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India 1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto

More information

IO Informatics The Sentient Suite

IO Informatics The Sentient Suite IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric

More information

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel

More information

ELPUB Digital Library v2.0. Application of semantic web technologies

ELPUB Digital Library v2.0. Application of semantic web technologies ELPUB Digital Library v2.0 Application of semantic web technologies Anand BHATT a, and Bob MARTENS b a ABA-NET/Architexturez Imprints, New Delhi, India b Vienna University of Technology, Vienna, Austria

More information

Achilles a platform for exploring and visualizing clinical data summary statistics

Achilles a platform for exploring and visualizing clinical data summary statistics Biomedical Informatics discovery and impact Achilles a platform for exploring and visualizing clinical data summary statistics Mark Velez, MA Ning "Sunny" Shang, PhD Department of Biomedical Informatics,

More information

Functional Requirements for Digital Asset Management Project version 3.0 11/30/2006

Functional Requirements for Digital Asset Management Project version 3.0 11/30/2006 /30/2006 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 26 27 28 29 30 3 32 33 34 35 36 37 38 39 = required; 2 = optional; 3 = not required functional requirements Discovery tools available to end-users:

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting Agenda v Presentation User Needs Survey/Analysis (Todd/Vidya) v PP integration v Brief updates from others Supported by the NIH grant 1U24 AI117966-01 to the University

More information

Build Your Knowledge!

Build Your Knowledge! About this Course This 3-day Instructor led course Explore several advanced topics of working with SharePoint 2013 sites. Topics include SharePoint Server site definitions (Business Intelligence, Document

More information

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture. Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in

More information

e-science Applications with Duckling Collaboration Library

e-science Applications with Duckling Collaboration Library e-science Applications with Duckling Collaboration Library Kevin DONG CSTNET, Computer Network Information Center Chinese Academy of Sciences CANS 2013 @ Hangzhou Outline Duckling Collaboration Environment

More information

BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE. Big Data Europe

BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE. Big Data Europe BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE Big Data Europe The Big Data Aggregator The Big Data Aggregator: o A general-purpose architecture for processing Big Data o An implementation

More information

biomedical and healthcare Data Discovery Index Ecosystem

biomedical and healthcare Data Discovery Index Ecosystem November 2-4, 2014 kickoff meeting biomedical and healthcare Data Discovery Index Ecosystem Table of Contents 1. Project Overview and Timelines.... 1 2. Community Engagement..... 3 3. Pilot Projects.......

More information

5 Key Trends in Connected Health

5 Key Trends in Connected Health 5 Key Trends in Connected Health One of the most exciting market opportunities in healthcare today is the near limitless set of innovative solutions that can be created through the integration of the Internet,

More information

Cloud and Big Data Standardisation

Cloud and Big Data Standardisation Cloud and Big Data Standardisation EuroCloud Symposium ICS Track: Standards for Big Data in the Cloud 15 October 2013, Luxembourg Yuri Demchenko System and Network Engineering Group, University of Amsterdam

More information

Research Data Networks: Privacy- Preserving Sharing of Protected Health Informa>on

Research Data Networks: Privacy- Preserving Sharing of Protected Health Informa>on Research Data Networks: Privacy- Preserving Sharing of Protected Health Informa>on Lucila Ohno-Machado, MD, PhD Division of Biomedical Informatics University of California San Diego PCORI Workshop 7/2/12

More information

The Search API in Drupal 8. Thomas Seidl (drunken monkey)

The Search API in Drupal 8. Thomas Seidl (drunken monkey) The Search API in Drupal 8 Thomas Seidl (drunken monkey) Disclaimer Everything shown here is still a work in progress. Details might change until 8.0 release. Basic architecture Server Index Views Technical

More information

SQL Server 2012 Business Intelligence Boot Camp

SQL Server 2012 Business Intelligence Boot Camp SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations

More information

MS-55052: SharePoint 2013 End User Level II

MS-55052: SharePoint 2013 End User Level II MS-55052: SharePoint 2013 End User Level II Description This 3-day Instructor led course Explore several advanced topics of working with SharePoint 2013 sites. Topics include SharePoint Server site definitions

More information

Karl Lum Partner, LabKey Software klum@labkey.com. Evolution of Connectivity in LabKey Server

Karl Lum Partner, LabKey Software klum@labkey.com. Evolution of Connectivity in LabKey Server Karl Lum Partner, LabKey Software klum@labkey.com Evolution of Connectivity in LabKey Server Connecting Data to LabKey Server Lowering the barrier to connect scientific data to LabKey Server Increased

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Adam Rauch Partner, LabKey Software adam@labkey.com. Extending LabKey Server Part 1: Retrieving and Presenting Data

Adam Rauch Partner, LabKey Software adam@labkey.com. Extending LabKey Server Part 1: Retrieving and Presenting Data Adam Rauch Partner, LabKey Software adam@labkey.com Extending LabKey Server Part 1: Retrieving and Presenting Data Extending LabKey Server LabKey Server is a large system that combines an extensive set

More information

MicroStrategy Course Catalog

MicroStrategy Course Catalog MicroStrategy Course Catalog 1 microstrategy.com/education 3 MicroStrategy course matrix 4 MicroStrategy 9 8 MicroStrategy 10 table of contents MicroStrategy course matrix MICROSTRATEGY 9 MICROSTRATEGY

More information

Data and Informatics Implementation

Data and Informatics Implementation Data and Informatics Implementation Advisory Committee to the Director Meeting December 7, 2012 Lawrence A. Tabak, DDS, PhD Deputy Director, NIH Department of Health and Human Services Charge to the Working

More information

Cloud computing based big data ecosystem and requirements

Cloud computing based big data ecosystem and requirements Cloud computing based big data ecosystem and requirements Yongshun Cai ( 蔡 永 顺 ) Associate Rapporteur of ITU T SG13 Q17 China Telecom Dong Wang ( 王 东 ) Rapporteur of ITU T SG13 Q18 ZTE Corporation Agenda

More information

Pilot. Pathway into the Future for. Delivery. April 2010 Bron W. Kisler, CDISC Senior Director bkisler@cdisc.org

Pilot. Pathway into the Future for. Delivery. April 2010 Bron W. Kisler, CDISC Senior Director bkisler@cdisc.org SHARE S&V Document and the Pilot Pathway into the Future for Standards Development and Delivery April 2010 Bron W. Kisler, CDISC Senior Director bkisler@cdisc.org 1 CDISC Mission To develop and support

More information

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India Call for Papers Cloud computing has emerged as a de facto computing

More information

BD2K Update. Philip Bourne, PhD, FACMI Associate Director for Data Science

BD2K Update. Philip Bourne, PhD, FACMI Associate Director for Data Science BD2K Update Philip Bourne, PhD, FACMI Associate Director for Data Science Advisory Committee to the NIH Director December 11, 2015 http://datascience.nih.gov Slides: http://www.slideshare.net/pebourne

More information

OpenAIRE Research Data Management Briefing paper

OpenAIRE Research Data Management Briefing paper OpenAIRE Research Data Management Briefing paper Understanding Research Data Management February 2016 H2020-EINFRA-2014-1 Topic: e-infrastructure for Open Access Research & Innovation action Grant Agreement

More information

Big Data for Investment Research Management

Big Data for Investment Research Management IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable

More information

Dionseq Uatummy Odolorem Vel

Dionseq Uatummy Odolorem Vel W H I T E P A P E R : T E C H N I C A L Aciduisismodo Hitachi Clinical Dolore Repository Eolore Dionseq Uatummy Odolorem Vel Meet the Unique Challenges of DICOM/HL7 Data Access, Data Consolidation, Data

More information

Data Publishing Workflows with Dataverse

Data Publishing Workflows with Dataverse Data Publishing Workflows with Dataverse Mercè Crosas, Ph.D. Twitter: @mercecrosas Director of Data Science Institute for Quantitative Social Science, Harvard University MIT, May 6, 2014 Intro to our Data

More information

Collaborative Open Market to Place Objects at your Service

Collaborative Open Market to Place Objects at your Service Collaborative Open Market to Place Objects at your Service D6.2.1 Developer SDK First Version D6.2.2 Developer IDE First Version D6.3.1 Cross-platform GUI for end-user Fist Version Project Acronym Project

More information

Distributed Networking

Distributed Networking Distributed Networking Millions of people. Strong collaborations. Privacy first. Jeffrey Brown, Lesley Curtis, Richard Platt Harvard Pilgrim Health Care Institute and Harvard Medical School Duke Medical

More information

Developing Windows Azure and Web Services

Developing Windows Azure and Web Services Course M20487 5 Day(s) 30:00 Hours Developing Windows Azure and Web Services Introduction In this course, students will learn how to design and develop services that access local and remote data from various

More information

Sage Integration Cloud Technology Whitepaper

Sage Integration Cloud Technology Whitepaper Sage Integration Cloud Technology Whitepaper Sage Christian Rubach July 21, 2016 Abstract Sage is committed to providing businesses around the world the information, insight and tools they need to succeed.

More information

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop

More information

Client Overview. Engagement Situation. Key Requirements for Platform Development :

Client Overview. Engagement Situation. Key Requirements for Platform Development : Client Overview Our client provides leading video platform for enterprise HD video conferencing and has product suite focused on product-based visual communication solutions. Our client leverages its solutions

More information

The New ADS Search Interface and API

The New ADS Search Interface and API The New ADS Search Interface and API Alberto Accomazzi - @aaccomazzi for the ADS team - @adsabs 28 September 2013 IVOA Kona Saturday, September 28, 13 The ADS Classic System No frameworks available in

More information

Time series IoT data ingestion into Cassandra using Kaa

Time series IoT data ingestion into Cassandra using Kaa Time series IoT data ingestion into Cassandra using Kaa Andrew Shvayka ashvayka@cybervisiontech.com Agenda Data ingestion challenges Why Kaa? Why Cassandra? Reference architecture overview Hands-on Sandbox

More information

IBM Watson Ecosystem. Getting Started Guide

IBM Watson Ecosystem. Getting Started Guide IBM Watson Ecosystem Getting Started Guide Version 1.1 July 2014 1 Table of Contents: I. Prefix Overview II. Getting Started A. Prerequisite Learning III. Watson Experience Manager A. Assign User Roles

More information

Software Design Proposal Scientific Data Management System

Software Design Proposal Scientific Data Management System Software Design Proposal Scientific Data Management System Alex Fremier Associate Professor University of Idaho College of Natural Resources Colby Blair Computer Science Undergraduate University of Idaho

More information

Fast Innovation requires Fast IT

Fast Innovation requires Fast IT Fast Innovation requires Fast IT 2014 Cisco and/or its affiliates. All rights reserved. 2 2014 Cisco and/or its affiliates. All rights reserved. 3 IoT World Forum Architecture Committee 2013 Cisco and/or

More information

XpoLog Center Suite Data Sheet

XpoLog Center Suite Data Sheet XpoLog Center Suite Data Sheet General XpoLog is a data analysis and management platform for Applications IT data. Business applications rely on a dynamic heterogeneous applications infrastructure, such

More information

This module provides an overview of service and cloud technologies using the Microsoft.NET Framework and the Windows Azure cloud.

This module provides an overview of service and cloud technologies using the Microsoft.NET Framework and the Windows Azure cloud. Module 1: Overview of service and cloud technologies This module provides an overview of service and cloud technologies using the Microsoft.NET Framework and the Windows Azure cloud. Key Components of

More information

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP Your business is swimming in data, and your business analysts want to use it to answer the questions of today and tomorrow. YOU LOOK TO

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

Data Discovery on the Information Highway

Data Discovery on the Information Highway Data Discovery on the Information Highway Susan Gauch Introduction Information overload on the Web Many possible search engines Need intelligent help to select best information sources customize results

More information

Full-text Search in Intermediate Data Storage of FCART

Full-text Search in Intermediate Data Storage of FCART Full-text Search in Intermediate Data Storage of FCART Alexey Neznanov, Andrey Parinov National Research University Higher School of Economics, 20 Myasnitskaya Ulitsa, Moscow, 101000, Russia ANeznanov@hse.ru,

More information

What s Next for Data Sharing: Insight from the NIH Experience

What s Next for Data Sharing: Insight from the NIH Experience What s Next for Data Sharing: Insight from the NIH Experience Jerry Sheehan Assistant Director for Policy Development National Library of Medicine National Institutes of Health SHARE In-Person Meeting

More information

API Architecture. for the Data Interoperability at OSU initiative

API Architecture. for the Data Interoperability at OSU initiative API Architecture for the Data Interoperability at OSU initiative Introduction Principles and Standards OSU s current approach to data interoperability consists of low level access and custom data models

More information

Overview NIST Big Data Working Group Activities

Overview NIST Big Data Working Group Activities Overview NIST Big Working Group Activities and Big Architecture Framework (BDAF) by UvA Yuri Demchenko SNE Group, University of Amsterdam Big Analytics Interest Group 17 September 2013, 2nd RDA Plenary

More information

Serendipity a platform to discover and visualize Open OER Data from OpenCourseWare repositories Abstract Keywords Introduction

Serendipity a platform to discover and visualize Open OER Data from OpenCourseWare repositories Abstract Keywords Introduction Serendipity a platform to discover and visualize Open OER Data from OpenCourseWare repositories Nelson Piedra, Jorge López, Janneth Chicaiza, Universidad Técnica Particular de Loja, Ecuador nopiedra@utpl.edu.ec,

More information

Tableau s Place in a Big Data Architecture DAMA, Tableau User Group Meeting November 13, 2014

Tableau s Place in a Big Data Architecture DAMA, Tableau User Group Meeting November 13, 2014 s Place in a Big Data Architecture DAA, User Group eeting November 13, 2014 Agenda BI/DW Workload Categories & Three Integration odels Capability odels Architecture Patterns Summary Q & A 2 Workload Categories

More information

Big Data and Cyber Security A bibliometric study Jacky Akoka, Isabelle Comyn-Wattiau, Nabil Laoufi Workshop SCBC - 2015 (ER 2015) 1 Big Data a new generation of technologies and architectures, designed

More information

Federated, Generic Configuration Management for Engineering Data

Federated, Generic Configuration Management for Engineering Data Federated, Generic Configuration Management for Engineering Data Dr. Rainer Romatka Boeing GPDIS_2013.ppt 1 Presentation Outline I Summary Introduction Configuration Management Overview CM System Requirements

More information

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot www.etidaho.com (208) 327-0768 Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot 3 Days About this Course This course is designed for the end users and analysts that

More information

Online Courses. Version 9 Comprehensive Series. What's New Series

Online Courses. Version 9 Comprehensive Series. What's New Series Version 9 Comprehensive Series MicroStrategy Distribution Services Online Key Features Distribution Services for End Users Administering Subscriptions in Web Configuring Distribution Services Monitoring

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

Big Data to Knowledge (BD2K)

Big Data to Knowledge (BD2K) Big Data to Knowledge () potential funding agency synergies Jennie Larkin, PhD Office of the Associate Director of Data Science National Institutes of Health idash-pscanner meeting UCSD September 16, 2014

More information

THE FUTURE OF CODING IS NOW

THE FUTURE OF CODING IS NOW THE FUTURE OF CODING IS NOW xpatterns Computer-Assisted Coding: Features and Benefits: Automatically generates medical codes directly from clinical encounter notes Maps clinical codes to appropriate billing

More information

SAP Agile Data Preparation

SAP Agile Data Preparation SAP Agile Data Preparation Speaker s Name/Department (delete if not needed) Month 00, 2015 Internal Legal disclaimer The information in this presentation is confidential and proprietary to SAP and may

More information

Rich Media & HD Video Streaming Integration with Brightcove

Rich Media & HD Video Streaming Integration with Brightcove Rich Media & HD Video Streaming Integration with Brightcove IBM Digital Experience Version 8.5 Web Content Management IBM Ecosystem Development 2014 IBM Corporation Please Note IBM s statements regarding

More information

Self-Service Business Intelligence

Self-Service Business Intelligence Self-Service Business Intelligence BRIDGE THE GAP VISUALIZE DATA, DISCOVER TRENDS, SHARE FINDINGS Solgenia Analysis provides users throughout your organization with flexible tools to create and share meaningful

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

Big answers from big data: Thomson Reuters research analytics

Big answers from big data: Thomson Reuters research analytics Big answers from big data: Thomson Reuters research analytics REUTERS/Stoyan Nenov Nordic Workshop on Bibliometrics and Research Policy Ann Beynon September 2014 Thomson Reuters: Solutions Portfolio to

More information

What s New Guide. Help Desk Authority 9.1

What s New Guide. Help Desk Authority 9.1 What s New Guide Help Desk Authority 9.1 2011ScriptLogic Corporation ALL RIGHTS RESERVED. ScriptLogic, the ScriptLogic logo and Point,Click,Done! are trademarks and registered trademarks of ScriptLogic

More information

THE EUROPEAN DATA PORTAL

THE EUROPEAN DATA PORTAL European Public Sector Information Platform Topic Report No. 2016/03 UNDERSTANDING THE EUROPEAN DATA PORTAL Published: February 2016 1 Table of Contents Keywords... 3 Abstract/ Executive Summary... 3 Introduction...

More information

Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management

Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management Ram Soma 2, Amol Bakshi 1, Kanwal Gupta 3, Will Da Sie 2, Viktor Prasanna 1 1 University of Southern California,

More information

An Architecture to Deliver a Healthcare Dial-tone

An Architecture to Deliver a Healthcare Dial-tone An Architecture to Deliver a Healthcare Dial-tone Using SOA for Healthcare Data Interoperability Joe Natoli Platform Architect Intel SOA Products Division April 2008 Legal Notices This presentation is

More information

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - aniketb1@umbc.edu. CMSC 601 - Presentation

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - aniketb1@umbc.edu. CMSC 601 - Presentation HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM Aniket Bochare - aniketb1@umbc.edu CMSC 601 - Presentation Date-04/25/2011 AGENDA Introduction and Background Framework Heterogeneous

More information

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps White provides GRASP-powered big data predictive analytics that increases marketing effectiveness and customer satisfaction with API-driven adaptive apps that anticipate, learn, and adapt to deliver contextual,

More information

CRITEO INTERNSHIP PROGRAM 2015/2016

CRITEO INTERNSHIP PROGRAM 2015/2016 CRITEO INTERNSHIP PROGRAM 2015/2016 A. List of topics PLATFORM Topic 1: Build an API and a web interface on top of it to manage the back-end of our third party demand component. Challenge(s): Working with

More information

D4.1 - Functional Specifications & Portal Architecture

D4.1 - Functional Specifications & Portal Architecture D4.1 - Functional Specifications and Portal Architecture ECP 2008 DILI 518002 EUscreen Exploring Europe s Television Heritage in Changing Contexts D4.1 - Functional Specifications & Portal Architecture

More information

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf Jenkins as a Scientific Data and Image Processing Platform Ioannis K. Moutsatsos, Ph.D., M.SE. Novartis Institutes for Biomedical Research www.novartis.com June 18, 2014 #jenkinsconf Life Sciences are

More information

Six Challenges for the Privacy and Security of Health Information. Carl A. Gunter University of Illinois

Six Challenges for the Privacy and Security of Health Information. Carl A. Gunter University of Illinois Six Challenges for the Privacy and Security of Health Information Carl A. Gunter University of Illinois The Six Challenges 1. Access controls and audit 2. Encryption and trusted base 3. Automated policy

More information

PONTE Presentation CETIC. EU Open Day, Cambridge, 31/01/2012. Philippe Massonet

PONTE Presentation CETIC. EU Open Day, Cambridge, 31/01/2012. Philippe Massonet PONTE Presentation CETIC Philippe Massonet EU Open Day, Cambridge, 31/01/2012 PONTE Description Efficient Patient Recruitment for Innovative Clinical Trials of Existing Drugs to other Indications Start

More information

NIST Big Data Phase I Public Working Group

NIST Big Data Phase I Public Working Group NIST Big Data Phase I Public Working Group Reference Architecture Subgroup May 13 th, 2014 Presented by: Orit Levin Co-chair of the RA Subgroup Agenda Introduction: Why and How NIST Big Data Reference

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

More information

APPENDIX to http://dx.doi.org/10.4338/aci-2014-09-ra-0083 CAHIIM 2012 Curriculum Requirements Health Informatics Master s Degree

APPENDIX to http://dx.doi.org/10.4338/aci-2014-09-ra-0083 CAHIIM 2012 Curriculum Requirements Health Informatics Master s Degree APPENDIX to http://dx.doi.org/10.4338/aci-2014-09-ra-0083 CAHIIM 2012 Curriculum Requirements Health Informatics Master s Degree Column 1 - Health Informatics Facet I. Information Systems concerned with

More information

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan

More information

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand Build a Streamlined Data Refinery An enterprise solution for blended data that is governed, analytics-ready, and on-demand Introduction As the volume and variety of data has exploded in recent years, putting

More information

Large Scale Text Analysis Using the Map/Reduce

Large Scale Text Analysis Using the Map/Reduce Large Scale Text Analysis Using the Map/Reduce Hierarchy David Buttler This work is performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract

More information

Clinical Genomics at Scale: Synthesizing and Analyzing Big Data From Thousands of Patients

Clinical Genomics at Scale: Synthesizing and Analyzing Big Data From Thousands of Patients Clinical Genomics at Scale: Synthesizing and Analyzing Big Data From Thousands of Patients Brandy Bernard PhD Senior Research Scientist Institute for Systems Biology Seattle, WA Dr. Bernard s research

More information

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015 Data Governance in the Hadoop Data Lake Kiran Kamreddy May 2015 One Data Lake: Many Definitions A centralized repository of raw data into which many data-producing streams flow and from which downstream

More information

D5.3.2b Automatic Rigorous Testing Components

D5.3.2b Automatic Rigorous Testing Components ICT Seventh Framework Programme (ICT FP7) Grant Agreement No: 318497 Data Intensive Techniques to Boost the Real Time Performance of Global Agricultural Data Infrastructures D5.3.2b Automatic Rigorous

More information

Making big data simple with Databricks

Making big data simple with Databricks Making big data simple with Databricks We are Databricks, the company behind Spark Founded by the creators of Apache Spark in 2013 Data 75% Share of Spark code contributed by Databricks in 2014 Value Created

More information

TECHNICAL HIGHLIGHTS. September 16 th,2015 Oglethorpe D. oneusg

TECHNICAL HIGHLIGHTS. September 16 th,2015 Oglethorpe D. oneusg TECHNICAL HIGHLIGHTS September 16 th,2015 Oglethorpe D oneusg Constitution one set of uniform business procedures, policies and practices one technical platform / software solution one support team and

More information