DataBridge Arcot Rajasekar" The University of North Carolina at Chapel Hill "
|
|
- Berenice Webb
- 8 years ago
- Views:
Transcription
1 DataBridge Arcot Rajasekar" The University of North Carolina at Chapel Hill "
2 Data Bridge: A Social Network for Long Tail Science Data" Outline of the Talk" Motivation" Design" Implementation Status" Examples" Future" " Data Bridge" 2"
3 Big Data" Well-known problem" Characteristics:" Volume" Exponential Increase in Size & Count" Velocity" Speed of Generation & Consumption" Variety" Disparate Types of data" Veracity" Integrity & Fidelity" Value" Worth" cognizan(.weber.org infographics.socialnama.com Data Bridge" 3"
4 Three Kinds of Big Data (1)" Archetypal Big Data" Science Projects LHC, LSST, SCEC, OOI " Business/Industry Genomics, Finance, Pharma " Government NASA, NOAA, NCDC " " Volume High large datasets" Velocity High but predictable" Variety Low Standardized, Metadata, Curated" Veracity High Fidelity and Credible" Value High funded" Findability High Known sites and discovery mechanisms" Availability High Published API" Light & Visible Data " infographics.socialnama.com wired.com 4"
5 Three Kinds of Big Data (2)" Crowd-Sourced Big Data" Social Media Facebook, Twitter " Recommenders Yelp, Angie s List, Groupon" Web Commerce Amazon, Ebay, Orbitz, enews" Volume High small data" Velocity High and non predictable" Variety High But well managed" Veracity Mixed Low to High" Value Ephemeral can be None to High" Findability High Known and Advertised" dreams(me.com Availability Immediate Interest Web pages and Apps" Nova-like Data " Data Bridge" 5"
6 Three Kinds of Big Data (3)" Long-tail Big Data" Science Projects small teams and organizations" Personal Hobbies, Amateur/Citizen Science/Arts" Government Internal and unpublished" Volume High small data sets" Velocity Low" Variety High Too many, Idiosyncratic" Veracity Non Credible until proven" Value unknown" Findability None Hidden and not advertised" Availability None In local, disks and tapes" Dark Data " teradata.com Images.frompo.com 6"
7 Our Interest: Long tail of Science Data" Large number of data generators" Highly distributed" NSF in 2011" 11,150 awards" Median size $126K" Primarily single PI" Data individually not petascale but large in aggregate" Of possible Value " Dark Data" Unpublished" Used once and Forgotten" Sunset Data" Even NSF retention expects at the least only 3 years after project" Expired Data " curated but disposed" Social media data" com2733amandaathens.blogspot.com 7"
8 Problems: Long-tail of Science Data" First Mile Problem" How to make it available?" Where do I upload?" Who is in charge?" How do I get credit?" Can I control access?" How do I pool with other like-minded researchers? Community services?" How much is long-term?" Who pays for it?" Last Mile Problem" How to make it findable?" What is needed to make it more visible? Metadata?" Are there other methods to make my data findable?" My data has specific ways & characteristics" How do I expose " them as finding " aids?" How can I find similar " data?" Solving the long-tail problem will also help " other two Big Data problems " blog.enrichconsul(ng.com Data Bridge" 8"
9 Dark Data from The Long Tail of Science " Long tail data amounts to small data sets produced by numerous investigators." Dark Data Exemplars:" From Brahe to Mendel, discovery has come from relatively small data sets" Much long tail data is dark data, data not easily found by potential users (Bryan Heidorn)" Long tail data sets lack structural advantages of classic Big Data, such as professional curation and homogeneous formats and well-documented data formats and populated & well-formed metadata schema." Improving availability and findability will help " solve the problems with this type of big data " and make it more mainstream." " Expose the hidden nuggets" astrosolar.com 9"
10 Data Bridge: A Social Network for Long Tail Science Data" Outline of the Talk" Motivation" Design" Implementation Status" Examples" Future" " Data Bridge" 10"
11 Data Bridge: A solution" We tackle mainly the last mile problem" But show an avenue for solving the first mile problem" Main Aim: Improve Findability" Solution: Empower Data!!" Empower data to find its own community" Community of likeness" Similarity in multiple dimensions" Look at data from diverse angles" Find " Relationships strong links " Friendships weaker links" Assist scientists in discovering " interesting data sets by automatically " forming communities of data" " Data Bridge" 11"
12 Data Bridge Strategy: Automatic Community Detection " Doc Watson albums at Amazon.com" Clustered by Yasiv.com" Clusters represent " related items" Clusters are connected by some internal " metrics " Data Bridge" 12"
13 Data Bridge: Design" Construct multi-dimensional social networks for data. " Three challenges:" Evaluate multiple types of " metrics on data" Domain-specific, genre-specific, " project-specific" Use Socio-metric Network Algorithms" Similar to but for data" Find relevance" Slices of similarity" Explore Relationships between " Data, Users, Resources, Methods, Workflows " Use Relevance Algorithms" Create communities" Use Clustering Algorithms" Provide an extensible & big data framework" Democratize the process" Data Bridge" 13"
14 Data Bridge Infrastructure" Accommodate for multiple, extensible number of " SNA, RA and CA algorithms" Provision an easy way to add new algorithms" Crowd sourcing" MyVector (add your own way of defining metrics)" Provision an easy way to connect data to algorithms" Multiple ways of finding similarities" Multiple ways of providing search criteria" MyBridge (add your own way of finding relevance)" Provision an easy way to form communities" Multiple ways of categorizing data" MyCommunity (add your own " domain-specific clustering)" Make it a distributed system " Grow and Shrink as needed" Make it easy for third-party setups" Federation of Data Bridge" " Data Bridge" 14"
15 Data Bridge Architecture" 15"
16 Message-oriented Architecture" Loosely linked processes: Messages make the connections" Scenario: " User A has a novel, signature detection algorithm for gene sequences" User A wraps algorithm with API provided by DataBridge" Subscribes to a message for gene sequence data " User B publishes a new gene sequence into DataBridge" A new message is created informing the new addition" User A s algorithm detects, computes relevance to signature " Publishes a new relevance message for this signature " User C has a relevance algorithm that catches A s message" Uses it to add B s sequence to a new data community" Other signature detectors may also look at B s gene sequence" Publish their own relevance, if applicable" User A can also look at " older gene sequences and " find relevance" Image from support.oyala.com Data Bridge" 16"
17 Data Bridge: A Social Network for Long Tail Science Data" Outline of the Talk" Motivation" Design" Implementation Status" Examples" Future" " Data Bridge" 17"
18 Modules: Current Status " RabbitMQ Messaging system" Ingest Engine" Relevance Engine" Network Engine" Ingestion GUI" DataVerseNetwork access" Meta database" MongoDB" Network Database" Neo4J" Viz Display " Data Bridge" 18"
19 Messages" Message Listener Originator Ingest Metadata Ingest Engine Any Data Provider (DVN) Metadata Available Relevance Engine Metadatabase Create Similarity Matrix Relevance Engine Any Ingest Engine Similarity Matrix Available Network Engine Any Relevance Engine Insert Similarity Matric Ingest Engine User/App Run SNA Algorithm Network Engine Network Engine or user/app SNA Data Available Network Engine Network Database Create Visualiza(on Data Network Database Network Visualiza(on Show Visualiza(on Visualizer User (WebApp) Data Bridge" 19"
20 Example Message Schema" Name Header Insert.Metadata.Java.URI.MetadataDB System headers type subtype User provided headers classname namespace inputuri Example Value databridge ingestmetadata Example Value org.renci.databridgecontrib.ingest.mockingest system_test /projects/databridge/metadata.xml W3.org Data Bridge" 20"
21 Data Bridge: A Social Network for Long Tail Science Data" Outline of the Talk" Motivation" Design" Implementation Status" Examples" Future" " Data Bridge" 21"
22 Screenshot: Finding similarities" Select Network Data Filter Connec(vity by similarity value Data Bridge" 22"
23 Screenshot: Weight of similarity" Similarity measure: "
24 Screenshot: Highlights of similarities" Link to the data 24"
25 Screenshot: Data Access" Data Bridge" " 25"
26 Screenshot: Simple Ingest GUI" Data Bridge" " 26"
27 Data Bridge: A Social Network for Long Tail Science Data" Outline of the Talk" Motivation" Design" Implementation Status" Examples" Future" " Data Bridge" 27"
28 Next Steps" Basic Framework implemented " Applied to a few thousands of datasets" Work to do, advanced features" Documentation" Scaling tests" More types data/metadata to be tested" Ready for new algorithms" Ready for more data" Ready for larger usage" Investigate multiple" similarity measures" Usage, Methods as " relevance" Data Bridge" 28"
29 Players" Howard Lander" Justin Zhan" Merce Crosas" Gary King" Jon Crabtree" Tom Carsey" Sharlini Shankaran" Arcot Rajasekar" Data Bridge" 29"
30 Arcot Rajasekar" The University of North Carolina at Chapel Hill " Conclusion" DataBridge" Motivation" Design" Implementation Status" Examples" Future"
Big Data Analysis in a Message Oriented Framework
Big Data Analysis in a Message Oriented Framework Arcot Rajasekar 1, Howard Lander 2! 1 School of Information and Library Science, 2 The Renaissance Computing Institute, The University of North Carolina
More informationMerce Crosas, Gary King Harvard University Cambridge, Massachusetts, USA. {mcrosas, king}@harvard.edu
Sociometric methods for relevancy analysis of Long Tail Science Data Arcot Rajasekar, Sharlini Sankaran, Howard Lander, Tom Carsey, Jonathan Crabtree The University of North Carolina Chapel Hill Chapel
More informationBig Data Operations: Basis for Benchmarking Big Data Systems
Big Data Operations: Basis for Benchmarking Big Data Systems Justin Zhan North Carolina State A&U University, Greensboro Arcot Rajasekar Reagan Moore Shu Huang Yufeng Xin University of North Carolina at
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationMicroStrategy Course Catalog
MicroStrategy Course Catalog 1 microstrategy.com/education 3 MicroStrategy course matrix 4 MicroStrategy 9 8 MicroStrategy 10 table of contents MicroStrategy course matrix MICROSTRATEGY 9 MICROSTRATEGY
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationDigital Preservation Lifecycle Management
Digital Preservation Lifecycle Management Building a demonstration prototype for the preservation of large-scale multi-media collections Arcot Rajasekar San Diego Supercomputer Center, University of California,
More informationOrganic Data Publishing: A Novel Approach to Scientific Data Sharing
Second International Workshop on Linked Science Tackling Big Data, (LISC 2012), colocated with the International Semantic Web Conference (ISWC), Boston, MA, November 11-15, 2012. Organic Data Publishing:
More informationData Grids. Lidan Wang April 5, 2007
Data Grids Lidan Wang April 5, 2007 Outline Data-intensive applications Challenges in data access, integration and management in Grid setting Grid services for these data-intensive application Architectural
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
More informationA Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com
A Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com Abstract. In today's competitive environment, you only have a few seconds to help site visitors understand that you
More informationEnhanced Research Data Management and Publication with Globus
Enhanced Research Data Management and Publication with Globus Vas Vasiliadis Jim Pruyne Presented at OR2015 June 8, 2015 Presentations and other useful information available at globus.org/events/or2015/tutorial
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically
More informationThe National Consortium for Data Science (NCDS)
The National Consortium for Data Science (NCDS) A Public-Private Partnership to Advance Data Science Ashok Krishnamurthy PhD Deputy Director, RENCI University of North Carolina, Chapel Hill What is NCDS?
More informationCASC Spring Meeting 2014 Federal Agency Panel Update on Big Data
CASC Spring Meeting 2014 Federal Agency Panel Update on Big Data Robert Chadduck Program Director, Data & CI CISE Division of Advanced Cyberinfrastructure 23 April 2014 ACI data focused CI - A view towards
More informationOpenChorus: Building a Tool-Chest for Big Data Science
OpenChorus: Building a Tool-Chest for Big Data Science Milind Bhandarkar Chief Scientist, Machine Learning Platforms EMC Greenplum 1 Agenda! Tools for Data Science! Data Science Workflow! Greenplum OpenChorus!
More informationThe Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and
More informationFlattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful
More informationQLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
More informationConsiderations for Research Data Management
Considerations for Research Data Management Andrew Dean - OCF adean@ocf.co.uk - 07508 033894 Wednesday 3 rd December 2014 What is an RDM solution? Research Data Management A method of effectively managing
More informationMAP YOUR WORLD S DATA. CartoDB is the easiest way to map & analyze your location data
MAP YOUR WORLD S DATA CartoDB is the easiest way to map & analyze your location data CartoDB CartoDB leads the world of location intelligence and data visualization, empowering any business and individual
More informationTidepool Informational Pre-submission Meeting
Tidepool Informational Pre-submission Meeting Prepared for FDA CDRH June 2, 2015 Tidepool attendees: Howard Look, President and CEO (phone) Brandon Arbiter, VP Product and BizDev (phone) Sheila Ramerman,
More informationFast Innovation requires Fast IT
Fast Innovation requires Fast IT 2014 Cisco and/or its affiliates. All rights reserved. 2 2014 Cisco and/or its affiliates. All rights reserved. 3 IoT World Forum Architecture Committee 2013 Cisco and/or
More informationData Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
More informationOutline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
More informationWhat s New in Analytics: Fall 2015
Adobe Analytics What s New in Analytics: Fall 2015 Adobe Analytics powers customer intelligence across the enterprise, facilitating self-service data discovery for users of all skill levels. The latest
More informationAnalytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
More informationEnabling the Big Data Commons through indexing of data and their interactions
biomedical and healthcare Data Discovery Index Ecosystem Enabling the Big Data Commons through indexing of and their interactions 2 nd BD2K all-hands meeting Bethesda 11/12/15 Aims 1. Help users find accessible
More informationOLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP
Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key
More informationirods in complying with Public Research Policy
irods User Group 2015 irods in complying with Public Research Policy Vic Cornell Senior Storage Consultant Overview Compliance overview UK examples Imperial College MedBio Requirements Architecture irods
More informationBig Data. George O. Strawn NITRD
Big Data George O. Strawn NITRD Caveat auditor The opinions expressed in this talk are those of the speaker, not the U.S. government Outline What is Big Data? NITRD's Big Data Research Initiative Big Data
More informationMonitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center
Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center Presented by: Dennis Liao Sales Engineer Zach Rea Sales Engineer January 27 th, 2015 Session 4 This Session
More informationUsing the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
More informationHypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
More informationVulnerability Management
Vulnerability Management Buyer s Guide Buyer s Guide 01 Introduction 02 Key Components 03 Other Considerations About Rapid7 01 INTRODUCTION Exploiting weaknesses in browsers, operating systems and other
More informationData Publishing Workflows with Dataverse
Data Publishing Workflows with Dataverse Mercè Crosas, Ph.D. Twitter: @mercecrosas Director of Data Science Institute for Quantitative Social Science, Harvard University MIT, May 6, 2014 Intro to our Data
More informationMachine Learning/Data Mining for Cancer Genomics
Machine Learning/Data Mining for Cancer Genomics Bernard Manderick, Vrije Universiteit Brussel Henry Nyongesa, University of the Western Cape Collaboration: Artificial Intelligence Laboratory VUB Intelligent
More informationWorkload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace
Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace Beth Plale Indiana University plale@cs.indiana.edu LEAD TR 001, V3.0 V3.0 dated January 24, 2007 V2.0 dated August
More informationSocial Media Implementations
SEM Experience Analytics Social Media Implementations SEM Experience Analytics delivers real sentiment, meaning and trends within social media for many of the world s leading consumer brand companies.
More informationImage Data, RDA and Practical Policies
Image Data, RDA and Practical Policies Rainer Stotzka and many others KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu Data Life Cycle Lab
More informationHow To Understand The Value Of Big Data
Big Data Is Not Yet Another IT Project Krish Krishnan President, Sixth Sense Advisors Inc Bridge to Big Data Oct 23 rd 2012 Background Applications, OLTP Systems, Traditional Data Warehouse and Business
More informationHow to avoid building a data swamp
How to avoid building a data swamp Case studies in Hadoop data management and governance Mark Donsky, Product Management, Cloudera Naren Korenu, Engineering, Cloudera 1 Abstract DELETE How can you make
More informationDAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY
Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com
More informationThe New ADS Search Interface and API
The New ADS Search Interface and API Alberto Accomazzi - @aaccomazzi for the ADS team - @adsabs 28 September 2013 IVOA Kona Saturday, September 28, 13 The ADS Classic System No frameworks available in
More informationDelivering a Campus Research Data Service with Globus. GlobusWorld 2014 Keynote
Delivering a Campus Research Data Service with Globus GlobusWorld 2014 Keynote Give me your data, your terabytes, Your huddled files yearning to breathe free Building campus research data services Open
More informationManaging Data Storage in the Public Cloud. October 2009
October 2009 Table of Contents Introduction...1 What is a Public Cloud?...1 Managing Data Storage as a Service...2 Improving Public Cloud Storage CDMI...4 How CDMI Works...4 Metadata in CDMI... 6 CDMI
More informationBig Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
More informationSURFsara Data Services
SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,
More informationData Management using irods
Data Management using irods Fundamentals of Data Management September 2014 Albert Heyrovsky Applications Developer, EPCC a.heyrovsky@epcc.ed.ac.uk 2 Course outline Why talk about irods? What is irods?
More informationBig Data Analytics Roadmap Energy Industry
Douglas Moore, Principal Consultant, Architect June 2013 Big Data Analytics Energy Industry Agenda Why Big Data in Energy? Imagine Overview - Use Cases - Readiness Analysis - Architecture - Development
More informationUnderstanding Your Customer Journey by Extending Adobe Analytics with Big Data
SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction
More informationIDC MaturityScape Benchmark: Big Data and Analytics in Government
IDC MaturityScape Benchmark: Big Data and Analytics in Government Adelaide O Brien Research Director, IDC aobrien@idc.com Presentation to ACT-IAC Emerging Technology SIG July, 2014 IDC MaturityScape Benchmark:
More informationC05 Discovery of Enterprise zsystems Assets for API Management
C05 Discovery of Enterprise zsystems Assets for API Management Unlocking mainframe assets for mobile and cloud applications Haley Fung hfung@us.ibm.com IMS Mobile and APIM Development Lead * IMS Technical
More informationA Close Look at Drupal 7
smart. uncommon. ideas. A Close Look at Drupal 7 Is it good for your bottom line? {WEB} MEADIGITAL.COM {TWITTER} @MEADIGITAL {BLOG} MEADIGITAL.COM/CLICKOSITY {EMAIL} INFO@MEADIGITAL.COM Table of Contents
More informationScale Cloud Across the Enterprise
Scale Cloud Across the Enterprise Chris Haddad Vice President, Technology Evangelism Follow me on Twitter @cobiacomm Read architecture guidance at http://blog.cobia.net/cobiacomm Skate towards the puck
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationAlcatel-Lucent Multiscreen Video Platform RELEASE 2.2
Alcatel-Lucent Multiscreen Video Platform RELEASE 2.2 Enrich the user experience and build more valuable customer relationships by delivering personal, seamless and social multiscreen video services Embrace
More informationFluency With Information Technology CSE100/IMT100
Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999
More information3Gen Data Deduplication Technical
3Gen Data Deduplication Technical Discussion NOTICE: This White Paper may contain proprietary information protected by copyright. Information in this White Paper is subject to change without notice and
More informationEUDAT. Towards a pan-european Collaborative Data Infrastructure
EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland EISCAT User Meeting, Uppsala,6 May 2013 2 Exponential growth Data trends Zettabytes
More informationCanadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management
Research Data Management Canadian National Research Data Repository Service Progress Report, June 2016 As their digital datasets grow, researchers across all fields of inquiry are struggling to manage
More informationGlobus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis
Globus Research Data Management: Introduction and Service Overview Steve Tuecke Vas Vasiliadis Presentations and other useful information available at globus.org/events/xsede15/tutorial 2 Thank you to
More informationIO Informatics The Sentient Suite
IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric
More informationAddressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015
Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO Big Data Everywhere Conference, NYC November 2015 Agenda 1. Challenges with Risk Data Aggregation and Risk Reporting (RDARR) 2. How a
More informationWhat s New in Analytics: Fall 2015
Adobe Analytics What s New in Analytics: Fall 2015 Adobe Analytics powers customer intelligence across the enterprise, facilitating self-service data discovery for users of all skill levels. The latest
More informationOracle BI 11g R1: Build Repositories
Oracle University Contact Us: 1.800.529.0165 Oracle BI 11g R1: Build Repositories Duration: 5 Days What you will learn This Oracle BI 11g R1: Build Repositories training is based on OBI EE release 11.1.1.7.
More informationHow To Understand The Benefits Of Big Data
Findings from the research collaboration of IBM Institute for Business Value and Saïd Business School, University of Oxford Analytics: The real-world use of big data How innovative enterprises extract
More informationLarge-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of Chicago @madduri
Large-scale Research Data Management and Analysis Using Globus Services Ravi Madduri Argonne National Lab University of Chicago @madduri Outline Who we are Challenges in Big Data Management and Analysis
More informationDataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD
DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure Arcot (RAJA) Rajasekar DICE/SDSC/UCSD What is SRB? First Generation Data Grid middleware developed at the San Diego Supercomputer Center
More informationBig Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
More informationA Benchmark to Evaluate Mobile Video Upload to Cloud Infrastructures
A Benchmark to Evaluate Mobile Video Upload to Cloud Infrastructures Afsin Akdogan, Hien To, Seon Ho Kim and Cyrus Shahabi Integrated Media Systems Center University of Southern California, Los Angeles,
More informationBig Data and Predictive Analytics. Cameron Hall Vice President, Products ValueCentric, LLC
Big Data and Predictive Analytics Cameron Hall Vice President, Products ValueCentric, LLC Agenda 1 What is Big Data? 2 Does your organization have Big Data? - Spoiler Alert: Yes! 3 What is Predictive Analytics?
More informationTutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734
Cleveland State University Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734 SS Chung 14 Build a Data Mining Model using Data
More informationCloud Computing. What s the Big Deal? Michael J. Carey Information Systems Group CS Department UC Irvine
Cloud Computing and Big Data: What s the Big Deal? Michael J. Carey Information Systems Group CS Department UC Irvine What Is Cloud Computing? Cloud computing is a model for enabling ubiquitous, convenient,
More informationIntroduction to Service Oriented Architectures (SOA)
Introduction to Service Oriented Architectures (SOA) Responsible Institutions: ETHZ (Concept) ETHZ (Overall) ETHZ (Revision) http://www.eu-orchestra.org - Version from: 26.10.2007 1 Content 1. Introduction
More informationTrends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
More informationWeb Archiving and Scholarly Use of Web Archives
Web Archiving and Scholarly Use of Web Archives Helen Hockx-Yu Head of Web Archiving British Library 15 April 2013 Overview 1. Introduction 2. Access and usage: UK Web Archive 3. Scholarly feedback on
More informationData sharing and Big Data in the physical sciences. 2 October 2015
Data sharing and Big Data in the physical sciences 2 October 2015 Content Digital curation: Data and metadata Why consider the physical sciences? Astronomy: Video Physics: LHC for example. Video The Research
More informationA very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect
A very short talk about Apache Kylin Business Intelligence meets Big Data Fabian Wilckens EMEA Solutions Architect 1 The challenge today 2 Very quickly: OLAP Online Analytical Processing How many beers
More informationSpatio-Temporal Networks:
Spatio-Temporal Networks: Analyzing Change Across Time and Place WHITE PAPER By: Jeremy Peters, Principal Consultant, Digital Commerce Professional Services, Pitney Bowes ABSTRACT ORGANIZATIONS ARE GENERATING
More informationCloudera Enterprise Data Hub in Telecom:
Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer
More informationBig Data Trends A Basis for Personalized Medicine
Big Data Trends A Basis for Personalized Medicine Dr. Hellmuth Broda, Principal Technology Architect emedikation: Verordnung, Support Prozesse & Logistik 5. Juni, 2013, Inselspital Bern Over 150,000 Employees
More informationRS MDM. Integration Guide. Riversand
RS MDM 2009 Integration Guide This document provides the details about RS MDMCenter integration module and provides details about the overall architecture and principles of integration with the system.
More informationlocuz.com Big Data Services
locuz.com Big Data Services Big Data At Locuz, we help the enterprise move from being a data-limited to a data-driven one, thereby enabling smarter, faster decisions that result in better business outcome.
More informationCollaboration. Michael McCabe Information Architect mmccabe@gig-werks.com. black and white solutions for a grey world
Collaboration Michael McCabe Information Architect mmccabe@gig-werks.com black and white solutions for a grey world Slide Deck & Webcast Recording links Questions and Answers We will answer questions at
More informationMultichannel Customer Listening and Social Media Analytics
( Multichannel Customer Listening and Social Media Analytics KANA Experience Analytics Lite is a multichannel customer listening and social media analytics solution that delivers sentiment, meaning and
More informationCreative Director. Inspire artists, programmers, producers and marketing staff to make the highest quality product possible
Open positions Creative Director... 2 Level designer... 3 Data scientist... 4 Backend engineer - user acquisition and game management tools... 5 Gameplay programmer... 6 Software engineer Client, tools,
More informationGlobus Research Data Management: Introduction and Service Overview
Globus Research Data Management: Introduction and Service Overview Kyle Chard chard@uchicago.edu Ben Blaiszik blaiszik@uchicago.edu Thank you to our sponsors! U. S. D E P A R T M E N T OF ENERGY 2 Agenda
More informationCustomer Analytics. Turn Big Data into Big Value
Turn Big Data into Big Value All Your Data Integrated in Just One Place BIRT Analytics lets you capture the value of Big Data that speeds right by most enterprises. It analyzes massive volumes of data
More informationAre You Big Data Ready?
ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain
More informationOnline Marketing Module COMP. Certified Online Marketing Professional. v2.0
= Online Marketing Module COMP Certified Online Marketing Professional v2.0 Part 1 - Introduction to Online Marketing - Basic Description of SEO, SMM, PPC & Email Marketing - Search Engine Basics o Major
More informationBig data for the Masses The Unique Challenge of Big Data Integration
Big data for the Masses The Unique Challenge of Big Data Integration White Paper Table of contents Executive Summary... 4 1. Big Data: a Big Term... 4 1.1. The Big Data... 4 1.2. The Big Technology...
More informationWhy NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
More informationThe Way to SOA Concept, Architectural Components and Organization
The Way to SOA Concept, Architectural Components and Organization Eric Scholz Director Product Management Software AG Seite 1 Goals of business and IT Business Goals Increase business agility Support new
More informationHow To Write A Blog Post On Globus
Globus Software as a Service data publication and discovery Kyle Chard, University of Chicago Computation Institute, chard@uchicago.edu Jim Pruyne, University of Chicago Computation Institute, pruyne@uchicago.edu
More informationBig Data Analytics. Prof. Dr. Lars Schmidt-Thieme
Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,
More informationData Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
More informationCloud computing based big data ecosystem and requirements
Cloud computing based big data ecosystem and requirements Yongshun Cai ( 蔡 永 顺 ) Associate Rapporteur of ITU T SG13 Q17 China Telecom Dong Wang ( 王 东 ) Rapporteur of ITU T SG13 Q18 ZTE Corporation Agenda
More informationTableau Server 7.0 scalability
Tableau Server 7.0 scalability February 2012 p2 Executive summary In January 2012, we performed scalability tests on Tableau Server to help our customers plan for large deployments. We tested three different
More information