ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

Size: px
Start display at page:

Download "ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013"

Transcription

1 ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE October 2013

2 Introduction As sequencing technologies continue to evolve and genomic data makes its way into clinical use and medical practice, a momentous challenge arises how to cope with the rapidly increasing volume of complex data. Issues such as data storage, access, transfer, sharing, security, and analysis must be resolved to enable the new era of genomic medicine. Annai Systems provides several tools to enable and enhance genomic data use: the Annai-GNOS data management platform, GeneTorrent and GTFuse for accelerated file transfer and file mining, request Portal for collaboration and discovery, and the BioCompute Farm for analytical power. These powerful tools can be deployed in concert or independently. of Annai Platform Components Annai-GNOS provides a fast, scalable and robust network solution for storing, moving, finding, and securing genomic sequence data and associated metadata. GNOS-enabled repositories are capable of handling multi-petabytes of next generation sequencing data for fast and flexible storage, search, and retrieval. GeneTorrent is a data transfer protocol that allows for highspeed transfer of data files into and out of a given GNOS enabled repository. The repository and file transfer capabilities are highly secure and meet government standards, as defined by the Federal Information Security and Management Act of 2002 (FISMA). BioCompute Farm is a virtualized computation environment that provides on-demand compute power specifically optimized to facilitate analysis of genomic data. Users can enjoy high throughput computing without having to build local high-performance compute platforms or transfer massive data files over the Internet. request is a web portal which employs a query and networking infrastructure enabling researchers to search, find, and manage downloads from multiple GNOS-enabled data repositories. request s intuitive user interface streamlines the process of exploring and searching genomic data. GTFuse amplifies GeneTorrent s fast transfer speeds by allowing users to download selected portions of large genomic data files such as those at CGHub. GTFuse allows researchers to find and quickly access sequence data files as swiftly as if they were on the local network. GTFuse s option to select and retrieve a designated subset or region of a BAM file dramatically reduces data transfer times and costs ANNAI SYSTEMS ALL RIGHTS RESERVED 2

3 There are a growing number of public and private repositories emerging as integral parts of the drug discovery and therapeutic treatment process. These data repositories vary greatly in data use, efficiency of data upload/ download and access, regulatory compliance and security configurations. Furthermore, genomic data comes in a wide variety of formats and from various sequencing platforms. As the integration of genomic data with clinical data becomes increasingly required, there is an urgent need for genomic data tools that provide flexible, scalable solutions for a wide diversity of uses. The Cancer Genomics Hub (CGHub) is a vast repository of cancer genome data accessed freely by hundreds of researchers and clinicians, in both academic and commercial environments. CGHub uses Annai- GNOS to provide highly scalable access to The Cancer Genome Atlas (TCGA) and other cancer genome data sets. CGHub was launched in 2012 at UC Santa Cruz and now holds over 55,000 cancer genome files totaling 675 Terabytes. Hundreds of researchers from dozens of institutions rely on CGHub for access to cancer genome data from ten world-class sequencing centers, including the Broad Institute, Washington University, and Baylor College of Medicine. The repository is expected to grow to 5 Petabytes in the next few years. Annai supports both research and clinical settings by providing a powerful and flexible environment for enabling users at all levels of IT skill to easily accomplish tasks of genomic data handling and analysis. AnnaiBCF AnnaireQuest Research Portal AnnaiGNOS Genome Network Operating System GNOS Web Services AnnaiGTFuse Federated Authentication GNOS Repository Public Genomic Data GeneTorrent Data Transfer Private Genomic Data FIGURE 1. The Annai-GNOS environment and related peripheral data management tools. The various components of the Annai platform can be deployed together as an integrated whole or independently. When deployed in full, the Annai-GNOS system boosts productivity, reduces timeto-insight, and ensures data security while facilitating collaboration. Researchers or clinicians can quickly search and extract specific segments from thousands of genomes, work independently or collaborate with a team to analyze the data, and prepare their findings for publication or use in the clinic to guide therapy. The Annai-GNOS platform is designed to accelerate genomic research. A closer examination of its components will provide insight into their collective synergy as a system with unique and comprehensive capabilities ANNAI SYSTEMS ALL RIGHTS RESERVED 3

4 Annai-GNOS A Platform for High Performance Genomic Analysis and Data Management Annai-GNOS is a unique integration of the data repository infrastructure and high-speed networking capabilities needed to accommodate large genomic data sets. These data sets are characterized by diverse file formats, extensive meta-data, large file sizes and individual sequence datasets ranging from 10 Gigabytes to more than 1 Terabyte in size (depending on the depth of coverage). Annai-GNOS allows the entire user community to see the state of data throughout the submission lifecycle, including data that has not yet been approved or submitted for download. Researchers can query the state of data as soon as it is submitted and quickly identify submissions that may require some attention due to formatting or other problematic issues, before they are available to users of the repository. Flexible meta-data searching greatly simplifies finding the right sequence file, and highly fault-tolerant design ensures services continue to be available. The GNOS network functionality integrates secure, high-speed network protocols to mobilize petabyte scale genomic data analysis. Annai-GNOS can also be integrated with federated authentication systems like InCommon and the National Cancer Institute s authorization systems. Technical Specifications GNOS features the following capabilities: User-programmable meta-data format validation engine Support for multiple meta-data formats including customer defined formats and the Sequence Read Archive (SRA) schemas used by NCBI, EBI and DDBJ Support for multiple sequence data file types Ability to store other file types, such as compressed sequences Accelerated file transfer using GeneTorrent and GTFuse Incommon (Shibboleth) based, federated user authentication. Project-based data authorization to control individual researcher access Support for commonly used file format standards and analysis tools, including NCBI SRA Meta-data format; TCGA v2 BAM and VCF File Formats and GATK, BowTie, TopHat, CuffLinks and additional tools. The GNOS platform streamlines all aspects of genomic data management and access for researchers and clinicians. Setting up a GNOS repository consists of two steps: 1) data ingestion (duration depends on the state of the data) and 2) data deployment as indexed, meta-data tags in the GNOS database. Sequence data are entered into the repository using Annai s proprietary GeneTorrent tool and metadata submission API. Researchers can use the request web portal to quickly and easily explore GNOS-enabled data. For example, a simple search of ovarian cancer in CGHub using request can instantly output the number of ovarian cancer genome files contained in the database and how many are RNA-Seq, exome, or whole genome. The interface also enables the user to further drill down quickly to the specific files of interest. The ability to quickly visualize the contents of a GNOS repository is based on searching meta-data attributes that are extracted from sequencing files, catalogued and indexed. Query parameters are unlimited, but typically include file type, disease, sample collection date, sequencing platform, date of sequencing, and mapping and alignment tools. GNOS is suitable for public and/or private genomic databases of translational and basic research centers, pharmaceutical R&D labs, diagnostic companies, and similar organizations generating significant volumes of sequence data. GNOS provides tools to help catalogue, index, upload and download files, and to make the data available for collaboration. GNOS can also be integrated with any data management and transfer method or protocol. Use Case 1 CGHub Cancer Genome Repository The University of California Santa Cruz (UCSC) provides CGHub, the world s largest repository of cancer genome data. CGHub is built on GNOS and, after rigorous testing with active TCGA users, was established as the new secure repository for the Cancer Genome Atlas (TGCA) on April 30, Use Case 2 Drug Development Pharmaceuticals companies have strict requirements for data protection and security. Corporate policies may mandate keeping data behind a firewall. In this case, an in-house GNOS repository is an optimal solution. After installation by Annai, this type of repository will be managed by the company s local experts within its existing highperformance computing infrastructure ANNAI SYSTEMS ALL RIGHTS RESERVED 4

5 GeneTorrent Accelerated Secure File Transport Whole genome sequence data files range from several hundred gigabytes to over one terabyte in size. GeneTorrent enables accelerated transfers of terabyte-scale data. It employs a proprietary variant of the popular BitTorrent algorithm to securely transfer files at speeds limited only by the base network bandwidth. Technical Specifications Use Case Translational and Clinical Research Translational researchers and clinicians use GeneTorrent to push sequence data, either locally or from an external sequencing lab, into a GNOS repository either installed in their facility or hosted by Annai in the BioCompute Farm. GNOS-enabled repositories can also be hosted on Amazon Web Services (AWS) or in similar cloud environments. GeneTorrent s key functionality is as follows: High-fidelity parallel file transfer at up to multi-gbits/sec (speeds as high as 200 Mbps are routinely achieved) Highly resilient to in-network and computing failures with automatic recovery Highly secure 256-bit encrypted file transfer request One-stop Portal for Data Access, Collaboration and Management One of the most difficult aspects of genomics research is finding specific data across multiple, growing and often separate, disparate data repositories. Individual files can also be very large and the metadata extensive and difficult to interpret. The request portal addresses these challenges by providing a single point of access to the contents of all accessible GNOS-enabled repositories. Researchers can employ request s data exploration capabilities to analyze the data trends across available repositories. The portal s Access and Download capabilities allow researchers to drill down to find and download specific data sets. The Explore, Access, Download, and Collaboration capabilities of request are available to the community through standard web browsers enabling users to query, retrieve, and monitor download progress without having to install or master complex proprietary tools or query syntax. Technical Specifications The following describes request s key functionality: Explore a graphical interface to interrogate and analyze the contents of any Annai-GNOS enabled data repository using data statistics and meta-data. This function enables searches based on organization, study, disease, and other key terms to explore the genomic data set. Access a powerful, yet user-friendly meta-data query building capability allowing the researcher to find and select a set of individual sequence files for download. The download of files can be initiated from the Access area once the desired files are designated. Annai request offers conditional access, as some data repositories, such as the TCGA data hosted on CGHub, require access authorization credentials in order to download sequence files. The status of current and past download requests can be reviewed from a single dashboard. Download users can view the status of each file within their download requests, and a complete history of downloads is maintained to support experiment reproducibility. Collaborate provides public and private collaboration sites to engage with colleagues and share knowledge around common projects and frequently accessed datasets to broaden and expand the community of academic and clinical researchers ANNAI SYSTEMS ALL RIGHTS RESERVED 5

6 System Management Data Explorer Annai request portal Data Access Portal Management Database Data Download Metadata Ingest The collaborative capabilities of request facilitate cross team communication and allow for better distribution of tasks. For example, a team member responsible for defining the experimental parameters could select the appropriate data and pass it to a bioinformatician who is performing the analysis. Operating System Communications Broker FIGURE 2. request Portal helping to expedite research through a wellmanaged, user-friendly portal environment. GTFuse Accelerated Data Queries GTFuse enables researchers to directly access remote sequence data files as if they were on the local file system. GTFuse allows researchers to mount the desired data and immediately run any existing tools such as SamTools to inspect the header and begin accessing specific regions of the sequence data (i.e. if you are interested in analyzing data from a particular chromosome, gene, or region). GNOS Genomic File GTFuse client HPC Analysis Clusters Technical Specifications The following describes GTFuse s key functionality: Mounts remote file on local file system Relevant data within file GTFuse client Local Analysis Tools Provides asynchronous access to files via GeneTorrent protocol No data transfer until file is accessed by the user on local file system FIGURE 3. GTFuse provides the option to search and download the specific genes or regions required instead of the entire file. It requires no tools integration and allows any analysis tool to access data files as if they were local. Researchers often want to quickly examine specific regions of genomic data in remote repositories without retrieving the entire BAM file or analysis object. Alternately, researchers may need to read entire files but do not have the storage capacity to maintain local copies of large numbers of BAM files. Other tasks are difficult due to the large size of sequence files. For example, a researcher may spend hours downloading BAM files to inspect their headers and determine if there is sufficient coverage depth for their analysis. For all of these scenarios, GTFuse provides a speedy and economical solution by substantially shortening the time researchers spend preparing to undertake the analysis that interests them and helping to conserve IT resources. Use Case 1 Asynchronous BAM file access A researcher wants to use SAMTools to view specific genome data coordinates. The researcher uses GTFuse to open a BAM file and its corresponding BAI file and perform seek operations to read small portions from the BAM file asynchronously. Use Case 2 Process remote file locally A researcher avoids using large amounts of local disk storage by mounting a remote BAM file using GTFuse before building a BAI index file locally ANNAI SYSTEMS ALL RIGHTS RESERVED 6

7 BioCompute Farm Enabling Simple, Streamlined Data Analysis The BioCompute Farm is a private cloud designed specifically for genomic data analysis. The BioCompute Farm allows collaborators to use an elastic pool of compute servers and run cross-organizational experiments without up front capital expense, IT development effort, ongoing maintenance, or significant lead-time. Local GNOS-enabled compute databases, a pre-installed set of analysis tools, a stored set of reference genomes, and specialized data access greatly simplify genomic data gathering and analysis. The BioCompute Farm s unique efficiencies reduce the resources and time needed to accomplish complex genomic data analysis. Researchers can instantly activate virtual machines in our highly secure BioCompute Farm and collaborate with colleagues across the globe. Data input and output is free on the BioCompute Farm. The BioCompute Farm s high-speed network transfer capability removes the need to ship hard disks containing potentially sensitive data between organizations with the attendant risks and delays. The BioCompute Farm s flexible storage allows researchers to import large volumes of data to be utilized for performing data analysis and to discard it afterwards. This allows researchers to avoid the difficulties and delays of expanding existing local IT infrastructure to cope with moving and processing large volumes of sequencing data. Customer Site Access Control Researcher Researcher Researcher CGHub Compute Console request Portal Transfer Control Sequence Data DataCenter Fabric San Diego Supercomputing Center Internet ANNAI BioCompute Farm FIGURE 4. The BioCompute Farm offers high performance computing, storage, and networking resources in a virtualized computing environment Genome Analysis Tools & GTFuse Technical specifications The BioCompute Farm has the following key functionalities: High-performance compute power including 10G networking, 100GB memory and highly scalable storage capacity, to deliver performance optimized for bioinformatics application needs. Users have complete control over their virtual instances. Additional instances, memory and storage capacity can be added as needed. Custom user tracking and reporting can be enabled. Instances include bioinformatics and data extraction tools for large-scale and complex genomic analysis. Users can add additional tools and save them for future reuse. Workflows can be set up to launch automatically. There are two primary uses of the BioCompute Farm. One use is serving clients who need to do analytical research with repositories such as CGHub, and do not need to store data at the compute center. Typically, they want to do analysis of primary sequence data in the BioCompute Farm and pull results datasets back to their local environments. By using GTFuse researchers can extract the genes or regions of interest, instead of bulk copying whole sequence files. This is one of the most significant advantages of GTFuse used in conjunction with the BioCompute Farm. In some particular cases where a handful of genes are studied across many genomes, TCGA researchers use up to one hundred times less compute and storage capacity by working only with the actively used TCGA data. Use Case 1 CGHub BioCompute Farm The CGHub BioCompute Farm is co-located with CGHub, home of genomic data from The Cancer Genome Atlas, within the San Diego Supercomputer Center. The BioCompute Farm has a 10Gb/sec connection to CGHub and the Internet. Annai s request web portal enables users to rapidly browse the genomic data sets via customized and automated searches, and to bring the desired data into the user applications running in the BioCompute Farm ANNAI SYSTEMS ALL RIGHTS RESERVED 7

8 Use Case 2 Private BioCompute Farm A private BioCompute Farm can be co-located with an in-house GNOS-enabled data repository tailored to meet the particular requirements of a research organization. Annai provides installation, configuration and GeneTorrent training to researchers. Optionally, mapping, alignment, and variant calling tools can also be pre-installed in the BioCompute Farm. Having data analysis capacity co-located with in-house data can substantially reduce costs and speed up genomic data analysis. Conclusion Advancing translational research and genomic medicine requires distilling valuable, actionable information from hundreds or thousands of genomic sequence files and raises a unique set of big data challenges. Responding to these challenges, Annai Systems has developed the Annai-GNOS platform that drives robust repository operations to meet the real-world needs of users by providing metadata-based indexing, search query, and access to multiple distributed data sets, high-speed file transfer, rapid extraction of designated elements from multiple files, and a user-friendly alternative to command line interface. Annai Systems Inc. Tel Alberto Way, Suite 120 Los Gatos, California, ANNAI SYSTEMS ALL RIGHTS RESERVED 8

Delivering the power of the world s most successful genomics platform

Delivering the power of the world s most successful genomics platform Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE

More information

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated

More information

Large-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of Chicago @madduri

Large-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of Chicago @madduri Large-scale Research Data Management and Analysis Using Globus Services Ravi Madduri Argonne National Lab University of Chicago @madduri Outline Who we are Challenges in Big Data Management and Analysis

More information

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons The NIH Commons Summary The Commons is a shared virtual space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage,

More information

CGHub Client Security Guide Documentation

CGHub Client Security Guide Documentation CGHub Client Security Guide Documentation Release 3.1 University of California, Santa Cruz April 16, 2014 CONTENTS 1 Abstract 1 2 GeneTorrent: a secure, client/server BitTorrent 2 2.1 GeneTorrent protocols.....................................

More information

Four Ways High-Speed Data Transfer Can Transform Oil and Gas WHITE PAPER

Four Ways High-Speed Data Transfer Can Transform Oil and Gas WHITE PAPER Transform Oil and Gas WHITE PAPER TABLE OF CONTENTS Overview Four Ways to Accelerate the Acquisition of Remote Sensing Data Maximize HPC Utilization Simplify and Optimize Data Distribution Improve Business

More information

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel

More information

ediscovery and Search of Enterprise Data in the Cloud

ediscovery and Search of Enterprise Data in the Cloud ediscovery and Search of Enterprise Data in the Cloud From Hype to Reality By John Patzakis & Eric Klotzko ediscovery and Search of Enterprise Data in the Cloud: From Hype to Reality Despite the enormous

More information

Practical Solutions for Big Data Analytics

Practical Solutions for Big Data Analytics Practical Solutions for Big Data Analytics Ravi Madduri Computation Institute (madduri@anl.gov) Paul Dave (pdave@uchicago.edu) Dinanath Sulakhe (sulakhe@uchicago.edu) Alex Rodriguez (arodri7@uchicago.edu)

More information

Digital Asset Management. Content Control for Valuable Media Assets

Digital Asset Management. Content Control for Valuable Media Assets Digital Asset Management Content Control for Valuable Media Assets Overview Digital asset management is a core infrastructure requirement for media organizations and marketing departments that need to

More information

Using the Bionimbus Protected Data Cloud (PDC): Obtaining Access Credentials FAQ

Using the Bionimbus Protected Data Cloud (PDC): Obtaining Access Credentials FAQ Using the Bionimbus Protected Data Cloud (PDC): Obtaining Access Credentials FAQ It s very important that a PDC user is the only one who logs in with an account. If you have members of your lab that would

More information

Big Data Challenges. technology basics for data scientists. Spring - 2014. Jordi Torres, UPC - BSC www.jorditorres.

Big Data Challenges. technology basics for data scientists. Spring - 2014. Jordi Torres, UPC - BSC www.jorditorres. Big Data Challenges technology basics for data scientists Spring - 2014 Jordi Torres, UPC - BSC www.jorditorres.eu @JordiTorresBCN Data Deluge: Due to the changes in big data generation Example: Biomedicine

More information

Globus Genomics Tutorial GlobusWorld 2014

Globus Genomics Tutorial GlobusWorld 2014 Globus Genomics Tutorial GlobusWorld 2014 Agenda Overview of Globus Genomics Example Collaborations Demonstration Globus Genomics interface Globus Online integration Scenario 1: Using Globus Genomics for

More information

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?

More information

Big Data Challenges in Bioinformatics

Big Data Challenges in Bioinformatics Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?

More information

T a c k l i ng Big Data w i th High-Performance

T a c k l i ng Big Data w i th High-Performance Worldwide Headquarters: 211 North Union Street, Suite 105, Alexandria, VA 22314, USA P.571.296.8060 F.508.988.7881 www.idc-gi.com T a c k l i ng Big Data w i th High-Performance Computing W H I T E P A

More information

Building a Scalable Big Data Infrastructure for Dynamic Workflows

Building a Scalable Big Data Infrastructure for Dynamic Workflows Building a Scalable Big Data Infrastructure for Dynamic Workflows INTRODUCTION Organizations of all types and sizes are looking to big data to help them make faster, more intelligent decisions. Many efforts

More information

Introduction to Arvados. A Curoverse White Paper

Introduction to Arvados. A Curoverse White Paper Introduction to Arvados A Curoverse White Paper Contents Arvados in a Nutshell... 4 Why Teams Choose Arvados... 4 The Technical Architecture... 6 System Capabilities... 7 Commitment to Open Source... 12

More information

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution WHITEPAPER A Technical Perspective on the Talena Data Availability Management Solution BIG DATA TECHNOLOGY LANDSCAPE Over the past decade, the emergence of social media, mobile, and cloud technologies

More information

HIGH-SPEED BRIDGE TO CLOUD STORAGE

HIGH-SPEED BRIDGE TO CLOUD STORAGE HIGH-SPEED BRIDGE TO CLOUD STORAGE Addressing throughput bottlenecks with Signiant s SkyDrop 2 The heart of the Internet is a pulsing movement of data circulating among billions of devices worldwide between

More information

White Paper. Version 1.2 May 2015 RAID Incorporated

White Paper. Version 1.2 May 2015 RAID Incorporated White Paper Version 1.2 May 2015 RAID Incorporated Introduction The abundance of Big Data, structured, partially-structured and unstructured massive datasets, which are too large to be processed effectively

More information

Key Considerations and Major Pitfalls

Key Considerations and Major Pitfalls : Key Considerations and Major Pitfalls The CloudBerry Lab Whitepaper Things to consider before offloading backups to the cloud Cloud backup services are gaining mass adoption. Thanks to ever-increasing

More information

Amazon Cloud Storage Options

Amazon Cloud Storage Options Amazon Cloud Storage Options Table of Contents 1. Overview of AWS Storage Options 02 2. Why you should use the AWS Storage 02 3. How to get Data into the AWS.03 4. Types of AWS Storage Options.03 5. Object

More information

Computational Requirements

Computational Requirements Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects Computational Requirements Steve Sherry, Lisa Brooks, Paul Flicek, Anton Nekrutenko, Kenna Shaw, Heidi Sofia High-density

More information

White Paper. Amazon in an Instant: How Silver Peak Cloud Acceleration Improves Amazon Web Services (AWS)

White Paper. Amazon in an Instant: How Silver Peak Cloud Acceleration Improves Amazon Web Services (AWS) Amazon in an Instant: How Silver Peak Cloud Acceleration Improves Amazon Web Services (AWS) Amazon in an Instant: How Silver Peak Cloud Acceleration Improves Amazon Web Services (AWS) Amazon Web Services

More information

Taking Big Data to the Cloud. Enabling cloud computing & storage for big data applications with on-demand, high-speed transport WHITE PAPER

Taking Big Data to the Cloud. Enabling cloud computing & storage for big data applications with on-demand, high-speed transport WHITE PAPER Taking Big Data to the Cloud WHITE PAPER TABLE OF CONTENTS Introduction 2 The Cloud Promise 3 The Big Data Challenge 3 Aspera Solution 4 Delivering on the Promise 4 HIGHLIGHTS Challenges Transporting large

More information

Intelligent Systems for Health Solutions

Intelligent Systems for Health Solutions Bringing People, Systems, and Information Together Today s health organizations are increasingly challenged to accomplish what we call the triple aim of effective healthcare: deliver higher quality care

More information

How To Write A Blog Post On Globus

How To Write A Blog Post On Globus Globus Software as a Service data publication and discovery Kyle Chard, University of Chicago Computation Institute, chard@uchicago.edu Jim Pruyne, University of Chicago Computation Institute, pruyne@uchicago.edu

More information

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS ESSENTIALS Executive Summary Big Data is placing new demands on IT infrastructures. The challenge is how to meet growing performance demands

More information

Whitepaper. The ABC of Private Clouds. A viable option or another cloud gimmick?

Whitepaper. The ABC of Private Clouds. A viable option or another cloud gimmick? Whitepaper The ABC of Private Clouds A viable option or another cloud gimmick? Although many organizations have adopted the cloud and are reaping the benefits of a cloud computing platform, there are still

More information

A Service for Data-Intensive Computations on Virtual Clusters

A Service for Data-Intensive Computations on Virtual Clusters A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King rainer.schmidt@arcs.ac.at Planets Project Permanent

More information

Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences

Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences David Fergusson Head of Scientific Computing The Francis Crick Institute The Francis Crick Institute

More information

NetApp Big Content Solutions: Agile Infrastructure for Big Data

NetApp Big Content Solutions: Agile Infrastructure for Big Data White Paper NetApp Big Content Solutions: Agile Infrastructure for Big Data Ingo Fuchs, NetApp April 2012 WP-7161 Executive Summary Enterprises are entering a new era of scale, in which the amount of data

More information

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Keystone Image Management System

Keystone Image Management System Image management solutions for satellite and airborne sensors Overview The Keystone Image Management System offers solutions that archive, catalogue, process and deliver digital images from a vast number

More information

OPTIMIZING PERFORMANCE IN AMAZON EC2 INTRODUCTION: LEVERAGING THE PUBLIC CLOUD OPPORTUNITY WITH AMAZON EC2. www.boundary.com

OPTIMIZING PERFORMANCE IN AMAZON EC2 INTRODUCTION: LEVERAGING THE PUBLIC CLOUD OPPORTUNITY WITH AMAZON EC2. www.boundary.com OPTIMIZING PERFORMANCE IN AMAZON EC2 While the business decision to migrate to Amazon public cloud services can be an easy one, tracking and managing performance in these environments isn t so clear cut.

More information

Making a Case for Including WAN Optimization in your Global SharePoint Deployment

Making a Case for Including WAN Optimization in your Global SharePoint Deployment Making a Case for Including WAN Optimization in your Global SharePoint Deployment Written by: Mauro Cardarelli Mauro Cardarelli is co-author of "Essential SharePoint 2007 -Delivering High Impact Collaboration"

More information

GenomeSpace Architecture

GenomeSpace Architecture GenomeSpace Architecture The primary services, or components, are shown in Figure 1, the high level GenomeSpace architecture. These include (1) an Authorization and Authentication service, (2) an analysis

More information

Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2

Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2 Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2 In the movie making, visual effects and 3D animation industrues meeting project and timing deadlines is critical to success. Poor quality

More information

EMC CLOUDARRAY PRODUCT DESCRIPTION GUIDE

EMC CLOUDARRAY PRODUCT DESCRIPTION GUIDE EMC CLOUDARRAY PRODUCT DESCRIPTION GUIDE INTRODUCTION IT organizations today grapple with two critical data storage challenges: the exponential growth of data and an increasing need to keep more data for

More information

Cisco UCS and Quantum StorNext: Harnessing the Full Potential of Content

Cisco UCS and Quantum StorNext: Harnessing the Full Potential of Content Solution Brief Cisco UCS and Quantum StorNext: Harnessing the Full Potential of Content What You Will Learn StorNext data management with Cisco Unified Computing System (Cisco UCS ) helps enable media

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...

More information

Testimony of. Paul Misener Vice President for Global Public Policy, Amazon.com. Before the

Testimony of. Paul Misener Vice President for Global Public Policy, Amazon.com. Before the Testimony of Paul Misener Vice President for Global Public Policy, Before the United States House of Representatives Committee on Energy and Commerce Subcommittee on Communications and Technology Subcommittee

More information

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute Justin Paschall Team Leader Genetic Variation / EGA ! European Genome-phenome

More information

CGHub Web-based Metadata GUI Statement of Work

CGHub Web-based Metadata GUI Statement of Work CGHub Web-based Metadata GUI Statement of Work Mark Diekhans Version 1 April 23, 2012 1 Goals CGHub stores metadata and data associated from NCI cancer projects. The goal of this project

More information

LifeScope Genomic Analysis Software 2.5

LifeScope Genomic Analysis Software 2.5 USER GUIDE LifeScope Genomic Analysis Software 2.5 Graphical User Interface DATA ANALYSIS METHODS AND INTERPRETATION Publication Part Number 4471877 Rev. A Revision Date November 2011 For Research Use

More information

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment CloudCenter Full Lifecycle Management An application-defined approach to deploying and managing applications in any datacenter or cloud environment CloudCenter Full Lifecycle Management Page 2 Table of

More information

ORACLE HEALTH SCIENCES INFORM ADVANCED MOLECULAR ANALYTICS

ORACLE HEALTH SCIENCES INFORM ADVANCED MOLECULAR ANALYTICS ORACLE HEALTH SCIENCES INFORM ADVANCED MOLECULAR ANALYTICS INCORPORATE GENOMIC DATA INTO CLINICAL R&D KEY BENEFITS Enable more targeted, biomarker-driven clinical trials Improves efficiencies, compressing

More information

Cisco Virtualized Multiservice Data Center Reference Architecture: Building the Unified Data Center

Cisco Virtualized Multiservice Data Center Reference Architecture: Building the Unified Data Center Solution Overview Cisco Virtualized Multiservice Data Center Reference Architecture: Building the Unified Data Center What You Will Learn The data center infrastructure is critical to the evolution of

More information

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1 Powerful analytics and enterprise security in a single platform microstrategy.com 1 Make faster, better business decisions with easy, powerful, and secure tools to explore data and share insights. Enterprise-grade

More information

UCLA Team Sequences Cell Line, Puts Open Source Software Framework into Production

UCLA Team Sequences Cell Line, Puts Open Source Software Framework into Production Page 1 of 6 UCLA Team Sequences Cell Line, Puts Open Source Software Framework into Production February 05, 2010 Newsletter: BioInform BioInform - February 5, 2010 By Vivien Marx Scientists at the department

More information

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc.

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc. How to Ingest Data into Google BigQuery using Talend for Big Data A Technical Solution Paper from Saama Technologies, Inc. July 30, 2013 Table of Contents Intended Audience What you will Learn Background

More information

Product Brief SysTrack VMP

Product Brief SysTrack VMP for VMware View Product Brief SysTrack VMP Benefits Optimize VMware View desktop and server virtualization and terminal server projects Anticipate and handle problems in the planning stage instead of postimplementation

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Relocating Windows Server 2003 Workloads

Relocating Windows Server 2003 Workloads Relocating Windows Server 2003 Workloads An Opportunity to Optimize From Complex Change to an Opportunity to Optimize There is much you need to know before you upgrade to a new server platform, and time

More information

Desktop Virtualization for the Banking Industry. Resilient Desktop Virtualization for Bank Branches. A Briefing Paper

Desktop Virtualization for the Banking Industry. Resilient Desktop Virtualization for Bank Branches. A Briefing Paper Desktop Virtualization for the Banking Industry Resilient Desktop Virtualization for Bank Branches A Briefing Paper September 2012 1 Contents Introduction VERDE Cloud Branch for Branch Office Management

More information

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Clear the way for new business opportunities. Unlock the power of data. Overcoming storage limitations Unpredictable data growth

More information

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved. Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat

More information

GeneProf and the new GeneProf Web Services

GeneProf and the new GeneProf Web Services GeneProf and the new GeneProf Web Services Florian Halbritter florian.halbritter@ed.ac.uk Stem Cell Bioinformatics Group (Simon R. Tomlinson) simon.tomlinson@ed.ac.uk December 10, 2012 Florian Halbritter

More information

IBM Global Technology Services September 2007. NAS systems scale out to meet growing storage demand.

IBM Global Technology Services September 2007. NAS systems scale out to meet growing storage demand. IBM Global Technology Services September 2007 NAS systems scale out to meet Page 2 Contents 2 Introduction 2 Understanding the traditional NAS role 3 Gaining NAS benefits 4 NAS shortcomings in enterprise

More information

DELL s Oracle Database Advisor

DELL s Oracle Database Advisor DELL s Oracle Database Advisor Underlying Methodology A Dell Technical White Paper Database Solutions Engineering By Roger Lopez Phani MV Dell Product Group January 2010 THIS WHITE PAPER IS FOR INFORMATIONAL

More information

End-to-End E-Clinical Coverage with Oracle Health Sciences InForm GTM

End-to-End E-Clinical Coverage with Oracle Health Sciences InForm GTM End-to-End E-Clinical Coverage with InForm GTM A Complete Solution for Global Clinical Trials The broad market acceptance of electronic data capture (EDC) technology, coupled with an industry moving toward

More information

TABLE OF CONTENTS THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY FOR SHAREPOINT DATA. Introduction. Examining Third-Party Replication Models

TABLE OF CONTENTS THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY FOR SHAREPOINT DATA. Introduction. Examining Third-Party Replication Models 1 THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY TABLE OF CONTENTS 3 Introduction 14 Examining Third-Party Replication Models 4 Understanding Sharepoint High Availability Challenges With Sharepoint

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Digital Asset Management

Digital Asset Management A collaborative digital asset management system for marketing organizations that improves performance, saves time and reduces costs. MarketingPilot provides powerful digital asset management software for

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Media Exchange really puts the power in the hands of our creative users, enabling them to collaborate globally regardless of location and file size.

Media Exchange really puts the power in the hands of our creative users, enabling them to collaborate globally regardless of location and file size. Media Exchange really puts the power in the hands of our creative users, enabling them to collaborate globally regardless of location and file size. Content Sharing Made Easy Media Exchange (MX) is a browser-based

More information

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Scalable Cloud Computing Solutions for Next Generation Sequencing Data Scalable Cloud Computing Solutions for Next Generation Sequencing Data Matti Niemenmaa 1, Aleksi Kallio 2, André Schumacher 1, Petri Klemelä 2, Eija Korpelainen 2, and Keijo Heljanko 1 1 Department of

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

The Recipe for Sarbanes-Oxley Compliance using Microsoft s SharePoint 2010 platform

The Recipe for Sarbanes-Oxley Compliance using Microsoft s SharePoint 2010 platform The Recipe for Sarbanes-Oxley Compliance using Microsoft s SharePoint 2010 platform Technical Discussion David Churchill CEO DraftPoint Inc. The information contained in this document represents the current

More information

Utilizing the SDSC Cloud Storage Service

Utilizing the SDSC Cloud Storage Service Utilizing the SDSC Cloud Storage Service PASIG Conference January 13, 2012 Richard L. Moore rlm@sdsc.edu San Diego Supercomputer Center University of California San Diego Traditional supercomputer center

More information

CrossPoint for Managed Collaboration and Data Quality Analytics

CrossPoint for Managed Collaboration and Data Quality Analytics CrossPoint for Managed Collaboration and Data Quality Analytics Share and collaborate on healthcare files. Improve transparency with data quality and archival analytics. Ajilitee 2012 Smarter collaboration

More information

WE RUN SEVERAL ON AWS BECAUSE WE CRITICAL APPLICATIONS CAN SCALE AND USE THE INFRASTRUCTURE EFFICIENTLY.

WE RUN SEVERAL ON AWS BECAUSE WE CRITICAL APPLICATIONS CAN SCALE AND USE THE INFRASTRUCTURE EFFICIENTLY. WE RUN SEVERAL CRITICAL APPLICATIONS ON AWS BECAUSE WE CAN SCALE AND USE THE INFRASTRUCTURE EFFICIENTLY. - Murari Gopalan Director, Technology Expedia Expedia, a leading online travel company for leisure

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

Cisco Unified Data Center

Cisco Unified Data Center Solution Overview Cisco Unified Data Center Simplified, Efficient, and Agile Infrastructure for the Data Center What You Will Learn The data center is critical to the way that IT generates and delivers

More information

How To Create A Large Enterprise Cloud Storage System From A Large Server (Cisco Mds 9000) Family 2 (Cio) 2 (Mds) 2) (Cisa) 2-Year-Old (Cica) 2.5

How To Create A Large Enterprise Cloud Storage System From A Large Server (Cisco Mds 9000) Family 2 (Cio) 2 (Mds) 2) (Cisa) 2-Year-Old (Cica) 2.5 Cisco MDS 9000 Family Solution for Cloud Storage All enterprises are experiencing data growth. IDC reports that enterprise data stores will grow an average of 40 to 60 percent annually over the next 5

More information

ebook Utilizing MapReduce to address Big Data Enterprise Needs Leveraging Big Data to shorten drug development cycles in Pharmaceutical industry.

ebook Utilizing MapReduce to address Big Data Enterprise Needs Leveraging Big Data to shorten drug development cycles in Pharmaceutical industry. Utilizing MapReduce to address Big Data Enterprise Needs Leveraging Big Data to shorten drug development cycles in Pharmaceutical industry. www.persistent.com 3 4 5 5 7 9 10 11 12 13 From the Vantage Point

More information

How To Build A Cloud Computer

How To Build A Cloud Computer Introducing the Singlechip Cloud Computer Exploring the Future of Many-core Processors White Paper Intel Labs Jim Held Intel Fellow, Intel Labs Director, Tera-scale Computing Research Sean Koehl Technology

More information

Clodoaldo Barrera Chief Technical Strategist IBM System Storage. Making a successful transition to Software Defined Storage

Clodoaldo Barrera Chief Technical Strategist IBM System Storage. Making a successful transition to Software Defined Storage Clodoaldo Barrera Chief Technical Strategist IBM System Storage Making a successful transition to Software Defined Storage Open Server Summit Santa Clara Nov 2014 Data at the core of everything Data is

More information

Globus Research Data Management: Introduction and Service Overview

Globus Research Data Management: Introduction and Service Overview Globus Research Data Management: Introduction and Service Overview Kyle Chard chard@uchicago.edu Ben Blaiszik blaiszik@uchicago.edu Thank you to our sponsors! U. S. D E P A R T M E N T OF ENERGY 2 Agenda

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

How To Build A Clustered Storage Area Network (Csan) From Power All Networks

How To Build A Clustered Storage Area Network (Csan) From Power All Networks Power-All Networks Clustered Storage Area Network: A scalable, fault-tolerant, high-performance storage system. Power-All Networks Ltd Abstract: Today's network-oriented computing environments require

More information

Understanding the Benefits of IBM SPSS Statistics Server

Understanding the Benefits of IBM SPSS Statistics Server IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster

More information

StorReduce Technical White Paper Cloud-based Data Deduplication

StorReduce Technical White Paper Cloud-based Data Deduplication StorReduce Technical White Paper Cloud-based Data Deduplication See also at storreduce.com/docs StorReduce Quick Start Guide StorReduce FAQ StorReduce Solution Brief, and StorReduce Blog at storreduce.com/blog

More information

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System By Jake Cornelius Senior Vice President of Products Pentaho June 1, 2012 Pentaho Delivers High-Performance

More information

Globus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis

Globus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis Globus Research Data Management: Introduction and Service Overview Steve Tuecke Vas Vasiliadis Presentations and other useful information available at globus.org/events/xsede15/tutorial 2 Thank you to

More information

Data management challenges in todays Healthcare and Life Sciences ecosystems

Data management challenges in todays Healthcare and Life Sciences ecosystems Data management challenges in todays Healthcare and Life Sciences ecosystems Jose L. Alvarez Principal Engineer, WW Director Life Sciences jose.alvarez@seagate.com Evolution of Data Sets in Healthcare

More information

cloud functionality: advantages and Disadvantages

cloud functionality: advantages and Disadvantages Whitepaper RED HAT JOINS THE OPENSTACK COMMUNITY IN DEVELOPING AN OPEN SOURCE, PRIVATE CLOUD PLATFORM Introduction: CLOUD COMPUTING AND The Private Cloud cloud functionality: advantages and Disadvantages

More information

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud

More information

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure Authors: A O Jaunsen, G S Dahiya, H A Eide, E Midttun Date: Dec 15, 2015 Summary Uninett Sigma2 provides High

More information

Scalable Services for Digital Preservation

Scalable Services for Digital Preservation Scalable Services for Digital Preservation A Perspective on Cloud Computing Rainer Schmidt, Christian Sadilek, and Ross King Digital Preservation (DP) Providing long-term access to growing collections

More information

Analyzing HTTP/HTTPS Traffic Logs

Analyzing HTTP/HTTPS Traffic Logs Advanced Threat Protection Automatic Traffic Log Analysis APTs, advanced malware and zero-day attacks are designed to evade conventional perimeter security defenses. Today, there is wide agreement that

More information

Tableau Online. Understanding Data Updates

Tableau Online. Understanding Data Updates Tableau Online Understanding Data Updates Author: Francois Ajenstat July 2013 p2 Whether your data is in an on-premise database, a database, a data warehouse, a cloud application or an Excel file, you

More information

Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES WHITE PAPER

Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES WHITE PAPER Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES TABLE OF CONTENTS Introduction... 3 Overview: Delphix Virtual Data Platform... 4 Delphix for AWS... 5 Decrease the

More information

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts Part V Applications Cloud Computing: General concepts Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 1 What is cloud computing? SaaS: Software as a Service Cloud: Datacenters hardware

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

CAREER TRACKS PHASE 1 UCSD Information Technology Family Function and Job Function Summary

CAREER TRACKS PHASE 1 UCSD Information Technology Family Function and Job Function Summary UCSD Applications Programming Involved in the development of server / OS / desktop / mobile applications and services including researching, designing, developing specifications for designing, writing,

More information

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015 Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document

More information