Data Analytics. Sequence Exosome RNAs



Similar documents
PreciseTM Whitepaper

How-To: SNP and INDEL detection

Analysis of ChIP-seq data in Galaxy

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

GeneSifter: Next Generation Data Management and Analysis for Next Generation Sequencing

G E N OM I C S S E RV I C ES

The world of non-coding RNA. Espen Enerly

Profiling of non-coding RNA classes Gunter Meister

Lecture 8. Protein Trafficking/Targeting. Protein targeting is necessary for proteins that are destined to work outside the cytoplasm.

Frequently Asked Questions Next Generation Sequencing

CRAC: An integrated approach to analyse RNA-seq reads Additional File 3 Results on simulated RNA-seq data.

Control of Gene Expression

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Protein Synthesis How Genes Become Constituent Molecules

Micro RNAs: potentielle Biomarker für das. Blutspenderscreening

Introduction to NGS data analysis

17 July 2014 WEB-SERVER MANUAL. Contact: Michael Hackenberg

The RNA strategy. RNA as a tool and target in human disease diagnosis and therapy.

Bioinformatics Resources at a Glance

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

Methods, tools, and pipelines for analysis of Ion PGM Sequencer mirna and gene expression data

BIOL 3200 Spring 2015 DNA Subway and RNA-Seq Data Analysis

Module 1. Sequence Formats and Retrieval. Charles Steward

Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in

InSyBio BioNets: Utmost efficiency in gene expression data and biological networks analysis

New Technologies for Sensitive, Low-Input RNA-Seq. Clontech Laboratories, Inc.

Guide for Data Visualization and Analysis using ACSN

Viruses. Viral components: Capsid. Chapter 10: Viruses. Viral components: Nucleic Acid. Viral components: Envelope

Services. Updated 05/31/2016

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Using Galaxy for NGS Analysis. Daniel Blankenberg Postdoctoral Research Associate The Galaxy Team

RNA-Seq Tutorial 1. John Garbe Research Informatics Support Systems, MSI March 19, 2012

Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools.

ISTEP+: Biology I End-of-Course Assessment Released Items and Scoring Notes

Problem Set 1 KEY

CHAPTER 40 The Mechanism of Protein Synthesis

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

The Steps. 1. Transcription. 2. Transferal. 3. Translation

Comparing Methods for Identifying Transcription Factor Target Genes

Translation Study Guide

Quick Hit Activity Using UIL Science Contests For Formative and Summative Assessments of Pre-AP and AP Biology Students

Delivering the power of the world s most successful genomics platform

Bioruptor NGS: Unbiased DNA shearing for Next-Generation Sequencing

RNA Express. Introduction 3 Run RNA Express 4 RNA Express App Output 6 RNA Express Workflow 12 Technical Assistance

A Tutorial in Genetic Sequence Classification Tools and Techniques

Ingenuity Pathway Analysis (IPA )

Analysis and Integration of Big Data from Next-Generation Genomics, Epigenomics, and Transcriptomics

BME Engineering Molecular Cell Biology. Lecture 02: Structural and Functional Organization of

Core Facility Genomics

ProteinQuest user guide

Challenges associated with analysis and storage of NGS data

School of Nursing. Presented by Yvette Conley, PhD

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure enzymes control cell chemistry ( metabolism )

GenBank, Entrez, & FASTA

mrna NGS Data Analysis Report

MicroRNA formation. 4th International Symposium on Non-Surgical Contraceptive Methods of Pet Population Control

Next Generation Sequencing: Adjusting to Big Data. Daniel Nicorici, Dr.Tech. Statistikot Suomen Lääketeollisuudessa

Special report. Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006

LifeScope Genomic Analysis Software 2.5

SEQUENCING. From Sample to Sequence-Ready

Bioinformatics Unit Department of Biological Services. Get to know us

Molecular Genetics. RNA, Transcription, & Protein Synthesis

UGENE Quick Start Guide

Control of Gene Expression

Non-invasive prenatal detection of chromosome aneuploidies using next generation sequencing: First steps towards clinical application

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE

Review of the Cell and Its Organelles

Data Analysis for Ion Torrent Sequencing

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Version 5.0 Release Notes

mrna EDITING Watson et al., BIOLOGIA MOLECOLARE DEL GENE, Zanichelli editore S.p.A. Copyright 2005

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

Next Generation Sequencing

Identification of rheumatoid arthritis and osteoarthritis patients by transcriptome-based rule set generation

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe

Course Curriculum for Master Degree in Medical Laboratory Sciences/Clinical Biochemistry

An Introduction to Next-Generation Sequencing for in vitro Fertilization

trna Processing and Modification

Data Analysis & Management of High-throughput Sequencing Data. Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute

Dr Alexander Henzing

Cells & Cell Organelles

Next generation DNA sequencing technologies. theory & prac-ce

How Sequencing Experiments Fail

Basic processing of next-generation sequencing (NGS) data

GeneProf and the new GeneProf Web Services

RNAseq / ChipSeq / Methylseq and personalized genomics

Transcription:

Sequence Exosome RNAs The Exo-NGS service provides the exosome researcher with a comprehensive, expert service to isolate and identify exosome-associated RNA biomarkers - leveraging the throughput and scalability of Illumina s MiSeq and HiSeq next-generation sequencing platforms. Most exosomal RNAs are less than 300 nucleotides in length, thus small RNA libraries are ideal for this application. This service is a turnkey solution that is tailored for researchers who are interested in identifying novel exosome RNA biomarkers or understanding the abundance of such biomarkers in the exosomes of their model cellular systems or patient biofluids. Input Sample Requirements Sequence Read Quality Assessment Read Quality Score 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 Good Fair Biofluid Poor Volume Serum 500 l - 1ml Plasma 500 l - 1ml Cell Media 5ml - 10ml Urine 5ml - 10ml Spinal Fluid 5ml - 10ml Ascites Fluid 500 l - 1ml Other Inquire Sequence Read Quality Across all Bases 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 Position in Read (bp) Building Exosome RNA Library Exosomes isolated from samples Exo-RNAs purified with chromatography Illumina bar-codes added and amplified Library size-selection and PAGE purification High sensitivity library QC with Bioanalyzer Multiplexed NGS runs performed Two rounds of quality checks on sequence data are performed. The first round analyzes raw sequencing data generated by the Illumina platform, and library adaptors and Ns are trimmed. The second data check is performed on the trimmed sequence data to ensure read quality before genome mapping. Compare exo-rna sequence profiles across patient samples. SBI and Maverix Biomics have teamed up to provide a complete analytics solution on deep sequencing data of exosome-associated RNAs. The analysis service includes library sequence quality control metrics, data analysis for relative RNA abundance and identity, differential expression analysis and visualization of the data in a cloud-based, private UCSC Genome Browser. Simplify and accelerate your exosome RNA biomarker discovery with the advanced bioinformatics analysis included in SBI s Exo-NGS service. Genome Browser Data Mining Sequencing data uploaded Access to public databases Analyze with accepted analytics Visualize your data automatically Compare to known ENCODE data Put your data into context. Publication-ready figures automatically generated with the Exo-NGS service. Data Analytics Primary Data Deliverables Raw sequencing reads in FASTA format Sequencing read quality values Analyzed Data Small RNA Workflow Relative abundance of each RNA type Table of counts of mature micrornas RNA Type Charts 75 100 50 Expression Heatmaps 25 RNA Type Antisense RNA CDBox HAcaBox lincrna LINE LTR microrna Other ncrna pirna RefSeq exons RefSeq introns rrna scarna SINE Tandem repeat trna 1 2 3 4 5

White Paper Exosome RNA-seq Analysis The Maverix Analytic Platform facilitates discovery of small non-coding RNAs and biomarkers. Maverix Biomics, Inc. 1670 S. Amphlett Blvd, Suite 214, San Mateo CA 94402 650-388-9277 support@maverixbio.com

Table of Contents Overview 3 Introduction 3 Approach 5 Analysis 6 Case Study 7 Summary 11 References 11 2

Overview Thirty years ago, the extracellular vesicles known as exosomes were identified while studying transferrin/receptor recycling in reticulocytes [1] but only in the last 5-6 years have studies ignited significant interest by elucidating their role in pathogenesis, cell-cell communication, drug, vaccine and gene-vector delivery, and as reservoirs of biomarkers [2]. With research expanding into next-generation sequencing technologies, advances have accelerated in the areas of intercellular communication and disease-related biomarker discovery. RNA-seq facilitates this research and the Maverix Analytic Platform has an analysis kit available to accelerate your discovery. Introduction Exosomes are small extracellular vesicles (30-100 nm in diameter) of endocytic origin. These nanovesicles are formed by inward budding of late endosomes to produce multivesicular endosomes (MVEs), sometimes referred to as multivesicular bodies, and then released into the environment by fusion of the MVEs with the plasma membrane [3]. Figure 1. Release of microvesicles and exosomes. Microvesicles (MVs) bud directly from the plasma membrane, whereas exosomes are represented by small vesicles of different sizes that are formed as the intraluminal vesicles by budding into early endosomes and multivesicular endosomes (MVEs) and are released by fusion of MVEs with the plasma membrane. Other MVEs fuse with lysosomes. Red spots symbolize clathrin associated with vesicles at the plasma membrane (clathrin-coated vesicles [CCV]) or bilayered clathrin coats at endosomes. Membrane-associated and transmembrane proteins on vesicles are represented as triangles and rectangles, respectively. Arrows represent proposed directions of protein and lipid transport between organelles and between MVEs and the plasma membrane for exosome secretion. Illustration from Raposo and Stoorvogel, 2013 [3]. 3

Exosomes have been isolated from various sources, including amniotic fluid, bile, blood, breast milk, cerebrospinal fluid, malignant ascites fluid, saliva, semen and urine [3]. Their ability to be easily sampled from a patient s body fluids by relatively noninvasive methods makes them a valuable reagent. Exosomes are released by most cell types, with studies demonstrating various roles in antigen presentation, cell cell communication, immune response, and dissemination of infectious agents. Initially, the molecular components of exosomes were thought to be merely cell debris, but subsequent studies revealed otherwise. Proteins enriched in exosomes are derived from the endosomes, plasma membrane and the cytosol and not from the nucleus, mitochondria, or endoplasmic reticulum. Some common protein families identified in exosomes include chaperones, cytoskeletal proteins, ESCRT proteins, tetraspanin proteins, trimeric G protein subunits, and other proteins involved in transport and fusion. Much of the molecular content of exosomes is cell-specific in nature. The exosomal proteins, mrnas, micrornas, and lipids identified over the past several years have been curated in the extracellular vesicle database ExoCarta (http://www.exocarta.org/) [4]. Exosome small RNA repeat RNA trf vrna SRP-RNA Y-RNA rrna mirna pirna snrna snorna Description LINEs, LTRs, and simple repeat sequences trna fragments vault RNA, part of the vault ribonucleoprotein complex signal recognition particle RNA ncrna component of the Ro ribonucleoprotein particle ribosomal RNA (<200nt) microrna piwi-interacting RNA small nuclear RNA small nucleolar RNA Table 1. Exosomal small RNAs. Various small non-coding RNAs have been identified in exosomes, many of which play roles in the conveyance of genetically encoded messages between cells [5]. Next-generation sequencing (NGS) has allowed researchers to focus on small non-coding RNAs (ncrnas) in exosomes. ExoCarta catalogues the small ncrnas known as micrornas (mirnas), which are typically 22 nucleotides in length. mirnas are derived from longer stem-loop precursors and are known to function as transcriptional and post-transcriptional regulators of gene expression. In addition to mirnas, many small RNAs have been isolated from exosomes (see Table 1) and are presumed to play roles in communication between cells, potentially modifying the function of target cells [5]. The resurgence of interest in exosomes is accelerating 4

the understanding of how these small RNAs facilitate communication between cells and organs, act through gene regulatory functions, and play important roles in a wide range of physiological and pathological processes. Approach RNA-seq is an NGS method that uses high-throughput sequencing technology to sequence the RNA content of an organism, tissue or cell. RNA-seq allows a researcher to define and analyze the transcriptome, which represents the full RNA complement and includes mrna, rrna, trna and other ncrnas. To focus on small ncrnas, size fractionation is performed prior to the ligation of adaptors and conversion to cdna in the subsequent library preparation steps. System Biosciences (SBI) and Maverix Biomics have teamed up to provide a combined solution for the isolation, purification, analysis and visualization of exosome RNA NGS data. SBI has engineered tools and NGS services to accelerate the study of exosomes and exosome RNA biomarkers, including a simple one-step process for isolating exosomes from biofluids, followed by RNA purification and RNA-seq. The sequence reads can then be analyzed using the Exosome Small RNA-seq Analysis Kit on the cloud-based Maverix Analytic Platform, which provides small RNA analysis and data visualization. The small RNA-seq analysis kit facilitates identification of novel small ncrnas, biomarker discovery, transcription start site detection, and the study of small RNA regulatory pathways. Figure 2. The Exosome Small RNA-seq Analysis Kit on the Maverix Analytic Platform. The Maverix Analytic Platform allows researchers to easily and quickly upload their sequence data and launch an analysis kit that uses peer-reviewed open-source algorithms that have been highly cited in life sciences journal articles and are now widely accepted as the gold standard for NGS analysis. Table 2 lists the open-source software utilized in the Exosome Small RNA-seq Analysis Kit. Other software used in the analysis includes the licensed UCSC Genome Browser and utilities [6], as well as Maverix Biomics in-house developed applications. 5

Analysis The Exosome Small RNA-seq Analysis Kit initiates with a data quality check of the input sequence using FastQC, an open-source quality control (QC) tool for high-throughput sequence data [7]. FastQC runs analyses of the uploaded raw sequence reads that reveal the quality of the data and inform the subsequent preprocessing steps in the analysis. Following QC, the analysis moves to preprocessing of the RNA-seq reads to improve the quality of data input for read mapping. The open-source tools used are FastqMcf, part of the EA-utils package [8], and PRINSEQ [9]. Data preprocessing detects and removes N s at the ends of reads, trims sequencing adapters, and filters reads for quality and length. FastQC is then re-run to analyze the trimmed reads, allowing a before and after comparison. The summary report generated provides a quality assurance check to validate the processed set of input data used in the subsequent read mapping step. Software Analysis Kit Step Citation FastQC Data quality control 7 ea-utils Data preprocessing 8 PRINSEQ Data preprocessing 9 Bowtie Sequence mapping 10 SAMtools Alignment processing, Genome browser track generation 11 Picard Alignment processing, Genome browser track generation 12 R Statistics and visualization generation 13 Table 2. Open-source software used in the Exosome Small RNA-seq Analysis Kit. List of the opensource software and tools used in the analysis, including the associated analysis step and the software citation. The improved set of sequence reads are mapped to the reference genome using Bowtie, an ultrafast, memory-efficient short read aligner [10], followed by the generation of a mapping summary report for review. Using the open-source software SAMtools [11] and Picard [12], expression analyses are carried out, including computation of read coverage, determination of ncrna abundance and differential expression analysis across samples when applicable. Expression statistics are calculated and visualized using R, a software environment for statistical computing and graphics [13]. The Exosome Small RNA-seq analysis produces a summarization of results, including expression statistics and chromosome distribution, as well as genome browser tracks of read alignment and read coverage for analysis in a genomic context. The Maverix Analytic Platform allows researchers to visualize their data and analytic results using an integrated UCSC Genome Browser, automatically configured for their specific organism of interest. With access to data in the UCSC Genome Browser directly from the platform, researchers can add custom tracks to the browser, securely surf their data, and easily share or publish their results. 6

Case Study Exosomes have been isolated from many body fluids including breast milk, which is a complex liquid containing immunological components that can impact the development of the offspring s immune system. In their 2012 publication [14], Zhou, et al. investigated the transcriptome of exosomes in human breast milk with a particular focus on mirnas whose levels are frequently elevated in diseased states. In their study, exosomes were isolated from the breast milk of four women when their infants were 60 days old using SBI's ExoQuick exosome precipitation reagent. Four exosomal small RNA libraries were constructed and each was individually sequenced, with a total of ~86.37 million 36-nt raw reads generated. Figure 3. Sequencing read mapping rate. The charts show the percentage and number of reads, respectively, for trimmed, mapped and unmapped reads for each of the four samples. 7

Taking the small RNA sequence data from this study (deposited by the authors to NCBI s Gene Expression Omnibus under accession GSE32253), we launched the Exosome Small RNA-seq Analysis Kit on the Maverix Analytic Platform using the raw sequence reads as input. Figure 3 shows the sequencing read mapping rate, with trimmed, mapped and unmapped reads displayed as a percentage of reads and as total read counts. The trimmed reads are the set of reads filtered out during the data preprocessing step of the analysis. The remaining reads were used as input for the mapping step and are displayed in the chart as either mapped or unmapped reads. Following read mapping, data analytics were undertaken, with expression analysis that included determination of small ncrna and repeat element abundance level. Moving beyond the mirna analysis undertaken in the original publication, the Exosome Small RNA-seq Analysis Kit identifies and maps mirnas, trnas, small rrnas, repeat elements, antisense transcripts and a variety of small ncrnas. Abundance levels are calculated and an expression summary chart is generated for ease of visualization. Figure 4. Exosome small RNA expression summary. Small RNAs, including trna, rrna, mirna, snorna and other ncrnas, as well as antisense transcripts and repeat sequences are displayed in a pie chart for each of the four breast milk exosome samples. 8

Another category of results output from the analysis kit are heat maps, which display differential expression data. The heat maps can expedite insights into which transcripts, repeat elements and small RNAs are enriched in exosomes. These graphical representations of data, where the individual values are represented as colors, provide a mechanism for visual analysis of samples side-by-side. The color variations indicate the enrichment levels, facilitating the identification of differential expression between samples or conditions, and accelerating discovery of biomarkers. The heat map navigation makes it easy to explore your results. Hovering over a region of interest reveals a tooltip with the RNA type, name, and chromosomal location with coordinates (Fig. 5). Clicking on a region of the heat map will take you to the associated locus in the UCSC Genome Browser, making it easy to identify regions of interest on the heat map then jump directly to the genomic context for further analysis. Figure 5. Heat map visualization of small RNA expression. The heat map on the left displays the four samples side-by-side (columns 1-4) for a visual comparison of expression levels by color. The fifth column in the heat map distinguishes components by chromosome location, where chromosomes are classified by unique colors. Hovering over a region of the heat map brings up the tooltip as shown, with information about the mapped exosomal component, including name, chromosomal position, and differential expression values for each sample. Clicking on a region of interest on the heat map, or alternatively clicking on the component name in the tooltip, will bring up the associated region in the integrated UCSC Genome Browser for visualization of exosome components in a genomic context. The final output from the analysis includes genome browser tracks that display the mapped reads and read coverage data. Browser tracks can be visualized in the UCSC Genome Browser directly within the Maverix Analytic Platform (Fig. 6). Viewing data in the genome browser allows for analysis in a genomic context and makes sharing and collaborating easy and secure. Once the analysis kit has provided its output, the RNA-seq reads from each of the four samples can be viewed together in the browser to facilitate visual comparisons of read lengths, coverage, and differential expression. 9

A B C Figure 6. RNA expression in the genome browser. Exosomal RNA expression viewed as mapped reads and read coverage via browser tracks visualized in the UCSC Genome Browser. Examples of small RNAs identified in breast milk exosomes include mirna (A), trna (B), as well as rrna and snorna (C). 10

Summary The study by Zhou, et al. proposed that exosomal mirnas are transferable genetic material from mother to infant, and are essential for the development of the immune system in infants [14]. We used the breast milk dataset for benchmarking of the Exosome Small RNA-seq Analysis Kit in the case study reported in this white paper. Recent exosome studies focus on identification of protein, lipids, mrna, and micrornas [4], the latter of which were the focus of Zhou, et al. in the human breast milk samples. Our analysis, carried out via the Exosome Small RNA-seq Analysis Kit, expands the number and type of small RNAs identified from the dataset. The analytic results show that there are a broad range of small ncrnas, transcripts and repeat elements within the exosomes, many of which may represent novel immune-related components in the breast milk. Utilizing the combined solution offered by SBI and Maverix Biomics, researchers have access to a comprehensive solution from isolation and purification of exosomes, through cloud-based analysis and visualization of exosome-associated RNA biomarkers. With the analysis results available for visualization within the integrated UCSC Genome Browser, the Maverix Analytic Platform makes it easy to download graphs and charts as images and create publication-ready figures from the genome browser views. References 1. Endocytosis and intracellular processing of transferrin and colloidal gold-transferrin in rat reticulocytes: demonstration of a pathway for receptor shedding. Harding C, Heuser J, Stahl P. Eur J Cell Biol. 1984 Nov;35(2):256-63. 2. Vesiclepedia: a compendium for extracellular vesicles with continuous community annotation. Kalra H, et al. PLoS Biol. 2012;10(12):e1001450. doi: 10.1371/journal.pbio.1001450. 3. Extracellular vesicles: exosomes, microvesicles, and friends. Raposo G, Stoorvogel W. J Cell Biol. 2013 Feb 18;200(4):373-83. doi: 10.1083/jcb.201211138 4. ExoCarta 2012: database of exosomal proteins, RNA and lipids. Mathivanan S, Fahner CJ, Reid GE, Simpson RJ. Nucleic Acids Res. 2012 Jan;40(Database issue):d1241-4. doi: 10.1093/nar/gkr828. 5. Deep sequencing of RNA from immune cell-derived vesicles uncovers the selective incorporation of small non-coding RNA biotypes with potential regulatory functions. Nolte-'t Hoen EN, Buermans HP, Waasdorp M, Stoorvogel W, Wauben MH, 't Hoen PA. Nucleic Acids Res. 2012 Oct;40(18):9272-85. doi: 10.1093/nar/gks658 6. The UCSC genome browser and associated tools. Kuhn RM, Haussler D, Kent WJ. Brief Bioinform. 2013 Mar;14(2):144-61. doi: 10.1093/bib/bbs038 7. FastQC: A quality control tool for high throughput sequence data. Simon Andrews. http:// www.bioinformatics.babraham.ac.uk/projects/fastqc/ 11

8. ea-utils : "Command-line tools for processing biological sequencing data". Erik Aronesty, 2011; http://code.google.com/p/ea-utils 9. Quality control and preprocessing of metagenomic datasets. Schmieder R, Edwards R. Bioinformatics. 2011 Mar 15;27(6):863-4. doi: 10.1093/bioinformatics/btr026 10. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Langmead B, Trapnell C, Pop M, Salzberg SL. Genome Biol. 2009;10(3):R25. doi: 10.1186/ gb-2009-10-3-r25 11. The Sequence Alignment/Map format and SAMtools. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352 12. Picard: A set of tools (in Java) for working with next generation sequencing data in the BAM format. http://picard.sourceforge.net 13. R: A language and environment for statistical computing. R Development Core Team (2008). R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www.rproject.org 14. Immune-related micrornas are abundant in breast milk exosomes. Zhou Q, Li M, Wang X, Li Q, Wang T, Zhu Q, Zhou X, Wang X, Gao X, Li X. Int J Biol Sci. 2012;8(1):118-23. doi: 10.7150/ijbs.8.118 12

Discover the Unexpected 1670 S. Amphlett Blvd, Suite 214, San Mateo CA 94402 650-388-9277 www.maverixbio.com 2013 Maverix Biomics, Inc. All Rights Reserved.