Bioinformatics Unit Department of Biological Services Get to know us
Domains of Activity IT & programming Microarray analysis Sequence analysis Bioinformatics Team Biostatistical support NGS data analysis Image analysis
Mode of work Consulting (first meeting free) Experimental design (kickoff meeting) Analyzing Complete project analysis done by our team Guided analysis In your lab analysis (group meetings) Training, personally and in groups Teaching FGS courses and workshops Help desk
Mode of work Consulting Experimental design (kickoff meeting) Analyzing Complete project analysis done by our team Guided analysis (for NGS) In your lab analysis Training, personally and in groups Teaching FGS courses and workshops Help desk
Guided analysis: Simplifying bioinformatics analysis Command line operations Graphical interface and built-in pipelines (Chipster)
Types of Analysis Aim: Find and understand the biological meaning of the data But.. How do we get to the aim? We can help you - choose the proper tools and build proper workflows for analysis perform analysis using existing tools customize existing tools to specific data and aims development of new methods and algorithms
Previous achievements Fruitful collaborations resulting in ~40 publications since 2013 FGS courses given in 2015 (offered also in 2016): Statistical Principles in the Analysis of Research Data Practical Image Analysis for Biology Introduction to Deep Sequencing Analysis course Workshops given in 2015 (2016 to be announced): CRIPSR Design ChIP-Seq: Using Deep Sequencing to Discover Protein-DNA Interactions Ilastik: machine learning tool for image analysis Population structure
Infrastructure Workstations for bioinformatics applications and image analysis (@Levine bldg) Unix servers for intensive data analysis (local/wexac cluster nodes) Software licenses for: Bioinformatics Biostatistics Image Analysis BioImg Storage Server: Automatically copies data from Imaging equipment & FACS at core facilities and individual labs Lets the users access, share, annotate the data everywhere on campus Managing Biological Services LIMS
BioImaging Informatics How to store / display / enhance / quantify Image Data Light microscopy, EM, Histology, MRI, CT Quantitative Image Analysis Image data in Biology becomes larger and more important Our aims: Replace subjective visual inspection and manual measurement by objective quantitative computerized image processing & analysis Design high throughput pipelines for image data Record and analyze tens of parameters per cell Meet reviewer requirements for quantitative measurements
Common Image Quantifications Questions What is the mean intensity of the green channel in wild-type and knockout? What is the length or area of a structure in each cell? How does the intensity of the GFP change over a time course? How is the protein distributed, perhaps relative to a particular structure? How many cells, spots are in the image? How fast and far does the object move? How much does the green colocalize with the red?
Biostatistics Biostatistical support We provide biostatistical support at all stages of the experiment - from planning to final analysis and presentation. Experimental design, sample size, data organization, choosing the proper tests, interpretation and presentation. CyTOF analysis CyTOF (Cytometry by Time of Flight) can characterize single cells with dozens of probes (antibodies) per cell. Analysis of the results requires recently developed tools for multi-parametric data. We provide support for these tools, either through training or conducting the complete analysis.
Biostatistics Possible questions: How many samples should I use in each of my treatments? How do I analyze the effect of several experimental factors and/or time points? Is it possible to use data from an animal that survived only half of the experiment? How do I handle multiparametric data, e.g. from microarrays or CyTOF?
Microarray Analysis - Differential gene expression analysis - Clustering of genes/conditions - Mining and reanalysis of microarray data from published studies (GEO) - Comparative analysis of multiple microarray experiments at the gene and pathway levels
What can NGS do for you? Mutation discovery mrna expression and discovery AAAAAAAA AAAAAAAA AAAAAAAA AACTGGTAC AACTCGTAC AACTGGTAC AACTGGTAC microrna expression and discovery Alternative splicing and allele specific expression Sequencing novel genomes Protein-DNA interactions
Next Generation Sequencing Tools by Applications NGS raw reads QC: fastqc, FastQ Mapping Bowtie, BWA, STAR, TopHat Genomic Variation IGV, VCFtools, FreeBayes, SAMtools, GATK, Plink De novo Assembly Velvet, Newbler Metagenomics Metalook, Metamine, Metastats, MEGAN RNA-Seq RSEM, DESeq, Cufflinks Trinity ChIP-Seq MACS, HOMER, Meme- ChIP, CEAS, GREAT
Werner T Brief Bioinform 2010;11:499-511
CLASSICAL BIOINFORMATICS DNA RNA Basic Tools for all: Database Searches Alignments PROTEIN
DNA Analyses Genome Browsers Comparative Genomics Genomic Primer Design Gene Prediction Phylogenetics Gene Structure SNP Interpretation CRISPR Sanger Sequence Troubleshooting
RNA Analyses EST/gene building Splice Variants Primer Design 3 UTR signals sirna Secondary Structure Cloning/Plasmid Design Restriction Mapping Expression mirna Long non-coding RNA Sanger Sequence Troubleshooting
Protein Analyses Protein-protein interactions Subcellular Localization Antigen Design Phylogenetics Motif Finding/Definition Secondary Structure Post-translational modifications
Transcriptional Control (DNA-Protein-RNA interactions) Transcription Start Site Definition Promoter Analyses Conservation SNP Interpretation Transcription Factor Binding Site Definition Transcription Factor Binding Site Prediction
Site-licenses for software packages Sequence analysis: Lasergene, MacVector, Sequencher, SnapGene Promoter analysis: Genomatix Genome Analyzer (GGA) Pathways and system analysis: Ingenuity IPA Data Analysis Microarray and NGS statistical analysis PARTEK Genomics Suite Image Analysis Imaris, AutoQuant, Volocity, Analyze, Avizo (by CRS), Arivis Tool for studying publicly available expression data (GEO) GenVestigator
Bioinformatics Unit - Contacts Programming projects and programming help: Dr. Jaime Prilusky Genomics, Promoter analysis, Sequence analysis and Phylogenetics: Dr. Shifra Ben-Dor High-throughput genomics and Next-generation sequencing: Dr. Ester Feldmesser, Dr. Dena Leshkowitz and Dr. Naama Kopelman Microarray analysis, Sequence analysis, software installation, general issues: Irit Orr Biostatistical support: Dr. Ron Rotkopf BioImaging analysis: Ofra Golani Infrastructure: Ofra Golani (image related), Irit Orr (bioinformatics related), Kiril Kogan, Jaime Prilusky Head of unit: Dr. Dena Leshkowitz (x6330) Project coordinator: Irit Orr (x2470)
The End and hopefully the beginning of collaboration on your projects!
Links to our websites: General Information: http://www.weizmann.ac.il/biological_services/bioinformatics-about Sequence Analysis (tools): http://bip.weizmann.ac.il/ Image Analysis: http://www.weizmann.ac.il/vet/ic/informatics/about-service
Site-licenses for software packages GCK (gene construction kit) eliminates tedious examination of DNA sequence data by automatically identifying open reading frames (ORF's), keeping track of sticky ends during cutting and pasting of restriction enzyme digestion fragments, assisting withpcr primer design, and enabling comprehensive annotation of DNA sequence features. This DNA analysis software allows multiple files to be opened and displayed simultaneously, allowing DNA sequences to easily be copied and pasted between plasmids and vectors to represent real-world DNA cloning protocols. The Gene Construction Kit software is available for both Windows and Macintosh users. MacVector Major Features: Graphical Sequence Editing, Gateway and Topo Cloning Auto Annotation Primer Design DNA Analysis Protein Analysis Database Searching Multiple Sequence Alignment Sequence Assembly
Sequencher features: Sequence editing Sequence trimming Sequence assembly Assemble o reference Multiple alignment Restriction mapping Confidence value SNP detection Automated analysis SnapGene for Cloning features: Gibson Assembly Restriction cloning PCR & Mutagenesis Agarose Gel Electrophoresis Enzyme Sites Primers ORFs Chromosome size sequencing
Ingenuity IPA Data Analysis IPA Core Analysis has multiple ways of relating the molecules in your dataset to the body of information in the Ingenuity Knowledge Base. Biological functions and diseases that are over-represented in your data, and the predicted directional effects on these functions and diseases. Signaling and metabolic canonical pathways enriched in your data. Predicted upstream regulators that might explain the changes observed in your data. Molecular networks (algorithmically generated pathways describing potential molecular interactions in your experimental system)
PARTEK Genomics Suite Partek Genomics Suite is a comprehensive suite of advanced statistics and interactive data visualization specifically designed to reliably extract biological signals from noise and includes all of the functionality of the Discovery Suite. Designed for high-dimensional genomic studies containing thousands of samples, Partek GS is fast, memory efficient and will analyze large data sets on a personal computer. It supports a complete workflow including convenient data access tools, identification and annotation of important biomarkers, and construction and validation of predictive diagnostic classification systems.
Acknowledgments Source of Images (slides 2,5,9) Confocal florescence microscopy: Source: Nature of Nature Histology: Source: central microscopy Iowa Light microscopy: Phase, DIC. Source: Nikon Microscopy FIB-SEM: Ilana Sabanay, Elior Peles, Alon Weiner MRI: Michal Neeman lab