Table of Contents. Chapter 1 Read Me First! 1. Chapter 2 Tutorial: Estimate a Tree 11

Similar documents
Phylogenetic Trees Made Easy

A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML

Bio-Informatics Lectures. A Short Introduction

Introduction to Bioinformatics AS Laboratory Assignment 6

Genome Explorer For Comparative Genome Analysis

Bayesian Phylogeny and Measures of Branch Support

Visualization of Phylogenetic Trees and Metadata

Introduction to Phylogenetic Analysis

Contents. list of contributors. Preface. Basic concepts of molecular evolution 3

Protein Sequence Analysis - Overview -

MEGA. Molecular Evolutionary Genetics Analysis VERSION 4. Koichiro Tamura, Joel Dudley Masatoshi Nei, Sudhir Kumar

User Manual for SplitsTree4 V4.14.2

UGENE Quick Start Guide

DNA Sequence Alignment Analysis

Software review. Pise: Software for building bioinformatics webs

PAML FAQ... 1 Table of Contents Data Files...3. Windows, UNIX, and MAC OS X basics...4 Common mistakes and pitfalls...5. Windows Essentials...

Bioinformatics Resources at a Glance

Bioinformatics Grid - Enabled Tools For Biologists.

Installing C++ compiler for CSc212 Data Structures

A short guide to phylogeny reconstruction

Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment

Core Bioinformatics. Degree Type Year Semester Bioinformàtica/Bioinformatics OB 0 1

PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference

CD-HIT User s Guide. Last updated: April 5,

Pairwise Sequence Alignment

Arbres formels et Arbre(s) de la Vie

Molecular Clocks and Tree Dating with r8s and BEAST

The Central Dogma of Molecular Biology

Java Web Start Guide

Do I need to install anything on my computer to use the VC?

DnaSP, DNA polymorphism analyses by the coalescent and other methods.

BUDAPEST: Bioinformatics Utility for Data Analysis of Proteomics using ESTs

PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP

CLC Sequence Viewer USER MANUAL

Working with AppleScript

Netbeans IDE Tutorial for using the Weka API

A comparison of methods for estimating the transition:transversion ratio from DNA sequences

Introduction to Bioinformatics 3. DNA editing and contig assembly

Vector NTI Advance 11 Quick Start Guide

Multiple Sequence Alignment. Hot Topic 5/24/06 Kim Walker

Unipro UGENE User Manual Version

Code Estimation Tools Directions for a Services Engagement

STUDY GUIDE CHAPTER 4

Mac OS X. A Brief Introduction for New Radiance Users. Andrew McNeil & Giulio Antonutto

Getting Started. Getting Started with Time Warner Cable Business Class. Voice Manager. A Guide for Administrators and Users

Version 5.0 Release Notes

Protein & DNA Sequence Analysis. Bobbie-Jo Webb-Robertson May 3, 2004

Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont.

2 Short biographies and contact information of the workshop organizers

Clone Manager. Getting Started

PHYLOGENETIC ANALYSIS

Guide for Bioinformatics Project Module 3

(A GUIDE for the Graphical User Interface (GUI) GDE)

Working With Your FTP Site

Keywords: evolution, genomics, software, data mining, sequence alignment, distance, phylogenetics, selection

Using NetBeans to Compile and Run Java Programs

Analyzing A DNA Sequence Chromatogram

Unipro UGENE Manual. Version

Supervised DNA barcodes species classification: analysis, comparisons and results. Tutorial. Citations

Final Project Report

PROGRAMMING FOR BIOLOGISTS. BIOL 6297 Monday, Wednesday 10 am -12 pm

Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1

Linux Overview. Local facilities. Linux commands. The vi (gvim) editor

The FX Trading Station 2.0

Getting Started with Command Prompts

Version Control with Subversion and Xcode

Screen Design : Navigation, Windows, Controls, Text,

Lab 2/Phylogenetics/September 16, PHYLOGENETICS

Eclipse installation, configuration and operation

A Rough Guide to BEAST 1.4

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

For Introduction to Java Programming, 5E By Y. Daniel Liang

Similarity Searches on Sequence Databases: BLAST, FASTA. Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003

Lesson 1 - Creating a C18 Project with MPLAB

Software review. Analysis for free: Comparing programs for sequence analysis

A data management framework for the Fungal Tree of Life

2.3 Identify rrna sequences in DNA

How to use the Eclipse IDE for Java Application Development

Geneious 8.1. Biomatters Ltd

A combinatorial test for significant codivergence between cool-season grasses and their symbiotic fungal endophytes

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Inference of Large Phylogenetic Trees on Parallel Architectures. Michael Ott

Installing (1.8.7) 9/2/ Installing jgrasp

User Guide. v0.1 BETA. A-Lab Software Limited

Core Bioinformatics. Titulació Tipus Curs Semestre Bioinformàtica/Bioinformatics OB 0 1

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

University of Toronto

Chironomid DNA Barcode Database Search System. User Manual

Montefiore Portal Quick Reference Guide

MultiExperiment Viewer Quickstart Guide

Molecular typing of VTEC: from PFGE to NGS-based phylogeny

Biological Sequence Data Formats

Operating System Today s Operating Systems File Basics File Management Application Software

Cross platform Migration of SAS BI Environment: Tips and Tricks

Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

All other trademarks are property of their respective owners.

A Tutorial in Genetic Sequence Classification Tools and Techniques

Transcription:

Table of Contents Chapter 1 Read Me First! 1 New and Improved Software 2 Just What Is a Phylogenetic Tree? 3 Estimating Phylogenetic Trees: The Basics 4 Beyond the Basics 5 Learn More about the Principles 6 About Appendix III: F.A.Q. 7 Computer Programs and Where to Obtain Them 7 MEGA5 8 MrBayes 8 FigTree 8 Codemi 8 SplitsTree and Dendroscope 8 Utility Programs 8 Text Editors 9 Acknowledging Computer Programs 9 The Phylogenetic Trees Made Easy Website 9 Chapter 2 Tutorial: Estimate a Tree 11 Why Create Phylogenetic Trees? 11 About this Tutorial 12 Macintosh and Linux users 12 A word about screen shots 12 Searcri for Sequences Related to Your Sequence 13 Decide Which Related Sequences to Include on Your Tree 16 Establishing homology 17 To include or not to include, that is the question 18 Hall, Barry G. Phylogenetic trees made easy 2011 digitalisiert durch: IDS Basel Bern

TABLE OF CONTENTS Download the Sequences 20 Align the Sequences 23 Make a Neighbor Joining Tree 24 Summary 28 Chapter 3 Acquiringthe Sequences 29 Hunting Homologs: What Sequences Can Be Included on a Single Tree? 29 Becoming More Familiär with BLAST 30 BLAST help 32 Using the Nucleotide BLAST Page 32 Using BLAST to Search for Related Protein Sequences 34 Finalizing Selected Sequences for a Tree 38 Other Ways to Find Sequences oflnterest (Bewarel The Risks Are High) 43 Chapter 4 Aligning the Sequences 47 Aligning Sequences with MUSCLE 47 Examine and Possibly Manually Adjust the Alignment 51 Trim excess sequence 51 Elirninate duplicate sequences 54 Check Average Identity to Estimate Reliability ofthe Alignment 56 Codons: Pairwise arrtino acid identity 56 Non-coding DNA sequences 57 Increasing Alignment Speed by Adjusting MUSCLE's Parameter Settings 58 How MUSCLE works 58 Adjusting parameters to increase alignment speed 59 Aligning Sequences with ClustalW 60 Chapter 5 Major Methods for Estimating Phylogenetic Trees 61 1EARN MOREABOUTTREE-SEARCHING METHODS 62 Distance versus Character-Based Methods 64 LEARN MORE ABOUT DISTANCE METHODS 64 Which Method Should You Use? 66 Accuracy 66 Ease of interpretation 67 Time and convenience 67

TABLE OF CONTENTS XI Chapter 6 Neighbor Joining Trees 69 Using MEGA 5 to Estimate a Neighbor Joining Tree 69 LEARN MORE ABOUT PHYLOGENETIC TREES 70 Determine the suitability of the data for a Neighbor Joining tree 73 Estimate the tree 74 LEARN MORE ABOUT EVOLUTIONARY MODELS 75 Unrooted and Rooted trees 80 Estimating the Reliability ofa Tree 82 LEARN MORE ABOUT ESTIMATING THE RELIABILITY OF PHYLOGENETIC TREES 83 What about Protein Sequences? 89 Chapter 7 Drawing Phylogenetic Trees 91 Changing the Appearance ofa Tree 92 The Options dialog 94 Branch styles 96 Fine-tuning the appearance of a tree 99 Subtrees 102 Rootinga Tree 106 Finding an outgroup 108 Sa ving Trees 108 Saving a tree description 108 Saving a tree image 108 Captions 109 Chapter 8 Parsimony 111 LEARN MORE ABOUT PARSIMONY MP Search Methods 113 Multiple Equally Parsimonious Trees 116 Calculating branch lengths 117 Consensus and bootstrap trees 118 In the Final Analysis 122 Chapter 9 Maximum Likelihood 123 LEARN MORE ABOUT MAXIMUM LIKELIHOOD 123 ML Analysis Using MEGA 125 Test alternative modeis 126 Rooting the ML tree 129 m

XII TABLE OF CONTENTS The special case of zero length branches 132 Estimating the Reliability ofan ML Tree by Bootstrapping 134 What about Protein Sequences? 137 Chapter io Bayesian Inference of Trees Using MrBayes 139 MrBayes: An Overview 139 LEARN MORE ABOUT BAYESIAN INFERENCE 141 Saving time (and perhaps your sanity) 142 Choose a model 143 A General Strategy for Estimating Trees Using MrBayes 143 Creating the Execution File 144 What the Statements in the example mrbayes block do 145 How the stoprule Option of the mcmc command is implemented 148 How Do You Run a MrBayes Analysis? 148 More Complex (and More Useful) MrBayes Blocks 149 Including a user tree 149 The nperts Option of the mcmc command 150 Coding sequences and the charset Statement 150 The Screen Output while MrBayes Is Running 151 What If You Don't Get Convergence? 152 What about Protein Sequences? 156 Visualizing the MrBayes Tree 156 Using FigTree 158 The side panel 158 The icons above the tree 160 Chapter 11 Working with Various Computer Platforms 161 Command Line Programs 161 MEGA on the Macintosh Platform 162 Navigating among folders on the Mac 162 Printing trees and text from MEGA 165 The Line Endings Issue 165 Installing Command Line Programs 165 Macintosh and Linux: Use the bin folder 166 Windows: Create a bin folder and a path to it 166 Command Line Programs: The Running Environment 168

TABLE OF CONTENTS XÜi Windows: A brief visit to the Command Prompt program 168 Macintosh and Linux: A brief visit to Terminal and Unix 170 Acquiring and Installing MrBayes 172 Windows users 172 Macintosh and Linux users 173 Compile MrBayes for your Mac 173 Running the Utility Programs 174 Utility programs for Windows 175 Utility programs for Macintosh and Linux 175 Chapter 12 Advanced Alignment Using GUI DANCE 177 Issues of Alignment Reliability 177 Unreliable sequences 177 Unreliable regions 178 How GUIDANCE Works 178 An Examp/e Hlustrated by the SmallData Data Set 179 Make a nie of the unaligned sequences in FASTA format 180 Starting the ran 180 Viewing the results 182 Eliminate unreliable sequences 186 Applications of GUIDANCE 190 Chapter 13 ReconstructingAncestral Sequences 191 Using MEGA to Estimate Ancestral Sequences by Maximum Lskelihood 192 Create the alignment 192 Construct the phylogeny 193 Examine the ancestral states at each site in the alignment 194 Estimate the ancestral sequence 196 Calculating the ancestral protein sequence and amino acid probabilities 201 How Accurate are the Estimated Ancestral Sequences? 201 Chapter 14 Detecting Adaptive Evolution 203 Effect of Alignment Accuracy on Detecting Adaptive Evolution 205 Using MEGA to Detect Adaptive Evolution 205 Detecting overall selection 205 Detecting selection between pairs 206 Finding the region of the gene that has been subject to positive selection 208 Using Codemi to Detect Adaptive Evolution 211 Installation 211

XIV TABLE OF CONTENTS The files you need to run codeml 211 Questions that underlie the modeis 213 Run codeml 214 Identify the branches along which selection may have occurred 214 Test the Statistical significance of the dn/ds ratios 216 Summary 218 Chapter 15 Phylogenetic Networks 219 Why Trees Are Not Always Sufficient 219 Unrooted and Rooted Phylogenetic Networks 221 Using SplitsTree to Estimate Unrooted Phylogenetic Networks 221 Estimating networks from alignments 221 LEARN MORE ABOUT PHYLOGENETIC NETWORKS 223 Rooting an unrooted network 234 Estimating networks from trees 235 Consensus networks 236 Supernerworks 241 Using Dendroscope to Estimate Rooted Networks from Rooted Trees 243 Chapter 16 Some Final Advice: Learn to Program 249 Appendix I File Formats and Their Interconversion 251 Format Descriptions 251 The MEGA format 251 TheFASTAformat252 The Nexus format 253 The PHYLIP format 256 Interconverting Formats 257 FastaConvert and MEGA 257 Other format conversion programs 257 Appendix II Additional Programs 259 Appendix III Frequently Asked Questions 263 Literature Cited 267 Index to Major Program Discussions 269 Subject Index 275