Database Searching Tutorial/Exercises Jimmy Eng



Similar documents
Mascot Search Results FAQ

Sub menu of functions to give the user overall information about the data in the file

CPAS Overview. Josh Eckels LabKey Software

Charter Business Phone. Online Control Panel Getting Started Guide. Document Version 1.0

ORACLE USER PRODUCTIVITY KIT USAGE TRACKING ADMINISTRATION & REPORTING RELEASE 3.6 PART NO. E

SPSS: Getting Started. For Windows

HDAccess Administrators User Manual. Help Desk Authority 9.0

Health Indicators Advancing Healthy Aging in Your Community. Database Instructions for Managers

Using Barracuda Spam Firewall

Center for Faculty Development and Support. Gmail Overview

Fairfield University Using Xythos for File Sharing

Novell ZENworks Asset Management 7.5

TRIM: Web Tool. Web Address The TRIM web tool can be accessed at:

EXCEL Using Excel for Data Query & Management. Information Technology. MS Office Excel 2007 Users Guide. IT Training & Development

Learning Objectives:

Tutorial for Proteomics Data Submission. Katalin F. Medzihradszky Robert J. Chalkley UCSF

Outlook Web Access (OWA) User Guide

Tutorial for proteome data analysis using the Perseus software platform

Lab: Data Backup and Recovery in Windows XP

Getting Started with Tableau Server 6.1

i>clicker v7 Gradebook Integration: Blackboard Learn Instructor Guide

econtrol 3.5 for Active Directory & Exchange Self-Service Guide

!"#$ Stonington Public Schools Parents Guide for InfoSnap Online Enrollment. for Returning. Students. August. Online Enrollment.

Working with Office Applications and ProjectWise

How To Use Syntheticys User Management On A Pc Or Mac Or Macbook Powerbook (For Mac) On A Computer Or Mac (For Pc Or Pc) On Your Computer Or Ipa (For Ipa) On An Pc Or Ipad

Business Intelligence Office of Planning Planning and Statistics Portal Overview

Posting Job Orders. mindscope Staffing and Recruiting Software

Teacher References archived classes and resources

PCRecruiter Resume Inhaler

泛 用 蛋 白 質 體 學 之 質 譜 儀 資 料 分 析 平 台 的 建 立 與 應 用 Universal Mass Spectrometry Data Analysis Platform for Quantitative and Qualitative Proteomics

INTRODUCTION TO THE LS360 LMS

Online Web Learning University of Massachusetts at Amherst

MassMatrix Web Server User Manual

Security Analytics Engine 1.0. Help Desk User Guide

Test Generator. Creating Tests

Cognos 10 Getting Started with Internet Explorer and Windows 7

Basics Series-4004 Database Manager and Import Version 9.0

What is a Mail Merge?

Lab - Data Backup and Recovery in Windows XP

MultiAlign Software. Windows GUI. Console Application. MultiAlign Software Website. Test Data

Processing Data with rsmap3d Software Services Group Advanced Photon Source Argonne National Laboratory

ShoreTel Communicator for Web

How To Create A Team Site In Windows.Com (Windows)

Finance Reporting. Millennium FAST. User Guide Version 4.0. Memorial University of Newfoundland. September 2013

Business Objects InfoView Quick-start Guide

Marcum LLP MFT Guide

Real Estate Reports Overview Quick Reference Guide

Frequently Asked Questions Mindful Schools Online Courses. Logging In Navigation s & Forums Tracking My Work Files...

6/21/12 Procedure to use Track-It (TI) Self Service version 10 and later


Rational Quality Manager. Quick Start Tutorial

COURSE NAVIGATOR DEMO QUICK GUIDE

Remote Desktop Services - Multimedia. 1. On a PC, open Internet Explorer and type in this URL:

Virtual Office Online and Virtual Office Desktop

BSDI Advanced Fitness & Wellness Software

Erie 1 BOCES/WNYRIC. Secure File Transfer. Upload/Download Wizard

PORTAL ADMINISTRATION

EditAble CRM Grid. For Microsoft Dynamics CRM. How To Guide. Trial Configuration: Opportunity View EditAble CRM Grid Scenario

Why should I back up my certificate? How do I create a backup copy of my certificate?

DataDirector Getting Started

Once you ve signed up, all you ll have to do is sign in. To sign in key in your address and password.

Introduction to RefWorks

EQUELLA. Blackboard Learn Configuration Guide. Version 6.2

MASCOT Search Results Interpretation

Fax User Guide 07/31/2014 USER GUIDE

Go. Stockton Portal Tips

How to install and use the File Sharing Outlook Plugin

Onboarding for Administrators

DOCUMENT MANAGEMENT SYSTEM

Strategic Asset Tracking System User Guide

Creating and Using Forms in SharePoint

SIMPLY REPORTS DEVELOPED BY THE SHARE STAFF SERVICES TEAM

COLLABORATION NAVIGATING CMiC

RIFIS Ad Hoc Reports

User Manual - Sales Lead Tracking Software

UF Health SharePoint 2010 Introduction to Content Administration

Lab 2: MS ACCESS Tables

Express222 Quick Reference

USING STUFFIT DELUXE THE STUFFIT START PAGE CREATING ARCHIVES (COMPRESSED FILES)

USER GUIDE CALIFORNIA DIRECT. AXESSON 100 ENTERPRISE WAY, SUITE C-110 SCOTTS VALLEY, CA (831)

Crystal Print Control Installation Instructions for PCs running Microsoft Windows XP and using the Internet Explorer browser

DIT Online Self Service for Clients

Rational Software. Getting Started with Rational Customer Service Online Case Management. Release 1.0

Dreamweaver Tutorials Creating a Web Contact Form

Industrial Security Facilities Database (ISFD) Troubleshooting Tips

ITS Spam Filtering Service Quick Guide 2: Using the Quarantine

How to Use the Staff Directory Publisher to Update School Staff Directory Lists

ILLUMINATE ASSESSMENT REPORTS REFERENCE GUIDE

Also on the Performance tab, you will find a button labeled Resource Monitor. You can invoke Resource Monitor for additional analysis of the system.

How to integrate Verax NMS & APM with Verax Service Desk

ParishSOFT Remote Installation

FaxFinder Fax Servers

Database Forms and Reports Tutorial

CITS. Windows & Macintosh Zimbra Calendar 5.0. Computing and Information Technology Services. Revised 8/21/2008

SonicWALL GMS Custom Reports

System requirements 2. Overview 3. My profile 5. System settings 6. Student access 10. Setting up 11. Creating classes 11

Smart Web. User Guide. Amcom Software, Inc.

Using the BWSD Help Desk Website

Rochester Institute of Technology. Finance and Administration. Drupal 7 Training Documentation

Transcription:

Database Searching Tutorial/Exercises Jimmy Eng Use the PETUNIA interface to run a search and generate a pepxml file that is analyzed through the PepXML Viewer. This tutorial will walk you through the steps needed to process data, starting from data in the mzxml format through an X!Tandem search and the TPP visualization tools. search data with X!Tandem the search results will then be converted to pepxml files lastly the multiple pepxml files will be combined into a single view and opened in the PepXML Viewer for analysis To start, open up a web browser and click on the home page icon in your browser and access the PENTUNIA link get to the tools user interface at URL http://localhost/tppbin/tpp_gui.pl. Log in to the TPP graphical user interface using the username guest and password guest. On the Home tab, select Tandem as the analysis pipeline to use. The X!Tandem search involves three steps: (1) choosing input mzxml files, (2) choosing the Tandem search parameters, and (3) selecting the sequence database to search against. You will learn how to convert RAW data to mzxml or mzml later; I m providing you with mzxml files for this exercise because they contain a subset of an entire acquisition so that we can search them quickly for this exercise. First, click on the Add Files button to choose the input mzxml files. Select the two mzxml files that are present in the data\class\search\ directory.

Next, choose the Tandem parameters file. This file is named tandem-params.xml and exists in the same directory. The image on the right shows the contents of the file if you view it. Tandem has a rich set of search parameters and we ll discuss what some of these parameters mean. Lastly, select the sequence database to search. Choose the dbase\ yeast_orfs_all_rev.20060126.fix.fasta file. Your browser window should look similar to the image on the right. Confirm everything is set correctly and click on the Run Tandem Search button to launch X!Tandem. On these laptops with these search parameters, the two mzxml files should complete searching in a few minutes. Once the searches are completed, proceed to the next tab, pepxml. This next step in the analysis pipeline is to convert the native search results, in this case Tandem XML output, to our pepxml format. Simply choose the two OR2008*.tandem files in the search directory (class\search\) and click on the Convert to PepXML button. This will launch the data conversion which takes the Tandem output and creates corresponding pepxml files that have.pep.xml extensions.

We re just about through with our pipeline processing steps. Next select the Analysis Pipeline link again and go to the Analyze Peptides tab. Under the Select File(s) to Analyze pane, choose the two OR2008*.tandem.pep.xml files that we just created. In the PeptideProphet Options pane, unselect the checkbox that says RUN PeptideProphet as that s a feature you ll learn about tomorrow. Keep all other defaults, scroll down to the bottom of the page, and hit the Run XInteract button. This effectively combines together the identifications from the two Tandem searches and will present them in the PepXML Viewer interface for analysis. When the command has completed, select the Click here to view log file and output files link to get to the output results. Then, select the View link within the Output Files pane. The data being analyzed are two Orbitrap runs of a SILAC labeled yeast dataset. However, we did not specify the SILAC-heavy modification in the search parameters so we are only going to identify the normal, unlabeled peptides. The data were searched against a database composed of the yeast + decoy sequences. The yeast protein sequences are denoted by their ORF identifiers (proteins beginning with YAL, YOR, etc.) while the decoy entries have protein identifiers that begin with REV0 or REV1. A point to keep in mind is that we are searching X!Tandem with the k-score score plug-in which is different from native X!Tandem analysis. 1. The first step before we get started is to remove some unnecessary columns from the default view and add a peptide mass column. Click the Pick Columns pane. You can add, remove and reorder columns to view here. Remove the bscore and yscore columns and then click the Update Page button. Your browser should now look similar to the image to the right.

2. Sort the results in ascending order based on the EXPECT score column. You can do this in either the Summary pane or the Display Options pane. Better identifications have lower EXPECT scores. Notice that the IONS, PEPTIDE and PROTEIN columns contain hypertext links. These open up an ms/ms spectrum viewer, NCBI blast link, and a sequence viewer respectively. Go ahead and click on the spectrum link for the first entry in the results view. You should see the screen on the right which will open in a new tab in the Firefox browser. What is notable about the spectrum shown? Does it look like a good identification? What features do you notice to support this? Go ahead and click on the PEPTIDE and PROTEIN links as well to see what they lead to. 3. Now let s look for the best scoring decoy matches. Decoy hits are the known wrong ones! You could scroll through every page looking for a protein that begins with REV. However, let s use the viewer s filtering options to select just these entries. To do this, go to the Filtering Options pane and enter ^REV in the required protein text entry box. Look at the spectra of a few of these identifications. How plausible do they appear? 4. What is the false discovery rate (FDR) in the dataset if you apply an EXPECT cutoff of 0.1? What about 0.01? What about 1.0? Remember, we calculated FDR as num_decoy_hits num_target_hits. To find these values, clear out any existing applied filters and enter a value of 0.1 or 0.01 in the max

expect search results filter box. The Summary pane will tell you how many entries remain in the list after applying your filters. For a 0.1 cutoff, you should see displaying 1438 of total 4158 total spectra indicating 1438 peptide-to-spectrum matches pass this cutoff. So how many of these are target entries versus decoy entries? The easy way to determine this is to add an additional protein filter of ^REV to determine the number of decoy entries. When you do so, you will see that there are 4 decoy hits which consequently mean that there are 1434 target hits. So at a 0.1 expectation score cutoff, the estimated FDR of the target hits would be 4 1434 = 0.0028. What are the estimated FDRs at 0.01 and 0.1 cutoffs? More importantly, how would you choose an appropriate cutoff to apply? 5. Remove all filtering options, add the PPM column, remove HYPER and NEXT columns, and sort by ascending order via the "expect" score. Not the PPM mass error of the good scoring IDs in the first few pages. And then browse to the last few pages and note their corresponding calculated PPM mass error. What is the highest scoring peptide with a bad PPM mass error (where bad is say >50 PPM). Is there a plausible explanation for the large mass error or is this an incorrect match? 6. Lastly, go ahead and explore the rest of the interface. Take a look at the available columns and how you can re-order them. Explore your ability to sort/filter based on the various properties. In order to start all over from scratch, including restoring data and views to default, choose Restore Original from the Other Actions pane. You will be using this viewer in the upcoming days so you should definitely become comfortable navigating the interface today.