Associating a Spreadsheet with an Annotation File in Partek Genomics Suite This tutorial describes how to associate a spreadsheet with an annotation file as well as how to create a custom annotation file. Associating a Spreadsheet with an Annotation File Fig 1: Checking annotation file Open annotation file with text editor Make sure the first column content in the annotation file match the column header of your data spreadsheet Make sure the first column header is named Probe Set ID Associating a Spreadsheet with an Annotation File in Partek Genomics Suite 1
Figure 2: Adding a property to the spreadsheet Open data spreadsheet in Partek Genomics Suite Select File > Properties while the data spreadsheet is opened and selected If you have not yet configured the properties of the spreadsheet, you will need to choose the category of the data Select Genomic Microarray Figure 3: Choosing the type of genomic data Associating a Spreadsheet with an Annotation File in Partek Genomics Suite 2
From this dialog, choose the category that best describes your data Figure 4: Configuring the genomic properties Type in the chip name in the box Select Browse to choose the appropriate annotation file for this data. The annotation file should contain the genomic location of the elements in the spreadsheet as well as any additional information that you would like to use for annotation. Select OK. It is recommended that you save the spreadsheet so that the association is available the next time that you open the file Select the Edit Genome button in the Species panel to configure the properties of the species or add a species that doesn t appear in the list (Figure 5) Associating a Spreadsheet with an Annotation File in Partek Genomics Suite 3
Figure 5: Editing the Genome dialog If you specify the UCSC Species ame, your HTML reports will include links to the UCSC genome browser. If you specify the IGB Species, the HTML reports will include links to Affymetrix IGB. If you specify a Cytoband File, the genome viewer will use this file for the location and shading of cytobands. Downloading Cytoband Files Cytoband files are available from the UCSC site. To download the files, navigate to the downloads page, located at http://hgdownload.cse.ucsc.edu/downloads.html. On the downloads page, select the species that you are interested in (Figure 6) Associating a Spreadsheet with an Annotation File in Partek Genomics Suite 4
Figure 6: Viewing the UCSC Downloads page Select the Annotation database link for the genome build that you want to use (Figure 7) Figure 7: Selecting the build Use the find function of your browser to get to the cytoband files Download and unzip the cytoband.txt.gz file (Figure 8) Associating a Spreadsheet with an Annotation File in Partek Genomics Suite 5
Figure 8: Downloading the file Using the Annotation File Once the annotation file has been associated with the spreadsheet several new options will appear. Using the File Manager Select View > Chromosome View to invoke a genome view. On a result spreadsheet, <right-click> on a column header and choose Insert Annotation to add fields from the annotation file to the spreadsheet Select Transform > Genomic Smoothing to perform smoothing based on genomic location. This is described in chapter 10 of the Partek Online Documentation To view or modify the files associated with each chip select Tools > File Manager; the Specify File Locations dialog will appear (Figure 9) Associating a Spreadsheet with an Annotation File in Partek Genomics Suite 6
Figure 9: Configuring the File Manager Creating a Custom Annotation File If you are using an Affymetrix or an Agilent chip, you can obtain the appropriate annotation file from the chip manufacturer. If you have a custom chip or want to use a customized annotation file, then you will be able to add annotation to your spreadsheet and invoke a Probe Set HTML report provided that the annotation file meets these criteria: The annotation file must have a header (a line which contains a column label for each field) Comments are allowed before the header. They must start with # The fields of the annotation file must be tab or comma delimited The values in the first column must correspond to the column labels in the spreadsheet (with genes on columns) or values in a specific column (with genes on rows) In order to invoke a genome view your annotation file must also have one or more columns which contain the genomic location in a format that Partek can recognize. The annotation file must contain a column that has the chromosome information. It should also include the base pair location (start and stop or physical position). Additionally, it may include the cytoband and/or strand. The table below provides possible column labels, a description of the format for that field, and an example. Note: In this table the examples are for a gene on the top strand of chromosome 3; on the p arm in cytoband 14.2 starting at 69,871,322 base pairs and ending at 70,100,176 Associating a Spreadsheet with an Annotation File in Partek Genomics Suite 7
Column label Description of format Example chromosome OR seqname a choromosome label 3 OR chr start an integer, the start position 69871322 (in base pairs) of the feature stop an integer, the stop positon 70100176 (in base pairs) of the feature GenomicCoordinates chromosome:start-stop 3:69871322-70100176 strand + for top, - for bottom + Physical Position an integer, the position (in 70100176 base pairs) of the feature Table 1: Sample table for genes on chromosome 3 Here are a few example annotation files: #using Agilent s format ProbeID GeneName GenomicCoordinates Cytoband A_44_P1025812 TC521361 chr12:2546883-2546824 rn 12p12 #using the format of Affymetrix SNPs "Probe Set ID","Chromosome","Physical Position","Strand","Cytoband" "SNP_A-1512540","9","22205296","-","p21.3" # using the format of Affymetrix exons "probeset_id","seqname","strand","start","stop" "2315588","chr1","+","1155398","1155624" End of Tutorial This is the end of the associating a spreadsheet tutorial. If you need additional assistance with performing this task, you can call our technical support staff at +1-314-878-2329 or email support@partek.com. Associating a Spreadsheet with an Annotation File in Partek Genomics Suite 8 Copyright 2010 by Partek Incorporated. All Rights Reserved. Reproduction of this material without express written consent from Partek Incorporated is strictly prohibited.