Step-by-Step Guide to Basic Expression Analysis and Normalization

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Step-by-Step Guide to Basic Expression Analysis and Normalization"

Transcription

1 Step-by-Step Guide to Basic Expression Analysis and Normalization Page 1

2 Introduction This document shows you how to perform a basic analysis and normalization of your data. A full review of this document will provide you with a basic understanding of a broad number of applications. In this example, we examine the GEO data set GSE11352 and track gene expression changes in response to estrogen treatment of MCF7 cells during a time-course experiment using Affymetrix arrays. You can download the data files we will use from the same page where you downloaded this document. The data set we will use is a highly modified subset of GSE34945, originally downloaded from the Gene Expression Omnibus (GEO) web site. Objectives Learn how to set up the Basic Expression Workflow and interpret results Learn about capabilities for downstream analysis of workflow results Learn about the different options for normalizing array data Basic Expression Workflow JMP Genomics incorporates several commonly used quality control and ANOVA modeling processes into the Basic Expression Workflow. This is a good tool for beginning users, because the dialog is simplified, with fewer options than in the underlying process dialogs. Basic Workflow results are organized by process and accessed from a single JMP Journal. This organization makes it simple to open tables and graphical results from one or more workflow processes. The Basic Expression Workflow combines the setup of three or more different processes into one dialog. Both the design file and the data file created during the import process are required to run this workflow. Optional files include an annotation data set and track files. Data Quality Assessment As an initial step, we will assess the quality of the data set. 1. Select Workflows > Basic > Basic Expression Workflow from the Genomics Starter to open the Basic Expression Workflow. 2. Examine the General tab. 3. Choose the edf_data.sas7bdat input data set. 4. Highlight Probe_Set_ID in the Available Variables window, and click the right arrows ( ) to add this variable to the By Variables, Label Variable, and Variables to Keep in Output fields. Page 2

3 5. Choose an output folder. All results (data sets, scripts, and graphics) generated by the Basic Workflow will be stored here. 6. Type MCF7 in the Workflow Output Name field. The completed General tab is shown below: 7. Select the Experimental Design tab. 8. Choose the edf.sas7bdat file from the supplied data as the Experimental Design Data Set. 9. Highlight and add the following variables in the Available Variables field to the specified variable fields. Color Variables Label Variable Variables Defining Plotting Groups 10. Type Time Characteristics (without the quotation marks) in the Variance Component Effects box. To specify the interaction between Time and Characteristics, type Time*Characteristics (without the quotation marks) in the Variance Component Effects box. This will calculate the variances due to each individual term as well as the interaction. Page 3

4 The completed Experimental Design tab is shown below: 11. Click the Open button (next to the Experimental Design Data Set field) to open the design file. 12. Select Analyze > Distribution from the JMP menu, and add the and variables to the Y, Columns field. This will allow us to explore the design of the experiment. 13. Click OK. Note that you can quickly see the numbers of samples in each group. Clicking on one histogram bar in the distribution will highlight the same samples in the other Page 4

5 distributions and in the data set. This data set has 18 samples evenly distributed across all groups; it is a balanced design. The resulting QC plots will be colored by, and the scatter plots will be grouped by. We will examine the Variance Component Effects across all samples and probes that are due to Time and Characteristics and their interaction. We do not have any Adjustment Effects, such as array lot number, to add. 14. Close the distribution and design table windows. 15. Select the QC and Normalization tab. Examine the tab. 16. Make sure the three boxes at the top (Distribution Analysis, Correlation and Grouped Scatterplots and Correlation and Principal Variance Components Analysis) are checked. 17. Expand the PVCA (Principal Variance Components Analysis). The Cumulative Proportion of Variation to Explain with Principal Components is set to 0.9 by default. This is usually more than sufficient to get a high level of understanding about the sources of variation. Selected normalization methods are optional in the workflow, but we will consider normalization approaches later in this document. Note that quality control analyses can be run before or after normalization, or both. 18. Keep all other default values the same. The completed QC and Normalization tab is shown below: 19. Click Run. Quality Control Results When all quality control processes are complete, a JMP journal is created. Page 5

6 1. Examine the MCF7 journal. The Results of each process can be launched by clicking on the corresponding link under each heading. The Reopen Dialog links will open individual process dialogs (rather than the workflow dialog), pre-loaded with the settings specified in the workflow. The settings can be changed and the process re-run if desired. The Close All Other Windows button closes everything but the journal. Clicking Open Workflow Builder Dialog will re-open the workflow that was run in the Workflow Builder interface. 2. Click Reopen Dialog under Process 2 Data Correlation (MCF7). Examine the different tabs to see how the available options compare to the streamlined options in the Basic Expression Workflow. 3. When finished, click Close All Other Windows from the journal. 4. Click Results from under the Process 1 Data Distribution heading. The first tab on the dashboard is Kernel Density Estimates. o The Tabs section in the upper left hand corner of the report displays a list of available results tabs. In some cases, only a subset of tabs will be displayed initially. Tabs that are not currently in view may be opened by clicking on the pull-down button for that tab and selecting View Tab. o The Tabs section will often be followed by other sections. For example, in this case, a Launch Follow-Up Process section contains action buttons to launch additional analysis processes. If data tables other than those used to create figures displayed in tabs are available, they will be listed next. To close a tabbed dashboard, use the Close All button. Page 6

7 The overlaid curves display the distribution profile of all the arrays in the data set, with intensity on the x-axis and the relative frequency on the y-axis. You will note that there is one array that is slightly different than the others. If you hover over the corresponding line with the mouse, you will see that it is an untreated control array from the 12 hour time point. If we wished to remove this array from further analysis, we would not have to modify the original input data table. Since the rows present in the design file dictate the columns from the input data set that will be used in the analysis, we can simply delete the corresponding row from the design file. 5. Highlight the outlier array by clicking on the red line at the top of the plot. 6. Click the Create Subset Experimental button found above the graph window (circled in the figure above). A new file with the suffix, containing all of the data, minus the outliers, is created in the output directory. 7. Click on the Box Plots tab to bring it into view. The box plot graph shows side-by-side comparisons of all arrays in the data set. 8. Click on the red triangle hotspot to the left of the graph s title to access drilldown options for viewing data set quantiles and other information. Page 7

8 Direct your attention to the Launch Follow-Up Process section of the Results window. Launch Follow-Up buttons are surfaced when there are reasonable next steps to take after a process has run. While the suggested processes can also be accessed from the JMP Genomics menu, clicking the button to launch the process will automatically load known information, such as input data sets picked in a prior step and output folders, into the process. 9. Click the Filter Intensities button. Note that the General tab of the Filter Intensities dialog (shown below) is automatically loaded with the original input files we specified in the workflow. Page 8

9 Let s work through an example of filtering data. We will change expression values less than -1.6 to missing, and then we will delete probes where a majority of samples within either the estrogen or control group have such values. 10. Select the Replace Low/High tab. Check the box to Replace Low Values to activate the additional filtering options below. Type in -1.6 into the Replace Intensities Falling Below this Value text box. Note: There are a number of other options (e.g., Standard Deviations) to use as criteria for filtering rows from a data set. An identical set of options can be used for filtering high values. The complete Replace Low/High tab is shown below: Page 9

10 11. Select the Delete Rows tab. Under the Delete Rows option, click the radio button if any or the expressions are satisfied. Type NMISS>=5 in the Delete Rows with Number of Missing Values Satisfying This Expression text box. o When used alone, this filtering expression will remove rows where there are 5 or more missing values. o However, we will be performing the filtering step in a group-wise manner. Rather than deleting a row if 5 or more of any values in the row are missing, we will be specifying a grouping variable on the next tab so that if more than 5 of the 9 values are missing in either the estrogen or control group (or both), we will delete that row. o Many other criteria besides number of missing values can be used, and these can be combined with either OR- or AND-type Boolean statements. The completed the Delete Rows tab is shown below: Page 10

11 12. Select the Groups tab. Highlight and click to add it to the Variables Defining Groups field. o Recall that the variable classifies samples into one of two groups, E2 (estrogen-treated) and Cont (control). Type 49 in the Group Percentage for Deletion text field. o Specifying a percentage less than 50% here will cause a row to be filtered from the data set if it contains 5 or more missing values (NMISS >= 5) in one or both treatment groups. The completed Groups tab is shown below: 13. Select the Options tab. Here, you may specify a custom output file name. If you do not, a default name will be created for you 14. Click Run. Page 11

12 Note: The Results window that appears shows the path of the data set and reveals that the number of rows in the filtered data set is less than the number of rows in the original data set. Also, a new Launch Follow-Up Process window has been created, in which we could click the Basic Expression Workflow button to run a new workflow with the filtered data set. 15. Go back to the workflow results journal and click the Close All Other Windows button. 16. Click the Results button under Process 2-Data Correlation (MCF7). 17. Click the Correlation Heat Map tab. The heat map and associated dendrogram are used to visualize the magnitude of the correlations between data from all possible pairs of samples. o The clustering algorithm applied here is unsupervised. Groups of samples whose data are positively correlated cluster together in regions of light to dark red colors. Groups of samples that display negative correlations are displayed in light to dark blue areas of the heat map. Pairs of samples that display low to no correlation will cluster in gray areas of the graph. o Examine the dendrogram found to the right of the heat map. It is immediately apparent that the estrogen vs. control treatment effect explains the split between the two main clusters. Replicate samples from the same time points within each treatment cluster together. This indicates Page 12

13 that the estrogen treatment is likely a primary effect and time a secondary effect. 18. Click on one of the main branches of the dendrogram to the right of the heat map. 19. Select Analyze > Distribution from the JMP menu and add Characteristics and Time as the Y, Columns. What do you see? 20. Click on the 3-D PCA Plot tab. This plot displays a three-dimensional view of a principal components analysis performed on the paired correlations between samples. 21. Click the red triangle by the plot title and select Normal Contour Ellipsoids > Group by Column. Select Time as the grouping variable. Ellipsoids that group samples by time points are drawn. The time effect is mainly captured by the 2 nd principal component, which explains 14.1% of the variance. We could repeat the process to draw ellipsoids to group samples by treatment group. If we did, we would see that the estrogen vs. control treatment effect is captured mainly by the 1 st principal component, which captures 30.5% of the total variance. 22. Click on the 2D PCA Plots tab. These two-dimensional plots are based on the same data set as the 3D plots. Use these plots to examine the distribution of samples across each component. This can be helpful, particularly when working with large data sets. 23. Scroll to the bottom of the 2D PCA Plots window. Page 13

14 The plot of Mahalanobis distances compares the distance of each array from a centroid or center of mass of the data points, taking into account the covariance. This may be useful in identifying potential outliers among arrays in the data set. 24. Select the Variance Component Charts tab. The top chart represents the weighted average proportion of variances across all samples and probes, accounting for 90% of the total variance. (For the purposes of calculation, the estimated variances are the result of treating all experimental variables as random effects in a mixed model.) Consistent with earlier observations, we can see that the estrogen vs. control treatment is the largest effect, time is next, and the interaction of the two explains least of the overall variability. The residual or error is about 38% of the total. If more was known about the data (array lot numbers, etc.), we might be able to attribute variance to technical effects and decrease the unexplained residual variance. Page 14

15 The lower graph, Variance Proportion by Principal Component, shows the breakdown of variance explained by each principal component. For example, we see that the Characteristics variable (estrogen or control) explains nearly 100% of the variance captured by the first principal component, Time dominates the 2 nd and 4 th components, and the interaction explains most of the 3 rd, 5 th and 7 th principal component. o Note that you may click the red triangle to the left of the Variance Proportion by Principal Component title to view more options. To view each component in a separate graph, uncheck the overlay option by clicking on Overlay Plots > Overlay Y s. The underlying tables for this and all other tabs are hidden by default, but may be opened either from the Window List or the tabbed report. Under the Tabs section on the upper left of the dashboard, click on the pull-down menu for the tab of interest and click View Data. We now have a very good estimation of the relative amounts of variance explained by the experimental variables in our data set. 25. Go back to the Workflow Journal and click Close All Other Windows. 26. Click on Results under Process 3 ArrayGroupCorrelation (MCF7). A window opens that shows two different scatter plot grids. We see two grids because we elected to analyze samples within control and estrogen groups separately in the Basic Expression Workflow dialog. Each grid displays a set of scatter plots in which data from pairs of samples are plotted against one another. This graphic provides a summary view of the correlation between replicate arrays in each treatment group. Page 15

16 On the diagonal we see distribution histograms for each individual sample. Off-diagonal scatterplots show probe intensities for pairs of samples. 27. Select Window > Close All from the top menu bar. Analysis of Variance The next step is to run the analysis of variance (ANOVAS) in the Basic Expression Workflow. Note that the quality control and ANOVAS functions can be run together, but normally, you will want to check the data quality prior to running the ANOVAS. We will not be normalizing this data set in the workflow. We will cover normalization options in a separate section later in this document. 1. Select File > Load Life Sciences Setting and navigate to the output directory used for the quality control analysis. 2. Select the BasicExpressionWorkflow MCF7 setting. The previous settings automatically load into the window. Now we just need to change some settings to set up the ANOVAs. 3. On the General tab, change the Workflow Output Name to ANOVAs1. 4. Select the QC and Normalization tab and uncheck the three QC processes previously run. 5. Select the ANOVA tab. 6. Select and add and to the Class Variables field. Page 16

17 7. Type Time Characteristics Time*Characteristics in the Model These Fixed Effects field. In this case we have a fairly simple model, a 2x3 factorial ANOVAs. For more information about specifying complex models, click next to the Model these Fixed Effects field, or refer to the SAS documentation for the Model statement in PROC MIXED. We have no random effects in this model, but if we did, they would be added as Class Variables and entered in the Adjust Variability for these Random Effects field. 8. Select and copy the text in the Model These Fixed Effects field. 9. Select the LSMeans tab. 10. Paste the copied text into the Estimate LSMean Differences for These Fixed Effects field. The LSMeans Difference Set for Volcano Plots radio buttons are used to select standard sets of differences automatically, without the Difference Chooser. For factorial designs, the Simple Differences option shows only comparisons between groups differing at levels of a single factor. For example, if age and sex were crossed factors, this option would produce a comparison between young males and young females, but not between young males and old females. Differences with a Control compares all other groups with the control group identified in the LSMeans Control Levels box. When None is selected, multiple testing corrections are performed on the F- tests of each fixed effect, with no differences calculated. This can be Page 17

18 particularly helpful in understanding which genes are changing in response to an interaction term. 11. Make sure that Simple Differences is selected as the LSMeans Difference Set for the Volcano Plots. 12. Click the Difference Chooser button. The Difference Chooser dialog opens in a new window. This tool allows you to easily set up a subset of differences to calculate. 13. Set up the Difference Chooser dialog as shown below: 14. Click Save. The file with the selected differences will be created, and its path automatically loaded in the difference data set field. The completed LSMeans tab is shown below: Page 18

19 15. Select the Multiple Testing tab. In the pull-down menu for Multiple Testing Method, select FDR, which is the Benjamini-Hochberg correction for false discovery rate. Note all of the available methods. Keep all other defaults on the Multiple Testing tab. 16. Select the Annotation Tab. Choose the supplied annotation file hg_u133_plus_2_na28_annot.sas7bdat. Select as the Annotation Merge Variable and as the Annotation Label Variable. The completed Annotation tab is shown below: 17. Leave the Tracks tab blank. 18. Click Run. ANOVA Results 1. Click Results from the journal ANOVAs1. The dashboard shows a volcano plot for each comparison and a set of action buttons. There are many ways to drill down into the data at this point. We will cover some of the main features. 2. Examine one of the Volcano Plots. Page 19

20 A volcano plot presents a summary view of the results for all probe sets for a single comparison. Each point represents a single probe set. The x-axis value for that point is the difference between the two group means being compared. The y-axis value for the point is log 10 (p-value) associated with its difference. The red dotted line represents the adjusted is log 10 (p-value) significance cutoff for the multiple testing correction method specified earlier. Points on the right have positive differences, and those that fall above the dotted line are considered significantly increased in the 12hr E2 group relative to the 12hr Cont group in the figure above. Points on the left with negative differences that fall above the dotted line are considered significantly decreased in the 12hr E2 group relative to the 12hr Cont group in the figure above. 3. Select View Data from the Results pull-down menu under the Tabs section. This table contains the results from the ANOVAS analyses run for all probe sets. Note that annotation information has been merged with the statistical results from the ANOVAS calculations, which include LSMeans, differences, and p-value information. 4. Scroll to the right to find columns with names that start with. For every difference p-value, a corresponding column is created that contains 0-1 values that indicate whether that p-value met the significance criterion. If a p-value was significant, a value of 1 is placed in the corresponding cell in its column. If a result was not significant, then a value of 0 is recorded. We can use these variables in a number of ways. For example, they can be used to generate Venn Diagrams showing the relationship among significance results, as in the following section. 5. Click the Venn Diagram action button in the ANOVAS results dashboard. 6. Select all four columns by clicking the first and shift-clicking the last. Click OK. Page 20

21 Each section in the resulting Venn diagram captures the overlap of significant results for one or more comparisons. Venn Diagrams can be generated from any table with one or more binary variables coded as 0 and 1, using General Utilities > Venn Diagram Single Table. The middle section of the graphic contains the number 2614, which is the number of probe sets that exceeded our significance criteria for all four difference comparisons. 7. Click the section labeled 2614, view the volcano plots, and then return to the data table. What do you see? 8. Click Tables > Subset. Clicking sections of the Venn diagram highlights rows in the underlying _amr table. We can use JMP tools to create new data tables that contain interesting subsets of genes, and use these subset tables to perform additional analyses. In the Subset dialog, make sure the Selected Rows and All Columns options are selected. Click OK. The resulting table can now be saved as a JMP table, SAS table, txt file, or file. Next we will manipulate this data set to create a wide data set that can be used to perform hierarchical clustering of these 2614 genes. Page 21

22 9. Close the subset data table. 10. From the Action Buttons section of the dashboard, click the Open Subset in Wide Format button. 11. In the Wide Subset dialog that opens, highlight Probe_Set_ID. 12. Type pr_ in the Optionally Enter a Common Prefix for Wide Column Names box. 13. Type venn_wide as the output data set name. The complete dialog is shown below: 14. Click OK. A new file is created, venn_wide. This contains the original intensity values for the 2614 probes merged with the design file information. We ll come back to this table when we perform clustering. 15. Close the venn_wide table. The SAS data set is saved in the output folder. 16. Close the Venn Diagram graphic. 17. Examine the remaining Action Buttons. Page 22

23 If we wanted to create a tall data table with rows corresponding to only those 2614 probes, we would have clicked the Open Subset in Tall Format button instead. If you have an Ingenuity Pathway Analysis license, you could also click the IPA Upload button to send the list of selected probesets directly to Ingenuity for further analysis. Next, we will drill down to examine detailed results for just a few probes. 18. Go to the 12hr E2 12hr Control volcano plot and select a few points on the far right (no more than 5) by drawing a box with your mouse. 19. In the Action Buttons menu, click Construct Oneway Plots. 20. Highlight and in the upper box, and in the lower box. Click OK. Separate graphics are created for each probe set, plotting the original intensities of that probe set for samples grouped by their values of Time and Characteristics. Page 23

24 21. Click the red triangle next to the plot title to examine additional options. 22. From the Action Buttons menu, click the Fit Model and Plot LSMeans button. In the resulting dialog, highlight and click OK. A graphical profile (shown below) is created for each probe set, showing its LSMeans values over different time points, treatment groups, and levels of the interaction term. 23. From the Action Buttons menu, click the Plot Intensities button. 24. In the Plot Intensities dialog, add the data columns under Intensity Columns to Plot. 25. Change the Label Variable to. Page 24

25 26. Click Run. A parallel plot is now shown in which each line represents the intensity of an individual probe set. Observe how that probe set s value changes across different samples. Using the Data Filter to Select Genes Genes are often selected according to certain criteria in the results data set. In this section, we will review using the Data Filter to select significant genes with 2-fold or greater change (i.e., log2 mean differences greater than 1 or less than -1). 1. In the ANOVAS results, select Rows > Data Filter. 2. Highlight the 12hr E2 12hr Control Significance Index and Diff columns in the Data Filter variable list. Page 25

26 3. Click Add. A new dialog opens 4. Click the OR button (circled above), then click Add without changing the selections. 5. Fill out the filter as shown below: Note: It is simpler to click on the numbers at either end of the range for the variable and type the 1 and -1 instead of using the slider. Page 26

27 Check the Show box in the data filter and examine the volcano plots. What happened to the volcano plots? 6. From the Results journal, click on Reopen Dialog to open the ANOVAs application process. 7. Select the Test tab. Expand the Mean Difference Filter outline. On the ANOVAs application process interface, there are options in addition to those offered in the workflow. You can set value cutoff filters for differences, as in this example where probe sets will be given a significance index value of 1 only if they pass the p-value cutoff and have a mean difference of greater than 1 or less than -1 (equivalent to a 2-fold change in this example, due to log 2 transformation of the input data). The Compute Multiple Testing option (indicated with an arrow above) will result in an individual multiple testing adjustment for each comparison, as opposed to the default global adjustment. This may be useful in instances where the p-value distributions are very different between comparisons in the data. 8. Return to the ANOVAs results dashboard and examine the action buttons. Page 27

28 9. Click the button labeled Create Subset with Mean Difference and P-value Criteria. This dialog functions similarly to the one in the Test tab of the ANOVAs dialog, except that instead of changing the way significance index variables are constructed, this dialog creates a subset data table from the ANOVAs results. Select the first two Difference columns, and type MySubset in the optional name field. 10. Click OK. A new table called is created, containing the 163 probe sets that satisfied the specified criteria for at least one of the two chosen comparisons. Pattern Analysis We will run three different pattern discovery analyses using the venn_wide data table we created earlier: hierarchical clustering, principal components, and cross-correlation. Hierarchical Clustering Analysis 1. Select Pattern Discovery > Hierarchical Clustering from the Genomics Starter window. 2. Examine the General tab. 3. Choose the venn_wide data table from the output directory as the input data set. 4. Add to the Label Variable field. 5. Add to the Compare Variables field. Page 28

29 6. Type pr_: in the List-Style Specification box. This specifies that any variable that starts with the prefix in the data table is to be used for clustering.. 7. Specify an Output Folder 8. Select the Options tab. 9. Select Fast Ward as the Hierarchical Clustering Method. The Two-Way Clustering and Center Rows boxes should be checked by default. 10. Check the Standardize Variables (Columns) before Clustering check box. The Color Theme for Heat Map can be selected prior to clustering. Leave the default for now. If you wish to change the color after clustering, that can be done with JMP tools. Page 29

30 11. Click Run. What do you see? Does it make sense? 12. Click Apply at the bottom of the Hierarchical Clustering dialog. This will keep these settings as a default as we move to the next process. 13. Click Close All from the dashboard. Principal Component Analysis 1. Select Pattern Discovery > Principal Components Analysis from the Genomics starter menu. Note that because we clicked Apply in the previous process, the General tab is filled in automatically with the input table and output path. 2. Specify in the Color Variables field. 3. Type pr_: in the List Style Specification of Continuous Variables field. 4. Select the Options tab. 5. Complete the options as shown below: 6. Click Run. Page 30

31 What do you see? 7. Click Genomics > General Utilities > Clear Parameter Defaults. 8. Click Close All from the dashboard. Cross Correlation Analysis We are going to perform a cross-correlation analysis on all possible pairs of genes in this data set to see which genes are negatively correlated over time. 1. Select Pattern Discovery > Cross Correlation from the Genomics Starter. 2. Examine the General tab. 3. Choose the venn_wide data table from the output directory as the input data set. 4. Add to the Label Variable field. 5. Add to the Compare Variables field. 6. Type pr_: in the List-Style Specification box. This specifies that any variable that starts with the prefix in the data table is to be used for clustering. 7. Specify an Output Folder 8. Select the Anno1 tab. Page 31

32 9. Choose the cross_corr_anno data set provided. Note: You can improve processing time by not specifying an annotation data set. 10. Complete the Anno1 tab as shown below: 11. Select the Secondary Data tab. 12. Choose the venn_wide data set previously created. 13. Type pr_: in the List-Style Specification box. 14. Select the Anno2 tab. 15. Choose the cross_corr_anno data set provided. 16. Complete the Anno2 tab identically to the Anno1 tab. 17. Select the Analysis tab. 18. Complete the Analysis tab as shown below: Note: The Cluster Heat Map checkbox should be unchecked when the data table is wide with many rows, as clustering can be memory intensive. 19. Select the Options tab. Complete the options as shown below: Page 32

33 The Process Group Size fields can help in managing memory usage. We are setting the minimum log 10 (p-value) for correlations to appear in the output data set to 5. Please note that if you do not use a cutoff here, your output data set will include all possible correlations, many of which may be low and therefore not of great interest, and the output data set may be extremely large. Alternatively, Multiple Testing Methods can be selected to filter the output data set. For large data sets, the p-value should be stringent. It is important to keep in mind that when this option is chosen, the full data set with all pairwise correlations is still generated as a temporary file prior to being filtered using the specified multiple testing adjustment criteria. It is important that the hard drive on which your temporary files are located has enough free space to accommodate this large intermediate data set. 20. Click Run. The sorted output file has the suffix and contains all correlations which exceeded the log 10 (p-value) of 5. Large negative correlations are found at the bottom of the table. 21. Select Rows > Data Filter. 22. In the Data Filter dialog, select Characteristics and Pearson_Correlation, click Add, then fill out the options as shown below: Page 33

34 23. Open the output graphics and examine the distribution plot. 24. Select the Plot Input Data for Selected Correlations drill down button. Click Plot Input Data. The resulting one-way plots are separated between the control and estrogen treated samples. This concludes the section on expression analysis. There are many more tools available in addition to those covered here. The dialogs for other processes are all very similar to the ones we reviewed earlier, so it should be straightforward to review other processes. Please note that you may always click the Load button at the bottom of each dialog to load sample settings for each process, and click Run to view sample output. Also, you may click the Process Description button at the top of the dialog to launch the section of the JMP Genomics User Guide documentation that contains detailed information about each process. Normalization JMP Genomics provides a number of different normalization procedures for expression and other array data. The commonly used normalization methods that will be reviewed in this section include mean, median, percentile, Loess, and quantile. Additionally, we will cover ratio analysis for 2-color arrays. JMP Genomics also has as a set of normalization options intended for count data from RNA-seq studies, which are reviewed in the step by step guide Import and Analysis of Next-Generation Data. Page 34

35 Data Standardization 1. Select Expression > Normalization > Data Standardize from the Genomics Starter window. Mean, median, and percentile normalization are all options available within this process. 2. Load the AffymetrixLatinSquare setting. The Standardization Method pull-down menu lists numerous choices. When a method such as Percentile is used, the value of the percentile (e.g., 75) should be typed into the Numerical Parameter for Advanced Standardization Methods field. 3. Select the Options tab. Page 35

36 Rows (probes or probe sets) can be standardized, but in general, standardization is performed on the columns (arrays). When using mean or median normalization, you can also standardize to a subset of genes, specified by the Subset Data Set to Use for Normalization. Note that mean and median center the values for each array at 0. Loess Normalization 1. Select Expression > Normalization > Loess Normalization from the Genomics Starter. 2. Load the DrosophilaAgingExample sample setting. Page 36

37 The smoothing parameter sets the percentage of data to use in each segment. The lower the value, the stronger the normalization. Repeated iterations of fitting steps can be performed, if desired. 3. Select the Analysis tab. When no baseline reference array is specified, the average of all arrays is used for the baseline. Alternatively, you can select a single array to use as a baseline, but this is not a commonly used option. Similar to mean and median normalization, data can be Loess normalized to a subset of the data. 4. Select the Kernel tab. For data sets with a small number of values to be standardized (e.g., mirna data), Kernel density may be used as a weight for Loess modeling. The number of points in the X and Y grid may be selected along with the bandwidth multiplier. Increasing the number of grid points results in a smoother curve, but extends run times. Increasing the Bandwidth Multiplier increases the smoothness of the estimate. Increasing the Exponential Multiplier will create greater curvature in the estimate. 5. Select the Reference Set tab. If you would like to use a different data set for creating the reference baseline, you may Choose that data set here. This option may be used in instances where you wish to standardize across different data sets in preparation for predictive modeling. The names of the output data sets can be specified on the Options tab. Quantile Normalization 1. Select Expression > Normalization > Quantile Normalization from the Genomics Starter. 2. Load the DrosophilaAgingExample sample setting. Page 37

38 There is a unique option in quantile normalization where the data from the autosomes can be normalized separately from the X and Y chromosome. This is useful when working with copy number data, for example. The Kernel, Reference Set, and Options tabs are similar in function to those in the Loess Normalization dialog. Two Color Ratio Analysis JMP Genomics provides different options for two-color array analysis depending on the experimental design. For loop designs, a mixed-model ANOVAS is appropriate, using the dye variable as a fixed effect and the array as a random effect, in addition to any other effects in the model. With a simple reference design, a ratio must first be calculated between the values of the experimental sample and the reference sample. 1. Select Expression > Quality Control > Ratio Analysis from the Genomics Starter. 2. Load the DrosophilaAgingExample sample setting. Page 38

39 The feature variable is the probe identifier. Note that you can also perform within-array Loess normalization using a By Variable such as print tip. The experimental design file supplies the name of the ratio variable. Note that this does not have to be the dye variable, as often such experiments alternate the cy3 and cy5 dyes between the reference and experimental sample. The denominator value should be listed in the appropriate field. Otherwise, the software will automatically use alphanumerical order to select this variable. 3. Select the Loess Normalization tab. The Perform Within Array Loess option should normally be checked so that intensity values are normalized prior to taking the ratio. The other functions are the same as in the Loess Normalization. 4. Select the Options tab. On this tab, we highly recommend that you enter new names in the Output Experimental Design Data Set and the Ratio Data Set fields. In the new design table, the number of data columns is halved as a result of the ratio analysis. This concludes our discussion of basic expression analysis and normalization. Page 39

Gene Expression Analysis of a Down s Syndrome Study Using Partek Genomics Suite 6.6

Gene Expression Analysis of a Down s Syndrome Study Using Partek Genomics Suite 6.6 Gene Expression Analysis of a Down s Syndrome Study Using Partek Genomics Suite 6.6 This tutorial will illustrate how to: Import Affymetrix CEL files and check quality Add attributes describing the sample

More information

Basic Data Analysis Using JMP in Windows Table of Contents:

Basic Data Analysis Using JMP in Windows Table of Contents: Basic Data Analysis Using JMP in Windows Table of Contents: I. Getting Started with JMP II. Entering Data in JMP III. Saving JMP Data file IV. Opening an Existing Data File V. Transforming and Manipulating

More information

Analyzing the Effect of Treatment and Time on Gene Expression in Partek Genomics Suite (PGS) 6.6: A Breast Cancer Study

Analyzing the Effect of Treatment and Time on Gene Expression in Partek Genomics Suite (PGS) 6.6: A Breast Cancer Study Analyzing the Effect of Treatment and Time on Gene Expression in Partek Genomics Suite (PGS) 6.6: A Breast Cancer Study The data for this study is taken from experiment GSE848 from the Gene Expression

More information

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER JMP Genomics Step-by-Step Guide to Bi-Parental Linkage Mapping Introduction JMP Genomics offers several tools for the creation of linkage maps

More information

Tutorial for proteome data analysis using the Perseus software platform

Tutorial for proteome data analysis using the Perseus software platform Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information

More information

Step by Step Guide to Importing Genetic Data into JMP Genomics

Step by Step Guide to Importing Genetic Data into JMP Genomics Step by Step Guide to Importing Genetic Data into JMP Genomics Page 1 Introduction Data for genetic analyses can exist in a variety of formats. Before this data can be analyzed it must imported into one

More information

Two-Way ANOVA tests. I. Definition and Applications...2. II. Two-Way ANOVA prerequisites...2. III. How to use the Two-Way ANOVA tool?...

Two-Way ANOVA tests. I. Definition and Applications...2. II. Two-Way ANOVA prerequisites...2. III. How to use the Two-Way ANOVA tool?... Two-Way ANOVA tests Contents at a glance I. Definition and Applications...2 II. Two-Way ANOVA prerequisites...2 III. How to use the Two-Way ANOVA tool?...3 A. Parametric test, assume variances equal....4

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents: Table of contents: Access Data for Analysis Data file types Format assumptions Data from Excel Information links Add multiple data tables Create & Interpret Visualizations Table Pie Chart Cross Table Treemap

More information

Scatter Plots with Error Bars

Scatter Plots with Error Bars Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

More information

SPSS Introduction. Yi Li

SPSS Introduction. Yi Li SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf

More information

Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

More information

A Guide for a Selection of SPSS Functions

A Guide for a Selection of SPSS Functions A Guide for a Selection of SPSS Functions IBM SPSS Statistics 19 Compiled by Beth Gaedy, Math Specialist, Viterbo University - 2012 Using documents prepared by Drs. Sheldon Lee, Marcus Saegrove, Jennifer

More information

Exiqon Array Software Manual. Quick guide to data extraction from mircury LNA microrna Arrays

Exiqon Array Software Manual. Quick guide to data extraction from mircury LNA microrna Arrays Exiqon Array Software Manual Quick guide to data extraction from mircury LNA microrna Arrays March 2010 Table of contents Introduction Overview...................................................... 3 ImaGene

More information

SPSS for Exploratory Data Analysis Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav)

SPSS for Exploratory Data Analysis Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav) Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav) Organize and Display One Quantitative Variable (Descriptive Statistics, Boxplot & Histogram) 1. Move the mouse pointer

More information

Gene Expression Data Analysis (detailed)

Gene Expression Data Analysis (detailed) Gene Expression Data Analysis (detailed) Topics Gene Expression Data Analysis Data import Different pla4orms Add sample a6ribute QA/QC PCA Detect differen>al Expressed genes Visualiza>on Biological interpreta>on

More information

Hierarchical Clustering Analysis

Hierarchical Clustering Analysis Hierarchical Clustering Analysis What is Hierarchical Clustering? Hierarchical clustering is used to group similar objects into clusters. In the beginning, each row and/or column is considered a cluster.

More information

SAS Analyst for Windows Tutorial

SAS Analyst for Windows Tutorial Updated: August 2012 Table of Contents Section 1: Introduction... 3 1.1 About this Document... 3 1.2 Introduction to Version 8 of SAS... 3 Section 2: An Overview of SAS V.8 for Windows... 3 2.1 Navigating

More information

JustClust User Manual

JustClust User Manual JustClust User Manual Contents 1. Installing JustClust 2. Running JustClust 3. Basic Usage of JustClust 3.1. Creating a Network 3.2. Clustering a Network 3.3. Applying a Layout 3.4. Saving and Loading

More information

Using Mail Merge in Microsoft Word

Using Mail Merge in Microsoft Word Using Mail Merge in Microsoft Word Creating the main document On the menu bar, click on Tools. From the pull down menu, select Letters & Mailings, then select Mail Merge... A task pane will appear on the

More information

Describing, Exploring, and Comparing Data

Describing, Exploring, and Comparing Data 24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter

More information

Introduction to the TI-Nspire CX

Introduction to the TI-Nspire CX Introduction to the TI-Nspire CX Activity Overview: In this activity, you will become familiar with the layout of the TI-Nspire CX. Step 1: Locate the Touchpad. The Touchpad is used to navigate the cursor

More information

Modifying Colors and Symbols in ArcMap

Modifying Colors and Symbols in ArcMap Modifying Colors and Symbols in ArcMap Contents Introduction... 1 Displaying Categorical Data... 3 Creating New Categories... 5 Displaying Numeric Data... 6 Graduated Colors... 6 Graduated Symbols... 9

More information

Analysis Tools in Geochemistry for ArcGIS

Analysis Tools in Geochemistry for ArcGIS Analysis Tools in Geochemistry for ArcGIS The database that is used to store all of the geographic information in Geochemistry for ArcGIS is Esri s file Geodatabase (fgdb). This is a collection of tables

More information

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

There are six different windows that can be opened when using SPSS. The following will give a description of each of them. SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet

More information

Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6

Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6 Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6 Overview This tutorial outlines how microrna data can be analyzed within Partek Genomics Suite. Additionally,

More information

For example, enter the following data in three COLUMNS in a new View window.

For example, enter the following data in three COLUMNS in a new View window. Statistics with Statview - 18 Paired t-test A paired t-test compares two groups of measurements when the data in the two groups are in some way paired between the groups (e.g., before and after on the

More information

Trial 9 No Pill Placebo Drug Trial 4. Trial 6.

Trial 9 No Pill Placebo Drug Trial 4. Trial 6. An essential part of science is communication of research results. In addition to written descriptions and interpretations, the data are presented in a figure that shows, in a visual format, the effect

More information

Newton s First Law of Migration: The Gravity Model

Newton s First Law of Migration: The Gravity Model ch04.qxd 6/1/06 3:24 PM Page 101 Activity 1: Predicting Migration with the Gravity Model 101 Name: Newton s First Law of Migration: The Gravity Model Instructor: ACTIVITY 1: PREDICTING MIGRATION WITH THE

More information

An Introduction to Point Pattern Analysis using CrimeStat

An Introduction to Point Pattern Analysis using CrimeStat Introduction An Introduction to Point Pattern Analysis using CrimeStat Luc Anselin Spatial Analysis Laboratory Department of Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

More information

IBM SPSS Statistics 20 Part 1: Descriptive Statistics

IBM SPSS Statistics 20 Part 1: Descriptive Statistics CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 1: Descriptive Statistics Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the

More information

CNV Univariate Analysis Tutorial

CNV Univariate Analysis Tutorial CNV Univariate Analysis Tutorial Release 8.1 Golden Helix, Inc. March 18, 2014 Contents 1. Overview 2 2. CNAM Optimal Segmenting 4 A. Performing CNAM Optimal Segmenting..................................

More information

PowerPoint 2007 Basics Website: http://etc.usf.edu/te/

PowerPoint 2007 Basics Website: http://etc.usf.edu/te/ Website: http://etc.usf.edu/te/ PowerPoint is the presentation program included in the Microsoft Office suite. With PowerPoint, you can create engaging presentations that can be presented in person, online,

More information

Excel Tutorial. Bio 150B Excel Tutorial 1

Excel Tutorial. Bio 150B Excel Tutorial 1 Bio 15B Excel Tutorial 1 Excel Tutorial As part of your laboratory write-ups and reports during this semester you will be required to collect and present data in an appropriate format. To organize and

More information

CHARTS AND GRAPHS INTRODUCTION USING SPSS TO DRAW GRAPHS SPSS GRAPH OPTIONS CAG08

CHARTS AND GRAPHS INTRODUCTION USING SPSS TO DRAW GRAPHS SPSS GRAPH OPTIONS CAG08 CHARTS AND GRAPHS INTRODUCTION SPSS and Excel each contain a number of options for producing what are sometimes known as business graphics - i.e. statistical charts and diagrams. This handout explores

More information

IBM SPSS Statistics 23 Part 4: Chi-Square and ANOVA

IBM SPSS Statistics 23 Part 4: Chi-Square and ANOVA IBM SPSS Statistics 23 Part 4: Chi-Square and ANOVA Winter 2016, Version 1 Table of Contents Introduction... 2 Downloading the Data Files... 2 Chi-Square... 2 Chi-Square Test for Goodness-of-Fit... 2 With

More information

SPSS Manual for Introductory Applied Statistics: A Variable Approach

SPSS Manual for Introductory Applied Statistics: A Variable Approach SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All

More information

Scientific Graphing in Excel 2010

Scientific Graphing in Excel 2010 Scientific Graphing in Excel 2010 When you start Excel, you will see the screen below. Various parts of the display are labelled in red, with arrows, to define the terms used in the remainder of this overview.

More information

Microarray Data Analysis Using Partek Genomic Suite. Xiaowen Wang Field Application Specialist Partek Inc.

Microarray Data Analysis Using Partek Genomic Suite. Xiaowen Wang Field Application Specialist Partek Inc. Microarray Data Analysis Using Partek Genomic Suite Xiaowen Wang Field Application Specialist Partek Inc. Who is Partek? Founded in 1993 Building tools for statistics & visualization Focused on genomics

More information

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

More information

Predictive Modeling with JMP Genomics

Predictive Modeling with JMP Genomics Predictive Modeling with JMP Genomics Page 1 Introduction Predictive modeling or biomarker discovery with modern genomics data is a popular activity and is a critical component in delivering on the promise

More information

ICP Data Entry Module Training document. HHC Data Entry Module Training Document

ICP Data Entry Module Training document. HHC Data Entry Module Training Document HHC Data Entry Module Training Document Contents 1. Introduction... 4 1.1 About this Guide... 4 1.2 Scope... 4 2. Step for testing HHC Data Entry Module.. Error! Bookmark not defined. STEP 1 : ICP HHC

More information

Guide for SPSS for Windows

Guide for SPSS for Windows Guide for SPSS for Windows Index Table Open an existing data file Open a new data sheet Enter or change data value Name a variable Label variables and data values Enter a categorical data Delete a record

More information

Using SPSS 20, Handout 3: Producing graphs:

Using SPSS 20, Handout 3: Producing graphs: Research Skills 1: Using SPSS 20: Handout 3, Producing graphs: Page 1: Using SPSS 20, Handout 3: Producing graphs: In this handout I'm going to show you how to use SPSS to produce various types of graph.

More information

Excel -- Creating Charts

Excel -- Creating Charts Excel -- Creating Charts The saying goes, A picture is worth a thousand words, and so true. Professional looking charts give visual enhancement to your statistics, fiscal reports or presentation. Excel

More information

MultiExperiment Viewer Quickstart Guide

MultiExperiment Viewer Quickstart Guide MultiExperiment Viewer Quickstart Guide Table of Contents: I. Preface - 2 II. Installing MeV - 2 III. Opening a Data Set - 2 IV. Filtering - 6 V. Clustering a. HCL - 8 b. K-means - 11 VI. Modules a. T-test

More information

User Manual. Transcriptome Analysis Console (TAC) Software. For Research Use Only. Not for use in diagnostic procedures. P/N 703150 Rev.

User Manual. Transcriptome Analysis Console (TAC) Software. For Research Use Only. Not for use in diagnostic procedures. P/N 703150 Rev. User Manual Transcriptome Analysis Console (TAC) Software For Research Use Only. Not for use in diagnostic procedures. P/N 703150 Rev. 1 Trademarks Affymetrix, Axiom, Command Console, DMET, GeneAtlas,

More information

Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step

Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step Minitab Guide This packet contains: A Friendly Guide to Minitab An introduction to Minitab; including basic Minitab functions, how to create sets of data, and how to create and edit graphs of different

More information

Numbers Basics. Website:

Numbers Basics. Website: Numbers 09 Basics Website: http://etc.usf.edu/te/ Numbers is Apple's spreadsheet application. It is installed as part of the iwork suite, which also includes the word processing program Pages and the presentation

More information

Array Analyze Software

Array Analyze Software Array Analyze Software Format: PC compatible Applications: For use with the MODified Histone Peptide Array (Catalog Nos. 13001 & 13005) Description: The Array Analyze software program is designed for use

More information

Directions for Creating a Column Graph and Plotting the Standard Deviation of the Average in Excel 2007

Directions for Creating a Column Graph and Plotting the Standard Deviation of the Average in Excel 2007 Directions for Creating a Column Graph and Plotting the Standard Deviation of the Average in Excel 2007 We will create a column graph comparing 5 volumes and their average volume for 5 Zinc washers. Also

More information

Normalizing Spectral Counts in Scaffold

Normalizing Spectral Counts in Scaffold Normalizing Spectral Counts in Scaffold When using MS/MS proteomics data to look for differentially expressed proteins, there are several issues that need to be considered: 1) peptides shared between proteins,

More information

Customizing and Editing Graphs

Customizing and Editing Graphs Customizing and Editing Graphs Table Of Contents Table Of Contents Customizing Graphs... 5 Symbols, Lines, Bars... 5 Axes, Ticks, Reference Lines... 94 Titles, Footnotes, and Data Labels... 153 Graph

More information

Building a table in the SAS Web Report Studio

Building a table in the SAS Web Report Studio Building a table in the SAS Web Report Studio This section provides an example on how to run a query within the SAS Cube environment. The query will create a table of clients by Indigenous status, age

More information

Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs

Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs Using Excel Jeffrey L. Rummel Emory University Goizueta Business School BBA Seminar Jeffrey L. Rummel BBA Seminar 1 / 54 Excel Calculations of Descriptive Statistics Single Variable Graphs Relationships

More information

Introduction to Stata: Graphic Displays of Data and Correlation

Introduction to Stata: Graphic Displays of Data and Correlation Math 143 Lab #1 Introduction to Stata: Graphic Displays of Data and Correlation Overview Thus far in the course, you have produced most of our graphical displays by hand, calculating summaries and correlations

More information

Petrel TIPS&TRICKS from SCM

Petrel TIPS&TRICKS from SCM Petrel TIPS&TRICKS from SCM Knowledge Worth Sharing Histograms and SGS Modeling Histograms are used daily for interpretation, quality control, and modeling in Petrel. This TIPS&TRICKS document briefly

More information

DeCyder Extended Data Analysis module Version 1.0

DeCyder Extended Data Analysis module Version 1.0 GE Healthcare DeCyder Extended Data Analysis module Version 1.0 Module for DeCyder 2D version 6.5 User Manual Contents 1 Introduction 1.1 Introduction... 7 1.2 The DeCyder EDA User Manual... 9 1.3 Getting

More information

Introduction to SPSS 16.0

Introduction to SPSS 16.0 Introduction to SPSS 16.0 Edited by Emily Blumenthal Center for Social Science Computation and Research 110 Savery Hall University of Washington Seattle, WA 98195 USA (206) 543-8110 November 2010 http://julius.csscr.washington.edu/pdf/spss.pdf

More information

Spreadsheet View and Basic Statistics Concepts

Spreadsheet View and Basic Statistics Concepts Spreadsheet View and Basic Statistics Concepts GeoGebra 3.2 Workshop Handout 9 Judith and Markus Hohenwarter www.geogebra.org Table of Contents 1. Introduction to GeoGebra s Spreadsheet View 2 2. Record

More information

Microsoft Word 2010. Quick Reference Guide. Union Institute & University

Microsoft Word 2010. Quick Reference Guide. Union Institute & University Microsoft Word 2010 Quick Reference Guide Union Institute & University Contents Using Word Help (F1)... 4 Window Contents:... 4 File tab... 4 Quick Access Toolbar... 5 Backstage View... 5 The Ribbon...

More information

Organizing Information by Using Tables

Organizing Information by Using Tables In this chapter Find out what tables are and why you should use them Learn how to create tables and put information in them Discover how easy it is to format columns and rows Learn how to customize table

More information

BD CellQuest Pro Software Analysis Tutorial

BD CellQuest Pro Software Analysis Tutorial BD CellQuest Pro Analysis Tutorial This tutorial guides you through an analysis example using BD CellQuest Pro software. If you are already familiar with BD CellQuest Pro software on Mac OS 9, refer to

More information

Petrel TIPS&TRICKS from SCM

Petrel TIPS&TRICKS from SCM Petrel TIPS&TRICKS from SCM Knowledge Worth Sharing Pie Charts or Bubble Maps This TIPS&TRICKS is intended to aid a person working in Petrel who needs to make a display showing the relative proportion

More information

A Demonstration of Hierarchical Clustering

A Demonstration of Hierarchical Clustering Recitation Supplement: Hierarchical Clustering and Principal Component Analysis in SAS November 18, 2002 The Methods In addition to K-means clustering, SAS provides several other types of unsupervised

More information

University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 2: Manipulating Display Parameters in ArcMap. Symbolizing Features and Rasters:

University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 2: Manipulating Display Parameters in ArcMap. Symbolizing Features and Rasters: : Manipulating Display Parameters in ArcMap Symbolizing Features and Rasters: Data sets that are added to ArcMap a default symbology. The user can change the default symbology for their features (point,

More information

PharmaSUG 2013 - Paper DG06

PharmaSUG 2013 - Paper DG06 PharmaSUG 2013 - Paper DG06 JMP versus JMP Clinical for Interactive Visualization of Clinical Trials Data Doug Robinson, SAS Institute, Cary, NC Jordan Hiller, SAS Institute, Cary, NC ABSTRACT JMP software

More information

Introduction... 1 Welcome Screen... 2 Map View... 3. Generating a map... 3. Map View... 4. Basic Map Features... 4

Introduction... 1 Welcome Screen... 2 Map View... 3. Generating a map... 3. Map View... 4. Basic Map Features... 4 Quick Start Guide Contents Introduction... 1 Welcome Screen... 2 Map View... 3 Generating a map... 3 Map View... 4 Basic Map Features... 4 Adding a Secondary Indicator... 5 Adding a Secondary Indicator...

More information

Integrating SAS with JMP to Build an Interactive Application

Integrating SAS with JMP to Build an Interactive Application Paper JMP50 Integrating SAS with JMP to Build an Interactive Application ABSTRACT This presentation will demonstrate how to bring various JMP visuals into one platform to build an appealing, informative,

More information

Introduction Course in SPSS - Evening 1

Introduction Course in SPSS - Evening 1 ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/

More information

Extended control charts

Extended control charts Extended control charts The control chart types listed below are recommended as alternative and additional tools to the Shewhart control charts. When compared with classical charts, they have some advantages

More information

Years after 2000. US Student to Teacher Ratio 0 16.048 1 15.893 2 15.900 3 15.900 4 15.800 5 15.657 6 15.540

Years after 2000. US Student to Teacher Ratio 0 16.048 1 15.893 2 15.900 3 15.900 4 15.800 5 15.657 6 15.540 To complete this technology assignment, you should already have created a scatter plot for your data on your calculator and/or in Excel. You could do this with any two columns of data, but for demonstration

More information

Can SAS Enterprise Guide do all of that, with no programming required? Yes, it can.

Can SAS Enterprise Guide do all of that, with no programming required? Yes, it can. SAS Enterprise Guide for Educational Researchers: Data Import to Publication without Programming AnnMaria De Mars, University of Southern California, Los Angeles, CA ABSTRACT In this workshop, participants

More information

GeoGebra Statistics and Probability

GeoGebra Statistics and Probability GeoGebra Statistics and Probability Project Maths Development Team 2013 www.projectmaths.ie Page 1 of 24 Index Activity Topic Page 1 Introduction GeoGebra Statistics 3 2 To calculate the Sum, Mean, Count,

More information

Microsoft Office. Mail Merge in Microsoft Word

Microsoft Office. Mail Merge in Microsoft Word Microsoft Office Mail Merge in Microsoft Word TABLE OF CONTENTS Microsoft Office... 1 Mail Merge in Microsoft Word... 1 CREATE THE SMS DATAFILE FOR EXPORT... 3 Add A Label Row To The Excel File... 3 Backup

More information

Charts in Excel 2007

Charts in Excel 2007 Charts in Excel 2007 Contents Introduction Charts in Excel 2007...1 Part 1: Generating a Basic Chart...1 Part 2: Adding Another Data Series...4 Part 3: Other Handy Options...6 Introduction Charts in Excel

More information

Custom Reporting System User Guide

Custom Reporting System User Guide Citibank Custom Reporting System User Guide April 2012 Version 8.1.1 Transaction Services Citibank Custom Reporting System User Guide Table of Contents Table of Contents User Guide Overview...2 Subscribe

More information

University of Rochester

University of Rochester University of Rochester User s Guide to URGEMS Ad Hoc Reporting Guide Using IBM Cognos Workspace Advanced, Version 10.2.1 Version 1.0 April, 2016 1 P age Table of Contents Table of Contents... Error! Bookmark

More information

Moving from SPSS to JMP : A Transition Guide

Moving from SPSS to JMP : A Transition Guide WHITE PAPER Moving from SPSS to JMP : A Transition Guide Dr. Jason Brinkley, Department of Biostatistics, East Carolina University Table of Contents Introduction... 1 Example... 2 Importing and Cleaning

More information

Data Presentation Methods with Minitab

Data Presentation Methods with Minitab Data Presentation Methods with Minitab Following example comes from the data called ch3example.mtw, which can be downloaded from D2L site. 1. Working with Minitab Data 1-1 Entering data The Data window

More information

Data analysis process

Data analysis process Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

More information

WebSphere Business Monitor V7.0 Business space dashboards

WebSphere Business Monitor V7.0 Business space dashboards Copyright IBM Corporation 2010 All rights reserved IBM WEBSPHERE BUSINESS MONITOR 7.0 LAB EXERCISE WebSphere Business Monitor V7.0 What this exercise is about... 2 Lab requirements... 2 What you should

More information

Analyzing calorimetry data using pivot tables in Excel

Analyzing calorimetry data using pivot tables in Excel Analyzing calorimetry data using pivot tables in Excel 1. Set up the Source Table: Start in format 1. a. Remove the table of weights from the top to a separate page so the top row has the column labels.

More information

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data The Illumina TopHat Alignment and Cufflinks Assembly and Differential Expression apps make RNA data analysis accessible to any user, regardless

More information

QuantStudio 3D AnalysisSuite Software: Relative Quantification

QuantStudio 3D AnalysisSuite Software: Relative Quantification HELP QuantStudio 3D AnalysisSuite Software: Relative Quantification for use with QuantStudio 3D Digital PCR System and QuantStudio 3D AnalysisSuite Server System Publication Number MAN0009636 Revision

More information

SAS / INSIGHT. ShortCourse Handout

SAS / INSIGHT. ShortCourse Handout SAS / INSIGHT ShortCourse Handout February 2005 Copyright 2005 Heide Mansouri, Technology Support, Texas Tech University. ALL RIGHTS RESERVED. Members of Texas Tech University or Texas Tech Health Sciences

More information

Chapter 3 Introduction to Predictive Modeling: Predictive Modeling Fundamentals and Decision Trees

Chapter 3 Introduction to Predictive Modeling: Predictive Modeling Fundamentals and Decision Trees Chapter 3 Introduction to Predictive Modeling: Predictive Modeling Fundamentals and Decision Trees 3.1 Creating Training and Validation Data... 3-2 3.2 Constructing a Decision Tree Predictive Model...

More information

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7 Using Your TI-NSpire Calculator: Descriptive Statistics Dr. Laura Schultz Statistics I This handout is intended to get you started using your TI-Nspire graphing calculator for statistical applications.

More information

SonicWALL GMS Custom Reports

SonicWALL GMS Custom Reports SonicWALL GMS Custom Reports Document Scope This document describes how to configure and use the SonicWALL GMS 6.0 Custom Reports feature. This document contains the following sections: Feature Overview

More information

MetroBoston DataCommon Training

MetroBoston DataCommon Training MetroBoston DataCommon Training Whether you are a data novice or an expert researcher, the MetroBoston DataCommon can help you get the information you need to learn more about your community, understand

More information

Working with Plate-Based Data in Partek Screener s Solution 1

Working with Plate-Based Data in Partek Screener s Solution 1 Working with Plate-Based Data in Partek Screener s Solution Introduction Partek s HTS Navigator is the name of the tool within the Screener s Solution for navigating high throughput screening data. It

More information

Excel 2016 Tables & PivotTables

Excel 2016 Tables & PivotTables Excel 2016 Tables & A PivotTable is a summary of data from a data source and is very useful when you have a lot of data to analyze. Excel enable one to gather and present data in a custom/dynamic display.

More information

CCH Accounts Production (PROcap) V7.0. Release notes. April Information. Fee Protection. Software. Magazines. Professional Development

CCH Accounts Production (PROcap) V7.0. Release notes. April Information. Fee Protection. Software. Magazines. Professional Development CCH Accounts Production (PROcap) V7.0 April 2009 Information Fee Protection Software Magazines Professional Development Contents 1 Executive summary...2 2 Installation...3 2.1 Obtaining and installing

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such

More information

WebSphere Business Monitor V6.2 Business space dashboards

WebSphere Business Monitor V6.2 Business space dashboards Copyright IBM Corporation 2009 All rights reserved IBM WEBSPHERE BUSINESS MONITOR 6.2 LAB EXERCISE WebSphere Business Monitor V6.2 What this exercise is about... 2 Lab requirements... 2 What you should

More information

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices: Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

More information

Multivariate analysis of variance

Multivariate analysis of variance 21 Multivariate analysis of variance In previous chapters, we explored the use of analysis of variance to compare groups on a single dependent variable. In many research situations, however, we are interested

More information

Word 2010: Mail Merge to Email with Attachments

Word 2010: Mail Merge to Email with Attachments Word 2010: Mail Merge to Email with Attachments Table of Contents TO SEE THE SECTION FOR MACROS, YOU MUST TURN ON THE DEVELOPER TAB:... 2 SET REFERENCE IN VISUAL BASIC:... 2 CREATE THE MACRO TO USE WITHIN

More information

Adobe Illustrator CS2 Tutorial University of Texas at Austin School of Information IT Lab Jin Wu Fall, 2006

Adobe Illustrator CS2 Tutorial University of Texas at Austin School of Information IT Lab Jin Wu Fall, 2006 Introduction: Adobe Illustrator CS2 Tutorial University of Texas at Austin School of Information IT Lab Jin Wu Fall, 2006 Illustrator is a vector-based imaging program. Unlike PhotoShop, which deals in

More information

Microsoft Excel 2010 Charts and Graphs

Microsoft Excel 2010 Charts and Graphs Microsoft Excel 2010 Charts and Graphs Email: training@health.ufl.edu Web Page: http://training.health.ufl.edu Microsoft Excel 2010: Charts and Graphs 2.0 hours Topics include data groupings; creating

More information