Systat. Introduction. What is Systat. Interface of Systat

Systat Introduction This document is intended to introduce some basic information about SYSTAT for Windows. Systat has a very good user manual, it not only introduce all the information about the package, but also include a complete document about concepts of statistics as well as some very useful examples. Most of the examples and code come from the manual of Systat, If user has any question, check the manual and you will find a satisfying answer. What is Systat SYSTAT is a general purpose statistical software package. Compared with some other statistical package, it is very easy to use and highly integrated. It is powerful and comprehensive. The software includes basic statistics, these basic functions are usually the most commonly used statistics (e.g., user can do descriptive statistics, frequencies, correlations and etc.). And Systat can do advanced statistics as well (e.g., regression, ANOVA, MANOVA, factor analysis, cluster analysis, time series). Interface of Systat SYSTAT has an excellent graphics interface. This interface works well both for data analysis and data visualization. The best part about the interface is user can do most of the work through menus and dialog boxes. The buttons give single-click short cuts to common statistical analysis. User can use code as well, but that means more work than using pull-down command.

Graph in Systat It is quick and easy to use for exploratory work, and has extensive options for presentation graphics. Systat has user friendly tool of plotting and the plots are ready to present in papers or presentations. Below is only some example in Systat s graph gallery. Basic Functions in Systat Linear regression There are several different regression types in Systat, ridge regression, least squares regression and Bayesian regression. In the free student version, only least square is included. But the other regressions have similar dialog box. To open Least Squares Regression dialog box, from the menus choose: Regression Linear Least Squares

We use data file Longley from Systat as linear model example, the code is as following: REGRESS USE LONGLEY PLENGTH LONG MODEL TOTAL = CONSTANT + DEFLATOR + GNP + UNEMPLOY +,ARMFORCE + POPULATN + TIME ESTIMATE Nested variable modeling Nested variable modeling is not an independent command in Systat. It is an option that combined with other command. It is most commonly used as an option in Model Estimation in GLM. To specify a general linear model using GLM, from the menus choose: General Linear Model (GLM) Estimate Model When the dialog box pop up, grouping variables are selected from the box. If you want nested variables in model, you need to build these components using the Cross and Nest buttons.

We use data file PESTRESIDUE and method is the nested variable, the code is as following USE PESTRESIDUE VC CATEGORY METHOD BATCH MODEL Y = INTERCEPT + METHOD RANDOM BATCH (METHOD) ESTIMATE Continuous/categorical variable analyses There is a command to treat variable differently. Generally, numeric variables will be treated as continuous variable. To treat a numeric variable as categorical, user have to use the CATEGORY command. CATEGORY X Y The command treats x and y as categorical numeric variables. Numeric variables can also represent discrete or continuous variables. As the manual indicates, SYSTAT does not differentiate between discrete and continuous variables. If we want to do variable analysis from dialog box, it is pretty simple as well. Take probit regression for example. Open the Probit Regression dialog box, from the menus choose: Regression Probit... Then in Independent(s) option, we select one or more variables. Categorical variables must be designated using the Category tab. Independent variables in a PROBIT model can be either categorical or continuous. If the categorical variable has k categories, dummy variables are created. Histograms Histogram is a very simple plot in statistics, It either be done by Graph Histogram X-variable Or simply use command hist Quick access also has histogram command

50 0.8 40 0.7 Count 30 20 10 0.6 0.5 0.4 0.3 0.2 Proportion per Bar 0.1 0 0 100 200 300 400 MIL 500 600 700 800 0.0 QQ-plot There is no QQ-plot function in Systat. Random effects model Random effect is part of mixed model. Category To specify categorical variables, click the Category tab. Select at least one fixed or random effect in Model tab other than intercept to activate this tab. To specify covariance structures for random effects and errors, click the Random tab.

F and R are the matrices of linear weights contrasting the coefficient estimates for fixed and random effects respectively. You can write your hypothesis in terms of the F and R matrices. We use data file Heart and method is mixed model, the code is as following USE HEART LET Y=HR-RESTHR MIXED CATEGORY HEIGHT FREQUENCY BLOCK MODEL Y = INTERCEPT + HEIGHT*FREQUENCY RANDOM BLOCK / STRUCTURE = DIAG ESTIMATE Chi-squared testing The chi-squared test result can be found from the ouput part. The likelihood-ratio statistic with degrees of freedom is the chi-square value. Sample size calculations Sample size calculation is not in student version, but if user purchase the product or download the 30 day trial, it can be found under the power analysis. Utilities Power Analysis

Nonlinear Regression: Estimate Model To open the Nonlinear Regression Estimate Model dialog box, from the menus choose: Regression Nonlinear Estimate Model... We use data file pattison and method is mixed model, the code is as following USE PATTISON NONLIN PLENGTH LONG MODEL GRASS = p1 + p2*exp(-p3*time) ESTIMATE Grass= p1+p2* exp(-p3 *TIME) Estimate

Factor Analysis For factor analysis, from the menus choose: Factor Analysis We use data file pattison and method is mixed model, the code is as following FACTOR USE YOUTH MODEL HEIGHT..WIDTH ESTIMATE / METHOD=PCA N=305 SORT ROTATE=VARIMAX

Reading in Multiple data Formats SYSTAT is very powerful and can open data files saved in the following formats: SYSTAT, FASTAT, and MYSTAT (*.SYS, *.SYD, *.SYZ) SIGMAPLOT (*.JNB) Excel (*.XLS ) SPSS (*.SAV ) SAS (*.SD2, *.SAS7BDAT, *.XPT, *.TPT) MINITAB (*.MTW) STATISTICA (*.STA) STATA (*.DTA) JMP (*.JMP) dbase (*.DBF ) ASCII text (*.TXT, *.DAT, *.CSV ) ArcView (*.SHP) LOTUS (*.WK1, *.WK2, *.WKS) DIF files (*.DIF ) STATVIEW (*.SVD) Some thing need to mention is to import of S-PLUS files you must employ Systat's command language. From these we can notice Systat can work with almost all the other popular statistical software. Data manipulation Data manipulation can be done conveniently with Systat. Here we will only introduce some basic manipulations and if user wants to get more detailed information in the data document. Missing Data In the Data Editor, missing numeric values are indicated by a period, and missing string values are represented by an empty cell. If you add, subtract, multiply, or divide when data are missing, the result is missing. If you sort your cases using a variable with missing values, the cases with values missing on the sort variable are listed first. If you specify conditions and a value is missing, SYSTAT sets the result to missing. To perform an analysis on only those cases with no values missing, use SELECT COMPLETE prior to the analysis. When we do data entry, missing data will be in the format of the default of that software. Transpose Transpose changes rows to columns and vice-versa, transposing cases and variables. To open the Transpose dialog box, from the menus choose: Data Reshape Transpose

Use IF...THEN DELETE statements to select a subset of cases and store them in a Systat data file, or use PUT or PRINT to save them in a text file. (You can also save selected cases using the SELECT command with EXTRACT.) Deleting row or column For performing row/column operations: MDELETE COLUMNS = columnlist (by row or column number) MAT name = mat_name (ref to rows; ref to columns) MDELETE ROWS = rowlist MSELECT condition MSAVE name1 / MAT = name2 Smoothing To fit a smoother, from the menus choose: Regression Smooth & Plot (doesn t include in the free student version) The smoothing function is the method for computing a smoothed estimate over the subset of points lying within the smoothing window. In practice, the most popular functions are the same ones we use for statistical estimators: themean, median, linear or polynomial regression estimate and so on.

Conclusion The advantage of Systat is the convenience of doing basic statistical analysis though it can do more sophisticated programming tasks as well. For example, time series, survival analysis, quality analysis, random sampling and a lot more. Systat is a very good package for everyday use for users without too much knowledge in statistics. All the information about statistical concepts can be found in the statistics document with Systat and examples are also come with the manual.