1 Phys 210 Lecture 5 Gnuplot: Functions, Data Plots, and Data Fits
2 Today: Course & Computer Issues Gnuplot Gotcha's and Tips Defining Functions in Gnuplot Plotting Data Files Fitting Data Files Examining Fit Results: Overlay, Residuals, Pulls
Course Issues? I put much more stuff into lectures than you could reasonably write as notes for doing the lab. That s because I upload the lecture notes, and expect you to be looking at them in the lab. Note: you will sometimes need to refer to previous lectures... I am purposely going fast on Linux & Gnuplot I think the important thing is to expose you to these things, not drill you to perfection. Having done it once, you will remember enough to Google for the forgotten details later. I want to get to Python before Phys 219 needs it (and will slow down once we re there). Phys 210 Lecture 5 3
Lessons from Lab Exercise 3 Sometimes you need a function that can't be calculated from elementary functions, but you can find the Taylor series for it. Taylor series have a range of convergence (outside it, they are wrong even with infinite terms), and a practical range (where they are accurate enough for a given application, using a given number of terms). Taylor series can be very accurate near the expansion point, especially when using many terms. But the accuracy gets worse as you move away from the expansion point, sometimes dramatically worse. Alternative calculation methods that minimize the worst-case error are frequently used instead. Phys 210 Lecture 5 4
5 Computer Issues Did you encounter any computer malfunctions? We think we have fixed things so your files and folders will default to world no-access (so others can't read or alter what you turn in).
6 Gnome Resets for Freeze, No-Cursor, etc Alt-F2 to get single-command-line box r to restart Gnome-shell (fastest, may not be enough) gnome-shell -r to replace Gnome-shell (more thorough) These should leave your working windows alone. This one will destroy all your working windows: Ctrl-Alt-F2 to get text-only Console login screen Log in with your username and password killall gnome-session logs out the frozen session Log in to the GUI screen that appears Ctrl-Alt-F2 to log out of the Console that s still there!! Ctrl-Alt-F1 to go back to the GUI
Gnuplot Lecture Gotchas Cut and pasted commands failed, because the lecture notes used slanted-quotes and Unix wants upright quotes! The EPS and PNG instructions didn t end with set term x11 which set you up for scary-looking EPS or PNG codes going to the Terminal (educational but frustrating). I left out the dash in "Times-Roman" which gives a warning in current Gnuplot (but not in the old versions I'm used to...) The backslash line-wrap was confusing. I went back and fixed these things. Phys 210 Lecture 5 7
Gnuplot Keyboard Gotchas If your graph is the top window, some keys will change it Q will close the graph window! (replot will bring it back) A will autoscale your plot L will change to log-scale, or undo log-scale G will add grid lines, or remove grid lines R will add ruler-lines at mouse (also removes them) M will show mouse-coordinates at the bottom of the window (ruler-lines, mouse-coords don t show in PNG, EPS files) H will show these and other hot-keys in Terminal window I don t know a way to shut off this behavior. It wasn t present in the old versions I m used to, or I would have warned you... Phys 210 Lecture 5 8
Gnuplot Mouse Gotchas Mouse-wheel scrolls the graph vertically, and changes yrange Right-drag draws a zoom-box, left-click zooms-in to the box; this can be iterated P key will undo both of these; replot won t Middle-click draws a cross with mouse-coords on graph (replot removes them; they don t appear in PNG, EPS files) unset mouse turns off these behaviors. I would have warned you, if these existed in older Gnuplot! Phys 210 Lecture 5 9
10 Command Line Editing (Gnuplot and Bash) Up-cursor retrieves earlier commands one by one You can just hit return, or edit then return Left and right cursor move without deleting Delete deletes one character left of cursor Ctrl-W deletes the word left of the cursor Ctrl-D deletes right of cursor (too far can kill your shell!) Ctrl-A goes to the beginning of line (useful for long lines) Ctrl-U erases the line no matter where you are in it Ctrl-R <string> reverse-searches for <string> (useful for finding commands many lines back)
11 Fixing Complicated Gnuplots If it was all in a single plot command, up-cursor and edit it. If the line wraps, Ctrl-minus the Terminal then stretch it wide If you made it by an initial plot followed by many replots, 1. Use Ctrl-R plot to find the first plot before all the replots 2. Edit if necessary then return to draw the first curve 3. Up-cursor through the replots, edit, and add, in right order Or 1. Do show plot which gives the combination of all the plot and replot commands that went into the plot, like last plot command was: plot sin(x) title..., cos(x) title.. 2. Drag mouse through red part to copy, paste into new line 3. Cursor around and fix the (long!) plot command
12 Fixing Complicated Gnuplots 2 It may be easier to work in a text-editor instead of Terminal 1. Make a command-file by save filename.save 2. Open filename.save in an editor like gedit 3. Edit plot command (at end) to remove extraneous curves, re-order the curves, change titles, widths, etc. 4. Save the edited file (keep editor open if you want) 5. Do load filename.save in Gnuplot, see if it s right yet 6. Editing, save and load filename.save until it s right.
Gnuplot and Directories Gnuplot runs in a working directory. When you make a.save or.eps or.png file, they will be created in that directory. When you load a.save file or.load file, Gnuplot expects to find them in the working directory. You can put a directory-path in front of filenames that are outside the working directory, both reading and writing. The Applications menu starts Gnuplot in your home directory. But you can start a Terminal, cd inside it to whatever directory you want, then type gnuplot or gnuplot5 to start it there. You can also cd inside Gnuplot (more on that later today). Phys 210 Lecture 5 13
Gnuplot Variables Gnuplot understands giving a symbolic name to a value. myvar = 9.12345678 stores the numeric value in the name myvar. You can then use myvar in expressions to be plotted, instead of typing 9.12345678. Gnuplot variables can also be complex numbers {1.1, 2.2} or quoted text-strings. The command show variables or sho var will print a list of all the variables you have defined. Phys 210 Lecture 5 14
15 Gnuplot Arithmetic Gotchas If the value has no decimal point, it is stored as a 32 bit integer. If the value has a decimal point (like 1. ) it is stored as a 64-bit floating point number. print <expression> evaluates <expression> in the Terminal Dividing an integer by an integer gives an integer, with the remainder discarded! a = 1; b = 2; print a/b gives 0!! But aa = 1. ; b = 2 ; print aa/b gives 0.5 and a = 1 ; bb = 2. ; print a/bb gives 0.5 (integers get "promoted" to floats in mixed expressions)
16 Gnuplot Arithmetic Gotchas 2 print 10000 * 10000 (4 zeros) gives 100000000 (8 zeros) print 100000 * 100000 (5 zeros) gives 1410065408 (wrong!) One bit is for the sign, and 2 31 = 2 2 10 so 10 10 doesn't fit into 32 bits! ( ) 3 2 1000 print 100000 * 100000. (decimal point added) gives 10000000000.0 (floating-point, and correct) Integer vs floating-point arithmetic issues come up in every computer language. ( ) 3 = 2 10 9
Gnuplot Functions Gnuplot understands defining a function of a variable. myfunc(z) = 4.5*z+45.0 stores the expression 4.5*z+45.0 When myfunc(value) appears, value replaces z. Then myfunc(0) would evaluate as 45.0 and myfunc(10) would evaluate as 90.0 Functions definitions can include built-in functions like sin and sqrt (square root), and even other user-functions. Function definitions can have more than one argument. quad1(a, b, c) = (-b+sqrt(b*b-4.0*a*c))/(2.0*a) Phys 210 Lecture 5 17
18 Gnuplot Functions 2 The command show functions or sho fun will print a list of all the functions you have defined. If you use x or y as argument names, they are just names, and don't necessarily correspond to x or y on a plot. (I tend to avoid using those names, to avoid the confusion.) Function definitions can have user-variables on the right side. If an argument name matches a user-variable name, the argument-value is used on the right, not the user-variable value. Use different names to avoid any confusion.
19 Gnuplot Functions 3 You could have done parts of Lab 3 using functions like this: T1(xx) = xx T3(xx) = T1(xx) - xx**3/3! T5(xx) = T3(xx) + xx**5/5! plot sin(x), T1(x), T3(x), T5(x) plot T1(x)-sin(x), T3(x)-sin(x), T5(x)-sin(x) Gnuplot thinks 3! = 6.0 not integer 6, so T3(1) = 0.833333 But if you wrote T3(x) = T1(x) - x**3/6, then T3(1) = 1!! but T3(1.0) = 0.83333 (the floating-point argument 1.0 "promotes" the 6 in the denominator to floating-point).
20 Plotting Data Files with Gnuplot Gnuplot can plot text-files containing rows of data, with the same number of columns in each row, like from a spreadsheet. Any column can be y-values, any column can be x-values. Any column can be y-errors, any column can be x-errors. You can apply functions and expressions to columns for these. The columns don't have to line up perfectly, as long as there is some white space or other separator. Unused number columns don't matter. Text columns are sometimes OK, sometimes not. Blank lines are sometimes ignored, sometimes have meaning. You can "comment out" lines using #.
21 Example File test.dat # x y err1 err2 text err3 1 1 1.1 junk.2 2 2.2 1.1 stuff.2 3 3 1.1 and.2 4 4.4 1.1 nonsense.2 5 5 1.1 Yo.2 6 6 1.1 Lo.2 The first line starting with # is ignored as a "comment" line. The "text" column 5 won't be a problem, even for accessing column 6. The last line has the same number of columns as the others, with the same meaning, so that's OK too. We'll see what the blank line does.
22 Data Plot Commands The simplest form is plot "filename" using xcol:ycol xcol and ycol are column numbers, starting from 1. plot "test.dat" using 1:2 gives this (autoscaled) plot Because the values are all "nice," the autoscaling set the x and y limits to exactly the values of the first and last points! And one "point" is in the key!
23 Data Plot Commands 2 plot "test.dat" u 1:2 with lines connects the dots. using can be abbreviated to u Autoscaling of "nice" values is OK for lines. The line breaks at the blank line in the file.
24 Data Plot Commands 3 set xrange [0:7]; set yrange [0:7] replot "" u 1:2 pointsize 4 This adds the data points to the line, and at a larger size, but they are in a different color. Gnuplot remembers a datafile name, and fills in empty quotes (in the key too...)
25 Data Plot Commands 4 plot "" u 1:2 ps 4 title "Data" repl "" u 1:2 w lines linecolor 1 title "" ps is abbreviation for pointsize, w for with. Plot first with large point symbols, and use a nicer key title. Add the line, in the same color as points, and suppress key title
26 Data Plot Commands 5 plot "" u 1:2 w linespoints ps 4 ti "Data" linespoints does lines and points at the same time. It could be shortened to lp. ti is abbreviation for title.
27 Data Plot Commands 6 plot "" using 1:2:3 with yerrorbars Plot error bars in y from column 3 yerrorbars can be abbreviated to yerr or err
28 Data Plot Commands 7 plot "" u 1:2:4:3 w xyerrorbars x error bars are from column 4 y error bars are from column 3 xyerrorbars can be abbreviated to xyerr
29 Data Plot Commands 8 replot "" u 1:2 smooth csplines title "splines" This add curves to connect the points (different color this time). Cubic splines are smooth curves that go exactly through all the data.
30 Fitting Physics Data with Gnuplot 1. Data file with x values, y values, and y errors in columns. 2. User function for y(x) involving parameters to be fitted 3. Initial guesses for the parameter values 4. Plot data with errors, superpose function with parameters 5. Adjust guesses and replot until function is close to data 6. Do the fit command 7. Replot with final fitted parameters as first sanity check 8. Plot residuals (data minus fit) with errors as better check 9. Plot pulls (data fit)/error as even better check
31 Line(xx) = Slope * xx + Intercept Slope = 0.5; Intercept = 2.0 plot "test.dat" using 1:2:6 with yerr, Line(x)
32 Gnuplot 4 Fit Command Syntax fit func(x) "file" using xcol:ycol:errcol via par1, par2,... The func name can be anything, and it can be defined with any name for the argument, but the argument here must be (x). The "file" name must be in quotes. xcol:ycol:errcol are usually column numbers from the file. Unlike the plot command, you don't add with error
33 fit Line(x) "test.dat" u 1:2:6 via Slope, Intercept replot Function goes thru errorbar for 5 out of 6 points. We expect ~2/3, so it's a little too good...
34 Fit Results in Terminal Window After 4 iterations the fit converged. final sum of squares of residuals : 3.48571 rel. change during last iteration : -2.4019e-09 Chisquare / ndf ~ 1, so a sensible fit degrees of freedom (FIT_NDF) : 4 rms of residuals (FIT_STDFIT) = sqrt(wssr/ndf) : 0.933503 variance of residuals (reduced chisquare) = WSSR/ndf : 0.871429 Final set of parameters Asymptotic Standard Error ======================= ========================== Slope = 0.994286 +/- 0.04463 (4.489%) Intercept = 0.12 +/- 0.1738 (144.8%) correlation matrix of the fit parameters: Slope Interc Slope 1.000 Intercept -0.899 1.000 Parameters and Errors
35 Data Column Arithmetic & Residuals Plot Parentheses mean do calculation instead of using raw file data, $n means interpret n as a column number, not the number n. To calculate the residual (data minus fit), replace ycol with ($ycol - func($xcol)) in plot command Residuals should be near zero, so autoscale in y The x scale should be the same as for the data and fit plot. Don't superpose the fit curve anymore. Just turn on the grid, or plot the function "0"
36 Residuals Plot set autoscale y; set grid plot "" u 1:($2-Line($1)):6 with yerr, 0 If the function is steep or highly curved, it's hard to see if the fit is close to all points. Subtracting the fit from the data flattens things so it's easier to see.
37 Pulls Plot Divide residual by error; plot (1) as the new "error" plot "" u 1:(($2-Line($1))/$6):(1) with yerr, 0 If errors are very different for different points, dividing by the errors makes it easier to see if the fit is really consistent with all the data. Here it just re-scales the plot.
38 Gnuplot Errors Without Data Errors Usually we are taught that parameter errors are undefined for "least-squares" fits where we don't assign errors to the data. But it's easy to calculate the RMS deviation of the data from the fit, and you can use that for the data errors. That lets you calculate parameter errors without being given data errors. And it's a sensible thing to do, if you believe that the errors are about the same for all the data points. Gnuplot 4 happily does that if you do a fit without giving it an error column.
Gnuplot 4 Errors are Non-Standard In Gnuplot 4, if the value in the error column is the same for all data points, the parameter values and errors are independent of the (common) value in that column! They are exactly the same as a fit using no error column at all! That's because Gnuplot 4 doesn't report what everyone else in the world calls the parameter error! Instead, it reports the parameter error times the chisquare per degree of freedom! Fitting a line to two points with errors should give a slope with an error. But the chisquare is zero, so Gnuplot 4 says the slope has zero error "by definition." That definition is wrong! Phys 210 Lecture 5 39
40 Gnuplot 5 Fit Differences New command set fit noerrorscaling (do it before the fit) makes Gnuplot report the true errors! Preferred fit syntax adds yerr to the fit command fit func(x) "file" using xcol:ycol:errcol yerr via par1, par2,... The Gnuplot 4 fit syntax is still accepted, but with a warning. The new syntax is more similar to a plot command: plot func(x), "file" using xcol:ycol:errcoll with yerr Also compact display of iteration progress. Also calculation of "chisquare probability"
41 Gnuplot 5 Fit Results gnuplot> fit Line(x) "test.dat" u 1:2:6 yerr via Slope, Intercept iter chisq delta/lim lambda Slope Intercept 0 1.1375000000e+02 0.00e+00 9.87e+00 5.000000e-01 2.000000e+00 1 4.6087163970e+01-1.47e+05 9.87e-01 6.893990e-01 1.300630e+00 2 3.4968958695e+00-1.22e+06 9.87e-02 9.893522e-01 1.391525e-01 3 3.4857142860e+00-3.21e+02 9.87e-03 9.942849e-01 1.200032e-01 4 3.4857142857e+00-8.70e-06 9.87e-04 9.942857e-01 1.200000e-01 iter chisq delta/lim lambda Slope Intercept After 4 iterations the fit converged. final sum of squares of residuals : 3.48571 rel. change during last iteration : -8.69636e-11 Iteration Progress degrees of freedom (FIT_NDF) : 4 rms of residuals (FIT_STDFIT) = sqrt(wssr/ndf) : 0.933503 variance of residuals (reduced chisquare) = WSSR/ndf : 0.871429 p-value of the Chisq distribution (FIT_P) : 0.480054 Final set of parameters Standard Deviation ======================= ========================== Slope = 0.994286 +/- 0.04781 (4.808%) Intercept = 0.12 +/- 0.1862 (155.2%) Probability of worse chisquare due to chance correlation matrix of the fit parameters: Slope Interc Slope 1.000 Intercept -0.899 1.000 Same parameter values Slightly different errors
42 For Next Time See you in the lab Extra Office Hours: Pit Pub, 3-5 PM Friday I buy the first pitcher... Next week, we will start Python