An Introduction to Git Version Control for SAS Programmers



Similar documents
Version Control with. Ben Morgan

Introduction to Git. Markus Kötter Notes. Leinelab Workshop July 28, 2015

Version control. with git and GitHub. Karl Broman. Biostatistics & Medical Informatics, UW Madison

Version Control with Git. Kate Hedstrom ARSC, UAF

MATLAB & Git Versioning: The Very Basics

Version control with GIT

Introduction to Version Control

Version Control with Svn, Git and git-svn. Kate Hedstrom ARSC, UAF

Version Control with Git. Dylan Nugent

Git Basics. Christopher Simpkins Chris Simpkins (Georgia Tech) CS 2340 Objects and Design CS / 22

Version Control! Scenarios, Working with Git!

Using Git for Project Management with µvision

Version Control with Git. Linux Users Group UT Arlington. Rohit Rawat

The Hitchhiker s Guide to Github: SAS Programming Goes Social Jiangtang Hu d-wise Technologies, Inc., Morrisville, NC

Introduction to the Git Version Control System

Introducing Xcode Source Control

Git. A Distributed Version Control System. Carlos García Campos carlosgc@gsyc.es

Lab Exercise Part II: Git: A distributed version control system

Introduc)on to Version Control with Git. Pradeep Sivakumar, PhD Sr. Computa5onal Specialist Research Compu5ng, NUIT

Version Control Systems

Two Best Practices for Scientific Computing

CS 2112 Lab: Version Control

Software Configuration Management and Continuous Integration

Version Control. Version Control

Version Control with Git

MOOSE-Based Application Development on GitLab

Git Basics. Christian Hanser. Institute for Applied Information Processing and Communications Graz University of Technology. 6.

Mercurial. Why version control (Single users)

Version Control using Git and Github. Joseph Rivera

Version control. HEAD is the name of the latest revision in the repository. It can be used in subversion rather than the latest revision number.

Version Uncontrolled! : How to Manage Your Version Control

Git in control of your Change Management

Version Control with Subversion and Xcode

FEEG Applied Programming 3 - Version Control and Git II

Source Control Systems

Source Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE

Version Control Systems: SVN and GIT. How do VCS support SW development teams?

Annoyances with our current source control Can it get more comfortable? Git Appendix. Git vs Subversion. Andrey Kotlarski 13.XII.

Version Control with Subversion

Jazz Source Control Best Practices

Content. Development Tools 2(63)

In depth study - Dev teams tooling

Version Control with Mercurial and SSH

Using GitHub for Rally Apps (Mac Version)

STABLE & SECURE BANK lab writeup. Page 1 of 21

Version Control with Subversion

Distributed Version Control

using version control in system administration

Data management on HPC platforms

CPSC 491. Today: Source code control. Source Code (Version) Control. Exercise: g., no git, subversion, cvs, etc.)

Using Subversion in Computer Science

PKI, Git and SVN. Adam Young. Presented by. Senior Software Engineer, Red Hat. License Licensed under

An Introduction to Mercurial Version Control Software

Version Control Script

Work. MATLAB Source Control Using Git

Introduction to Source Control ---

Version Control with Git

Git - Working with Remote Repositories

Source Control Guide: Git

CISC 275: Introduction to Software Engineering. Lab 5: Introduction to Revision Control with. Charlie Greenbacker University of Delaware Fall 2011

Version Control Tools

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer GIT

The Social Accelerator Setup Guide

Source Code Management/Version Control

Yocto Project Eclipse plug-in and Developer Tools Hands-on Lab

5 barriers to database source control and how you can get around them

Git, GitHub & Web Hosting Workshop

Gitflow process. Adapt Learning: Gitflow process. Document control

Automating client deployment

Cobian9 Backup Program - Amanita

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Zero-Touch Drupal Deployment

SUGI 29 Applications Development

Subversion Integration for Visual Studio

My DevOps Journey by Billy Foss, Engineering Services Architect, CA Technologies

inforouter Version 8.0 Administrator s Backup, Restore & Disaster Recovery Guide

Version Control for Computational Economists: An Introduction

Department of Veterans Affairs. Open Source Electronic Health Record Services

See the installation page

IDERA WHITEPAPER. The paper will cover the following ten areas: Monitoring Management. WRITTEN BY Greg Robidoux

RingCentral from AT&T Desktop App for Windows & Mac. User Guide

Developer Workshop Marc Dumontier McMaster/OSCAR-EMR

How to set up SQL Source Control. The short guide for evaluators

Incremental Backup Script. Jason Healy, Director of Networks and Systems

LOCKSS on LINUX. Installation Manual and the OpenBSD Transition 02/17/2011

Using SVN to Manage Source RTL

Introduction to Version Control with Git

Xcode Source Management Guide. (Legacy)

Pragmatic Version Control

Using Karel with Eclipse

Chapter 28: Expanding Web Studio

Setting up a local working copy with SVN, MAMP and rsync. Agentic

5 Group Policy Management Capabilities You re Missing

Visual Studio.NET Database Projects

Transcription:

ABSTRACT An Introduction to Git Version Control for SAS Programmers Stephen Philp, Pelican Programming, Redondo Beach, CA Traditionally version control has been in the domain of the enterprise: either your company provides it as part of your development environment or they don't. Additionally version control systems were often viewed as cumbersome, complex and difficult to work with. With Git, version control is simplified and incorporated into your development process. This lets both individual SAS developers and entire teams work faster and smarter. This paper will take some of the best practices of open source developers and apply them to individual SAS developers. It will then show how those best practices can scale to teams and remote developers. INTRODUCTION Git is a free and open source distributed version control system that is unobtrusive and fits in nicely with the average SAS programmer s workflow. This paper will be useful for SAS programmers who are either interested in getting started with version control, or are currently using version control, but would like to see how Git works. This paper will walk you through the steps needed to get started using Git version control. VERSION CONTROL FOR PROGRAMMERS Why Version Control? Developers use version control to manage change. Version control allows you to see all the changes that have been made to your source code as well as who made those changes and when. Additionally, it allows you to rewind to any point in history and restore the code to the state it was in at that point in time. This lets you reproduce outputs generated by your code from any time in the past. It also lets you compare code to its previous implementation to help you understand why the code was changed. Why Git? Out of all the goodness that source control brings to your development environment, there are three main functions that stand out: Reduce the risks that code refactoring can introduce. Allow you to view and work with code as it was at any time in its history. Track who made what changes and when. Git makes these operations trivially easy. Additionally it uses hashing algorithms to ensure the history of a project is not changed or corrupted. History of Git Git was created in the Spring of 2005 by none other than Linus Torvalds. Torvalds had several design goals in mind when he started creating it. Namely: Support a distributed workflow Be very fast and scalable Have strong safeguards and make it very difficult to corrupt the history of a project. To support a distributed workflow, Git allows each developer to always have the entire history of a project within their own local repository. Files are not checked in or out of a central repository. Instead, each developer always has a full-fledged working copy. 1

Who uses Git? According to the Eclipse Foundation s annual community survey, over 27% of professional software developers report using Git as their primary source control system [1]. A quick search using Dice.com shows 974 jobs when searching with the keyword Git. To put that within a frame of reference I also searched some other technology keywords: Technology Keyword Number of Jobs Listed on Dice.com SAS 1621 Ruby 2226 Perl 5130 Bash 944 Java 17696 Svn 1092 In its early years Git suffered from criticism that it was difficult to use. In fact there is still a lot of documentation online that would make the strongest stomach churn with fear (check out the man pages, my eyes! they bleed!). Luckily, in the last few years a lot of progress has been made to make the project accessible to us mere mortals. Git Architecture To understand how Git works you have to understand how data flows between three data structures: your repository, the index and your working tree. The working tree is the set of files and directories on your computer that you normally work with. The index is an intermediary place where you stage changes that you would like committed to the repository. Often called the staging area. Creating a commit copies the staged changes into a new commit and then clears the index. The repository is the database that contains the full history of your project. Once things are committed to the repository they become part of the project s history and should not be changed. Each new commit depends on the hash signature of all the previous commits to maintain historical integrity. Typical Workflow In a typical workflow you edit files in your working tree. You then stage those changes to the index. After you have staged the changes to the index, you can then commit them to the repository. It may all sound like additional work at first, especially to programmers who generally write programs on their own or in small teams. But version control, especially source code control is one of those things that s not really needed until it s REALLY needed. And I have found it actually improves my workflow and coding habits. Some examples of how it has improved my coding habits: Code ownership. Changes are tagged with my name. Forever. Encourages smaller source code files. Logic within DATA steps can t be abstracted away into functions like other languages, but often steps can be logically separated into files. Teaches a change-one-thing-at-a-time mentality. Helps me think of my code as having a life-span and not as a once-off. 2

DIVING IN Now that we have talked about Git and some of its benefits, we are going to jump straight into downloading, installing and creating our first project with it. Our example project will be to write SAS code that audits a SAS data set by taking some measurements of it (number of observations, number of variables). It will be named auditor.sas. It is just being used as an example and does not have anything to do with Git itself. Set Up Git Many Linux systems have Git installed. If your operating system does not, you can download it: http://git-scm.com/ Git was born on and is native to Linux, but works on Windows and Mac OS X. There are a few small differences between the Windows version and *nix version but for the most part everything works the same for both. Since a lot of SAS development is still done on Windows, I will work through examples from that operating system. After you have downloaded the Git executable, you should run it. Display 1. Installing Git on Windows After it is installed you should see Git under your Start/Program Files. There will be two programs: Git Bash and Git GUI. Running Git Bash should bring up a window showing it is installed: Display 2. Git Bash Is Running 3

Once Git is installed, you should configure a few global variables. prompt> git config global user.name Stephen Philp prompt> git config global user.email stephen@stephenphilp.com They tell Git who you are and how you can be contacted. Another nice global variable to set is color.ui which tells Git to use color-coded output. prompt> git config global color.ui auto Creating Your First Repository It is very simple to create a repository in Git. First create the directory that your project will live under then change to that directory and run git init: prompt> mkdir auditor prompt> cd auditor prompt> git init Display 3. Creating A Git Repository And that s it. You have created a Git repository. The git init command creates a directory named.git within the auditor directory (your working tree). This.git directory stores all the repository metadata and contains your entire repository. Adding To Your Repository Now that we have an empty repository, it s time to add some code to it. The requirements for our make believe example are to track if the data set exists, when it was modified, how many observations it has and how many variables it has. To get started, I fire up my SAS editor and create a file named auditor.sas that looks like this: /* auditor.sas This code will audit a SAS data set. The measurements we will be taking are: number of observations number of variables modification date This code is example code and comes with no warranties :) */ 4

That s all we need to get started. Next we need to tell Git to track it. First we add it to the staging area with the git add command. prompt> git add auditor.sas Running this command does not produce any output. So it can be nice to have Git tell us what its status is: prompt> git status Git tells us that is has staged the file and the index has changes ready to be committed to the repository. Display 4. Staged Changes Ready To Be Committed Now we can commit the change to the repository with git commit. prompt> git commit m Add audit.sas Display 5. Our First Commit The m flag tells git commit to add the message Add audit.sas to the commit. Commits are the chunks of history added to the repository. The message lets you know why that chunk was added and what it does. 5

It is important to write informative commit messages because they help you understand your project s history as it grows. Tips on writing good commit messages can be found here: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html Now that we have added some history to the repository we can see our commit with git log: Display 6. Our Project s History The first thing you may notice is that commits are named using a hash tag. Each new commit hash tag is generated from all of the history in the repository. This lets Git track changes in a distributed environment. And it ensures that future commits reflect any changes that have been made to a project s history. You can also see that it tagged me as the author, date stamped it and displays the commit message. When we run git status we see that the index/staging area has been cleared by the last commit. Display 7. Index Cleared After Commit Working With Changes Now that we have a repository with a file that we are tracking, it s time to make some changes. The first thing I am going to do is create a test data set that we can use to write our code against. It will live in a directory named test. prompt> mkdir test 6

I m also going to create readme.txt file in the test directory to explain the contents of the directory. prompt> echo This directory contains data to test against > test/readme.txt I will also add some code to auditor.sas to create the test data set. /* auditor.sas This code will audit a SAS data set. The measurements we will be taking are: number of observations number of variables modification date This code is example code and comes with no warranties :) */ %macro create_test_data; libname test "c:\data\wuss2012\git\projects\auditor\test\"; data test.test_data; do i = 1 to 10; output; end; run; %mend create_test_data; %create_test_data; After I submit that code I have a data set named test_data in the test directory. We have made three changes to our working tree: changed our code in auditor.sas created a new file named readme.txt created a new file named test_data.sas7bdat Running git status shows that Git sees the changes in our working tree. Display 8. Working Tree Has Changes 7

Git sees that we have changes in our working tree, but they have not been added to its index/staging area to be committed. Before we add our changes to the staging area, we need to pause a moment and think about the types of files we just created. Specifically the SAS data set. SAS data sets are binary files. In general, it does not make sense to track changes to binary files. And Git in particular is not a very good tool for tracking binary files. So we will tell Git to ignore SAS data sets by creating a.gitignore file. At the root directory of your project create a file called.gitignore. Inside that file you can specify exactly which files to ignore or use globbing to specify which types to ignore. We will add *.sas7bdat and *.log since we don t want to track log files either. Listing of.gitignore file: *.sas7bdat *.log Then when we run git status we see that Git notices auditor.sas has been modified and additionally there is a new file called.gitignore. Display 9. Git Status Shows Our Changes Now we have new files and a modified file. Our currently tracked file auditor.sas has been modified, and there are new files that haven t been tracked yet. You add both of these changes to the staging area with git add. This was a point of confusion for me when I was first working with Git. Why would I add the file auditor.sas when it s already being tracked in the repository? I did not realize that I was not adding files to the repository. I was adding the changes to the files into the index/staging area. This may seem like an unnecessary extra step, but it is actually quite useful. It allows you to craft exactly what you want in your commit. It is bad practice to make commits that encompass more than one type of change. Consider our current set of changes: we added a.gitignore file and we also created test code and data. Instead of wrapping those two things into one commit like Added gitignore file and created test data, it is better to create two separate commits. First I will tell Git to add the.gitignore file to its index. prompt> git add.gitignore 8

git status shows there are changes in the index ready to be committed to the repository, and there are still changes not staged (the untracked files in test/). Display 10. Crafting A Commit With Specific Changes Now I can commit the staged changes that include only the.gitignore file: prompt> git commit m Add gitignore Running git status now shows that we have two changes in our working tree ready to be staged: the modified auditor.sas file and the new test directory that holds the readme.txt file. These are both changes that have to do with creating test data so I can put them into one commit. Display 10. Crafting A Commit With Specific Changes prompt> git add auditor.sas prompt> git add test After both files are added to the index/staging area we can commit our changes to the repository: prompt> git commit m Add test data 9

Now we can see a list of all our history with git log: Display 11. Git Log Shows the History of Our Project After creating our test data, the next step is to write the code that gathers the measurements. So I add this DATA step to the code in auditor.sas: * gather the measurements for our data set; libname target "c:\data\wuss2012\git\projects\auditor\test\"; data measurements; dsid = open('target.test_data'); nobs = attrn(dsid,'nobs'); nvars = attrn(dsid,'nvars'); modte = datepart( attrn(dsid,'modte') ); * no need for the time; rc = close(dsid); audit_date = today(); drop rc dsid; run; After I save the file I can type git status and I see that Git knows there are changes in my working tree. But let s pretend that we have been pulled away for a while and can t remember what changes have been made to the code. Maybe we got pulled away from the project and something else took priority for a few days. When I finally come back to the code, I d like to see the differences between what s in my working tree and what s stored in the repository: prompt> git diff HEAD 10

HEAD is a quick way to refer to the most recent commit to the branch you are in (more about branches in a second, but for the time being the branch we are in is called master). As you can see, the changes to auditor.sas are presented to us. Lines with a minus (-) are removed or replaced with the lines with a plus (+). Display 12. Diff Between Working Tree And Repository Those changes look good to me so I will commit them. Instead of typing git add auditor.sas I can use a shortcut and specify the flag a (for add) as well as the message flag on my commit command. prompt> git commit am Create measurements data set At this point we have reached a milestone in that our code is finished and works the way we want. A good way to mark a milestone is to create a tag in the repository. A tag sets a marker in your project s history so you can find it and refer to it easily. This is the first release of our code so we can tag it as 1.0. prompt> git tag 1.0 To see which tags are in your project s history use the command git tag by itself. Summary That covers the basic workflow of creating a repository, adding files to it, changing the working tree and committing the changes to the repository. Commands used: Git init Git add Git status Git commit Git log Git tag This has only scratched the surface though. One of the things that make Git such a productivity enhancer is branching. 11

BRANCHING Looking back at the work we have done so far, you can see that Git is using a branch called master. One of the most useful parts of Git is that it lets you create branches which you can treat as code scratchpads without worrying about messing up working code that is currently being used. Let s set up a typical scenario to show how Git branches can make life easier. The code we wrote works. It creates a data set called measurements. It s currently being run in a production environment every morning. It has made everyone happy. That is, until one night you wake up in a cold sweat with the realization that your code is missing one important piece: it doesn t check if the data set exists before gathering measurements! Luckily it hasn t caused a problem but it needs to be fixed right away. And the fix should be pretty quick to add to the code. Then you look at your code and realize that checking for the existence of the data set isn t as trivial as you originally thought it would be. data measurements; dsid = open('target.test_data'); nobs = attrn(dsid,'nobs'); nvars = attrn(dsid,'nvars'); modte = datepart( attrn(dsid,'modte') ); * no need for the time; rc = close(dsid); audit_date = today(); drop rc dsid; run; At first you think you can just add one line of logic to the data step: If not dsid then stop; But that still creates the measurements data set with no observations. Ideally we don t want to create a measurments data set if the data set we are measuring does not exist. Like every good SAS programmer, we puzzle on it just long enough to realize SAS/Macro is the answer. However, that will involve a pretty significant re-write. Normally big structural code changes like that can be risky. But we are not afraid. We have Git. And Git has branches. Everything in Git is treated as a branch. The code we have been working on is in a branch called master. You can see all the branches in your repository with the git branch command: prompt> git branch * master As you can see, there is one default branch called master. The asterisk tells us it is the currently selected branch. Since everything in Git is treated as a branch, they are very quick and easy for us to work with. Each branch contains just the commits that were made to that branch, plus a link to its parent branch. This keeps it from consuming too much disk space and allows Git to reconstruct the entire history of any branch by walking the links back through any parent branches. There are no hard and fast rules on when to create a new branch. But here are some examples when creating a branch is useful: When you want to do a pretty serious rewrite (like our current example). When testing out a new way to solve a problem. For A/B style code comparisons. To create a branch you use the git branch command and the name of your new branch. We will call the new branch ds_exist. prompt> git branch ds_exist This only creates a new branch, but it doesn t check it out into your working tree. prompt> git branch ds_exit * master 12

Now we have two branches, but the asterisk next to master tells us it is the one currently checked out. To start working with our new branch, we check it out. prompt> git checkout ds_exist Switched to branch ds_exist Now we can work on the code in our ds_exist branch with confidence knowing we can always revert back to the master branch if things go badly. After some hacking and editing we have a new version of the code: %macro create_measurements; * gather the measurements for our data set; libname target "c:\data\wuss2012\git\projects\auditor\test\"; %let exist = %sysfunc( exist(target.test_data) ); %if not &exist %then %goto term; data measurements; dsid = open('target.test_data'); nobs = attrn(dsid,'nobs'); nvars = attrn(dsid,'nvars'); modte = datepart( attrn(dsid,'modte') ); * no need for the time; rc = close(dsid); audit_date = today(); drop rc dsid; run; %return; %term: %put the data set does not exist; %mend create_measurements; %create_measurements; Now we can stage and commit these changes to our branch: prompt> git commit am Skip logic if data set does not exist Display 13. A Commit to the New Branch At this point, just to do a sanity check, you can checkout master and then reload auditor.sas to see that it has not changed within branch master. Then checkout ds_exist and reload auditor.sas to see that those changes are still in the ds_exist branch. Notice that branches allow you to have two (or more) versions of code within your one directory, and you can flip back and forth with one command: git checkout <branch name>. 13

Merging Branches Now that we are satisfied with the changes to the code in the ds_exist branch we can think about merging those changes back to the master branch. There are a couple of different ways to merge branches, but we will go with the straightforwardly named straight merge. A straight merge takes all the commits and history of one branch and attempts to merge them into another branch. The straight merge is nice because it preserves all of the commit history. As opposed to a squashed commits merge where all the commit history of one branch is squashed into one commit that is then applied to the branch. First thing you need to do is make sure that your working tree reflects the branch you want to merge into. prompt> git checkout master Switched to branch master The git merge command takes the name of the branch you want to merge into your current branch: prompt> git merge ds_exist Display 14. Merging Branches You can use git log to see that the commit from ds_exist has indeed been committed to the master branch. Display 15. Git Log Shows History Now that we have successfully merged our changes from branch ds_exist we can delete it. prompt> git branch d ds_exist 14

Deleted branch ds_exist (was cf38737). REMOTE REPOSITORIES So far all of our work has been done locally. We have used Git as our own personal version control on our own personal project. This works fine and is good, but Git was specifically written to allow teams of developers to work on the same project and be located anywhere. The standard architecture for using Git in a distributed environment is to have one main repository in a Git server. By default, this centralized repository is called origin. Then any number of developers pull from origin to create their local repository to work on. Their local repository will then have all of the history of the project. When a developer is satisfied with his or her changes, they push their branch back to origin. Setting up a Git server to act as the main repository for your company is relatively simple but involves some administrative tasks such as installing SSH, creating a user, etc which would fall outside the scope of this paper. For a great walkthrough see: http://git-scm.com/book/en/git-on-the-server-setting-up-the-server Luckily, there is a great little spot on the web called github.com. As its name implies Github acts as an online hub for Git repositories. This works perfectly for an example to illustrate using Git as a distributed version control system. Github I have created an example repository named sas-empty-project on Github. You can view it online at: https://github.com/swelltrain/sas-empty-project Display 16. Sas-empty-project On Github 15

SAS-empty-Project is simply a set of directories forming a relatively sane layout for a SAS project with a splash of SAS code to get things started. It s a nice way to bootstrap a new SAS project. Let s say you found the project on Github. You want to use it when you create your own SAS projects, and additionally contribute to the repository on Github. To start working with the sas-empty-project you need to create your own local repository of it. You do this with the git clone command. As the screen print above shows you, the repository on Github can be accessed through a URL which we pass to git clone: https://github.com/swelltrain/sas-empty-project.git prompt> git clone https://github.com/swelltrain/sas-empty-project.git Display 17. Voila! My Own Local Repository If you run git log you will see that the repository and all its history has been cloned to your machine. prompt> cd sas-empty-project prompt> git log 16

Now you can make some changes to your local repository. Looking at the.gitignore file you notice that only *.log and *.sas7bdat files have been specified. Since it s a SAS project, we know there are other SAS binary files that should be excluded. So you edit.gitignore to include other SAS file extensions [2]: # logs *.log # sas data sets *.sas7bdat # sas views *.sas7bvew # sas catalog *.sas7bcat # sas indexes *.sas7bndx # sas audit files *.sas7baud # sas stored programs *.sas7bpgm # sas MDDB *.sas7bmdb # sas item store *.sas7bitm Notice this time you also added comments to the.gitignore file. Now you can stage and commit the change to your local repository: prompt> git commit am Add more SAS binary file types to.gitignore Remember, the commit message is there to tell everyone what change you made. All that s left is to push your changes to the remote server. But first we have to take a step back and talk about how Git knows which remote server to push to. When you ran git clone, Git automatically set up a remote pointing to the repository you cloned from. You can verify this by running the git remote command with the v (for verbose) flag. prompt> git remote v origin https://github.com/swelltrain/sas-empty-project.git (fetch) origin https://github.com/swelltrain/sas-empty-project.git (push) You can see that the remote origin has been set up for both fetching (which is the same as a pull, but it does not automatically merge) and pushing. 17

Now you can push your changes using the git push command. prompt> git push origin master Since origin and master are defaults you could instead just do: prompt> git push Display 18. Pushing To A Remote Repository And that s it. You have successfully contributed your changes to the public repository on Github. And additionally you can now create all the directories for a new SAS project with a single command: prompt> git clone https://github.com/swelltrain/sas-empty-project.git ADDITIONAL USEFUL COMMANDS A lot of times after I make a commit, I realize that I left a tiny little mistake in the code. Like a comment that I meant to erase, or a put statement that was useful just for debugging. I need to change the code, but I don t think the change really warrants its own commit. You can amend your last commit with: prompt> git commit C HEAD a amend The C parameter tells Git to use the message from the commit specified (remember HEAD refers to the last commit). Want to know who committed a line of code, and when? Use git blame <file>. prompt> git blame L 26,+6 auditor.sas This tells Git to show us who committed 6 lines in auditor.sas starting with line 26. Did you accidentally create a commit that absolutely shouldn t be there? prompt> git reset hard HEAD^ The parameter HEAD^ tells Git to reset the repository to the commit before the last commit. It is like the last commit never happened. 18

GIT CHEAT SHEET git init Initialize a Git repository. git status Shows all the changes that have occurred in your repository. git add <file> Adds a file to the index/staging area. git commit Creates a commit. -a adds changes to the index -m message specifies a message git log Shows the history of your repository. git diff Compare the changes in your working tree against the staging area. git diff HEAD Compare your working tree including staged changes to repository. git diff cached view the changes in the index/staging area to the repository..gitignore Text file that specifies files for Git to ignore. git rm cached <file> remove a file previously added to the repository. git tag Lists the tags. <tagname> Create a tag in the repository. git checkout tags/<tag name> Checks out the tag into your working tree. git branch List the branches. <branch name> Create the branch. -d <branch name> Delete the branch. git checkout <branch> Copy the code in that branch to your working tree to work on. -b Create a new branch from the current branch and checks it out. git merge <other branch> Merge <other branch> into the CURRENT branch. git clone <repository URL> Create a local copy of a remote repository. 19

REFERENCES [1] Eclipse Foundation. Eclipse Open Source Developer Report June 2012. Available at http://www.slideshare.net/ianskerrett/eclipse-survey-2012-report-final [2] Cates, Randall. 2002. What s In A Name; Describing SAS File Types. Proceedings of the Twenty-Seventh Annual SAS Users Group International Conference. Cary, NC: SAS Institute Inc. Available at http://www2.sas.com/proceedings/sugi27/p069-27.pdf RECOMMENDED READING Swicegood, Travis (2008). Pragmatic Version Control Using Git. Raleigh NC: The Pragmatic Bookshelf www.git-scm.com Git Home Page CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Stephen Philp Pelican Programming / www.sascoders.com / www.sas-resources.com www.stephenphilp.com Redondo Beach CA stephen@stephenphilp.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 20