Version Control Systems: SVN and GIT How do VCS support SW development teams? CS 435/535 The College of William and Mary
Agile manifesto We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value: Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan That is, while there is value in the items on the right, we value the items on the left more. What is needed? Chapter 3 Agile software development 2
Plan-driven and agile specification Plan-based development Requirements engineering Requirements specification Design and implementation Requirements change requests Agile development Requirements engineering Design and implementation Chapter 3 Agile software development 3
The extreme programming release cycle Select user stories for this release Break down stories to tasks Plan release Evaluate system Release software Develop/integrate/ test software Chapter 3 Agile software development 4
The Scrum process Outline planning and architectural design Assess Review Select Develop Project closure Sprint cycle Chapter 3 Agile software development 5
Scrum benefits The product is broken down into a set of manageable and understandable chunks. Unstable requirements do not hold up progress. The whole team have visibility of everything and consequently team communication is improved. Customers see on-time delivery of increments and gain feedback on how the product works. Trust between customers and developers is established and a positive culture is created in which everyone expects the project to succeed. Chapter 3 Agile software development 6
Software Engineering is Team Work Enabling technology for productivity must support parallelization must support communication Remember SVN from CS 301? What does it support? Documentation as preserved communication must support management of tasks & people What needs to be done? When? By whom? What has been done? By whom?
Version Control Systems Centralized CVS 1990 SVN - 2000 Distributed Bitkeeper - 1997 Git 2005 Bazaar 2005 Mercurial - 2005 *More VCS at http://en.wikipedia.org/wiki/comparison_of_revision_control_software
Version Control Systems Version control system supports concurrent software development on shared code base keeps track of changes, integrates versions / recognizes conflicts, allows for recovery, documentation of changes Common set up: IDE as front end, VCS as back end (shared, persistent storage)
Subclipse: eclipse plugin for SVN
Subclipse: eclipse plugin for SVN http://subclipse.tigris.org/update_1.8.x
EGit: eclipse plugin for Git
EGit: eclipse plugin for Git http://download.eclipse.org/egit/updates
EGit: eclipse plugin for Git http://eclipsesource.com/blogs/tutorials/egit-tutorial/ http://wiki.eclipse.org/egit/user_guide#overview
Centralized vs Distributed Version Control Systems Centralized Architecture: Image from http://git-scm.com/book/en/getting-started-about-version-control
Distributed Version Control Systems Distributed Architecture: Image from http://git-scm.com/book/en/getting-started-about-version-control
Centralized vs Distributed VCS What are the pros & cons? Software engineering is much about scalability: Project size in # of developers about 10 up to 100 more than 100
Workflows: Centralized Small teams Typical workflow for SVN and CVS Repository is a single point of failure Image from http://git-scm.com/book/en/distributed-git-distributed-workflows
Workflows: Integration - Manager Supported by CVS and SVN using branches More easily supported by distributed version control systems Image from http://git-scm.com/book/en/distributed-git-distributed-workflows
Workflows: Director and Lieutenants Supported by CVS and SVN using branches More easily supported by distributed version control systems Generally used by huge projects (e.g., Linux kernel) Image from http://git-scm.com/book/en/distributed-git-distributed-workflows
Versions, Revisions, and Snapshots CVS: each commit generates a new version for each file modified SVN: each commit generates new state of the file system tree, called a revision GIT: same than SVN; keeps a snapshot of the system but instead of saving the deltas it saves the changed files and references to the unchanged ones
our project in Git, it basically takes a picture of what all your files look like at moment and stores a reference to that snapshot. To be efficient, if files have not ged, Git doesn t store the file again just a link to the previous identical file it has ady stored. Git thinks about its data more like Figure 1.5. Git follows idea of a file system with snapshots Figure 1.5: Git stores data as snapshots of the project over time. This is an important distinction between Git and nearly all other VCSs. It makes reconsider almost every aspect of version control that most other systems copied the previous generation. This makes Git more like a mini filesystem with some edibly powerful tools built on top of it, rather than simply a VCS. We ll explore
TER 1 GETTING STARTED n) think of the information they keep as a set of files and the changes made to each SVN et al: over time, as illustrated in Figure 1.4. re 1.4: Other systems tend to store data as changes to a base version of each file. Git doesn t think of or store its data this way. Instead, Git thinks of its data more a set of snapshots of a mini filesystem. Every time you commit, or save the state our project in Git, it basically takes a picture of what all your files look like at moment and stores a reference to that snapshot. To be efficient, if files have not
Versions, Revisions, and Snapshots SVN and Git use global revision numbers Image from http://svnbook.red-bean.com/en/1.7/svn.basic.in-action.html
Operations and states (CVS and SVN) Workspace Repository Checkout Commit
Operations and states (Git) Workspace Staging area (Index) Repository Checkout Stage Commit
Operations and commands - Git http://osteele.com/posts/2008/05/commit-policies
Workflows: Integration - Manager Supported by CVS and SVN using branches More easily supported by distributed version control systems Image from http://git-scm.com/book/en/distributed-git-distributed-workflows
Operations and commands Operation CVS SVN Git Init init create init Import import import commit Checkout checkout checkout clone Checkout branch checkout checkout checkout Commit/Checkin commit commit commit, push Update update update fetch, pull
Operations and commands - SVN+Eclipse
Operations and commands - SVN+Eclipse
Operations and commands - SVN+Eclipse
Operations and commands - SVN+Eclipse
Operations and commands - SVN+Eclipse
Operations and commands - SVN+Eclipse
Workflows and issues Workflow: 1) get code base 2) make changes 3) deliver changes Issue: Read/write access to remote repository Protected: User authentication, registration, account/pw necessary in communication, IDE stores/uses account/pw for convenience Issue: Conflicts Changes do not fit together, automatically recognized at some level of granularity (same file, same method, same line of code) Automatically recognized, manually fixed Issue: Documentation / Communication What changed, how trustworthy are the changes, what needs to be changed as an effect Finding the right historical version to undo some changes
Tagging Useful for marking specific points in history, in particular: Releases Two types: lightweight vs annotated annotated: full objects in Git DB, check summed, contain tagger CHAPTER 2 GIT BASICS name, email, date, tagging message, can be signed & verified $ git tag -a v1.4 -m my version 1.4 $ git show v1.4 tag v1.4 Tagger: Scott Chacon <schacon@gee-mail.com> Date: Mon Feb 9 14:45:11 2009-0800 my version 1.4 commit 15027957951b64cf874c3557a0f3547bd83b3ff6 Merge: 4a447f7... a6b4c97... Author: Scott Chacon <schacon@gee-mail.com> Date: Sun Feb 8 19:02:46 2009-0800 Merge branch experiment That shows the tagger information, the date the commit was ta tation message before showing the commit information.
Branching CVS: simple process for creating branches on the repository SVN: has no internal concept of a branch; branches are managed as copies of a directory. GIT: very simple process for creating local and remote branches
Merging Branch 4 1 2 3 5 6 Branch 4 Merge 1 2 3 5 6 7
Branching in SVN
Branching in SVN
Branching in SVN
Branching in SVN
Branching in SVN
Branching in SVN
Branching in SVN
Branching in Git Branches are lightweight movable pointers to commits The default branch is the MASTER (trunk) Images from http://git-scm.com/book/en/git-branching-basic-branching-and-merging
Branching in Git Initial layout for three commits New branch pointer (iss53) Images from http://git-scm.com/book/en/git-branching-basic-branching-and-merging
Branching in Git New commit on the branch Hot fix branch on master Images from http://git-scm.com/book/en/git-branching-basic-branching-and-merging
Merging in Git Images from http://git-scm.com/book/en/git-branching-basic-branching-and-merging
Merging in Git Images from http://git-scm.com/book/en/git-branching-basic-branching-and-merging
Branching & Merging in Git Key concept, well supported Local workflow people create branches for any issue / task /assignment they deal with, sometimes called topic branch optional: rebase instead of merge to obtain a linear history only recommended for local repository Remote repository: merge with master For integration manager with blessed repository: pull request
What is missing so far? Documentation of problems, bug reports Work assignments, who does what and till when Issue tracking
Github Issue Tracker Filter by open and closed issues, assignees, labels, and milestones. Sort by issue age, number of comments, and update time. Milestones / labels
Github Workflow: Code review & Pull request Pull request starts conversation around proposed changes. Additional commits may add to branch before merging into master. Pull Request = Code + Issue + Code Comments
Software Engineering is Team Work Enabling technology for productivity must support parallelization must support communication Documentation as preserved communication must support management of tasks & people What needs to be done? When? By whom? What has been done? By whom?