No sooner have you started with your project and the code starts to grow beyond the basic skeleton of folder structures, you come across the need to do some version control for your source code. Rudimentary approach of making backup folders by date or patch files has its fair share of issues when it comes to maintaining the code or going back to specific change you made a week back, while retaining much of the latest changes. Version control gives you a finer control over things, you can keep adding changes to files that you have and the version control tool will give you ability to go back and revert changes whenever required. Most people are familiar with Subversion as an excellent tool for version control and I am sure they all have heard of Git as well, but havent really moved on to it for vairous reasons.
Git has become wildly popular as the tool for version control. It is the brain child of Linus Torvalds and the fact that it is currently used to version contol the linux kernel, which is probably millions of lines of code and the biggest open source project of its kind, speaks leaps and bounds about the capability of Git. It is largely command line driven but several good (and free) clients are also availabe for you to try. It differs from Subversion in the fact that it is a distributed version control system, as against the single server model of svn. At any given time, the git repo on a single machine is all you will ever need to work with the source. These changes can later be moved to , merged into or pushed to other code repositories as and when they are available. So to continue with some work, you need not rely on any central location for making the latest code available to you. Thats sweet!
I have always been a Subversion user, primarily because work never demanded switching to Git ever. So the only option was to start using it with some pet project and understand how it works, how it differs from svn etc.
Over time I think I have a better understanding of git. One can learn it by drawing some parallels between git and svn, but that only creates more confusion since the idea is to understand git as a distributed version control and by comparing it with svn, we kind of miss that point.
Initializing a git repository
There are three ways of achieving this. One is to initialize a new git repository (repo for short) in whatever folder you are in and start maintaining code in there. Second is to clone git code from some local folder that already houses a git repo. Third is to clone git repo from sevaral public git urls hosted on github. This is a location where projects are hosted in public domain and people can collaborate by cloning the repo, contributing their changes and pushing the change back to public repo.
git init - Creates a git repo in current directory.
git clone /path/to/project - clones existing project and starts git repo in current directory
git clone [git-url] - clones a publicly hosted project onto local disk. creates a directory of the same name as the project
Starting to use git
Once the git repo is created, you can carry on with work as usual and then check in the code when done. The key difference here (from svn) is git has a two step commit process. It has a staging area and the actual commited code which are treated as separate things. So for some change to be commited, it has to be added to stage first. Adding to stage simply means git is now tracking those files for change. Files that are not in stage are not tracked and wont make it to repo on commit. git provides simple commands to add, check and commit the changes. In case you missout something during an initial commit, git’s commit command also has an amend flag which will make your change as part of last check in.
Git has branch names for code that you are working with. The default is the “master”, something similar to trunk in svn. You will be able to create new branches and name them as per your convinience. We will see this later.
git status - lists files that have changed since last commit or ones which haven't yet been added to stage
git add [ filename | file1 file2 file3 | *.txt | * ] - start tracking files, use multiple file names or wildcards to add more than one file at a time
git commit -m "message" - commit the changes to current branch. "master" in this case. and add a message for the commit
git commit --amend - Commit this change as part of last check in. This can be used to change comment description, add missed out code etc.
git show - shows files that were checked in the last commit
git show --name-only - will list just the file names and leave out other details.
Before commit, it is usually better to add some config information to git repo. This identifies you by a username and email and this info is added to every commit that you do.
git config --global user.name "Your Name"
git config --global user.email you@yourmail.com
Note that this commit is only on the local master, and is good only if you are the only user of the codebase. If this needs to be shared with others, it has to be send out to remote repository. For this simply add a remote repo to current git setup and push code to external url.
git remote add [remote-name] [git-url] - This adds a repo url identified by the given name
git fetch [remote-name] - fetches all code from the remote location to current directory.
git push [remote-name] master - pushes all local changes in master to the remote repo identified by given name
Branching and Merging
As with other version control tools, you may need to branch out existing code to try out experimental code or add some feature to see how it integrates with rest of the project. git has branches for this. git also allows you to switch between branches, so that the directory in which you are working currently shows only the code which belongs to the branch you are working on. If you switch the branch half way, the changes you made a while back are no longer vsible since you are on a separate branch now. The changes, if committed, will still be on the other branch, so dont worry about it. “branch” command lists all branches in current repo. The default branch is master. You can switch to a different branch using “checkout”. Merging is also simple enough. It merges changes from the mentioned branches into current branch. If all goes well, you have the changes in current branch. The other branch can be removed if no longer used
git branch - lists all branches in the current repo
git branch [branchname] - creates a new branch of given name from current branch
git checkout [branchname] - swicth to branch of given name
git branch -b [branchname] - shortcut command for creating a new branch and switching to it as well
git merge [branchname] - merge contents of given branch into current branch. For this you have to be outside the branch you want to merge. If you try to merge a branch onto itself, git wont complain. It will only say that the branch is "Already up-to-date."
Tagging
So we have done a lot of work and changes on current code and current state of the project is good enough to push out a alpha build say. git provides a way to tag the code base for furture reference.
git tag [tagname] [commit id]
commit id is maintained in the git log, which we can refer for entire list of changes done on the project so far. git will tag code changes identified by the commit id.
Using git in a team
All git commands seen so far work well if you are the only one working on the code. What if you and another friend are collaborating on the project and need to be aware of the code changes made by each other. This is where you need to understand gits way of working with distribute code. git allows you to pull changes from a location and push changes out.
git pull [remote-repo] - gets latest code from the origin (the public git location)
git push [remote-repo] master - pushes code changes from master branch to origin (share changes with other) This is possible after adding a remote repo to your project as we have seen in earlier section.
Once you have latest code with you and start making changes, but need to discard your changes and replace them with the ones on the repo, git can either checkout the individual file again and thereby replace the local copy, or fetch everything from the origin branch and revert all local changes in one go. Be careful when using these options.
So far so good. Feel like a Git Pro already? Things can be confusing, so I made myself some easy to follow charts outlining different git commands that I’d use when working on different scenarios in a project. Apart from the usual commit and reverting of changes, we also have to deal with branching and merging and pushing code to public repo. These small (hastily made) charts will help keep things in perspective. A picture does speak more than a thousand words.
Simple git Initialization
Clone a project and work on it
Revert some local changes
Basic branch and merge
Tag version of the project
In course of learning new things about Git, I stumbled upon some excellent reference materials which are worth mentioning here. Further study of these resources will help gain a better understanding of Git. The book “Pro Git” is definitely on my reading list.
1) http://rogerdudler.github.io/git-guide/ – A much better guide than this article
2) http://gitref.org/ – A good reference guide by the GitHub team
3) http://git-scm.com/book/en/v2 – The Pro Git book
There are couple UI tools to work with git as well. I havent started using those yet, as I still need to make myself comfortable with git command line. Maybe a post for Git UI clients can soon follow. For now, hope this quick guide to Git helps understand the concepts and basic usage!