Software Development: Version Control

Last month we looked at how to use version control when using Agile development. My conclusion was that you should be using Git. This is simply because using CI (Continuous Integration) there is a lot of branching and merging going on and Git is the only version control system that allows a version to have to have two parents. This is not to say that you can't use other version controls systems (and in fact I like SVN better in many ways - see below) just that Git keeps track of what needs to be merged for you.

This month I take a leisurely stroll back through time and look at all the version control systems I have used. I have a long personal history of using version control systems (generally being the administrator for such systems). I have used the best (and worst) but you should note that there are some excellent systems (like the proprietary Perforce and open-source Mercurial) that I have not used (yet?).

UNIX

I first experimented with version control while at Sydney University in the early 1980's using the Computer Science department's VAX 11/780. This ran a variation of UNIX that included a primitive version control system called SCCS (Source Code Control System) I think.

PVCS

I first used version control for my C source code in several MSDOS/C jobs during the mid-1980's. At the time the only serious option for MSDOS was PVCS (Polytron Version Control System) which I used at several companies.

I can't say I loved PVCS but it did the job. It efficiently stored changes to text files as "reverse deltas" and had all the basic features like branching and tagging.

CVS, etc

In the late 1980's I moved back to UNIX where I was a system administrator and system programmer. Under UNIX I tried SCCS, RCS (Revision Control System) and an early version of CVS (Concurrent Versions System) all of which worked butwere difficult to use, in some way.

TLIB

When I moved back to MSDOS/MSWindows systems in the early 1990's I used TLIB. This was similar to PVCS, but quite a bit better. However, this was still a command line driven system which I found tedious to use.

VSS

In the mid-1990's Microsoft included a GUI based version control system with their Windows IDE (Developer Studio). This seemed like a great idea to me after my experiences with command-line version control systems. However, Visual Source Safe (VSS) turned out to be by far the worst product I have ever used - it was not only poorly designed and very confusing, but also had a tendency to lose and corrupt files and whole repositories! Moreover, it made multi-site development impossible due to poor performance - there were 3rd party extensions to VSS (I later used one called VSSConnect) that were developed purely to improve performance over the Internet - but even then the performance was barely acceptable.

ClearCase

In my next job I used ClearCase (originally developed by Rational before being bought by IBM). This is the sort of product you would expect from IBM - thorough but confusing due to its plethora of features and options and requiring a lot of work to maintain. Luckily, I got to work on a new project where I had the opportunity to try a new open-source control system called Subversion (SVN).

SVN (SubVersion)

I set up SVN as an Apache module on one of the companies servers and was amazed at the performance. Using an Apache server allowed SVN to easily work over the Internet since it used HTTP/WebDav. (SVN also provides its own protocol and server call svnserve but the Apache option has advantages.)

The team for this project was split between Australia and Belgium but the two developers in Belgium got great performance (through VPN over the Internet) even though the server was in Sydney. Generally we spent about 10 minutes a day updating and committing changes.

This success with SVN encouraged me to use SVN for my own personal projects. I put my HexEdit open-source software (see http://www.hexedit.com) into an SVN repository which was hosted on SourceForge.

SVN was the first version control system I actually enjoyed using. One reason was that there was a Windows shell extension called TSVN (Tortoise SVN) that allowed you to easily do all your version control tasks using Windows Explorer.

SVN was the first
version control system
I enjoyed using

Another favorite thing is that, even if you are disconnected from the repository (eg if Internet connection is lost), you can still compare your current changes with the repo. This is because SVN keeps a local copy of all files as they were when you last updated from the repository.

TFS

In my next job I found that I was again dealing with the horrible VSS. Luckily, the company decided they had had enough problems with VSS and moved to TFS. Now TFS is much much better than VSS but still inferior in many ways to SVN. TFS does provides "shelving" which is a good idea but I have not found it all that useful in practice.

TFS does not
conform to the
Observer Pattern

TFS is more of a "centralized control" system than SVN. For example, it keeps track of all the files you have checked out into your WC (working copy) in its central database, whereas SVN only stores the actual files (the repo) in its central database and tracks things to do with the WC locally. To me the SVN approach makes more sense (conforming to the "Observer Design Pattern") and indeed many developers encounter problems when the local WC becomes inconsistent with TFS's idea of what it should contain.

Git

Finally, I last came to try Git a few years ago as I was intrigued by its branching model. This solved the only annoying thing I found with SVN - the problem of merging changes between the trunk and a long term branch. I like to merge often (as Agile and CI say you should) but SVN forced you to manually keep track of which versions you have already merged between branches. Git automatically tracks your merges so you can't forget to merge or merge the same thing twice.

Git makes it easy
to branch and
merge

There is a lot to like about Git but in all honesty I do not find it as enjoyable to use as SVN. First, there are a plethora of confusing commands and options. For example the ability to "stage" a commit before actually committing I never found that useful. It just adds another layer of complexity.

But the worst thing about Git is that it is all command line driven. I always find it much easier to remember how to use GUI software than to remember obscure command names and options. Luckily Atlassian provides a GUI interface to Git using a free product called SourceTree.

One good thing about Git is that it has an excellent book called "Pro Git" that explains in detail how to use it. However the book does get a little evangelical in its praise for Git at times. For example, it goes on about atomic commits (SVN has atomic commits), how fast it is to clone a repo (SVN checkout is faster) and that it has the killer feature of lightweight branching (SVN has that too).

Then there is the fact that Git is distributed whereas SVN is centralized. Now people rave on and on about the advantages of distributed version control but I really don't see it. Sure if you have an open-source project with one or more different "forks" then it's probably useful. Personally I prefer one central "master" copy of the source where changes are merged to as soon as possible. I think having multiple repositories floating around would lead to a merge nightmare and contravenes the idea behind CI.

Anyway, I don't want to go into too much depth on the "centralized vs distributed" debate here (I may later). So that's all for now. Bye.

Software Development

Thursday 17 November 2016

Version Control - Personal Experiences

No comments:

Post a Comment