DVCS: Modern Source Control aka the Programmers Safety Net

Revision control is a key tool for modern software engineers. It provides a safety net for the individual developer, and provides a collaborative framework that allows many developers to work on the same project without fear of stepping on each others toes.

Revision control or isn’t a new idea. RCS and it’s descendant, CVS, date back to the early 80′s, and they in turn were based on even older systems. That said, many programmers still aren’t using it. Eric Sink blames it on lack of training. Ben Collins-Sussman thinks it’s because 80% of programmers aren’t "Alphas". Andrew Smith (the number one hit on Google, I might add) thinks it’s because takes too long to learn and it’s hard to set up a server. I’ll plead the fifth and say I hope I can be a part of the solution instead of the problem!

In any case, up until the last few years, revision control systems were centralized. That is, there was a single central repository of code to which contributors connected, checking out code and checkin in their changes. Subversion is the latest of these centralized systems. It was developed specifically to be CVS without the worst of the bugs, and to that end it is very successful. If you want great tools support, have a reasonable sized team, like non-mind-bending behavior, and you only work across a local network anyway, subversion is a great system.

However, many developers have become frustrated with centralized version control. Nobody wants to be accused of ‘breaking the build’, so naturally the frequency of checkins decreases. To the same end, to avoid newbies breaking the build, project administrators don’t give out commit access lightly. The end result is that developers lose the safety-net aspect revision control. I’ve been witness to developers making a copy of their source code, out of revision control, because they’re so afraid they might check in something bad.

In addition, since core contributors are the only ones with commit access to the revision control system, most contributions must come as patches. These patches can be tricky to create in the best of times, but with scale this problem becomes untenable. Just check out the linux kernel mailing list to get a sense of the problem.

The answer to these problems is called a Distributed Version Control System, or DVCS. There are quite a few of these animals out there. Most recently, it seems as if the open source playing field is being dominated by three: Bazaar, Git, and Mercurial. All of these systems have their plusses and minusses, but they are all open source and work well enough to get the job done.

Distributed version control systems share quite a few things in common. Instead of using a line or tree with named revision numbers to store the change history, distributed revision control systems use directed acyclic graphs. This basically means that you can have multiple valid lines or trees at the same time. Hence, distributed.

What this means to you (the developer) is that you get a local copy of the entire repository available to you at all times. That means you can check in, revert, merge, create branches, etc without a network connection.

It also means that you always have access to that revision control sandbox. It allows you to ‘check in early, check in often’, and still not live in fear of breaking the build or disrupting somebody elses work with your bad code. When your code is good and ready, you can review it’s entire change history, merge in any changes, and submit the entire changeset directly to the central repository or to a core committer as a patch.

Having a local copy of the repository also means that you have a more complete copy of your source code at every developer location  with a DVCS than you would with a traditional VCS.

I’ll get into the nitty-gritty of how to actually start using DVCS (and how’s it’s arguably faster and easer than svn) in another post, but for now, just get out there and use something. Not using source control is like skydiving without a parachute.

References: