Tech Blog :: Using SVN Merge for Dev-Prod Web Deployments


Jul 23 '09 4:54pm
Tags

Using SVN Merge for Dev-Prod Web Deployments

Note: As of SVN 1.5 and newer, merging has gotten a little easier, so some of this may be obsolete.

We use Subversion (SVN) to manage our code, and typically branch the code from 'dev' to 'prod' at launch time, with each project's tech lead determining the best way to manage the two repositories in parallel for subsequent development. SVN, like any decent versioning system, has branching and merging functionality, but until recently I had little success (and some disaster) trying to use it. So I used the cruder but tried-and-true method of locally rsync'ing dev to prod (with -rC for recursive and ignoring SVN metadata) and committing the changes to each separately. That mostly works fine, except it can be a very tedious process (using FileMerge to check every file and figure out which branch's version is correct); it makes SVN's logs very difficult to understand (with no clear history to a file and duplicate commits of everything); it's easy to make a mistake and mess up a branch; and it seemed unnecessary given SVN's built-in functionality. In some cases, the dev and prod branches got so out of whack that the developers had given up on merging them, and all development was done on local copies of the production branch, an inefficient (and dangerous) method that made branches pointless.

So on my latest post-launch project, I decided to figure out SVN merging. The first deployment process took 2+ hours with a splitting headache at the end; then 1 more lucid hour; now it's 5 minutes, so I do this even for extremely urgent fixes. svn help merge describes multiple ways to use the command; this is how I've gotten it to work.

SVN Merge essentially diff's one branch (e.g. dev) from R1 (some revision) to R2 (some other revision, generally HEAD, the latest commit), then applies those differences/patches to a second branch (e.g. prod, in the same repository). This is all done locally, on working copies of each branch. The result is local changes to the second branch, which you then commit to that branch. Since the branches are in the same repository, the commit numbers are sequential: changes to dev might be in commits/revisions 101, 102, 103, then merged to prod, committed at r104. Subsequent commits to dev start at r105, etc. If all development is done on dev, R1 is generally the revision at which the 2nd branch (prod) was last merged and committed.
So an example merge logic would be, take all changes to dev since r104 [to HEAD], and copy them to prod. But I'm getting ahead of myself.

The most important thing to keep track of is the revisions at which each branch or merge are done. I keep a log as a GDoc; you can run 'svn log' each time to get the same information.
Also before I get into syntax, some precautions:

  • always use the --dry-run tag with 'svn merge' the first time, to check what it wants to do.
  • backup all the copies of the site that might be changed (local and servers' dev and prod).
  • very important: make sure your local working copies are all updated ('svn up'), so the HEAD revisions match.

On this particular project, I have a /project folder, under which are dev and prod folders, in each a working copy of their subsequent branches (of the same site). So /project gives me a good vantage point: I can 'svn up dev; svn up prod' to update both, and the folder names make for clear syntax.
For the initial launch, I had branched dev (via svn cp) to prod at r833. I had then done some post-launch debugging directly on the prod branch, so it needed to be merged back to dev, to resume development there.
The syntax (in a Unix terminal on OSX), from /project, was like this:
svn merge -r833:HEAD prod@HEAD dev

The logic: Take all the changes on prod since revision #833, and copy them to dev.
(It can take a while to run the merge, so don't worry if you don't see any response for a while.)
When that's done, run 'svn st dev' make sure that everything looks right, and commit dev. Keep track of the new revision #, you'll need it later. In this case let's say it was r900.

With dev now an accurate copy of prod, I then made some changes to the dev code, so dev is now at r950.
Fast forward a couple of days, I want to deploy the new code from dev back to prod.
From /project again, with /project/dev and /project/prod updated and all wanted changes in dev committed to the dev branch, I did:
svn merge -r900:HEAD dev@HEAD prod

Note r900 -- that's the committed revision from the last merge. Prod hasn't changed since r900, but dev has. All the changes I want are on dev between r900 and HEAD. So this does exactly that: copies all those changes from dev since r900, from my local dev (@HEAD since it's updated), to my local prod. Again run a dry run first to make sure it's not doing anything wonky --
svn merge -r900:HEAD dev@HEAD prod --dry-run

-- then a live run, then commit again to prod.
I said dev was previously at r950, so the new merged revision on prod should be r951.

So a week later dev's at r999, I want to merge to prod again:
svn merge -r951:HEAD dev@HEAD prod

That creates r1000 on prod ... a few days later,
svn merge -r1000:HEAD dev@HEAD prod

... and so on. The key is to keep track of that merged revision number, because all changes made subsequent to that merge on the other branch will need to be merged back.

If I wanted to copy prod back to dev, let's say because users have uploaded files on the live server which I've committed to the prod branch, I could use the original prod-to-dev syntax, like so:
svn merge -rXXXX:HEAD prod@HEAD dev

(this time the source being changes on prod, the destination being my local dev, the opposite of the others.)

It could get tricky if the same files are being modified simultaneously on both branches, but in our work this shouldn't be necessary -- one site branch is considered the active development branch at any time, everyone on the team is notified, and conflicts are handled locally. What I have done successfully is committed uploaded files to the prod branch, and merged from dev to prod, with the new prod files (missing in the dev branch) being left alone, as they should. I suspect with more complicated simultaneous modifications, anyway, that the principle would be the same, it would just be the quantity needed to wrap your head around that would increase. I haven't tested scenarios like that yet.

Points to remember:

  • keep track of the revision # from the last committed merge
  • think in terms of copying changes in a source since some point to a destination.
  • only work on 1 development active branch at a time
  • work on updated local working copies
  • use --dry-run first and back everything up
  • have Advil available for the first few times.

The beauty of this approach, in the end, is a clear history of each file in the logs, no duplication, synchronized branches, and a much cleaner deployment process. I'm going to use it on all the future projects that I can, and recommend that my colleagues do the same.

(Please let me know if anything I've written here is unclear or blatantly wrong. ben at echoditto dot com.)
Good luck!

More resources:

Thank you for the article, I have been using efficiently this SVN dev/prod branches workflow for a few months now.

I am looking for a tool that would automatically read the log of both branches an tell me which revisions of the dev branck (the trunk) still need to be merged to prod, and vice versa. If such a tool does not exist yet, I might end up writing it myself.

Post new comment

Don't bother putting in spam links. They'll be set to rel=nofollow and will be removed and reported as spam shortly after submitting.

The content of this field is kept private and will not be shown publicly.
CAPTCHA
Are you human?