Tech Blog :: How to add an unversioned live site into SVN


Dec 16 '09 11:13am
Tags

How to add an unversioned live site into SVN

Recently at work, I've been moving a bunch of older sites into SVN. (The plan is to migrate everything to Git soon.) I've done it on 3 big sites so far and got it down to a simple process.

Some background on the circumstances I documented this in:
  • This is all done on a *nix server via an SSH'd bash shell.
  • These sites have live and staging versions which, because they weren't versioned, were out of sync. Diff'ing all the differences between the sites seemed prohibitively complicated, so after consulting with everyone who uses the staging site, I put the live site into SVN and pushed it over to staging.
  • These sites have various mini-sites working inside them, some Drupal and some Java apps which periodically write static HTML files to the server. Years of these files have accumulated to many gigabytes of data, most of which didn't need to be in SVN. So I made liberal use of svn:ignore.
  • These sites are based on a main server which pushes out to multiple load-balanced servers, which are what visitors actually interact with. So coordination with our sysadmin to pause that export process momentarily prevented complications in the last steps.
  • You might need sudo access to do some of these steps. Be careful about preserving the file ownership/permissions.

So here's how it goes. I'm using generic site and directory names, substitute as needed:
ssh webserver
 
## backup the webroot
cp -R /var/www /var/www-backup
# and/or
tar -czf ~/mysite-backup.tar.gz /var/www
 
# copy the site outside the live webroot (into home dir)
# (rsync parameters preserve links and permissions; see 'man rsync' for details)
mkdir ~/mysite-copy/
sudo rsync -rlpogt /var/www/ ~/mysite-copy/
 
# should be the same size
du -sh ~/mysite-copy; du -sh /var/www
 
# make svn space
svn mkdir http://svn/repo/mysite -m"creating import space for mysite"
svn mkdir http://svn/repo/mysite/tags -m"creating tags dir"
 
## import the folders that should be fully versioned from the copy
## don't import the folders that should be svn:ignore'd
## (here I'm making a date-stamped static "tag" copy and then copying to trunk for the future)
cd ~/mysite-copy
svn import somedir http://svn/repo/mysite/tags/mysite-webserver-MMDDYY/somedir -m"importing mysite somedir"
# ... repeat for other directories
 
  # check what's worth versioning in the root
  du -hc --max-depth 1
 
    ## (might need to fix ownership and permissions for import to work, e.g.
      sudo chgrp -R webgroup .
      sudo chmod -R g+rwx .
    ## or just sudo the whole import
    ## if import fails, restart it (server seems to handle, no partial imports)
 
 
## checkout the [partially imported] site
mkdir ~/mysite-checkout
cd ~/mysite-checkout/
svn co http://svn/repo/mysite/tags/mysite-webserver-MMDDYY/ ~/mysite-checkout/
 
# copy everything over checkout
rsync -rlpogt ~/mysite-copy/ ~/mysite-checkout/ --progress
 
## put other folders in svn:ignore
## (can be done several ways, this requires manual editing but is simple)
## find folders in mysite-copy root, pipe thru sort, save to .svnignore
for D in `find . -type d -mindepth 1 -maxdepth 1`; do 
  echo ${D##./}
done | sort > .svnignore
 
## then remove the imported files/folders imported, remove .svn
nano .svnignore
 
# once .svnignore is good,
svn propset svn:ignore . -F .svnignore
# (remember to keep those in sync, so the file is always a reference for propset'ing)
 
# confirm ignores, no changes
svn status
# or to see everything,
svn status --no-ignore
# DO NOT DO 'svn add *' !!! -- that will add the ignored dirs!
 
# add root files 
# (this only adds what appears in the status report; I'm borrowing this from someone else)
svn status | sed -ne "/^?/ {s/^? *//;s/ /\\\ /g;p;}" | xargs svn add
 
# commit
svn commit ~/mysite/checkout -m"A VERY DESCRIPTIVE MESSAGE"
 
## (I ran into conflicts here with case-sensitive duplicates; 
## in that case, identify and delete the extras, svn cleanup, commit again)
 
## make sure /trunk doesn't exist already (otherwise it'll go into a subfolder)
svn ls http://svn/repo/mysite/trunk
svn cp http://svn/repo/mysite/tags/mysite-webserver-MMDDYY http://svn/repo/mysite/trunk -m"copying mysite-webserver tag to trunk"
 
## make sure all subsequent commits go to the trunk
svn switch http://svn/repo/mysite/trunk
 
## checkout again into a directory parallel to the webroot
sudo mkdir /var/www-WC
cd /var/www-WC/
# (using sudo here, then fixing the permissions; might not be necessary in all cases)
sudo svn co http://svn/repo/mysite/trunk/ /var/www-WC/
sudo chgrp -R webgroup /var/www-WC
sudo chmod -R g+rw /var/www-WC/
 
## copy everything else (and recently updated files)
sudo rsync -rlpogt /var/www/ /var/www-WC/ --progress
## (run twice, 2nd time should have little/no output)
 
# check size & differences (excluding SVN data)
du -sh --exclude .svn /var/www/; du -sh --exclude .svn /var/www-WC/
diff --recursive --exclude .svn /var/www /var/www-WC
# (assuming it's identical or close enough...)
 
## commit anything newer...
 
## at this point it's worth checking with your sysadmin (if you're not a sysadmin yourself) that any load-balancer exports or the like are shut off
 
## MAKE SURE ALL RELEVANT STAKEHOLDERS ARE OK WITH THIS BEFORE DOING
## flip the webroot
sudo mv /var/www /var/www-OLD; sudo mv /var/www-WC /var/www;
 
## test it, du and diff again
## keep the -OLD directory for a while just in case, or tarball it and put it somewhere else
That's pretty much it... let me know if I missed anything, if you have any questions, comments, etc. Good luck!

Post new comment

Don't bother putting in spam links. They'll be set to rel=nofollow and will be removed and reported as spam shortly after submitting.

The content of this field is kept private and will not be shown publicly.
CAPTCHA
Are you human?