Copyright © 2005 Michael L.H. Brouwer, Russell Brown
(TBA)
Table of Contents
List of Figures
After playing with Subversion and SVK for a long time, without really being able to use either for real work since our main repository was still using CVS, the author convinced his management to let him switch their SCM over to SVK directly instead of moving to Subversion which would have been at best a step sideways from CVS.
The transistion went smoothly and after the CVS repository was converted to Subversion using cvs2svn, we mirrored it to a SVK depot and setup a quick and easy bootstrap proccess for everyone to use. Almost all of the developers started using SVK immediately without any problems. A few of them asked the question:
Q: You expect me to use a piece of software without documentation for mission critical work?
A: It has documentation, look at the built in help or go to the svk wiki.
Of course I started to realize that without hanging out in the #svk irc channel, the help and the wiki were really not sufficient to help get someone going in using SVK. So I started writing a guide to using SVK day by day on our internal wiki.
After a few days of manually updating the wiki and keeping the table of contents in sync with the actual content, I started to realize that a wiki isn't the best way to write a book, which is what this guide was turning into. I also watched people coming to #svk on a daily basis asking very similar questions about SVK which should have been answered in a document somewhere.
After some encouragement from clkao in irc I decided to start with a copy of the Subversion book's docbook XML sources and write a book about SVK. The idea was to collect information from the wiki, FAQs, from #svk and things I was putting on our internal wiki that really applied to SVK in general into one place.
This book is that one place.
— , San Jose, 30 August, 2005
Table of Contents
“##TODO witty open quote here” —##TODO Witty Quote Author
In the world of open-source software, the Concurrent Versions System (CVS) has long been the tool of choice for version control. And rightly so. CVS itself is free software, and its non-restrictive modus operandi and support for networked operation—which allow dozens of geographically dispersed programmers to share their work—fits the collaborative nature of the open-source world very well. CVS and its semi-chaotic development model have become cornerstones of open-source culture.
Like many tools that have lasted 25 years, CVS is starting to show its age. Subversion is a relatively new version control system designed to be the successor to CVS. The designers set out to win the hearts of CVS users in two ways: by creating an open-source system with a design (and “look and feel”) similar to CVS, and by attempting to fix most of CVS's noticeable flaws. While the result isn't necessarily the next great evolution in version control design, Subversion is very powerful, very usable, and very flexible.
For some people, a plain successor to CVS wasn't good enough. One of those people was Chia-liang Kao. He took a year off his regular work to sit down and write a version control system that would help raise his own productivity once he got back to doing paid work. The result of his labor, and more recently that of an entire community of users and developers is SVK. While Subversion set out take over CVS's user base, SVK attempts to provide an answer for many others - including people who had already defected to another version control system and users who had never before used version control. SVK is written in Perl and uses the underlying revision-tracking filesystem built by the Subversion project.
This book documents SVK version 1.04. We have made every attempt to be thorough in our coverage. However, SVK has a thriving and energetic development community — a number of features and improvements planned for future versions of SVK may change some of the commands and specific notes in this book.
This book is written for computer-literate folk who want to use SVK to manage their data. While SVK runs on a number of different operating systems, its primary user interface is command-line based. It is that command-line tool (svk) which is discussed and used in this book. For consistency, the examples in this book assume the reader is using a Unix-like operating system, and is relatively comfortable with Unix and command-line interfaces.
That said, the svk program also runs on
non-Unix platforms like Microsoft Windows. With a few minor
exceptions, such as the use of backward slashes
(\) instead of forward slashes
(/) for path separators, the input to and
output from this tool when run on Windows are identical to its
Unix counterpart. However, Windows users may find more success
by running the examples inside the Cygwin Unix emulation
environment.
Most readers are probably programmers or sysadmins who need to track changes to source code. This is the most common use for SVK, and therefore it is the scenario underlying all of the book's examples. But SVK can be used to manage changes to any sort of information: images, music, databases, documentation, and so on. To SVK, all data is just data.
While this book is written with the assumption that the reader has never used version control, we've also tried to make it easy for users of CVS or Subversion to make a painless leap into SVK. Special sidebars may discuss CVS or Subversion from time to time, and a special appendix summarizes most of the differences between CVS, Subversion and SVK.
This book aims to be useful to people of widely different backgrounds—from people with no previous experience in version control to experienced sysadmins. Depending on your own background, certain chapters may be more or less important to you. The following can be considered a “recommended reading list” for various types of readers:
The assumption here is that you've probably used CVS before, and are dying to get a Subversion server up and running ASAP. Chapters 5 and 6 will show you how to create your first repository and make it available over the network. After that's done, chapter 3 and appendix A are the fastest routes to learning the SVK client while drawing on your CVS or Subversion experience.
Your administrator has probably set up Subversion already, and you need to learn how to use SVK as a client. If you've never used a version control system (like CVS or Subversion), then chapters 2 and 3 are a vital introduction. If you're already an old hand at CVS or Subversion, chapter 3 and appendix A are the best place to start.
Whether you're a user or administrator, eventually your project will grow larger. You're going to want to learn how to do more advanced things with SVK, such as how to use branches and perform merges (chapter 4), how to use SVK's property support, how to configure runtime options (chapter 7), and other things. Chapters 4 and 7 aren't vital at first, but be sure to read them once you're comfortable with the basics.
Presumably, you're already familiar with SVK, and now want to either extend it or add new tests or fixes to it. Chapter 8 is just for you.
The book ends with reference material—chapter 9 is a reference guide for all SVK commands, and the appendices cover a number of useful topics. These are the chapters you're mostly likely to come back to after you've finished the book.
This section covers the various conventions used in this book.
Used for commands, command output, and switches
Constant width
italicUsed for replaceable items in code and text
ItalicUsed for file and directory names
This icon designates a note relating to the surrounding text.
This icon designates a helpful tip relating to the surrounding text.
This icon designates a warning relating to the surrounding text.
Note that the source code examples are just that—examples. While they will compile with the proper compiler incantations, they are intended to illustrate the problem at hand, not necessarily serve as examples of good programming style.
The chapters that follow and their contents are listed here:
Covers the history of SVK as well as its features, architecture, components, and install methods. Also includes a quick-start guide.
Explains the basics of version control and different versioning models, along with SVK's depot, working copies and revisions.
Walks you through a day in the life of a SVK user. It demonstrates how to use SVK to obtain, modify, and commit data.
Discusses branches, merges, and tagging, including best practices for branching and merging, common use cases, how to undo changes, and how to easily swing from one branch to the next.
Describes the basics of the SVK depot, how to create, configure and maintain a depot, how to setup a shared repository and the tool you can use to do all this.
Explains how to configure a Subversion server for
use with SVK, and the three ways to access your
repository: HTTP, the
svn protocol, and local access. It
also covers the details of authentication, authorization
and anonymous access.
Explores the SVK client environment variables, file
and directory properties, how to
ignore files in your working copy,
and lastly how to handle vendor branches.
Describes the internals of SVK, the $SVKROOT administrative areas from a programmer's point of view. Shows how to write new tests for SVK and most importantly, how to contribute to the development of SVK.
Explains in great detail every subcommand of svk with plenty of examples for the whole family!
Covers the similarities and differences between SVK and Subversion.
Addresses common problems and difficulties using and building SVK.
Discusses tools that support or use Subversion, including alternative client programs, repository browser tools, and so on.
This book started out as a branch of the Version Control With Subversion book. Over time is was morphed into an SVK specific book by the authors. As such, it has always been under a free license. (See Appendix D, Copyright.) In fact, the book was written in the public eye, as a part of SVK. This means two things:
You will always find the latest version of this book in
the book's own Subversion repository at svn://svn.clkao.org/svkbook/trunk.
You can distribute and make changes to this book however you wish—it's under a free license. Of course, rather than distribute your own private version of this book, we'd much rather you send feedback and patches to the SVK developer community. See the section called “” to learn about joining this community.
A relatively recent online version of this book can be found
at http://svkbook.elixus.org.
This book would not be possible (nor very useful) if SVK did not exist. For that, the authors would like to thank Chia-liang Kao for having the vision of writing an Open Source version control system with the power, speed and interoperability that SVK has.
In addition this book wouldn't have existed, or at least would have taken much longer to write if it wasn't for the Subversion book written by Ben Collins-Sussman, Brian W. Fitzpatrick and C. Michael Pilato
The original book on which this book is based is titled Version Control With Subversion and is Copyright 2002, 2003, 2004, 2005 by Ben Collins-Sussman, Brian W. Fitzpatrick and C. Michael Pilato
We would also like to thank the countless people who contributed to the SVK book with reviews, suggestions and fixes: While this is undoubtedly not a complete list, this book would be incomplete and incorrect without the help of: Gary Hoo, Jesse Vincent, Michael Hendricks, Chris Russo and the entire SVK community.
Finally we would like to thank the countless people who contributed to the original subversion book with informal reviews, suggestions, and fixes: While this is undoubtedly not a complete list, this book would be incomplete and incorrect without the help of: Jani Averbach, Ryan Barrett, Francois Beausoleil, Jennifer Bevan, Matt Blais, Zack Brown, Martin Buchholz, Brane Cibej, John R. Daily, Peter Davis, Olivier Davy, Robert P. J. Day, Mo DeJong, Brian Denny, Joe Drew, Nick Duffek, Ben Elliston, Justin Erenkrantz, Shlomi Fish, Julian Foad, Chris Foote, Martin Furter, Dave Gilbert, Eric Gillespie, Matthew Gregan, Art Haas, Greg Hudson, Alexis Huxley, Jens B. Jorgensen, Tez Kamihira, David Kimdon, Mark Benedetto King, Andreas J. Koenig, Nuutti Kotivuori, Matt Kraai, Scott Lamb, Vincent Lefevre, Morten Ludvigsen, Paul Lussier, Bruce A. Mah, Philip Martin, Feliciano Matias, Patrick Mayweg, Gareth McCaughan, Jon Middleton, Tim Moloney, Mats Nilsson, Joe Orton, Amy Lyn Pilato, Kevin Pilch-Bisson, Dmitriy Popkov, Michael Price, Mark Proctor, Steffen Prohaska, Daniel Rall, Tobias Ringstrom, Garrett Rooney, Joel Rosdahl, Christian Sauer, Larry Shatzer, Russell Steicke, Sander Striker, Erik Sjoelund, Johan Sundstroem, John Szakmeister, Mason Thomas, Eric Wadsworth, Colin Watson, Alex Waugh, Chad Whitacre, Josef Wolf, Blair Zajac, and the entire Subversion community.
Table of Contents
Version control is the art of managing changes to information. It has long been a critical tool for programmers, who typically spend their time making small changes to software and then undoing those changes the next day. But the usefulness of version control software extends far beyond the bounds of the software development world. Anywhere you can find people using computers to manage information that changes often, there is room for version control. And that's where SVK comes into play.
This chapter contains a high-level introduction to SVK—what it is; what it does; how to get it.
SVK is a free/open-source version control system. That is, SVK manages files and directories over time. A tree of files is placed into a depot on the user's machine. The depot remembers every change ever made to your files and directories, and also to other files and directories that you mirror from other places. This allows you to recover older versions of your data, or examine the history of how your data changed. In this regard, many people think of a version control system as a sort of “time machine”.
Also central to many uses of Version Control is the concept of a repository. A repository is like a filesystem that can either be hosted on a remote server or locally on the same machine as the user's depot. While SVK doesn't currently provide a repository server, it has been designed to be able to work with repositories created by other Version Control systems; in particular Subversion repositories.
SVK can access repositories across networks, which allows them to be used by people on different computers. This is convenient to those who like a centralized repository model, and SVK supports this fully. Others prefer to use other more detached or distributed models, and again SVK is at home in these environments too. Whichever model is used, because the work is versioned, can always track back through the changes that have been made, and undo them if required.
Some version control systems are also software configuration management (SCM) systems. These systems are specifically tailored to manage trees of source code, and have many features that are specific to software development—such as natively understanding programming languages, or supplying tools for building software. SVK, however, is not one of these systems. It is a general system that can be used to manage any collection of files. For you, those files might be source code—for others, anything from grocery shopping lists to digital video mix-downs and beyond.
The idea of source control management has been around for many years. Early on the centralized repository scheme became the standard for many developer teams. In more recent times, distributed source control has become part of the normal work flow process. CVS is a widely known and used standard for Source Control Management. Because of some shortcomings a team of developers decided to create an improved version of CVS called Subversion (svn). While Subversion, is still a centralized repository system, because it was a new design, it was developed to be more flexible than its predecessor. Because of this fact, developers are able to take key components and connect them together in ways that the original designers never dreamt of. SVK is one of those dreams.
This particular dream came from Chia-liang Kao, who took a year out from work to develop SVK.###TODO: Expand this some more.
While Subversion aimed to take the basic Source Control model already provided by CVS and improve on its design and implementation, SVK aims to open up other Source Control techniques and features offered by other Source Control Systems and models.
Because some of these features are very different to those offered by fully-centralized systems such as CVS and Subversion, there may be a number of techniques or terms that are unfamiliar to you. These will be explained later.
Below is a list of some of the key features provided by SVK
SVK provides:
If a repository is being used, all read-only operations are available without a connection to the repository.
It is possible to use SVK without any connection to a repository whatsoever.
Merges between branches are tracked automatically, and therefore do not need manual lists of revision numbers to be specified.
SVK performs very well when compared with other version control systems.###TODO Need more here.
SVK can mirror repositories created by a number of Source Control Systems other than Subversion. ###TODO: List some here
Subversion expresses file differences using a binary differencing algorithm, which works identically on both text (human-readable) and binary (human-unreadable) files. Both types of files are stored equally compressed in the repository, and differences are transmitted in both directions across the network.
Figure 1.1, “SVK's Architecture” illustrates what one might call a “mile-high” view of SVK's design.
Because of the extremely flexible nature of SVK, it's very difficult to pin down one definitive way of describing its architecture; either in words or graphically. The way in which it all works depends largely on the source control model that you're currently using, and that gets even more complicated when you consider that you could be using more than one model with multiple projects in the same svk installation.
On one end is a Subversion repository that holds all of your versioned data. On the other end is your Subversion client program, which manages local reflections of portions of that versioned data (called “working copies”). Between these extremes are multiple routes through various Repository Access (RA) layers. Some of these routes go across computer networks and through network servers which then access the repository. Others bypass the network altogether and access the repository directly.
SVK is a set of Perl 5 modules written on top of Subversion's Perl bindings. In order to run SVK you will need to install Subversion with Perl bindings. Subversion itself is built on a portability layer called APR (Apache Portable Runtime library). This means SVK should work on any operating system that supports Perl 5 and the Apache httpd server runs on: Windows, Linux, all flavors of BSD, Mac OS X, Netware, and others.
The easiest way to get SVK is to download a binary
package built for your operating system. SVK's website
(http://svk.elixus.org) often has
these packages available for download, posted by volunteers.
The site usually contains graphical installer packages for users
of Apple and Microsoft operating systems. If you run a Unix-like
operating system, you can use your system's native package
distribution system (RPMs, DEBs, the ports tree, etc.) to get
SVK.
Alternately, you can build SVK directly from source
code. Below are instructions from the SVK wiki at http://svk.elixus.org/?BuildingSvkInYourHomeDirectory,
on how to build SVK in your home directory without root
access:
#!/bin/sh
# Install Perl
curl -O http://www.cpan.org/src/stable.tar.gz
tar xzf stable.tar.gz
cd perl-5*
sh Configure -ders -Dprefix=$HOME/local
make
make test # I didn't really do this step :P
make install
cd ..
export PATH=$HOME/local/bin:$PATH
# Install CPAN Modules
rm -f $HOME/local/lib/perl5/5.8.6/CPAN/Config.pm
rm -fr .cpan
perl -e 'print "\n\n\n\n\n\n\n\nfollow\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n5\n4\n1\n\nq\n"' > answers
perl -MCPAN -eshell < answers
rm answers
cpan Bundle::CPAN < /dev/null
cpan LWP < /dev/null
cpan ExtUtils::AutoInstall < /dev/null
cpan Module::Build < /dev/null
cpan Module::Install < /dev/null
cpan Module::Signature < /dev/null
cpan SVN::Simple::Edit < /dev/null
cpan version < /dev/null
cpan Sort::amic < /dev/null
cpan PerlIO::via::symlink < /dev/null
cpan IO::Digest < /dev/null
cpan Date::Parse < /dev/null
cpan File::Type < /dev/null
cpan PerlIO::eol < /dev/null
cpan Locale::Maketext::Simple < /dev/null
cpan Locale::Maketext::Lexicon < /dev/null
cpan FreezeThaw < /dev/null
cpan HTML::Element < /dev/null
cpan IPC::Run3 < /dev/null
cpan Pod::HTML_Elements < /dev/null
cpan Text::Diff < /dev/null
cpan XML::ValidWriter < /dev/null
cpan VCP::Dest::svk < /dev/null
# Install Apache 2.0
wget http://apache.mirrors.pair.com/httpd/httpd-2.0.52.tar.gz
tar xzf httpd-2.0.52.tar.gz
./configure --prefix=$HOME/local/apache2 --enable-mods-shared='headers rewrite dav ssl'
make
make install
cd ..
# XXX Apache configuration/startup juju goes here...
# Install SWIG
wget http://optusnet.dl.sourceforge.net/sourceforge/swig/swig-1.3.19.tar.gz
tar xzf swig-1.3.24.tar.gz
cd SWIG-1.3.24/
./configure --with-perl5=$HOME/local/bin/perl5.8.6 --prefix=$HOME/local
make
make install
cd ..
# Install Subversion
wget http://subversion.tigris.org/tarballs/subversion-1.2.1.tar.gz
tar xzf subversion-1.2.1.tar.gz
cd subversion-1.2.1
./configure \
SWIG=$HOME/local/bin/swig \
PERL=$HOME/local/bin/perl5.8.6 \
--prefix=$HOME/local \
--with-apxs=$HOME/local/apache2/bin/apxs
make
make swig-pl
make check-swig-pl
make install
make install-swig-pl
cd ..
# Install SVN-Mirror
svn co svn://svn.clkao.org/member/clkao/modules/SVN-Mirror/ SVN-Mirror
cd SVN-Mirror
perl Makefile.PL
make
make test
make install
cd ..
# Install VCP
wget http://search.cpan.org/CPAN/authors/id/A/AU/AUTRIJUS/VCP-autrijus-snapshot-0.9-20050110.tar.gz
tar xzf VCP-autrijus-snapshot-0.9-20050110.tar.gz
cd VCP-autrijus-snapshot-0.9-20050110
perl Makefile.PL
make
make test
make install
cd ..
# Install SVK
svn co svn://svn.clkao.org/svk/trunk svk
cd svk/
perl Makefile.PL
make
make test
make install
cd ..
rm -fr SVN-Mirror SWIG-1.3.24 perl-5.8.6 stable.tar.gz \
subversion-1.2.1* svk swig-1.3.24.tar.gz
svk help commands
# Done!
SVK, once installed, has a number of different pieces. The following is a quick overview of what you get. Don't be alarmed if the brief descriptions leave you scratching your head—there are plenty more pages in this book devoted to alleviating that confusion.
The command-line client program.
A tool for creating, tweaking or repairing a Subversion repository—this is technically part of subversion, however svk need it to be installed in order to create depots..
Assuming you have SVK installed correctly, you should be ready to start. The next two chapters will walk you through the use of svk, SVK's command-line client program.
Some people have trouble absorbing a new technology by reading the sort of “top down” approach provided by this book. This section is a very short introduction to SVK, and is designed to give “bottom up” learners a fighting chance. If you're one of those folks who prefers to learn by experimentation, the following demonstration will get you up and running. Along the way, we give links to the relevant chapters of this book.
If you're new to the entire concept of version control or to the “copy-modify-merge” model used by both CVS, Subversion and SVK, then you should read Chapter 2, Basic Concepts before going any further.
The following example assumes that you have svk, the SVK command-line client, and svnadmin, the administrative tool which is part of Subversion, ready to go. It also assumes you are using Subversion 1.2 or later (run svnadmin --version to check.) and SVK-1.00 or later (run svk --version to check.
SVK stores all versioned data in a depot. To begin, create
the default depot (make sure to answer
y<return> to the question svk asks
you):
$ svk depotmap --init Repository /Users/sally/.svk/local does not exist, create? (y/n)y $ ls ~/.svk cache config local
This command creates a new directory
~/.svk[1] which contains SVK's administrative
files, and a Subversion repository called local.
SVK has no concept of a “project”. The depot is just a virtual versioned filesystem, a large tree that can hold anything you wish. Some people prefer to store only one project in a depot, and others prefer to store multiple projects in a depot by placing them into separate directories. The merits of each approach are discussed in the section called “”. Either way, the depot only manages files and directories, so it's up to humans to interpret particular directories as “projects”.
In this example, we assume that you already have some sort
of project (a collection of files and directories) that you wish
to import into your newly created SVK depot. Begin
by organizing them into a single directory
called myproject (or whatever you wish).
For reasons that will be clear later on (see
Chapter 4, Branching and Merging), your project's tree
structure should contain three top-level directories
named branches,
tags, and
trunk. The trunk
directory should contain all of your data,
while branches
and tags directories are empty:
/tmp/myproject/branches/
/tmp/myproject/tags/
/tmp/myproject/trunk/
foo.c
bar.c
Makefile
…
Once you have your tree of data ready to go, import it into the repository with the svk import command (see the section called “svk import”):
$ svk import --message "initial import" /tmp/myproject //myproject Committed revision 1. Import path //myproject initialized. Committed revision 2. Directory /tmp/myproject imported to depotpath //myproject as revision 2.
Now the depot contains this tree of data. As mentioned
earlier, you won't see your files by directly peeking into the
depot; they're all stored within a database. But the
depot's imaginary filesystem now contains a top-level
directory named myproject, which in turn
contains your data.
Note that the original /tmp/myproject
directory is unchanged; SVK is unaware of it. (In fact,
you can even delete that directory if you wish.) In order to
start manipulating repository data, you need to create a new
“working copy” of the data, a sort of private
workspace. Ask SVK to “check out” a working
copy of the myproject/trunk directory in
the depot:
$ svk checkout //myproject/trunk myproject Syncing //myproject/trunk(/myproject/trunk) in /Users/sally/myproject to 2. A myproject/foo.c A myproject/bar.c A myproject/Makefile …
Now you have a editable copy of part of the depot in a
new directory named myproject. You can edit
the files in your working copy and then commit those changes
back into the depot.
Enter your working copy and edit a file's contents.
Run svk diff to see unified diff output of your changes.
Run svk commit to commit the new version of your file to the depot.
Run svk update to bring your working copy “up-to-date” with the depot.
For a full tour of all the things you can do with your working copy, read Chapter 3, Guided Tour.
At this point, you have the option of making your depot available to others over a network. See Chapter 6, Server Configuration to learn about the different sorts of server processes available and how to configure them.
Table of Contents
This chapter is a short, casual introduction to SVK. If you're new to version control, this chapter is definitely for you. We begin with a discussion of general version control concepts, work our way into the specific ideas behind SVK, and show some simple examples of SVK in use.
Even though the examples in this chapter show people sharing collections of program source code, keep in mind that SVK can manage any sort of file collection—it's not limited to helping computer programmers.
SVK is a system for tracking history. At its core is a depot[2], which is a central store of data. The depot stores information in the form of a filesystem tree—a typical hierarchy of files and directories.
So why is this interesting? So far, this sounds like the definition of a typical file system. And indeed, the depot is a kind of file system but it's not your usual breed. What makes the SVK depot special is that it remembers every change ever written to it: every change to every file, and even changes to the directory tree itself, such as the addition, deletion, and rearrangement of files and directories.
When SVK reads data from the depot, it normally sees only the latest version of the filesystem tree. But it also has the ability to view previous states of the filesystem. For example, you can ask SVK historical questions like, “What did this directory contain last Wednesday?” or “Who was the last person to change this file, and what changes did they make?” These are the sorts of questions that are at the heart of any version control system: systems that are designed to record and track changes to data over time.
The core mission of a version control system is to enable collaborative editing and sharing of data. But different systems use different strategies to achieve this.
All version control systems have to solve the same fundamental problem: how will the system allow users to share information, but prevent them from accidentally stepping on each other's feet? It's all too easy for users to accidentally overwrite each other's changes in the depot.
Consider the scenario shown in Figure 2.1, “The problem to avoid”. Suppose we have two co-workers, Harry and Sally. They each decide to edit the same repository file at the same time. If Harry saves his changes to the repository first, then it's possible that (a few moments later) Sally could accidentally overwrite them with her own new version of the file. While Harry's version of the file won't be lost forever (because the system remembers every change), any changes Harry made won't be present in Sally's newer version of the file, because she never saw Harry's changes to begin with. Harry's work is still effectively lost—or at least missing from the latest version of the file—and probably by accident. This is definitely a situation we want to avoid!
Many version control systems use a lock-modify-unlock model to address this problem. In such a system, the repository allows only one person to change a file at a time. First Harry must “lock” the file before he can begin making changes to it. Locking a file is a lot like borrowing a book from the library; if Harry has locked a file, then Sally cannot make any changes to it. If she tries to lock the file, the repository will deny the request. All she can do is read the file, and wait for Harry to finish his changes and release his lock. After Harry unlocks the file, his turn is over, and now Sally can take her turn by locking and editing. Figure 2.2, “The lock-modify-unlock solution” demonstrates this simple solution.
The problem with the lock-modify-unlock model is that it's a bit restrictive, and often becomes a roadblock for users:
Locking may cause administrative problems. Sometimes Harry will lock a file and then forget about it. Meanwhile, because Sally is still waiting to edit the file, her hands are tied. And then Harry goes on vacation. Now Sally has to get an administrator to release Harry's lock. The situation ends up causing a lot of unnecessary delay and wasted time.
Locking may cause unnecessary serialization. What if Harry is editing the beginning of a text file, and Sally simply wants to edit the end of the same file? These changes don't overlap at all. They could easily edit the file simultaneously, and no great harm would come, assuming the changes were properly merged together. There's no need for them to take turns in this situation.
Locking may create a false sense of security. Pretend that Harry locks and edits file A, while Sally simultaneously locks and edits file B. But suppose that A and B depend on one another, and the changes made to each are semantically incompatible. Suddenly A and B don't work together anymore. The locking system was powerless to prevent the problem—yet it somehow provided a false sense of security. It's easy for Harry and Sally to imagine that by locking files, each is beginning a safe, insulated task, and thus not bother discussing their incompatible changes early on.
SVK, Subversion, CVS, and other version control systems use a copy-modify-merge model as an alternative to locking. In this model, each user's client contacts the project repository and creates a personal working copy—a local reflection of the repository's files and directories. Users then work in parallel, modifying their private copies. Finally, the private copies are merged together into a new, final version. The version control system often assists with the merging, but ultimately a human being is responsible for making it happen correctly.
Here's an example. Say that Harry and Sally each create working copies of the same project, copied from the repository. They work concurrently, and make changes to the same file A within their copies. Sally saves her changes to the repository first. When Harry attempts to save his changes later, the repository informs him that his file A is out-of-date. In other words, that file A in the repository has somehow changed since he last copied it. So Harry asks his client to merge any new changes from the repository into his working copy of file A. Chances are that Sally's changes don't overlap with his own; so once he has both sets of changes integrated, he saves his working copy back to the repository. Figure 2.3, “The copy-modify-merge solution” and Figure 2.4, “The copy-modify-merge solution (continued)” show this process.
But what if Sally's changes do overlap with Harry's changes? What then? This situation is called a conflict, and it's usually not much of a problem. When Harry asks his client to merge the latest repository changes into his working copy, his copy of file A is somehow flagged as being in a state of conflict: he'll be able to see both sets of conflicting changes, and manually choose between them. Note that software can't automatically resolve conflicts; only humans are capable of understanding and making the necessary intelligent choices. Once Harry has manually resolved the overlapping changes—perhaps after a discussion with Sally—he can safely save the merged file back to the repository.
The copy-modify-merge model may sound a bit chaotic, but in practice, it runs extremely smoothly. Users can work in parallel, never waiting for one another. When they work on the same files, it turns out that most of their concurrent changes don't overlap at all; conflicts are infrequent. And the amount of time it takes to resolve conflicts is far less than the time lost by a locking system.
In the end, it all comes down to one critical factor: user communication. When users communicate poorly, both syntactic and semantic conflicts increase. No system can force users to communicate perfectly, and no system can detect semantic conflicts. So there's no point in being lulled into a false promise that a locking system will somehow prevent conflicts; in practice, locking seems to inhibit productivity more than anything else.
It's time to move from the abstract to the concrete. In this section, we'll show real examples of SVK being used.
You've already read about working copies; now we'll demonstrate how the SVK client creates and uses them.
A SVK working copy is an ordinary directory tree on your local system, containing a collection of files. You can edit these files however you wish, and if they're source code files, you can compile your program from them in the usual way. Your working copy is your own private work area: SVK will never incorporate changes from the depot, nor publish your own changes to the depot, until you explicitly tell it to do so.
After you've made some changes to the files in your working copy and verified that they work properly, SVK provides you with commands to “publish” your changes to the depot. If the depot contained changes not yet in your working copy[3], SVK provides you with commands to merge those changes into your working directory (by reading from the depot).
A SVK working copy doesn't contain any extra files, unlike
Subversion and CVS working copies. Instead SVK keeps track of
the state of your working copy in a subdirectory of your home
directory named .svk.
A typical SVK depot often holds the files (or source code) for several projects; usually, each project is a subdirectory in the depot's filesystem tree. In this arrangement, a user's working copy will usually correspond to a particular subtree of the depot.
For example, suppose you have two software projects,
paint and calc, for
which you wish to start keeping a history.
To add these 2 projects to your depot you would run the following commands:
$ svk import --message "Initial import of calc." calc //calc Repository /Users/sally/.svk/local does not exist, create? (y/n)y Committed revision 1. Import path //calc initialized. Committed revision 2. Directory /Users/sally/calc imported to depotpath //calc as revision 2.
Then repeat the same steps for the paint project. Now you have 2 projects in the depot. Each project lives in its own top-level subdirectory, as shown in Figure 2.5, “The depot's filesystem”.
Since the calc and paint projects have been imported into the depot, it's safe to remove the directories we imported from, so lets run:
$ rm -rf calc $ rm -rf paint
Now in order to get a working copy, you must
check out some subtree of the
depot[4].
(The term “check out” may sound like it has
something to do with locking or reserving resources, but it
doesn't; it simply creates a private copy of the project for
you.) For example, if you check out
/calc, you will get a working copy like
this:
$ svk checkout //calc Syncing //calc(/calc) in /Users/sally/calc to 2. A calc/button.c A calc/Makefile A calc/integer.c $ ls -A calc Makefile button.c integer.c
The list of letter A's indicates that SVK is adding
a number of items to your working copy. You now have a
editable copy of the depot's /calc
directory.
Suppose you make changes to button.c.
Since SVK knows the revision that the file in your working
copy was based on, SVK can tell that you've changed the file.
However, SVK does not make your changes public until you
explicitly tell it to. The act of publishing your changes is
more commonly known as committing (or
checking in) changes to the
depot.
To publish your changes to others, you can use SVK's commit command:
$ svk commit button.c Committed revision 57.
Now your changes to button.c have
been committed to the depot; if another user checks out a
working copy of /calc, they will see
your changes in the latest version of the file.
Suppose you have a collaborator, Sally, who checked out a
working copy of /calc at the same time
you did. When you commit your change to
button.c, Sally's working copy is left
unchanged; SVK only modifies working copies at the
user's request.
To bring her project up to date, Sally can ask SVK to update her working copy, by using the SVK update command. This will incorporate your changes into her working copy, as well as any others that have been committed since she checked it out.
$ pwd /Users/sally/calc $ ls -A Makefile integer.c button.c $ svk update U button.c
The output from the svk update command
indicates that SVK updated the contents of
button.c. Note that Sally didn't need to
specify which files to update; SVK uses the information
about the working copy stored inside
~/.svk/config, and further information from
the depot, to decide which files need to be brought up to
date.
A svk commit operation can publish changes to any number of files and directories as a single atomic transaction. In your working copy, you can change files' contents, create, delete, rename and copy files and directories, and then commit the complete set of changes as a unit.
In the depot, each commit is treated as an atomic transaction: either all the commit's changes take place, or none of them take place. SVK tries to retain this atomicity in the face of program crashes, system crashes, network problems, and other users' actions.
Each time the depot accepts a commit, this creates a new state of the filesystem tree, called a revision. Each revision is assigned a unique natural number, one greater than the number of the previous revision. The initial revision of a freshly created depot is numbered zero, and consists of nothing but an empty root directory.
Figure 2.6, “The depot” illustrates a nice way to visualize the depot. Imagine an array of revision numbers, starting at 0, stretching from left to right. Each revision number has a filesystem tree hanging below it, and each tree is a “snapshot” of the way the depot looked after a commit.
It's important to note that working copies do not always correspond to any single revision in the depot; they may contain files from several different revisions. For example, suppose you check out a working copy from a depot whose most recent revision is 4:
calc/Makefile:4
integer.c:4
button.c:4
At the moment, this working directory corresponds exactly
to revision 4 in the repository. However, suppose you make a
change to button.c, and commit that
change. Assuming no other commits have taken place, your
commit will create revision 5 of the depot, and your
working copy will now look like this:
calc/Makefile:4
integer.c:4
button.c:5
Suppose that, at this point, Sally commits a change to
integer.c, creating revision 6. If you
use svk update to bring your working copy
up to date, then it will look like this:
calc/Makefile:6
integer.c:6
button.c:6
Sally's change to integer.c will
appear in your working copy, and your change will still be
present in button.c. In this example,
the text of Makefile is identical in
revisions 4, 5, and 6, but SVK will mark your working
copy of Makefile with revision 6 to
indicate that it is still current. So, after you do a clean
update at the top of your working copy, it will generally
correspond to exactly one revision in the repository.
For each working copy, SVK records
three essential pieces of information in the
hash: subsection of the
checkout: section in the
~/.svk/config file:
An absolute path to the working copy, or a subdirectory thereof, and
what revision your working copy is based on (this is called the working copy's working revision), and
the encoding in which the filenames are stored by the filesystem.
Given this information, by consulting the depot, SVK can tell which of the following four states a working file is in:
The file is unchanged in the working directory, and no changes to that file have been committed to the depot since its working revision. An svk commit of the file will do nothing, and an svk update of the file will do nothing.
The file has been changed in the working directory, and no changes to that file have been committed to the depot since its base revision. There are local changes that have not been committed to the depot, thus an svk commit of the file will succeed in publishing your changes, and an svk update of the file will do nothing.
The file has not been changed in the working directory, but it has been changed in the depot. The file should eventually be updated, to make it current with the depot revision. An svk commit of the file will do nothing, and an svk update of the file will fold the latest changes into your working copy.
The file has been changed both in the working directory, and in the depot. An svk commit of the file will fail with an “out-of-date” error. The file should be updated first; an svk update command will attempt to merge the public changes with the local changes. If SVK can't complete the merge in a plausible way automatically, it will ask the user how to resolve the conflict.
This may sound like a lot to keep track of, but the svk status command will show you the state of any item in your working copy. For more information on that command, see the section called “svk status”.
As a general principle, Subversion tries to be as flexible as possible. One special kind of flexibility is the ability to have a working copy containing files and directories with a mix of different working revision numbers. Unfortunately, this flexibility tends to confuse a number of new users. If the earlier example showing mixed revisions perplexed you, here's a primer on both why the feature exists and how to make use of it.
One of the fundamental rules of Subversion is that a “push” action does not cause a “pull”, nor the other way around. Just because you're ready to submit new changes to the repository doesn't mean you're ready to receive changes from other people. And if you have new changes still in progress, then svn update should gracefully merge repository changes into your own, rather than forcing you to publish them.
The main side-effect of this rule is that it means a working copy has to do extra bookkeeping to track mixed revisions, and be tolerant of the mixture as well. It's made more complicated by the fact that directories themselves are versioned.
For example, suppose you have a working copy entirely at
revision 10. You edit the
file foo.html and then perform
an svn commit, which creates revision 15
in the repository. After the commit succeeds, many new
users would expect the working copy to be entirely at
revision 15, but that's not the case! Any number of changes
might have happened in the repository between revisions 10
and 15. It would be a lie to claim that we had a working
copy of revision 15, the client simply doesn't know. If, on
the other hand, svn commit were to
automatically download the newest changes, then it would be
possible to set the entire working copy to revision
15—but then we'd be breaking the fundamental rule
of “push” and “pull” remaining
separate actions. Therefore the only safe thing the
Subversion client can do is mark the one
file—foo.html—as being at
revision 15. The rest of the working copy remains at
revision 10. Only by running svn update
can the latest changes be downloaded, and the whole working
copy be marked as revision 15.
The fact is, every time you run svn commit, your working copy ends up with some mixture of revisions. The things you just committed are marked as having larger working revisions than everything else. After several commits (with no updates in-between) your working copy will contain a whole mixture of revisions. Even if you're the only person using the repository, you will still see this phenomenon. To examine your mixture of working revisions, use the svn status --verbose command (see the section called “svk status” for more information.)
Often, new users are completely unaware that their working copy contains mixed revisions. This can be confusing, because many client commands are sensitive to the working revision of the item they're examining. For example, the svn log command is used to display the history of changes to a file or directory (see the section called “svk log”). When the user invokes this command on a working copy object, they expect to see the entire history of the object. But if the object's working revision is quite old (often because svn update hasn't been run in a long time), then the history of the older version of the object is shown.
If your project is sufficiently complex, you'll discover that it's sometimes nice to forcibly “backdate” portions of your working copy to an earlier revision; you'll learn how to do that in Chapter 3, Guided Tour. Perhaps you'd like to test an earlier version of a sub-module contained in a subdirectory, or perhaps you'd like to figure out when a bug first came into existence in a specific file. This is the “time machine” aspect of a version control system — the feature which allows you to move any portion of your working copy forward and backward in history.
However you make use of mixed revisions in your working copy, there are limitations to this flexibility.
First, you cannot commit the deletion of a file or directory which isn't fully up-to-date. If a newer version of the item exists in the repository, your attempt to delete will be rejected, to prevent you from accidentally destroying changes you've not yet seen.
Second, you cannot commit a metadata change to a directory unless it's fully up-to-date. You'll learn about attaching “properties” to items in Chapter 7, Advanced Topics A directory's working revision defines a specific set of entries and properties, and thus committing a property change to an out-of-date directory may destroy properties you've not yet seen.
We've covered a number of fundamental Subversion concepts in this chapter:
We've introduced the notions of the central repository, the client working copy, and the array of repository revision trees.
We've seen some simple examples of how two collaborators can use Subversion to publish and receive changes from one another, using the “copy-modify-merge” model.
We've talked a bit about the way Subversion tracks and manages information in a working copy.
At this point, you should have a good idea of how Subversion works in the most general sense. Armed with this knowledge, you should now be ready to jump into the next chapter, which is a detailed tour of Subversion's commands and features.
[2] Technically you can have more than one depot, but well talk about that later.
[3] More on how this can happen in chapter 3
[4] Actually we could have caused the
directory from which we imported to become a working copy
simply by adding the --to-checkout switch to
the import command.
[5] If your home directory is on a networked filesystem this is not strictly true, and it is in fact possible to share a depot amongst different users by having your depot on a shared volume. However in chapter 3 we will show better ways of working together on the same projects.
[6] Yes you can have more than one depot, but don't worry about it for now. Chances are you won't ever need more than one.
Table of Contents
Now we will go into the details of using SVK. By the time you reach the end of this chapter, you will be able to perform almost all the tasks you need to use SVK in a normal day's work. You'll start with an initial checkout of your code, and walk through making changes and examining those changes. You'll also see how to bring changes made by others into your working copy, examine them, and work through any conflicts that might arise.
Note that this chapter is not meant to be an exhaustive list of all SVK's commands—rather, it's a conversational introduction to the most common SVK tasks you'll encounter. This chapter assumes that you've read and understood Chapter 2, Basic Concepts and are familiar with the general model of SVK. For a complete reference of all commands, see Chapter 9, SVK Complete Reference.
Before reading on, here is the most important command you'll ever need when using SVK: svk help. The SVK command-line client is self-documenting—at any time, a quick svk help <subcommand> will describe the syntax, switches, and behavior of the subcommand.
You use svk import to import a new project into a SVK depot. While this is most likely the very first thing you will do when you set up your depot, it's not something that happens very often. For a detailed description of import, see the section called “svk import” later in this chapter.
You use svk mirror to mirror a remote Subversion repository into a SVK depot[7]. SVK will then allow you to check out working copies from your local mirror as well as commit changes back to the mirrored repository. Basically this makes SVK a Subversion client with a local cache, which as it turns out makes SVK orders of magnitude faster than the basic Subversion client. In addition this opens up the possibility to create local branches and track changes between those and the mirror in both directions, for more information on working with local branches, see Chapter 4, Branching and Merging.
Before we go on, you should know a bit about how to identify a particular revision in your depot. As you learned in the section called “Revisions”, a revision is a “snapshot” of the depot at a particular moment in time. As you continue to commit and grow your depot, you need a mechanism for identifying these snapshots.
You specify these revisions by using the
--revision (-r) switch plus
the revision you want (svk <subcommand>
--revision REV) or you can specify a range by
separating two revisions with a colon (svk
<subcommand> --revision REV1:REV2). And SVK
lets you refer to these revisions by number, keyword, or
date.
When you create a new SVK depot, it begins its life at revision zero and each successive commit increases the revision number by one. After your commit completes, the SVK client informs you of the new revision number:
$ svk commit --message "Corrected number of cheese slices." Committed revision 3.
If at any point in the future you want to refer to that revision (we'll see how and why we might want to do that later in this chapter), you can refer to it as “3”.
If, however, you were committing from a working copy that was a direct checkout of a mirrored depotpath, things would be a little more complicated. This is because the mirrored repository has its own idea of revision numbers which is distinct from the local depots idea of revision numbers.
$ svk commit --message "Corrected number of cheese slices." Commit into mirrored path: merging back directly. Merging back to mirror source http://svn.sally.org/calc. Merge back committed as revision 45. Syncing http://svn.sally.org/calc Retrieving log information from 45 to 45 Committed revision 15 from revision 45.
What happened here is that SVK first committed the change you made to the original mirrored repository. After that it automatically performs a svk sync to download the change on the mirror to the local depot again. This ensures that you can never commit anything to a mirrored path that isn't also on the mirror itself.
Above we showed you how to use the
--revision switch to refer to a specific
revision number. By default the revision numbers you specify
are the ones in your depot. Sometimes you want to refer to a
particular revision of the mirrored repository instead.
Luckily SVK keeps a mapping from local to remote revision
numbers and allows you to specify both local depot revision
numbers or revision numbers in the mirrored repository when
performing operations. To do so you only need to add a
@ right after the revision number. To get
the log message for the revision we just committed above you
would use:
$ svk log --revision 45@ ---------------------------------------------------------------------- r15 (orig r45): sally | 2005-07-23 14:49:11 -0700 Corrected number of cheese slices. ----------------------------------------------------------------------
Notice how the log output shows both the local depots
r15, and mirrored repositories
r45 revision numbers. This can be a useful
aid in certain situations as well.
The SVK client understands a number of
revision keywords. These keywords
can be used instead of integer arguments to the
--revision switch, and are resolved into
specific revision numbers by SVK:
For every file and directory in your working copy SVK keeps track of the revision you last updated to. You can refer to this as the “BASE” revision.
The latest revision in the depot.
The last revision an item in a working copy was updated to.
BASE, can be used to refer to local
paths, but not to DEPOTPATHs.
Here are some examples of revision keywords in action. Don't worry if the commands don't make sense yet; we'll be explaining these commands as we go through the chapter:
$ svk diff --revision BASE:HEAD foo.c
# shows the changes in the depot not yet in your working copy.
$ svk log --revision HEAD
# shows log message for the latest depot commit
$ svk diff --revision HEAD
# compares your working file (with local mods) to the latest version
# in the depot.
$ svk diff --revision BASE:HEAD foo.c
# compares your “pristine” foo.c (no local mods) with the
# latest version in the depot
$ svk log --revision BASE:HEAD
# shows all commit logs since you last updated
These keywords allow you to perform many common (and helpful) operations without having to look up specific revision numbers or remember the exact revision of your working copy.
Anywhere that you specify a revision number or revision keyword, you can also specify a date inside curly braces “{}”. You can even access a range of changes in the depot using both dates and revisions together!
Here are examples of the date formats that SVK accepts. Remember to use quotes around any date that contains spaces.
$ svk checkout --revision {2002-02-17}
$ svk checkout --revision {15:30}
$ svk checkout --revision {15:30:00.200000}
$ svk checkout --revision {"2002-02-17 15:30"}
$ svk checkout --revision {"2002-02-17 15:30 +0230"}
$ svk checkout --revision {2002-02-17T15:30}
$ svk checkout --revision {2002-02-17T15:30Z}
$ svk checkout --revision {2002-02-17T15:30-04:00}
$ svk checkout --revision {20020217T1530}
$ svk checkout --revision {20020217T1530Z}
$ svk checkout --revision {20020217T1530-0500}
…
When you specify a date as a revision, SVK finds the most recent revision of the depot as of that date:
$ svk log --revision {2005-07-23}
----------------------------------------------------------------------
r12 (orig r41): sally | 2005-07-22 10:06:17 -0700
…
You can also use a range of dates. SVK will find all revisions between both dates, inclusive:
$ svk log --revision {2005-07-20}:{2005-07-29}
…
As we pointed out, you can also mix dates and revisions:
$ svk log --revision {2005-07-20}:4040
Users should be aware of a subtlety that can become quite a stumbling-block when dealing with dates in SVK. Since the timestamp of a revision is stored as a property of the revision—an unversioned, modifiable property—revision timestamps can be changed to represent complete falsifications of true chronology, or even removed altogether. This will wreak havoc on the internal date-to-revision conversion that SVK performs.
Most of the time, you will want to start using SVK by creating a mirror of a remote Subversion repository containing your project. Mirroring a repository creates a copy of it on your local machine. This copy contains as much of the original history of the Subversion repository as you want. For this example let's mirror everything:
$ svk mirror svn://svn.clkao.org/svk //mirror/svk Committed revision 1. $ svk sync //mirror/svk Retrieving log information from 1 to 1281 Committed revision 2 from revision 1. Committed revision 3 from revision 2. … Committed revision 1282 from revision 1281.
When mirroring a project you will have the choice between mirroring just trunk or the branch you are interested in as opposed to the entire project. In most cases you will want to mirror the entire project tree including the trunk, branches and tags directories. Since branches and tags are cheap copies on the remote server the mirror will also store them as cheap copies and thus not use significantly more disk space.
If you choose not to do this and mirror just a single branch or trunk, and you later decide you need access to other branches or tags in the remote repository, separately mirroring those will cause you local depot to contain disjoint mirrors of the individual branches and tags, and copies that were originally cheap copies in the remote repository will become full blown copies in your local depot. In addition you might run into problems with merge tracking across remote branches.
Creating a mirror of a large Subversion repository can take quite a while. However rest assured that once you have the mirror keeping it up to date is very fast. We will also discuss several ways to speed up getting an initial mirror on a large number of machines, see ##TODO.
Now that we have a mirror of the Subversion repository in our depot we are ready to create a working copy:
$ svk checkout //mirror/svk/trunk Syncing //mirror/svk/trunk(/mirror/svk/trunk) in /Users/sally/trunk to 1282. A trunk/utils A trunk/utils/extract-docs A trunk/utils/extract-message-catalog A trunk/utils/svk-ediff.el A trunk/SIGNATURE A trunk/pkg …
Although the above example checks out the trunk directory, you can just as easily check out any deep subdirectory of a depot by specifying the subdirectory in the checkout DEPOTPATH:
$ svk checkout //mirror/svk/trunk/lib/SVK/Command Syncing //mirror/svk/trunk(/mirror/svk/trunk) in /Users/sally/Command to 1282. A Command/Propdel.pm A Command/Checkout.pm A Command/Revert.pm A Command/Cat.pm …
Since SVK uses a “copy-modify-merge” model instead of “lock-modify-unlock” (see Chapter 2, Basic Concepts), you're already able to start making changes to the files and directories in your working copy. Your working copy is just like any other collection of files and directories on your system. You can edit and change them, move them around, you can even delete the entire working copy and forget about it—though you should run svk checkout --detach to let svk know you removed it.
While your working copy is “just like any other collection of files and directories on your system”, you need to let SVK know if you're going to be rearranging anything inside of your working copy. If you want to copy or move an item in a working copy, you should use svk copy