Revision Control Systems

Source: migo@homemail.com--web/talk-rcs--main--0
License: GNU GPL or GNU FDL by your choice.

See index page.

In quest of the ideal RCS

We search for a revision control system that is good as:
  • Concurrent development system - allow multiple developers to work on the same code-data base at the same time.

  • History system - make it easy to retrieve and report all code-data changes at any point in the time.

  • Source distribution - make it easy to get the source for all interested parties.

  • Branching system - allow multiple lines of development (like testing, beta and release lines).

  • Merge system - allow code developed for a related development line to be merged in.

  • Backup / replication system - allow easy backup / mirror of all code-data changes.

  • Fast, elegant, transparent and easy to use.
We search for the best and are not satisfied with the tolerable.

Free Revision Control Systems

CVS - Concurrent Versions System

  • De-facto standard revision control system.
  • Centralized, one fixed repository per project.
  • No file/dir renames, symlinks, permissions.
  • Very limited support for branches.
  • Very limited support for merging.
  • No atomic changesets.
  • Developed infrastructure and integration.
  • Outdated, but works well when it works.
  • There are better alternatives to consider.

Subversion - evolution

  • Modern evolution of CVS.
  • Centralized, one repository per project.
  • Tracks directory changes such as
    renames, symlinks. No permissions.
  • Branch is implemented as a copy.
  • Tag is implemented as a copy.
  • Copy is cheap, actual copy on write.
  • Limited support for merging.
  • Changesets are not first class citizens.
  • Reuses Apache 2 for the server.
  • Could be considered for new projects.

GNU Arch - revolution

  • Not an evolution. Revolution in RCS.
  • Decentralized, multiple repositories.
  • Tracks directory changes such as
    renames, symlinks, permissions.
  • Several file id methods.
  • Branching is just tagging, very cheap.
  • Strong support for merging.
  • Changesets are first class citizens.
  • Dumb file storage, no server required.
  • Supports major network protocols.
  • Could be considered for new projects.

Arch features

  • Changesets are the core design part allowing a distributed nature.

  • Separation of logical repository names from their physical locations (easily movable repositories).

  • Global namespace of repositories around the world, with no central authority involved.

  • Posibility to build repositories around the developers and not around the servers.

  • "Dumb" repositories that hold data and may be accessed using regular protocols like ssh/http/ftp, with no special servers involved

  • Easy mirroring of repositories from anywhere to anywhere, including local disk.

  • Advanced methods of merging between branches in the same or different repositories.

  • Specialized namespace of revisions, category--branch--version--revision.

Centralization versus Decentralization

First graph - cvs and svn, all three graphs - arch.

Centralization scenarios

Normal operations

  • Setup the server and give the commit access to chosen developers.

Features

  • Good for closed company developement.
  • Good for catherdral type of developement.
  • Easier to find the authorized source.
  • Easier to find the authorized developers.
  • Harder to participate.
  • Mandatory authority.

Decentralization scenarios

Normal operations

  • External branches and forks.
  • Mirroring archives for faster read-only access.
  • Development while disconnected from internet.

Package distribution and portage systems

  • A debian package maintainer may manage debian/ subdirectory and update it accourding to the mainstream, without submitting it back. [example]
  • Software distributions may construct a set of ports from multiple repositories and maintain such portage system in RCS that allows this.

Software customization

  • A site may maintain its own web theme in RCS and synchronize the web software with the mainstream. Optional back submission. Examples of maintaining a custom template set for ArchZoom:

Developers and contributors

Usual CVS or Subversion work flow

  • checkout project as anonymous user
  • fix/add new feature in the working tree without using RC
    • create and submit a patch
    • wait until it is applied in the mainstream
  • repeat until the trust is earned
  • be granted with write access
  • recheckout project as a real user
  • develop with RC involved

Usual Arch workflow

  • fork (branch) project in personal archive
  • fix/add new feature with all benefits of RC
  • submit merge request to the mainstream
  • star-merge the mainstream's updates

Arch command set - Some numbers

Some "scary" numbers

tla, the Arch main tool, has more than 100 commands!

In comparison, svn, the Subversion main tool, has about 30 commands (and about 30 command aliases, plus about 15 svnadmin commands). cvs command line tool has about 30 commands.

Arch command set - Main commands

In reality, only several commands are used in a day to day work:

tla help
tla import
tla commit
tla update
tla add
tla delete
tla move

These have the same meaning as in svn and cvs (but cvs does not support move).

Other common commands have different names:

tla get              # ~~ cvs checkout ~~ svn checkout
tla changes          # ~~ cvs -n update ~~ svn status
tla changes --diffs  # ~~ cvs diff ~~ svn diff
tla logs             # ~~ cvs log ~~ svn log
tla replay           # ~~ svn merge
tla star-merge       # no cvs and svn equivalent
tla missing          #       -- . -- . --
tla delta            #       -- . -- . --

Arch command set - Revision library

Arch can optimize away expensive accesses to the archive. This is done with a revision library on the local disk. Not only does it speed up operation, you can also use arch's features when you are offline.

The most popular revisions (including changesets and the whole trees) may be stored in the library by the user's request. To learn more about the commands dealing with the revision library, see "tla help | grep library".

Arch command set - Useful commands

One frequent task is to fix something small and urgent while in the middle of a large change. This is trivial in arch:

# hack some incomplete code in file1 and file2
tla undo
# fix urgent bug in file1
tla commit
tla redo
# continue to work on file1 and file2

The inventory command is handy for several purposes. The following generates a list of all directories and source files in the tree.

tla inventory --both --source

Arch tutor

This practical part is intented to be shown in live. However the slides accurately describe the interactive show. You may create two local users muli and tali and run the commands from this tutorial. All # hack file in the slides may be replaced with echo >>file if you like.

Arch tutor - Starting a project

Hacker Muli discovers an O(1) algorithm to calculate prime numbers and starts his ambitious project primes:

mkdir ~/primes
cd ~/primes
# hack primes script
# hack README file

Muli reads that arch is very good and decides to try it. He introduces himself to arch (done once):

tla my-id 'Muli Verner <muli@hacker.org>'
mkdir ~/archives
tla make-archive muli@hacker.org--2004 ~/archives/muli@hacker.org--2004
tla my-default-archive muli@hacker.org--2004

Muli imports a project:

tla archive-setup primes--devel--0
tla init-tree primes--devel--0
tla add README primes
tla import

Arch tutor - Developing a project

Muli makes the upper limit configurable:

# hack primes, README
tla changes  # to review changes to the tree, or --diffs
tla commit -s 'accept upper limit argument from the command line'

Muli finds a way to speed up the program by factor of 2:

# hack primes script
tla commit -s 'highly optimize calculations'

Muli wants to see some project statistics:

tla abrowse
tla revisions --summary --date  # requires archive
tla logs --summary --date  # requires tree only

Arch tutor - Getting a project

And at the same time, on the other side of the planet, another hacker Tali finds the Muli's project and falls in love. She registers the archive and checkouts the sources:

tla register-archive ~muli/archives/muli@hacker.org--2004
tla get muli@hacker.org--2004/primes--devel--0 primes
cd primes

From time to time Tali updates the project:

tla missing --summary --date   # review the changes
tla update

Arch tutor - External participation

However, Tali finds some missing functionality preventing the script to be pipeline-able. She starts her own branch to fix this:

tla my-id 'Tali Cosmic <tali@scientist.com>'
mkdir ~/archives
tla make-archive tali@scientist.com--2004 ~/archives/tali@scientist.com--2004
tla my-default-archive tali@scientist.com--2004

tla tag -S muli@hacker.org--2004/primes--devel--0 primes--pipeline--0
tla join-branch primes--pipeline--0
tla set-tree-version primes--pipeline--0

Tali makes the output header to be optional:

# hack primes, README
tla commit -s 'add --noheader|-n command line option'

And adds an option to print primes one per line:

# hack primes, README
tla commit -s 'add --pipe|-p command line option'

Arch tutor - Merging, tagging

Tali sends a merge request to Muli by email, and Muli sees the changes are very useful. He registers the Tali's archive and starts to star-merge from her pipeline branch from time to time:

tla register-archive ~tali/archives/tali@scientist.com--2004
tla missing --summary --date tali@scientist.com--2004/primes--pipeline--0
tla star-merge tali@scientist.com--2004/primes--pipeline--0
tla commit -s 'merge with Tali'

Now, Muli is ready to release the 0.0.1 version. He makes some tree rearangements:

mkdir bin doc
tla add bin doc
tla mv README doc/
tla mv primes bin/
tla add doc/THANKS
# hack doc/README, doc/THANKS
tla commit -s 'rearrange files and prepare to 0.0.1'

Finally, Muli tags the 0.0.1 version:

tla tag -S primes--devel--0 primes--release--0.0.1

Arch-Magic

An infrastructure built around GNU Arch, that is written entirelly in Perl, and contains:

The idea is to provide a similar functionality by different user interfaces.

arch-magic

Arch-Perl Library

Allows to create Arch front-ends. Available on CPAN.

Checkout project tree from the repository: script

use Arch::Session;

my $session = Arch::Session->new;
my $branch = 'migo@homemail.com--Perl-GPL/arch-perl--devel';

my $tree = $session->get_tree($branch, "/tmp/arch-perl");
print "Got the latest snapshot of version ", $tree->get_version, "\n";

Automatically generate ChangeLog from project tree logs: script

use Arch::Tree;

foreach my $log (Arch::Tree->new->get_logs) {
        print "-" x 80, "\n";
        print $log->date, "\n", $log->summary, "\n\n";
        print $log->body;
}

ArchZoom

A web based repository browser, similar to CVSWeb or ViewCVS.

ArchZoom allows to browse local and remote archives, branches, changesets and trees. Features interactivelly expandable views, syntax highlighting, file/dir history and more.

archzoom

ArchWay

A GTK+ based GUI that consists of several specialized tools.

ArchWay allows to visually inspect remote archives, branches, changesets and snapshots, and to manage local projects.

archway

References

Thank you