Mercurial subrepos

Mercurial has support for repository hierarchies in subrepos extension. With it you can create parent-child dependecies between repositories and work in the hierarchy like all repos were part of the same tree.

Easiest example of using this is to have an application in the toplevel repo and libraries in subrepos.

The toplevel repo keeps a list of subrepo directory paths and locations from where the repo can be found when pushing or pulling. This information is stored in a file called .hgsub in the repository root.

In the file each line specifies one subrepo:

lib/liba = location/of/remote/repo

The left hand side is the local directory path in the repo and the right hand side is the remote location of push/pull repo.

It is possible to use full SSH/HTTP URLs as the remote location but I’ve found that this introduces a problem. This information is stored in the repository and thus all clones use this same information. If you have the repositories in a machine which can be accessed only from local network then what’s the point of storing this information in the repo where it is confusing everybody?

There is a way to move the real address out of the repo. The section [subpaths] in .hgrc allows us to map symbolic names in .hgsub to real paths. You can use regular expressions in the config file to specify which symbolic name you are overwriting. And the right hand side has access to backreferences from regexps.

With this indirection it is possible to come up with naming conventions for the symbolic names. Of course it is still possible to keep the real URLs there also, but I like the idea that I just say the name of the component I’m referring to as a symbolic name and put the real repository path in the .hgrc config file.

And the naming convention I’m using is to have component type and last item in the path as the name of the component:

fancy = src/django-fancy

Local path fancy is a subrepo. Here src is the type of the repo and django-fancy is the name of the component and also the last part of the repo path in the server. This information is used in .hgrc:

[subpaths]
src/(.*) = ssh://hg@hg.server.foo/projects/\1

With this system it is easy to handle new subrepos consistently and have different server mappings in internal network.