Distributed Version Tree

Conary keeps track of versions in a tree structure, much like a source code control system. The difference between Conary and many source code control systems is that Conary does not need all the branches of a tree to be kept in a single place. For example, if rPath maintains a kernel at rpath.com, and you, working for example.com, want to maintain a branch from that kernel, your branch could be stored on your machines, with the root of that branch connected to the tree stored on rPath's machines.

Figure 1. Distributed Branches

Distributed Branches

Repository

Conary stores everything in a distributed repository, instead of in package files. The repository is a network-accessible database that contains files for multiple packages, and multiple versions of these packages, on multiple development branches. Nothing is ever removed from the repository once it has been added. In simple terms, Conary is like a source control system married to a package system.

Files

When Conary stores a file in the repository, it tracks it by a unique file identifier rather than by name. Among other things, this allows Conary to track changes to file names—the file name is merely one piece of metadata associated with the file, just like the ownership, permission, timestamp, and contents. If you think of the repository as a filesystem, the file identifier is like an inode number.

Troves, Packages, and Components

When you build software with Conary, it collects the files into components, and then collects the components into one or more packages. Components and packages are both called troves. A trove is (generically) a collection of files or other troves.

A package does not directly contain files; a package references components, and the components reference files. Every component's name is constructed from the name of its container package, a : character, and a suffix describing the component. Conary has several standard component suffixes: :source, :runtime, :devel, :docs, and so forth. Conary automatically assigns files to components during the build process, but you can overrule its assignments and create arbitrary component suffixes as appropriate.

Figure 2. Package Structure

Package Structure

One component, with the suffix :source, holds all source files (archives, patches, and build instructions); the other components hold files to be installed. The :source component is not included in any package, since several different packages can be built from the same source component. For example, the mozilla:source component builds the packages mozilla, mozilla-mail, mozilla-chat, and so forth. The version structure in Conary's repositories always tells exactly which source component was used to build any other component.

Labels and Versions

Conary uses strongly descriptive strings to compose the version and branch structure. The amount of description makes them quite long, so Conary hides as much of the string as possible for normal use. Conary version strings act somewhat like domain names, in that for normal use you need only a short portion. For example, the version /conary.rpath.com@rpl:trunk/2.2.3-4-2 can usually be referred to and displayed as 2.2.3-4-2. The entire version string uniquely identifies both the source of a package and its intended context. These longer names are globally unique, preventing any confusion.

Let's dissect the version string /conary.rpath.com@rpl:trunk/2.2.3-4-2. The first part, conary.rpath.com@rpl:trunk, is a label. The label holds:

  • The repository host name: conary.rpath.com

  • Branch name: rpl:trunk

    • Namespace: rpl A high-level context specifier that allows branch names to be reused by independent groups. rPath will maintain a registry of namespace identifiers to prevent conflicts. Use local for branches that will never need to be shared with other organizations.

    • Tag: trunk This is the only portion of the label that is essentially arbitrary; and will be defined by the owner of the namespace it is part of.

The next part, 2.2.3-4-2, is called the revision and contains the more traditional version information.

  • Upstream version string: 2.2.3 This is the version number or string assigned by the upstream maintainer. Conary merely checks whether this upstream version exists already (to see which source count to use; see below), that it starts with a numeric character (to distinguish versions from labels when abbreviating versions), and that the - character is not in it (because the - character seperates the upstream version string from the next data element). The upstream version string is there primarily to present useful information to the user. Conary never tries to determine whether one upstream version is “newer” or “older” than another. Instead, the ordering specified by the repository's version tree determines what Conary thinks is older or newer; the most recent commit to the branch is the newest.

  • Source count: 4 Incremented each time a version of the sources with the same upstream version string is checked in. It is similar to the release number used by traditional packaging systems.

  • Build count: 2 How many times the source component that this component comes from has been built. This number is not provided for source components, because it is meaningless in that context.

Conary describes branch structure by appending version strings, separated by a / character. The first step to make a release is to create a branch that specifies what is in the release. Let's create the release-1 branch off the trunk: /conary.rpath.com@rpl:trunk/2.2.3-4/release-1 (note that because we are branching the source, there is no build count).

In this branch, release-1 is a label. The label inherits the repository and namespace of the node it branches from; in this case, the full label is conary.rpath.com@rpl:release-1

The first change that is committed to this branch can be specified in somewhat shortened form as /conary.rpath.com@rpl:trunk/2.2.3-4/release-1/5 Because the upstream version is the same as the node from which the branch descends, the upstream version may be omitted, and only the Conary version provided. Users will normally see this version expressed as 2.2.3-5, so this string, still long even when it has been shortened by elision, will not degrade the user experience. Conary also keeps track of individual labels in the repository, so conary.rpath.com@rpl:release-1/2.2.3-5 (note that there is no leading / character) is also going to be a reference to the same version.

Figure 3. Branch Structure

Branch Structure

Label Search Path and Branch Affinity

When you ask Conary to install a new trove on your system, but do not specify exactly which version to install, Conary will search its installLabelPath, which just an ordered list of labels, to find the trove. However, once you have a trove installed on the system, from any branch, updates to that trove will come from that branch. We call this branch affinity.

For example, let's assume that gimp 2.2.2 is in the distribution, and that the distribution label (conary.rpath.com@rpl:release1) is first in the installLabelPath, then conary update gimp will get gimp 2.2.2. However, let's say that someone is building the development version of gimp into our “contrib” repository on a branch named /conary.rpath.com@rpl:something/contrib.rpath.com@rpl:gimpdevel, which has the label contrib.rpath.com@rpl:gimpdevel. You could then run conary update gimp=contrib.rpath.com@rpl:gimpdevel and you would get the development version of gimp. Then, even if gimp 2.2.3 was later built into our distribution repository, future instances of conary update gimp would continue to fetch the latest version of the gimp from /conary.rpath.com@rpl:something/contrib.rpath.com@rpl:gimpdevel—that is, the exact branch that the label contrib.rpath.com@rpl:gimpdevel specified at the time when you originally updated to that label. You could then ask Conary to return to the stable version with conary update gimp=conary.rpath.com@rpl:release1.

Now, gimp stable and development versions can generally be installed at the same time. Assuming the development version has been packaged not to conflict with the stable version (in other words, the paths do not overlap), you can use the --keep-existing option to get both installed at once. So assuming that you already had gimp 2.2.2 installed from conary.rpath.com@rpl:release1, you could run conary update --keep-existing gimp=contrib.rpath.com@rpl:gimpdevel and you would have both versions of gimp installed at the same time. After running that command, running conary update gimp would update both branches of gimp. If you wanted to update only one branch of gimp, you would have to specify which branch to update: conary update gimp=contrib.rpath.com@rpl:gimpdevel

Shadows

The most powerful way to manage local changes is (of course) to build changes from source code. Conary makes this possible in two ways. One way is a simple branch, just like you would do with any source code control software. Unfortunately, this is not always the best solution.

Imagine a stock 2.6 Linux kernel packaged in Conary, being maintained on the /linux26 branch (we have omitted the repository host name and namespace identifier from the label for brevity) of the kernel:source package, currently at version 2.6.5-1 (note that because it is a source package, there is no build count). You have one patch that you want to add relative to that version, and then you wish to track that maintenance branch, keeping your own change up to date with the maintenance branch, and building new versions as you go.

If you create a new branch from /linux26/2.6.5-1, say /linux26/2.6.5-1/mybranch, all the work you do is relative to that one version. Creating a new branch does not help you, because the new branch goes off in its own direction from one point in development, rather than tracking changes. Therefore, when the new version /linux26/2.6.6-1 is committed to the repository, the only way to represent that version in your branch would be to manually compare the changes and apply them all, bring your patch up to date, and commit your changes to your branch. This is time-consuming, and the branch structure does not represent what is really happening in that case.

Note that you do not want to re-branch and create /linux26/2.6.6-1/mybranch because then mybranch will now be a label that means both /linux26/2.6.5-1/mybranch and /linux26/2.6.6-1/mybranch—almost certainly not what you intended. This would make it necessary for you to specify the entire branch name (/linux26/2.6.6-1/mybranch instead of just mybranch) when installing.

Conary introduces a new concept: a shadow. A shadow acts primarily as a repository for local changes to a tree. A shadow tracks changes relative to a particular upstream version string and source count, and is designed to allow you to merge changes and follow development. The name of a shadow is the name of the parent branch with //shadowname appended; for example, /branch//shadow. (Keep in mind for the rest of this discussion that /branch will really be something like /conary.rpath.com@rpl:linux and //shadow will really be something like //conary.example.com@rpl:myshadow)

Both /branch/1.2.3-3 and /branch//shadow/1.2.3-3 refer to exactly the same contents. Changes are represented with a dotted source count, so the first change to /branch/1.2.3-3 that you check in on the /branch//shadow shadow will be called /branch//shadow/1.2.3-3.1. When you build binaries, you will have versions like /branch//shadow/1.2.3-3.1-1.1 where the build count has also been dotted.

If you update to a new upstream source version on your shadow without merging to the parent branch, 0 is used as a placeholder for the parent source count. So if you check in version 1.2.4 on this shadow, you will get /branch//shadow/1.2.4-0.1 as your version. The same thing happens for build count; if the source version /branch/1.2.4-1 exists, but the build version /branch/1.2.4-1-1 does not exist when you build on your shadow, you will get versions that look like /branch//shadow/1.2.4-1.1-0.1

So, to track changes to the /linux26 branch of the kernel:source package, you create the mypatch shadow of the /linux26 branch, /linux26//mypatch, and therefore /linux26//mypatch/2.6.5-1 now exists. Commit a patch to the shadow, and /linux26//mypatch/2.6.5-1.1 exists. Later, when the linux26 branch is updated to version 2.6.6-1, you merely need to update your shadow, modify the patch to apply to the new kernel source code if necessary, and commit the your new changes to the shadow, where they will be named /linux26//mypatch/2.6.6-1.1. You can use the shadow branch name /linux26//mypatch just like you can use the branch name /linux26; you can install that branch, and conary update will use the same rules to find the latest version on the shadow that it uses to find the latest version on the branch. This includes affinity; Conary will look at the latest version on the shadow that you have installed; it will not switch to a different branch, nor will it look up the tree and pick a version off the branch (or shadow) from which the shadow was created.

Because re-branching (creating the same branch name again starting from a different root) creates multiple instances of labels, one for each branch instance, you really only want to use branches for truly divergent development, where there is no possibility at all that you will ever want to synchronize the branch with its parent. The main use for branches is to keep one or more old versions of a library (or less commonly, an application) available for the sake of compatibility, while moving forward with the more recent version; for example, gtk 1.2 and gtk 2. Shadows do not require that you ever merge or re-shadow; they do keep that option open in case it is ever useful. Use a branch only for divergent development. In case of any doubt, use a shadow, since shadows will also work for divergent development, as long as you do not want Conary to automatically install both branches at once.

Flavors

Conary has a unified approach to handling multiple architectures and modified configurations. It has a very fine-grained view of architecture and configuration. Architectures are viewed as an instruction set, including settings for optional capabilities. Configuration is set with system-wide flags, and per-package flags for configuration that is very package-specific. Each separate architecture/configuration combination built is called a flavor.

Using flavors, the same source package can be built multiple times with different architecture and configuration settings. For example, it could be built once for x86 with i686 and SSE2 enabled, and once for x86 with i686 enabled but SSE2 disabled. Each of those architecture builds could be done twice, once with PAM enabled, and once with PAM disabled. All these versions, built from exactly the same sources, are stored together in the repository.

At install time, Conary picks the most appropriate flavor of a component to install for the local machine and configuration (unless you override Conary's choice, of course). Furthermore, if two flavors of a component do not have overlapping files, and both are compatible with the local machine and configuration, both can be installed. For example, library files for the i386 family are kept in /lib and /usr/lib, but for x86_64 they are kept in /lib64 and /usr/lib64, so there is no reason that they should not both be installed, and since the AMD64 platform can run both, it is convenient to have them both installed.

When you update a trove, Conary has flavor affinity—that is, it tries to pick (from the available flavors of the latest version of that trove) the flavor that most closely matches what you currently have installed that is compatible with your system. Like branch affinity, you can override flavor affinity if you choose.