In place of the fragile script metadata provided by traditional package management systems, Conary introduces a concept called dynamic tags. Files managed by Conary can have sets of arbitrary text tags that describe them. Some of these tags are defined by Conary (for example, shlib is reserved to describe shared library files that cause Conary to update /etc/ld.so.conf and run ldconfig), and others can be more arbitrary.
Tag names are intended to be shared between repositories and distributions as much as is reasonably possible. When tag names are shared, tags will not introduce arbitrary incompatibilities in packaging. If one distribution needs something special done for any particular type of file, it should modify or replace the tag handler for that tag, but should leave the tag name the same. In order to allow tag semantics to be shared between repositories and distributions, it is likely that rPath will host a formal global tag registry in the future. At this time, the tag registry is kept manually in the Conary wiki.
By convention, a tag is a noun or noun phrase describing the file; it is not a description of what to do to the file. That is, file is-a tag. For example, a shared library is tagged as shlib instead of as ldconfig. Similarly, an info file is tagged as info-file, not as install-info.
Conary can be explicitly directed to apply a tag to a file, and it can also automatically apply tags to files based on a tag description file. A tag description file provides the name of the tag, a set of regular expressions that determine which files the tag applies to, the path of the tag handler program that Conary runs to process changes involving tagged files, and a list of actions that the handler cares about. Conary then calls the handler at appropriate times to handle the changes involving the tagged files.
Actions include changes involving either the tagged files or the tag handlers. Conary will pass in lists of affected files whenever it makes sense, and will coalesce actions rather than running all possible actions once for every file or component installed.
The current list of possible actions is:
Tagged files have been installed or updated; Conary provides a list of all installed or updated tagged files.
Tagged files are going to be removed; Conary provides a list of all tagged files to be removed.
Tagged files have been removed; Conary provides a list of filenames that were removed.
The tag handler or tag description have been installed or updated; Conary provides a list of all tagged files already installed on the system.
The tag handler or tag description will be removed; Conary provides a list of all the tagged files already installed on the system to facilitate cleanup.
Because the tag description files list the actions they handle, the tag handler API can be expanded easily while maintaining backward compatibility with old handlers.
Avoiding duplication between packages by writing scripts once instead of many times avoids bugs in scripts. Practically speaking, it avoids whole classes of common bugs that cause package upgrades to break installed software, and even more importantly from a provisioning standpoint, bugs that would cause rollbacks to fail. It makes it much easier to fix bugs when they do occur, without any need for “trigger” scripts that are often needed to work around script bugs in traditional package management. It also allows components to be installed across distributions—as long as they agree on the semantics for the tags, the actions taken for any particular tag will be correct for the distribution on which the package is being installed.
Calling tag handlers when they have been updated makes recovery from bugs in older versions of tag handlers relatively benign; Conary needs to install only a single new tag handler with the capability to recover from the effects of the bug. Older versions of packages with tagged files will use the new, fixed tag handler, which allows you to revert those packages to older versions as desired, without fear of re-introducing bugs created by old versions of scripts.
Furthermore, storing the scripts as files in the filesystem instead of as metadata in a package database means:
they can be modified to suit local system peculiarities, and those modifications will be tracked just like other configuration file modifications;
they are easier for system administrators to inspect; and
they are more readily available for system administrators to use for custom tasks.
There are two other kinds of troves that we did not discuss when we introduced the trove concept: groups and filesets.
Filesets are troves that contain only files, but those files come from components in the repository. They allow custom re-arrangements of any set of files in the repository. (They have no analog at all in the classical package model.) Each fileset's name is prefixed with fileset-, and that prefix is reserved for filesets only.
Filesets are useful primarily for creating small embedded systems. With traditional packaging systems, you are essentially limited to installing a system, then creating an archive containing only the files you want; this limits the options for upgrading the system. With Conary, you can instead create a fileset that references the files, and you can update that fileset whenever the components on which it is based are updated, and use Conary to update even very thin embedded images.
The desire to be able to create working filesets was a large motive for using file-specific metadata instead of trove-specific metadata wherever possible. For example, files in filesets maintain their tags, which means that exactly the right actions will be taken for the fileset. If Conary had package scripts like traditional package managers, it would be impossible to automatically determine which parts (if any) of the script should be included in the fileset. (As already discussed, scripts have other problems that tags solve; this is just another one of the architectural reasons that tags are preferable to scripts.)
Groups are troves that contain any other kind of trove, and the troves are found in the repository. (The task lists used by apt are similar to groups, as are the components used by anaconda, the Red Hat installation program.) Each group's name is prefixed with group-, and that prefix is reserved for groups only.
Groups are useful for any situation in which you want to create a group of components that should be versioned and managed together. Groups are versioned like any trove, including packages and components. Also, a group references only specific versions of troves. Therefore, if you install a precise version of a group, you know exactly which versions of the included components are installed; if you update a group, you know exactly which versions of the included components have been updated.
If you have a group installed and you then erase a component of the group without changing the group itself, the local changeset for the group will show the removal of that component from the group. This makes groups a powerful mechanism administrators can use to easily browse the state of installed systems.
The relationship between all four kinds of troves is illustrated as follows:
Groups and filesets are built from :source components just like packages. The contents of a group or fileset is specified as plain text in a source file; then the group or fileset is built just like a package.
This means that groups and filesets can be branched and shadowed just like packages can. So if you have a local branch with only one modified package on it, and then you want to create a branch of the whole distribution containing your package, you can branch the group that represents the whole distribution, changing only one line to point to your locally changed file. You do not have to have a full local branch of any of the other packages or components.
Furthermore, when the distribution from which you have branched is updated, your modification to the group can easily follow the updates, so you can keep your distribution in sync without having to copy all the packages and components.