Replicating Volumes (Creating Read-only Volumes)

Replication refers to creating a read-only copy of a read/write volume and distributing the copy to one or more additional file server machines. Replication makes a volume's contents accessible on more than one file server machine, which increases data availability. It can also increase system efficiency by reducing load on the network and File Server. Network load is reduced if a client machine's server preference ranks lead the Cache Manager to access the copy of a volume stored on the closest file server machine. Load on the File Server is reduced because it issues only one callback for all data fetched from a read-only volume, as opposed to a callback for each file fetched from a read/write volume. The single callback is sufficient for an entire read-only volume because the volume does not change except in response to administrator action, whereas each read/write file can change at any time.

Replicating a volume requires issuing two commands. First, use the vos addsite command to add one or more read-only site definitions to the volume's VLDB entry (a site is a particular partition on a file server machine). Then use the vos release command to clone the read/write source volume and distribute the clone to the defined read-only sites. You issue the vos addsite only once for each read-only site, but must reissue the vos release command every time the read/write volume's contents change and you want to update the read-only volumes.

For users to have a consistent view of the file system, the release of updated volume contents to read-only sites must be atomic: either all read-only sites receive the new version of the volume, or all sites keep the version they currently have. The vos release command is designed to ensure that all copies of the volume's read-only version match both the read/write source and each other. In cases where problems such as machine or server process outages prevent successful completion of the release operation, AFS uses two mechanisms to alert you.

First, the command interpreter generates an error message on the standard error stream naming each read-only site that did not receive the new volume version. Second, during the release operation the Volume Location (VL) Server marks site definitions in the VLDB entry with flags (New release and Old release) that indicate whether or not the site has the new volume version. If any flags remain after the operation completes, it was not successful. The Cache Manager refuses to access a read-only site marked with the Old release flag, which potentially imposes a greater load on the sites marked with the New release flag. It is important to investigate and eliminate the cause of the failure and then to issue the vos release command as many times as necessary to complete the release without errors.

The pattern of site flags remaining in the volume's VLDB entry after a failed release operation can help determine the point at which the operation failed. Use the vos examine or vos listvldb command to display the VLDB entry. The VL Server sets the flags in concert with the Volume Server's operations, as follows:

  1. Before the operation begins, the VL Server sets the New release flag on the read/write site definition in the VLDB entry and the Old release flag on read-only site definitions (unless the read-only site has been defined since the last release operation and has no actual volume, in which case its site flag remains Not released).

  2. If necessary, the Volume Server creates a temporary copy (a clone) of the read/write source called the ReleaseClone (see the following discussion of when the Volume Server does or does not create a new ReleaseClone.) It assigns the ReleaseClone its own volume ID number, which the VL Server records in the RClone field of the source volume's VLDB entry.

  3. The Volume Server distributes a copy of the ReleaseClone to each read-only site defined in the VLDB entry. As the site successfully receives the new clone, the VL Server sets the site's flag in the VLDB entry to New release.

  4. When all the read-only copies are successfully released, the VL Server clears all the New release site flags. The ReleaseClone is no longer needed, so the Volume Server deletes it and the VL Server erases its ID from the VLDB entry.

By default, the Volume Server determines automatically whether or not it needs to create a new ReleaseClone:

To override the default behavior, forcing the Volume Server to create and release a new ReleaseClone to the read-only sites, include the -f flag. This is appropriate if, for example, the data at the read/write site has changed since the existing ReleaseClone was created during the previous release operation.

Using Read-only Volumes Effectively

For maximum effectiveness, replicate only volumes that satisfy two criteria:

  • The volume's contents are heavily used. Examples include a volume housing binary files for text editors or other popular application programs, and volumes mounted along heavily traversed directory paths such as the paths leading to user home directories. It is an inefficient use of disk space to replicate volumes for which the demand is low enough that a single File Server can easily service all requests.

  • The volume's contents change infrequently. As noted, file system consistency demands that the contents of read-only volumes must match each other and their read/write source at all times. Each time the read/write volume changes, you must issue the vos release command to update the read-only volumes. This can become tedious (and easy to forget) if the read/write volume changes frequently.

Explicitly mounting a read-only volume (creating a mount point that names a volume with a .readonly extension) is not generally necessary or appropriate. The Cache Manager has a built-in bias to access the read-only version of a replicated volume whenever possible. As described in more detail in The Rules of Mount Point Traversal, when the Cache Manager encounters a mount point it reads the volume name inside it and contacts the VL Server for a list of the sites that house the volume. In the normal case, if the mount point resides in a read-only volume and names a read/write volume (one that does not have a .readonly or .backup extension), the Cache Manager always attempts to access a read-only copy of the volume. Thus there is normally no reason to force the Cache Manager to access a read-only volume by mounting it explicitly.

It is a good practice to place a read-only volume at the read/write site, for a couple of reasons. First, the read-only volume at the read/write site requires only a small amount of disk space, because it is a clone rather a copy of all of the data (see About Clones and Cloning). Only if a large number of files are removed or changed in the read/write volume does the read-only copy occupy much disk space. That normally does not happen because the appropriate response to changes in a replicated read/write volume is to reclone it. The other reason to place a read-only volume at the read/write site is that the Cache Manager does not attempt to access the read/write version of a replicated volume if all read-only copies become inaccessible. If the file server machine housing the read/write volume is the only accessible machine, the Cache Manager can access the data only if there is a read-only copy at the read/write site.

The number of read-only sites to define depends on several factors. Perhaps the main trade-off is between the level of demand for the volume's contents and how much disk space you are willing to use for multiple copies of the volume. Of course, each prospective read-only site must have enough available space to accommodate the volume. The limit on the number of read-only copies of a volume is determined by the maximum number of site definitions in a volume's VLDB entry, which is defined in the OpenAFS Release Notes. The site housing the read/write and backup versions of the volume counts as one site, and each read-only site counts as an additional site (even the read-only site defined on the same file server machine and partition as the read/write site counts as a separate site). Note also that the Volume Server permits only one read-only copy of a volume per file server machine.

Replication Scenarios

The instructions in the following section explain how to replicate a volume for which no read-only sites are currently defined. However, you can also use the instructions in other common situations:

  • If you are releasing a new clone to sites that already exist, you can skip Step 2. It can still be useful to issue the vos examine command, however, to verify that the desired read-only sites are defined.

  • If you are adding new read-only sites to existing ones, perform all of the steps. In Step 3, issue the vos addsite command for the new sites only.

  • If you are defining sites but do not want to release a clone to them yet, stop after Step 3and continue when you are ready.

  • If you are removing one or more sites before releasing a new clone to the remaining sites, follow the instructions for site removal in Removing Volumes and their Mount Pointsand then start with Step 4.

To replicate a read/write volume (create a read-only volume)

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

       % bos listusers <machine name>
    
  2. Select one or more sites at which to replicate the volume. There are several factors to consider:

    • How many sites are already defined. As previously noted, it is usually appropriate to define a read-only site at the read/write site. Also, the Volume Server permits only one read-only copy of a volume per file server machine. To display the volume's current sites, issue the vos examine command, which is described fully in Displaying One Volume's VLDB Entry and Volume Header.

         % vos examine <volume name or ID>
      

      The final lines of output display the volume's site definitions from the VLDB.

    • Whether your cell dedicates any file server machines to housing read-only volumes only. In general, only very large cells use read-only server machines.

    • Whether a site has enough free space to accommodate the volume. A read-only volume requires the same amount of space as the read/write version (unless it is at the read/write site itself). The first line of output from the vos examine command displays the read/write volume's current size in kilobyte blocks, as shown in Displaying One Volume's VLDB Entry and Volume Header.

      To display the amount of space available on a file server machine's partitions, use the vos partinfo command, which is described fully in Creating Read/write Volumes.

         % vos partinfo <machine name> [<partition name>]
      
  3. Issue the vos addsite command to define each new read-only site in the VLDB.

       % vos addsite <machine name> <partition name> <volume name or ID>
    

    where

    ad

    Is the shortest acceptable abbreviation of addsite.

    machine name

    Defines the file server machine for the new site.

    partition name

    Names a disk partition on the machine machine name.

    volume name or ID

    Identifies the read/write volume to be replicated, either by its complete name or its volume ID number.

  4. (Optional) Verify that the fs process (which incorporates the Volume Server) is functioning normally on each file server machine where you have defined a read-only site, and that the vlserver process (the Volume Location Server) is functioning correctly on each database server machine. Knowing that they are functioning eliminates two possible sources of failure for the release. Issue the bos status command on each file server machine housing a read-only site for this volume and on each database server machine. The command is described fully in Displaying Process Status and Information from the BosConfig File.

       % bos status <machine name> fs vlserver
    
  5. Issue the vos release command to clone the read/write source volume and distribute the clone to each read-only site.

       % vos release <volume name or ID> [-f]
    

    where

    rel

    Is the shortest acceptable abbreviation of release.

    volume name or ID

    Identifies the read/write volume to clone, either by its complete name or volume ID number. The read-only version is given the same name with a .readonly extension. All read-only copies share the same read-only volume ID number.

    -f

    Creates and releases a brand new clone.

  6. (Optional) Issue the vos examine command to verify that no site definition in the VLDB entry is marked with an Old release or New release flag. The command is described fully in Displaying One Volume's VLDB Entry and Volume Header.

       % vos examine <volume name or ID>
    

If any flags appear in the output from Step 6, repeat Steps 4and 5until the Volume Server does not produce any error messages during the release operation and the flags no longer appear. Do not issue the vos release command when you know that the read/write site or any read-only site is inaccessible due to network, machine or server process outage.