The Four Roles for File Server Machines

In cells that have more than one server machine, not all server machines have to perform exactly the same functions. The are four possible roles a machine can assume, determined by which server processes it is running. A machine can assume more than one role by running all of the relevant processes. The following list summarizes the four roles, which are described more completely in subsequent sections.

If a cell has a single server machine, it assumes the simple file server and database server roles. The instructions in the OpenAFS Quick Beginnings also have you configure it as the system control machine and binary distribution machine for its system type, but it does not actually perform those functions until you install another server machine.

It is best to keep the binaries for all of the AFS server processes in the /usr/afs/bin directory, even if not all processes are running. You can then change which roles a machine assumes simply by starting or stopping the processes that define the role.

Simple File Server Machines

A simple file server machine runs only the server processes that store and deliver AFS files to client machines, monitor process status, and pick up binaries and configuration files from the cell's binary distribution and system control machines.

In general, only cells with more than three server machines need to run simple file server machines. In cells with three or fewer machines, all of them are usually database server machines (to benefit from replicating the administrative databases); see Database Server Machines.

The following processes run on a simple file server machine:

  • The BOS Server (bosserver process)

  • The fs process, which combines the File Server, Volume Server, and Salvager processes so that they can coordinate their operations on the data in volumes and avoid the inconsistencies that can result from multiple simultaneous operations on the same data

  • A client portion of the Update Server that picks up binary files from the binary distribution machine of its AFS system type (the upclientbin process)

  • A client portion of the Update Server that picks up common configuration files from the system control machine (the upclientetc process)

Database Server Machines

A database server machine runs the four processes that maintain the AFS replicated administrative databases: the Authentication Server, Backup Server, Protection Server, and Volume Location (VL) Server, which maintain the Authentication Database, Backup Database, Protection Database, and Volume Location Database (VLDB), respectively. To review the functions of these server processes and their databases, see AFS Server Processes and the Cache Manager.

If a cell has more than one server machine, it is best to run more than one database server machine, but more than three are rarely necessary. Replicating the databases in this way yields the same benefits as replicating volumes: increased availability and reliability of information. If one database server machine or process goes down, the information in the database is still available from others. The load of requests for database information is spread across multiple machines, preventing any one from becoming overloaded.

Unlike replicated volumes, however, replicated databases do change frequently. Consistent system performance demands that all copies of the database always be identical, so it is not possible to record changes in only some of them. To synchronize the copies of a database, the database server processes use AFS's distributed database technology, Ubik. See Replicating the OpenAFS Administrative Databases.

It is critical that the AFS server processes on every server machine in a cell know which machines are the database server machines. The database server processes in particular must maintain constant contact with their peers in order to coordinate the copies of the database. The other server processes often need information from the databases. Every file server machine keeps a list of its cell's database server machines in its local /usr/afs/etc/CellServDB file. Cells that use the States edition of AFS can use the system control machine to distribute this file (see The System Control Machine).

The following processes define a database server machine:

  • The Authentication Server (kaserver process)

  • The Backup Server (buserver process)

  • The Protection Server (ptserver process)

  • The VL Server (vlserver process)

Database server machines can also run the processes that define a simple file server machine, as listed in Simple File Server Machines. One database server machine can act as the cell's system control machine, and any database server machine can serve as the binary distribution machine for its system type; see The System Control Machine and Binary Distribution Machines.

Binary Distribution Machines

A binary distribution machine stores and distributes the binary files for the AFS processes and command suites to all other server machines of its system type. Each file server machine keeps its own copy of AFS server process binaries on its local disk, by convention in the /usr/afs/bin directory. For consistent system performance, however, all server machines must run the same version (build level) of a process. For instructions for checking a binary's build level, see Displaying A Binary File's Build Level. The easiest way to keep the binaries consistent is to have a binary distribution machine of each system type distribute them to its system-type peers.

The process that defines a binary distribution machine is the server portion of the Update Server (upserver process). The client portion of the Update Server (upclientbin process) runs on the other server machines of that system type and references the binary distribution machine.

Binary distribution machines usually also run the processes that define a simple file server machine, as listed in Simple File Server Machines. One binary distribution machine can act as the cell's system control machine, and any binary distribution machine can serve as a database server machine; see The System Control Machine and Database Server Machines.

The System Control Machine

The system control machine stores and distributes system configuration files shared by all of the server machines in the cell. Each file server machine keeps its own copy of the configuration files on its local disk, by convention in the /usr/afs/etc directory. For consistent system performance, however, all server machines must use the same files. The easiest way to keep the files consistent is to have the system control machine distribute them. You make changes only to the copy stored on the system control machine, as directed by the instructions in this document.

For a list of the configuration files stored in the /usr/afs/etc directory, see Common Configuration Files in the /usr/afs/etc Directory.

The OpenAFS Quick Beginnings configures a cell's first server machine as the system control machine. If you wish, you can reassign the role to a different machine that you install later, but you must then change the client portion of the Update Server (upclientetc) process running on all other server machines to refer to the new system control machine.

The following processes define the system control machine:

  • The server portion of the Update Server (upserver) process The client portion of the Update Server (upclientetc process) runs on the other server machines and references the system control machine.

The system control machine can also run the processes that define a simple file server machine, as listed in Simple File Server Machines. It can also server as a database server machine, and by convention acts as the binary distribution machine for its system type. A single upserver process can distribute both configuration files and binaries. See Database Server Machines and Binary Distribution Machines.

To locate database server machines

  1. Issue the bos listhosts command.

       % bos listhosts <machine name>
    

    The machines listed in the output are the cell's database server machines. For complete instructions and example output, see To display a cell's database server machines.

  2. (Optional) Issue the bos status command to verify that a machine listed in the output of the bos listhosts command is actually running the processes that define it as a database server machine. For complete instructions, see Displaying Process Status and Information from the BosConfig File.

       % bos status <machine name> buserver kaserver ptserver vlserver
    

    If the specified machine is a database server machine, the output from the bos status command includes the following lines:

       Instance buserver, currently running normally.
       Instance kaserver, currently running normally.
       Instance ptserver, currently running normally.
       Instance vlserver, currently running normally.
    

To locate the system control machine

  1. Issue the bos status command for any server machine. Complete instructions appear in Displaying Process Status and Information from the BosConfig File.

       % bos status <machine name> upserver upclientbin upclientetc -long
    

    The output you see depends on the machine you have contacted: a simple file server machine, the system control machine, or a binary distribution machine. See Interpreting the Output from the bos status Command.

To locate the binary distribution machine for a system type

  1. Issue the bos status command for a file server machine of the system type you are checking (to determine a machine's system type, issue the fs sysname or sys command as described in Displaying and Setting the System Type Name. Complete instructions for the bos status command appear in Displaying Process Status and Information from the BosConfig File.

       % bos status <machine name> upserver upclientbin upclientetc -long
    

    The output you see depends on the machine you have contacted: a simple file server machine, the system control machine, or a binary distribution machine. See Interpreting the Output from the bos status Command.

Interpreting the Output from the bos status Command

Interpreting the output of the bos status command is most straightforward for a simple file server machine. There is no upserver process, so the output includes the following message:

   bos: failed to get instance info for 'upserver' (no such entity)

A simple file server machine runs the upclientbin process, so the output includes a message like the following. It indicates that fs7.example.com is the binary distribution machine for this system type.

   Instance upclientbin, (type is simple) currently running normally.
   Process last started at Wed Mar 10  23:37:09 1999 (1 proc start)
   Command 1 is '/usr/afs/bin/upclient fs7.example.com -t 60 /usr/afs/bin'

A simple file server machine also runs the upclientetc process, so the output includes a message like the following. It indicates that fs1.example.com is the system control machine.

   Instance upclientetc, (type is simple) currently running normally.
   Process last started at Mon Mar 22  05:23:49 1999 (1 proc start)
   Command 1 is '/usr/afs/bin/upclient fs1.example.com -t 60 /usr/afs/etc'

The Output on the System Control Machine

If you have issued the bos status command for the system control machine, the output includes an entry for the upserver process similar to the following:

   Instance upserver, (type is simple) currently running normally.
   Process last started at Mon Mar 22 05:23:54 1999 (1 proc start)
   Command 1 is '/usr/afs/bin/upserver'

If you are using the default configuration recommended in the OpenAFS Quick Beginnings, the system control machine is also the binary distribution machine for its system type, and a single upserver process distributes both kinds of updates. In that case, the output includes the following messages:

   bos: failed to get instance info for 'upclientbin' (no such entity)
   bos: failed to get instance info for 'upclientetc' (no such entity)

If the system control machine is not a binary distribution machine, the output includes an error message for the upclientetc process, but a complete a listing for the upclientbin process (in this case it refers to the machine fs5.example.com as the binary distribution machine):

   Instance upclientbin, (type is simple) currently running normally.
   Process last started at Mon Mar 22  05:23:49 1999 (1 proc start)
   Command 1 is '/usr/afs/bin/upclient fs5.example.com -t 60 /usr/afs/bin'
   bos: failed to get instance info for 'upclientetc' (no such entity)

The Output on a Binary Distribution Machine

If you have issued the bos status command for a binary distribution machine, the output includes an entry for the upserver process similar to the following and error message for the upclientbin process:

   Instance upserver, (type is simple) currently running normally.
   Process last started at Mon Apr 5 05:23:54 1999 (1 proc start)
   Command 1 is '/usr/afs/bin/upserver'
   bos: failed to get instance info for 'upclientbin' (no such entity)

Unless this machine also happens to be the system control machine, a message like the following references the system control machine (in this case, fs3.example.com):

   Instance upclientetc, (type is simple) currently running normally.
   Process last started at Mon Apr 5 05:23:49 1999 (1 proc start)
   Command 1 is '/usr/afs/bin/upclient fs3.example.com -t 60 /usr/afs/etc'