Controlling and Checking Process Status

To define the AFS server processes that run on a server machine, use the bos create command to create entries for them in the local /usr/afs/local/BosConfig file. The BOS Server monitors the processes listed in the BosConfig file that are marked with the Run status flag, and automatically attempts to restart them if they fail. After creating process entries, you use other commands from the bos suite to stop and start processes or change the status flag as desired.

Never edit the BosConfig file directly rather than using bos commands. Similarly, it is not a good practice to run server processes without listing them in the BosConfig file, or to stop them using process termination commands such as the UNIX kill command.

The Information in the BosConfig File

A process's entry in the BosConfig file includes the following information:

  • The process's name. The recommended conventional names are defined in both the OpenAFS Quick Beginnings and Creating and Removing Processes. The name of a simple process usually matches the name of its binary file (for example, ptserver for the Protection Server).

  • Its type, which is one of the following:

    simple

    A process that runs independently of any other on the server machine. If several simple processes fail at the same time, the BOS Server can restart them in any order. All standard AFS processes except the fs process are simple.

    fs

    A process type reserved for the server process for which the conventional name is also fs. This process combines three components: the File Server, the Volume Server, and the Salvager.

    cron

    A process that runs at a defined time rather than continuously. There are no standard processes of this type.

  • Its status flag, which tells the BOS Server whether it performs the following two actions with respect to the process:

    • Start the process during BOS Server initialization

    • Restart the process if it (the process) fails

    The two possible values are Run (which directs the BOS Server to perform these actions) and NotRun (which directs the BOS Server to ignore the process). The BOS Server itself never changes the setting of this flag, even if the process fails repeatedly. Also, this flag is for internal use only; it does not appear in the bos status command's output.

  • Its command parameters, which are the commands that the BOS Server runs to start the process.

    • A simple processes has one: the complete pathname to its binary file

    • The fs process has three: the complete pathnames to each of the three component processes (/usr/afs/bin/fileserver, /usr/afs/bin/volserver, and /usr/afs/bin/salvager)

    • A cron process has two: the first the complete pathname to its binary file, the second the time at which the BOS Server runs it

In addition to process definitions, the BosConfig file also records automatic restart times for processes that have new binaries, and for all server processes including the BOS Server. See Setting the BOS Server's Restart Times.

How the BOS Server Uses the Information in the BosConfig File

Whenever the BOS Server starts or restarts, it reads the BosConfig file to learn which processes it is to start and monitor. It transfers the information into kernel memory and does not read the BosConfig file again until it next restarts. This implies that the BOS Server's memory state can change independently of the BosConfig file. You can, for example, stop a process but leave its status flag in the BosConfig file as Run, or start a process even though its status flag in the BosConfig file is NotRun.

About Starting and Stopping the Database Server Processes

When you start or stop a database server process (Authentication Server, Backup Server, Protection Server, or Volume Location Server) for more than a short time, you must follow the instructions in the OpenAFS Quick Beginnings for installing or removing a database server machine. Here is a summary of the tasks you must perform to preserve correct AFS functioning.

  • Start or stop all four database server processes on that machine. All AFS server processes and the Cache Manager processes expect all four database server processes to be running on each machine listed in the CellServDB file. There is no way to indicate in the file that a machine is running only some of the database server processes.

  • Add or remove the machine in the /usr/afs/etc/CellServDB file on all server machines and the /usr/vice/etc/CellServDB file on all client machines.

  • Restart the database server processes on the other database server machines to force an election of a new Ubik coordinator for each one.

About Starting and Stopping the Update Server

In the conventional cell configuration, one server machine of each system type acts as a binary distribution machine, running the server portion of the Update Server (upserver process) to distribute the contents of its /usr/afs/bin directory. The other server machines of its system type run an instance of the Update Server client portion (by convention called upclientbin) that references the binary distribution machine.

It is conventional for the first server machine you install to act as the system control machine, running the server portion of the Update Server (upserver process) to distribute the contents of its /usr/afs/etc directory. All other server machines run an instance of the Update Server client portion (by convention called upclientetc) that references the system control machine.

It is simplest not to move binary distribution or system control responsibilities to a different machine unless you completely decommission a machine that is currently serving in one of those roles. Running the Update Server usually imposes very little processing load. If you must move the functionality, perform the following related tasks.

  • If you replace the system control machine, you must stop the upclientetc process on every other server machine and define a new one that references the new system control machine.

  • If you replace a binary distribution machine, you must stop the upclientbin process on every other server machine of its system type and define a new one that references the new binary distribution machine (unless you are no longer running any server machines of that system type).