Using the fstrace Command Suite

This section describes the fstrace commands that system administrators employ to trace Cache Manager activity for debugging purposes. It assumes the reader is familiar with the Cache Manager concepts described in Administering Client Machines and the Cache Manager.

The fstrace command suite monitors the internal activity of the Cache Manager and enables you to record, or trace, its operations in detail. The operations, which are termed events, comprise the cm event set. Examples of cm events are fetching files and looking up information for a listing of files and subdirectories using the UNIX ls command.

Following are the fstrace commands and their respective functions:

About the fstrace Command Suite

The fstrace command suite replaces and greatly expands the functionality formerly provided by the fs debug command. Its intended use is to aid in diagnosis of specific Cache Manager problems, such as client machine hangs, cache consistency problems, clock synchronization errors, and failures to access a volume or AFS file. Therefore, it is best not to keep fstrace logging enabled at all times, unlike the logging for AFS server processes.

Most of the messages in the trace log correspond to low-level Cache Manager operations. It is likely that only personnel familiar with the AFS source code can interpret them. If you have an AFS source license, you can attempt to interpret the trace yourself, or work with the AFS Product Support group to resolve the underlying problems. If you do not have an AFS source license, it is probably more efficient to contact the AFS Product Support group immediately in case of problems. They can instruct you to activate fstrace tracing if appropriate.

The log can grow in size very quickly; this can use valuable disk space if you are writing to a file in the local file space. Additionally, if the size of the log becomes too large, it can become difficult to parse the results for pertinent information.

When AFS tracing is enabled, each time a cm event occurs, a message is written to the trace log, cmfx. To diagnose a problem, read the output of the trace log and analyze the operations executed by the Cache Manager. The default size of the trace log is 60 KB, but you can increase or decrease it.

To use the fstrace command suite, you must first enable tracing and reserve, or allocate, space for the trace log with the fstrace setset command. With this command, you can set the cm event set to one of three states to enable or disable tracing for the event set and to allocate or deallocate space for the trace log in the kernel:

active

Enables tracing for the event set and allocates space for the trace log.

inactive

Temporarily disables tracing for the event set; however, the event set continues to allocate space occupied by the log to which it sends data.

dormant

Disables tracing for the event set; furthermore, the event set releases the space occupied by the log to which it sends data. When the cm event set that sends data to the cmfx trace log is in this state, the space allocated for that log is freed or deallocated.

Both event sets and trace logs can be designated as persistent, which prevents accidental resetting of an event set's state or clearing of a trace log. The designation is made as the kernel is compiled and cannot be changed.

If an event set such as cm is persistent, you can change its state only by including the -set argument to the fstrace setset command. (That is, you cannot change its state along with the state of all other event sets by issuing the fstrace setset command with no arguments.) Similarly, if a trace log such as cmfx is persistent, you can clear it only by including either the -set or -log argument to the fstrace clear command (you cannot clear it along with all other trace logs by issuing the fstrace clear command with no arguments.)

When a problem occurs, set the cm event set to active using the fstrace setset command. When tracing is enabled on a busy AFS client, the volume of events being recorded is significant; therefore, when you are diagnosing problems, restrict AFS activity as much as possible to minimize the amount of extraneous tracing in the log. Because tracing can have a negative impact on system performance, leave cm tracing in the dormant state when you are not diagnosing problems.

If a problem is reproducible, clear the cmfx trace log with the fstrace clear command and reproduce the problem. If the problem is not easily reproduced, keep the state of the event set active until the problem recurs.

To view the contents of the trace log and analyze the cm events, use the fstrace dump command to copy the content lines of the trace log to standard output (stdout) or to a file.

Note

If a particular command or process is causing problems, determine its process id (PID). Search the output of the fstrace dump command for the PID to find only those lines associated with the problem.

Requirements for Using the fstrace Command Suite

Except for the fstrace help and fstrace apropos commands, which require no privilege, issuing the fstrace commands requires that the issuer be logged in as the local superuser root on the local client machine. Before issuing an fstrace command, verify that you have the necessary privilege.

The Cache Manager catalog must be in place so that logging can occur. The fstrace command suite uses the standard UNIX catalog utilities. The default location is /usr/vice/etc/C/afszcm.cat. It can be placed in another directory by placing the file elsewhere and using the proper NLSPATH and LANG environment variables.

Using fstrace Commands Effectively

To use fstrace commands most effectively, configure them as indicated:

  • Store the fstrace binary in a local disk directory.

  • When you dump the fstrace log to a file, direct it to one on the local disk.

  • The trace can grow large in just a few minutes. Before attempting to dump the log to a local file, verify that you have enough room. Be particularly careful if you are using disk quotas on partitions in the local file system.

  • Attempt to limit Cache Manager activity on the AFS client machine other than the problem operation. This reduces the amount of extraneous data in the trace.

  • Activate the fstrace log for the shortest possibly period of time. If possible activate the trace immediately before performing the problem operation, deactivate it as soon as the operation completes, and dump the trace log to a file immediately.

  • If possible, obtain UNIX process ID (PID) of the command or program that initiates the problematic operation. This enables the person analyzing the trace log to search it for messages associated with the PID.

Activating the Trace Log

To start Cache Manager tracing on an AFS client machine, you must first configure

  • The cmfx kernel trace log using the fstrace setlog command

  • The cm event set using the fstrace setset command

The fstrace setlog command sets the size of the cmfx kernel trace log in kilobytes. The trace log occupies 60 kilobytes of kernel by default. If the trace log already exists, it is cleared when this command is issued and a new log of the given size is created. Otherwise, a new log of the desired size is created.

The fstrace setset command sets the state of the cm kernel event set. The state of the cm event set determines whether information on the events in that event set is logged.

After establishing kernel tracing on the AFS client machine, you can check the state of the event set and the size of the kernel buffer allocated for the trace log. To display information about the state of the cm event set, issue the fstrace lsset command. To display information about the cmfx trace log, use the fstrace lslog command. See the instructions in Displaying the State of a Trace Log or Event Set.

To configure the trace log

  1. Become the local superuser root on the machine, if you are not already, by issuing the su command.

       % su root
       Password: <root_password>
    
  2. Issue the fstrace setlog command to set the size of the cmfx kernel trace log.

       # fstrace setlog  [-log <log_name>+]  -buffersize <1-kilobyte_units>
    

The following example sets the size of the cmfx trace log to 80 KB.

   # fstrace setlog cmfx 80 

To set the event set

  1. Become the local superuser root on the machine, if you are not already, by issuing the su command.

       % su root
       Password: <root_password>
    
  2. Issue the fstrace setset command to set the state of event sets.

       % fstrace setset [-set <set_name>+] [-active] [-inactive]  \
                        [-dormant] 
    

The following example activates the cm event set.

   # fstrace setset cm -active

Displaying the State of a Trace Log or Event Set

An event set must be in the active state to be included in the trace log. To display an event set's state, use the fstrace lsset command. To set its state, issue the fstrace setset command as described in To set the event set.

To display size and allocation information for the trace log, issue the fstrace lslogcommand with the -long argument.

To display the state of an event set

  1. Become the local superuser root on the machine, if you are not already, by issuing the su command.

       % su root
       Password: <root_password>
    
  2. Issue the fstrace lsset command to display the available event set and its state.

       # fstrace lsset  [-set <set_name>+]
    

The following example displays the event set and its state on the local machine.

   # fstrace lsset cm
   Available sets:
   cm active

The output from this command lists the event set and its states. The three event states for the cm event set are:

active

Tracing is enabled.

inactive

Tracing is disabled, but space is still allocated for the corresponding trace log (cmfx).

dormant

Tracing is disabled, and space is no longer allocated for the corresponding trace log (cmfx).Disables tracing for the event set.

To display the log size

  1. Become the local superuser root on the machine, if you are not already, by issuing the su command.

       % su root
       Password: <root_password>
    
  2. Issue the fstrace lslog command to display information about the kernel trace log.

       # fstrace lslog  [-set <set_name>+]  [-log <log_name>]  [-long] 
    

The following example uses the -long flag to display additional information about the cmfx trace log.

   # fstrace lslog cmfx -long
   Available logs:
   cmfx : 60 kbytes (allocated)

The output from this command lists information on the trace log. When issued without the -long flag, the fstrace lslog command lists only the name of the log. When issued with the -long flag, the fstrace lslog command lists the log, the size of the log in kilobytes, and the allocation state of the log.

There are two allocation states for the kernel trace log:

allocated

Space is reserved for the log in the kernel. This indicates that the event set that writes to this log is either active (tracing is enabled for the event set) or inactive (tracing is temporarily disabled for the event set); however, the event set continues to reserve space occupied by the log to which it sends data.

unallocated

Space is not reserved for the log in the kernel. This indicates that the event set that writes to this log is dormant (tracing is disabled for the event set); furthermore, the event set releases the space occupied by the log to which it sends data.

Dumping and Clearing the Trace Log

After the Cache Manager operation you want to trace is complete, use the fstrace dump command to dump the trace log to the standard output stream or to the file named by the -file argument. Or, to dump the trace log continuously, use the -follow argument (combine it with the -file argument if desired). To halt continuous dumping, press an interrupt signal such as <Ctrl-c>.

To clear a trace log when you no longer need the data in it, issue the fstrace clear command. (The fstrace setlog command also clears an existing trace log automatically when you use it to change the log's size.)

To dump the contents of a trace log

  1. Become the local superuser root on the machine, if you are not already, by issuing the su command.

       % su root
       Password: <root_password>
    
  2. Issue the fstrace dump command to dump trace logs.

       # fstrace dump [-set <set_name>+]  [-follow <log_name>]  \
                      [-file <output_filename>]  \
                      [-sleep <seconds_between_reads>]
    

At the beginning of the output of each dump is a header specifying the date and time at which the dump began. The number of logs being dumped is also displayed if the -follow argument is not specified. The header appears as follows:

   AFS Trace Dump --
   Date: date time
   Found n logs.

where date is the starting date of the trace log dump, time is the starting time of the trace log dump, and n specifies the number of logs found by the fstrace dump command.

The following is an example of trace log dump header:

   AFS Trace Dump --
   Date: Fri Apr 16 10:44:38 1999
   Found 1 logs.

The contents of the log follow the header and are comprised of messages written to the log from an active event set. The messages written to the log contain the following three components:

  • The timestamp associated with the message (number of seconds from an arbitrary start point)

  • The process ID or thread ID associated with the message

  • The message itself

A trace log message is formatted as follows:

   time timestamp, pid pid:event message

where timestamp is the number of seconds from an arbitrary start point, pid is the process ID number of the Cache Manager event, and event message is the Cache Manager event which corresponds with a function in the AFS source code.

The following is an example of a dumped trace log message:

   time 749.641274, pid 3002:Returning code 2 from 19

For the messages in the trace log to be most readable, the Cache Manager catalog file needs to be installed on the local disk of the client machine; the conventional location is /usr/vice/etc/C/afszcm.cat. Log messages that begin with the string raw op, like the following, indicate that the catalog is not installed.

   raw op 232c, time 511.916288, pid 0
   p0:Fri Apr 16 10:36:31 1999

Every 1024 seconds, a current time message is written to each log. This message has the following format:

   time timestamp, pid pid: Current time: unix_time

where timestamp is the number of seconds from an arbitrary start point, pid is the process ID number, and unix_time is the standard time format since January 1, 1970.

The current time message can be used to determine the actual time associated with each log message. Determine the actual time as follows:

  1. Locate the log message whose actual time you want to determine.

  2. Search backward through the dump record until you come to a current time message.

  3. If the current time message's timestamp is smaller than the log message's timestamp, subtract the former from the latter. If the current time message's timestamp is larger than the log message's timestamp, add 1024 to the latter and subtract the former from the result.

  4. Add the resulting number to the current time message's unix_time to determine the log message's actual time.

Because log data is stored in a finite, circular buffer, some of the data can be overwritten before being read. If this happens, the following message appears at the appropriate place in the dump:

   Log wrapped; data missing.

Note

If this message appears in the middle of a dump, which can happen under a heavy work load, it indicates that not all of the log data is being written to the log or some data is being overwritten. Increasing the size of the log with the fstrace setlog command can alleviate this problem.

To clear the contents of a trace log

  1. Become the local superuser root on the machine, if you are not already, by issuing the su command.

       % su root
       Password: <root_password>
    
  2. Issue the fstrace clear command to clear logs by log name or by event set.

       # fstrace clear  [-set <set_name>+]  [-log <log_name>+]
    

The following example clears the cmfx log used by the cm event set on the local machine.

   # fstrace clear cm

The following example also clears the cmfx log on the local machine.

   # fstrace clear cmfx

Examples of fstrace Commands

This section contains an extensive example of the use of the fstrace command suite, which is useful for gathering a detailed trace of Cache Manager activity when you are working with AFS Product Support to diagnose a problem. The Product Support representative can guide you in choosing appropriate parameter settings for the trace.

Before starting the kernel trace log, try to isolate the Cache Manager on the AFS client machine that is experiencing the problem accessing the file. If necessary, instruct users to move to another machine so as to minimize the Cache Manager activity on this machine. To minimize the amount of unrelated AFS activity recorded in the trace log, place both the fstrace binary and the dump file must reside on the local disk, not in AFS. You must be logged in as the local superuser root to issue fstrace commands.

Before starting a kernel trace, issue the fstrace lsset command to check the state of the cm event set.

   # fstrace lsset cm

If tracing has not been enabled previously or if tracing has been turned off on the client machine, the following output is displayed:

   Available sets:
   cm inactive

If tracing has been turned off and kernel memory is not allocated for the trace log on the client machine, the following output is displayed:

   Available sets:
   cm inactive (dormant)

If the current state of the cm event set is inactive or inactive (dormant), turn on kernel tracing by issuing the fstrace setset command with the -active flag.

   # fstrace setset cm -active

If tracing is enabled currently on the client machine, the following output is displayed:

   Available sets:
   cm active

If tracing is enabled currently, you do not need to use the fstrace setset command. Do issue the fstrace clear command to clear the contents of any existing trace log, removing prior traces that are not related to the current problem.

   # fstrace clear cm

After checking on the state of the event set, issue the fstrace lslog command with the -long flag to check the current state and size of the kernel trace log .

   # fstrace lslog cmfx -long

If tracing has not been enabled previously or the cm event set was set to active or inactive previously, output similar to the following is displayed:

   Available logs:
   cmfx : 60 kbytes (allocated)

The fstrace tracing utility allocates 60 kilobytes of memory to the trace log by default. You can increase or decrease the amount of memory allocated to the kernel trace log by setting it with the fstrace setlog command. The number specified with the -buffersize argument represents the number of kilobytes allocated to the kernel trace log. If you increase the size of the kernel trace log to 100 kilobytes, issue the following command.

   # fstrace setlog cmfx 100

After ensuring that the kernel trace log is configured for your needs, you can set up a file into which you can dump the kernel trace log. For example, create a dump file with the name cmfx.dump.file.1 using the following fstrace dump command. Issue the command as a continuous process by adding the -follow and -sleep arguments. Setting the -sleep argument to 10 dumps output from the kernel trace log to the file every 10 seconds.

   # fstrace dump -follow cmfx -file cmfx.dump.file.1 -sleep 10
   AFS Trace Dump -
      Date: Fri Apr 16 10:54:57 1999
   Found 1 logs.
   time 32.965783, pid 0: Fri Apr 16 10:45:52 1999
   time 32.965783, pid 33657: Close 0x5c39ed8 flags 0x20 
   time 32.965897, pid 33657: Gn_close vp 0x5c39ed8 flags 0x20 (returns
   0x0) 
   time 35.159854, pid 10891: Breaking callback for 5bd95e4 states 1024
   (volume 0)
   time 35.407081, pid 10891: Breaking callback for 5c0fadc states 1024
   (volume 0)
       .                         .
       .                         .
       .                         .
   time 71.440456, pid 33658: Lookup adp 0x5bbdcf0 name g3oCKs fid (756
   4fb7e:588d240.2ff978a8.6) 
   time 71.440569, pid 33658: Returning code 2 from 19 
   time 71.440619, pid 33658: Gn_lookup vp 0x5bbdcf0 name g3oCKs (returns
   0x2) 
   time 71.464989, pid 38267: Gn_open vp 0x5bbd000 flags 0x0 (returns 0x
   0) 
   AFS Trace Dump - Completed