Maintaining the Backup Database

The Backup Database stores all of the configuration and tracking information that the Backup System uses when dumping and restoring data. If a hardware failure or other problem on a database server machine corrupts or damages the database, it is relatively easy to recreate the configuration information (the dump hierarchy and lists of volume sets and Tape Coordinator port offset numbers). However, restoring the dump tracking information (dump records) is more complicated and time-consuming. To protect yourself against loss of data, back up the Backup Database itself to tape on a regular schedule.

Another potential concern is that the Backup Database can grow large rather quickly, because the Backup System keeps very detailed and cross-referenced records of dump operations. Backup operations become less efficient if the Backup Server has to navigate through a large number of obsolete records to find the data it needs. To keep the database to a manageable size, use the backup deletedump command to delete obsolete records, as described in Removing Obsolete Records from the Backup Database. If you later find that you have removed records that you still need, you can use the backup scantape command to read the information from the dump and tape labels on the corresponding tapes back into the database, as instructed in To scan the contents of a tape.

Backing Up and Restoring the Backup Database

Because of the importance of the information in the Backup Database, it is best to back it up to tape or other permanent media on a regular basis. As for the other AFS, administrative databases, the recommended method is to use a utility designed to back up a machine's local disk, such as the UNIX tar command. For instructions, see Backing Up and Restoring the Administrative Databases.

In the rare event that the Backup Database seems damaged or corrupted, you can use the backup dbverify command to check its status. If it is corrupted, use the backup savedb command to repair some types of damage. Then use the backup restoredb to return the corrected database to the local disks of the database server machines. For instructions, see Checking for and Repairing Corruption in the Backup Database.

Checking for and Repairing Corruption in the Backup Database

In rare cases, the Backup Database can become damaged or corrupted, perhaps because of disk or other hardware errors. Use the backup dbverify command to check the integrity of the database. If it is corrupted, the most efficient way to repair it is to use the backup savedb command to copy the database to tape. The command automatically repairs several types of corruption, and you can then use the backup restoredb command to transfer the repaired copy of the database back to the local disks of the database server machines.

The backup savedb command also removes orphan blocks, which are ranges of memory that the Backup Server preallocated in the database but cannot use. Orphan blocks do not interfere with database access, but do waste disk space. The backup dbverify command reports the existence of orphan blocks if you include the -detail flag.

To verify the integrity of the Backup Database

  1. Verify that you are authenticated as a user listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

       % bos listusers <machine name>
    
  2. Issue the backup dbverify command to check the integrity of the Backup Database.

       % backup dbverify [-detail]
    

    where

    db

    Is the shortest acceptable abbreviation of dbverify.

    -detail

    Reports the existence of orphan blocks and other information about the database, as described on the backup dbverify reference page in the OpenAFS Administration Reference.

    The output reports one of the following messages:

    • Database OK indicates that the Backup Database is undamaged.

    • Database not OK indicates that the Backup Database is damaged. To recover from the problem, use the instructions in To repair corruption in the Backup Database.

To repair corruption in the Backup Database

  1. Log in as the local superuser root on each database server machine in the cell.

  2. If the Tape Coordinator for the tape device that is to perform the operation is not already running, open a connection to the appropriate Tape Coordinator machine and issue the butc command, for which complete instructions appear in To start a Tape Coordinator process.

       % butc [<port offset>] [-noautoquery]
    
  3. If writing to tape, place a tape in the appropriate device.

  4. Working on one of the machines, issue the backup command to enter interactive mode.

       # backup -localauth
    

    where -localauth constructs a server ticket from the local /usr/afs/etc/KeyFile file. This flag enables you to issue a privileged command while logged in as the local superuser root but without AFS administrative tokens.

  5. Verify that no backup operations are actively running. If necessary, issue the (backup) status command as described in To check the status of a Tape Coordinator process. Repeat for each Tape Coordinator port offset in turn.

       backup> status -portoffset <TC port offset>
    
  6. Issue the (backup) savedb command to repair corruption in the database as it is written to tape or a file.

       backup> savedb [-portoffset <TC port offset>]
    

    where

    sa

    Is the shortest acceptable abbreviation of savedb.

    -portoffset

    Specifies the port offset number of the Tape Coordinator handling the tape or backup data file for this operation. You must provide this argument unless the default value of 0 (zero) is appropriate.

  7. Exit interactive mode.

       backup>  quit  
    
  8. On each machine in turn, issue the bos shutdown command to shut down the Backup Server process. Include the -localauth flag because you are logged in as the local superuser root, but do not necessarily have administrative tokens. For complete command syntax, see To stop processes temporarily.

       # /usr/afs/bin/bos shutdown <machine name> buserver  -localauth  -wait
    
  9. On each machine in turn, issue the following commands to remove the Backup Database.

       # cd /usr/afs/db
       # rm bdb.DB0
       # rm bdb.DBSYS1
    
  10. On each machine in turn, starting with the machine with the lowest IP address, issue the bos start command to restart the Backup Server process, which creates a zero-length copy of the Backup Database as it starts. For complete command syntax, see To start processes by changing their status flags to Run.

       # /usr/afs/bin/bos start <machine name> buserver  -localauth
    
  11. Working on one of the machines, issue the backup command to enter interactive mode.

       # backup -localauth
    

    where -localauth constructs a server ticket from the local /usr/afs/etc/KeyFile file.

  12. Issue the (backup) addhost command to create an entry in the new, empty database for the Tape Coordinator process handling the tape or file from which you are reading the repaired copy of the database (presumably the process you started in Step 2 and which performed the backup savedb operation in Step 6). For complete syntax, see Step 8 in To configure a Tape Coordinator machine.

       backup>  addhost <tape machine name> [<TC port offset>]
    
  13. Issue the (backup) restoredb command to copy the repaired database to the database server machines.

       backup> restoredb  [-portoffset <TC port offset>]
    

    where

    res

    Is the shortest acceptable abbreviation of restoredb.

    -portoffset

    Specifies the port offset number of the Tape Coordinator handling the tape or backup data file for this operation. You must provide this argument unless the default value of 0 (zero) is appropriate.

  14. (Optional) Exit interactive mode if you do not plan to issue any additional backup commands.

       backup> quit
    
  15. (Optional) If desired, enter Ctrl-d or another interrupt signal to exit the root shell on each database server machine. You can also issue the Ctrl-c signal on the Tape Coordinator machine to stop the process.

Removing Obsolete Records from the Backup Database

Whenever you recycle or relabel a tape using the backup dump or backup labeltape command, the Backup System automatically removes all of the dump records for the dumps contained on the tape and all other tapes in the dump set. However, obsolete records can still accumulate in the Backup Database over time. For example, when you discard a backup tape after using it the maximum number of times recommended by the manufacturer, the records for dumps on it remain in the database. Similarly, the Backup System does not automatically remove a dump's record when the dump reaches its expiration date, but only if you then recycle or relabel the tape that contains the dump. Finally, if a backup operation halts in the middle, the records for any volumes successfully written to tape before the halt remain in the database.

A very large Backup Database can make backup operations less efficient because the Backup Server has to navigate through a large number of records to find the ones it needs. To remove obsolete records, use the backup deletedump command. Either identify individual dumps by dump ID number, or specify the removal of all dumps created during a certain time period. Keep in mind that you cannot remove the record of an appended dump except by removing the record of its initial dump, which removes the records of all associated appended dumps. Removing records of a dump makes it impossible to restore data from the corresponding tapes or from any dump that refers to the deleted dump as its parent, directly or indirectly. That is, restore operations must begin with the full dump and continue with each incremental dump in order. If you have removed the records for a specific dump, you cannot restore any data from later incremental dumps.

Another way to truncate the Backup Database is to include the -archive argument to the backup savedb command. After a copy of the database is written to tape or to a backup data file, the Backup Server deletes the dump records for all dump operations with timestamps prior to the date and time you specify. However, issuing the backup deletedump command with only the -to argument is equivalent in effect and is simpler because it does not require starting a Tape Coordinator process as the backup savedb command does. For further information on the -archive argument to the backup savedb command, see the command's reference page in the OpenAFS Administration Reference.

If you later need to access deleted dump records, and the corresponding tapes still exist, you can use the -dbadd argument to the backup scantape command to scan their contents into the database, as instructed in To scan the contents of a tape.

To delete dump records from the Backup Database

  1. Verify that you are authenticated as a user listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

       % bos listusers <machine name>
    
  2. (Optional) Issue the backup command to enter interactive mode, if you want to delete multiple records or issue additional commands. The interactive prompt appears in the following step.

       % backup
    
  3. (Optional) Issue the backup dumpinfo command to list information from the Backup Database that can help you decide which records to delete. For detailed instructions, see To display dump records.

       backup> dumpinfo [<no. of dumps>]  [-id <dump id>]  [-verbose]
    
  4. Issue the backup deletedump command to delete one or more dump sets.

       backup> deletedump [-dumpid <dumpid>+] [-from <date time>]  \
                          [-to <date time>] 
    

    where

    dele

    Is the shortest acceptable abbreviation of deletedump.

    -dumpid

    Specifies the dump ID of each initial dump to delete from the Backup Database. The records for all associated appended dumps are also deleted. Provide either this argument or the -to (and optionally, -from) argument.

    -from

    Specifies the beginning of a range of dates; the record for any dump created during the indicated period of time is deleted.

    To omit all records before the time indicated with the -to argument, omit this argument. Otherwise provide a value in the following format

    mm/dd/yyyy [hh:MM]

    where the month (mm), day (dd), and year (yyyy) are required. You can omit the hour and minutes (hh:MM) to indicate the default of midnight (00:00 hours). If you provide them, use 24-hour format (for example, the value 14:36 represents 2:36 p.m.).

    You must provide the -to argument along with this one.

    Note

    A plus sign follows this argument in the command's syntax statement because it accepts a multiword value which does not need to be enclosed in double quotes or other delimiters, not because it accepts multiple dates. Provide only one date (and optionally, time) definition.

    -to

    Specifies the end of a range of dates; the record of any dump created during the range is deleted from the Backup Database.

    To delete all records created after the date you specify with the -from argument, specify the value NOW. To delete every dump record in the Backup Database, provide the value NOW and omit the -from argument. Otherwise, provide a date value in the same format as described for the -from argument. Valid values for the year (yyyy) range from 1970 to 2037; higher values are not valid because the latest possible date in the standard UNIX representation is in early 2038. The command interpreter automatically reduces any later date to the maximum value in 2038.

    If you omit the time portion (hh:MM), it defaults to 59 seconds after midnight (00:00:59 hours). Similarly, the backup command interpreter automatically adds 59 seconds to any time value you provide. In both cases, adding 59 seconds compensates for how the Backup Database and backup dumpinfo command represent dump creation times in hours and minutes only. For example, the Database records a creation timestamp of 20:55 for any dump operation that begins between 20:55:00 and 20:55:59. Automatically adding 59 seconds to a time thus includes the records for all dumps created during that minute.

    Provide either this argument, or the -dumpid argument. This argument is required if the -from argument is provided.

    Note

    A plus sign follows this argument in the command's syntax statement because it accepts a multiword value which does not need to be enclosed in double quotes or other delimiters, not because it accepts multiple dates. Provide only one date (and optionally, time) definition.