The School of Informatics and Computing manages backups on a variety of central school servers as well as individual research servers. All system backups being managed by the IT staff are described on the Backup Schedules and Methods page. If you have a system that needs to be backed up that isn't listed there, please contact us.
We are using 2 primary methods for backups involving use of either the Tivoli Storage Manager (TSM) software or local scripts utilizing the Massive Data Storage Service (MDSS). We can also assist users who prefer to do their own backups or who have unusual requirements that do not fit well within the normally supported options.
Also note that we do not generally back up data on individual workstations. You are encouraged to utilize central networked storage servers for saving data you need to be backed up. Please contact us if you have any questions about the storage and backup options for workstations.
What follows is a discussion of the available options along with their advantages, disadvantages, and limitations.
Option 1: Tivoli Storage Manager (TSM)
Tivoli Storage Manager (TSM) is a backup system provided by UITS as part of the Intelligent Infrastructure (II) program. The backup client software is installed on the server and the data is backed up to UITS owned and managed storage hardware.
With TSM, a single full backup of the data is performed when the server is added to the system and then only nightly incremental backups are done after that. As long as files are not removed from the server, a configurable number of copies of that file are saved, subject to the retention window. When a file is removed, the saved copies are also retained as defined by the retention window. When you set this up, you define the retention window and the number of copies of a file to retain. The default for the TSM system is to use a 35 day retention window and to keep 2 copies of each file but these parameters are configurable.
The advantage of this system is that after the first full backup it only does nightly incremental backups. This way you never back up a file more than once so system load is minimized. Backups are also stored in a secure facility and automatically replicated to multiple data centers in Bloomington and Indianapolis.
The primary disadvantage of this system is that it isn't free. You have to purchase the software ($250/cpu), pay yearly maintenance on the software ($50/cpu/year) and pay data transfer and storage fees. See the TSM Cost Estimator Tool if you want to get a better idea of the costs involved.
If you want long term archiving of data (eg. many months or years) then you can also arrange for periodic snapshots that are retained as long as required, subject to the associated storage charges.
The TSM system supports Linux, Windows, and Mac OS X servers.
Option 2: Local Backup Scripts Utilizing the MDSS
We have tar-based backup scripts that are well suited to the task of doing periodic backups of research servers. With this system, you can specify the directory to be backed up, the frequency of backups, and the retention window. It also allows pre and post commands to be specified for doing things like database and repository dumps prior to the backups.
One big advantage of this system is that it it is free. There is no cost to use the software and the storage space on the MDSS is provided at no cost by IU as long as you do not need many TBs of storage space. This provides effective disaster recovery as well as longer term archiving and you can specify arbitrary backup schedules and retention windows. Also, data from the backups is written directly to the MDSS without having to be written locally so there is no need for local storage space on the server to stage backups.
A disadvantage of this system is that every backup is a full backup so doing nightly backups of large filesystems may not be practical. The frequency at which we can do backups will be a function of your needs weighed against the load the backups place on your server and the storage backend. However, doing weekly backups of filesystems on the order of 100-200GB or monthly backups of data on the order of 1TB are certainly feasible.
If you would like to have backups of data on your server set up, please let us know the following:
- The directories to be backed up
- The required backup frequency (ie. weekly, monthly, or biannually)
- The retention window (ie. 3 months or 1 year)
- Whether you have things like mysql databases or subversion repositories that require special dumps to prepare them for backup.
- Files or directories that should be excluded from the backups. See the Exclude Notes below for notes on files and directories that are automatically excluded from backups.
We will work with you to define a backup schedule that satisfies your requirements without imposing undue load on your server or the MDSS.
Note that this system is currently only deployed on Linux servers but it should be possible to use the same system on Mac OS X systems as well as Windows systems with Cygwin.
We exclude a number of things from our standard backups in order to eliminate the unnecessary backup of temporary and cache data. Furthermore, we have a NOBACKUP mechanism in place as a way that individual users can have data excluded from the backups. Here is a list of some of these excludes:
- *.NOBACKUP - We exclude any file or directory from both TSM and MDSS backups that have the .NOBACKUP file extension. This provides a simple way that users can create data that is excluded from the backups. We recommend using this feature for things like large datasets or downloads (eg. .iso files) that are easily recreated or re-downloaded and that would put unnecessary load on the backups.
- .Trash - We exclude any directory named .Trash from the MDSS backups. This is the typical naming convention for the directory in Linux that contains deleted files so it is excluded.
- Cache Directories - Many applications like firefox and thunderbird store frequently changing cache data that puts unnecessary load on the daily incremental backups. We are excluding any directory named cache, Cache, or opcache that is contained within a Profiles or "dot" directory from the incremental TSM backups. For example, we exclude the Cache directory under ~/.mozilla/firefox.
- Windows Excludes - The TSM system defines a number of default excludes for Windows systems. These exclude a variety of files and directories that are known to be temporary files not requiring backup. We are taking all of these default excludes on the Windows systems we back up using TSM.