The traces consist of a condensed version of each file system's data blocks, including a hash of each block's contents, all the block pointers and most of the directory information. The traces do not include the actual contents of files nor the file names. There is sufficient information to reconstruct the structure of the file system and to track the daily changes to this structure over time.
The daily snapshots of a Plan 9 file server are constructed using a copy-on-write scheme, enabling unchanged blocks to be shared between multiple snapshots. The size of the storage containing the snapshots is as follows:
| bootes | emelie | |
|---|---|---|
| Number of blocks | 45 million | 26 million |
| Block size | 6Kb | 16Kb |
The following two graphs depict the size of the active file system and the accumulative size of the snapshots. In addition, we have copied the snapshot data onto a new archival storage system, called Venti, that removes duplicate blocks and compresses the block's contents. When stored on Venti, the snapshot data is only a small multiple of the active file system size.
Using Venti, the size of the snapshot data is reduced by three factors: elimination of duplicate blocks, elimination of block fragmentation, and compression of the block contents. The following table presents the percent reduction for each of these factors.
| bootes | emelie | |
|---|---|---|
| Elimination of duplicates | 27.8% | 31.3% |
| Elimination of fragments | 10.2% | 25.4% |
| Data Compression | 33.8% | 54.1% |
| Total Reduction | 59.7% | 76.5% |
The trace data is available via ftp. It is also mirrored at http://pdos.csail.mit.edu/p9trace/. We currently provide trace data for the bootes file server covering the period Feb 27th 1990 though Feb 24th 2001 and the emelie file server for the period Nov 7th 1996 through May 1st 2001. The total size of the trace data is approximately 3GB per file server. To make the data more manageable it has been broken up into multiple files, each containing information on 1 million blocks.
The ftp directory also contains a description of the format of the trace data and source code for a trace file parser.
| Last modified: 11/16/2001 |
|