aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorrsc <devnull@localhost>2005-07-13 13:23:47 +0000
committerrsc <devnull@localhost>2005-07-13 13:23:47 +0000
commitbdf5b5cde1d463ce11b1e62eba68dab25f5b7422 (patch)
tree3906d7e7b28aa29f910c696a0f98c4c514b5eb9d
parentdf03d60c047a3010f800b26883763764be8d5744 (diff)
downloadplan9port-bdf5b5cde1d463ce11b1e62eba68dab25f5b7422.tar.gz
plan9port-bdf5b5cde1d463ce11b1e62eba68dab25f5b7422.tar.bz2
plan9port-bdf5b5cde1d463ce11b1e62eba68dab25f5b7422.zip
new man pages
-rw-r--r--man/man8/venti-backup.8106
-rw-r--r--man/man8/venti-fmt.8346
-rw-r--r--man/man8/venti.8431
3 files changed, 883 insertions, 0 deletions
diff --git a/man/man8/venti-backup.8 b/man/man8/venti-backup.8
new file mode 100644
index 00000000..d707091b
--- /dev/null
+++ b/man/man8/venti-backup.8
@@ -0,0 +1,106 @@
+.TH VENTI-BACKUP 8
+.SH NAME
+rdarena, wrarena \- copy arenas between venti servers
+.SH SYNOPSIS
+.PP
+.B venti/rdarena
+[
+.B -v
+]
+.I arenapart
+.I arenaname
+.PP
+.B venti/wrarena
+[
+.B -o
+.I fileoffset
+]
+[
+.B -h
+.I host
+]
+.I arenafile
+[
+.I clumpoffset
+]
+.SH DESCRIPTION
+.PP
+.I Rdarena
+extracts the named
+.I arena
+from the arena partition
+.I arenapart
+and writes this arena to standard output.
+This command is typically used to back up an arena to external media.
+The
+.B -v
+option generates more verbose output on standard error.
+.PP
+.I Wrarena
+writes the blocks contained in the arena
+.I arenafile
+(typically, the output of
+.IR rdarena )
+to a Venti server.
+It is typically used to reinitialize a Venti server from backups of the arenas.
+For example,
+.IP
+.EX
+venti/rdarena /dev/sdC0/arenas arena.0 >external.media
+venti/wrarena -h venti2 external.media
+.EE
+.LP
+writes the blocks contained in
+.B arena.0
+to the Venti server
+.B venti2
+(typically not the one using
+.BR /dev/sdC0/arenas ).
+.PP
+The
+.B -o
+option specifies that the arena starts at byte
+.I fileoffset
+(default
+.BR 0 )
+in
+.I arenafile .
+This is useful for reading directly from
+the Venti arena partition:
+.IP
+.EX
+venti/wrarena -h venti2 -o 335872 /dev/sdC0/arenas
+.EE
+.LP
+(In this example, 335872 is the offset shown in the Venti
+server's index list (344064) minus one block (8192).
+You will need to substitute your own arena offsets
+and block size.)
+.PP
+Finally, the optional
+.I offset
+argument specifies that the writing should begin with the
+clump starting at
+.I offset
+within the arena.
+.I Wrarena
+prints the offset it stopped at (because there were no more data blocks).
+This could be used to incrementally back up a Venti server
+to another Venti server:
+.IP
+.EX
+last=`{cat last}
+venti/wrarena -h venti2 -o 335872 /dev/sdC0/arenas $last >output
+awk '/^end offset/ { print $3 }' offset >last
+.EE
+.LP
+Of course, one would need to add wrapper code to keep track
+of which arenas have been processed.
+See
+.B /sys/src/cmd/venti/backup.example
+for a version that does this.
+.SH SOURCE
+.B \*9/src/cmd/venti/srv
+.SH SEE ALSO
+.IR venti (7),
+.IR venti (8)
diff --git a/man/man8/venti-fmt.8 b/man/man8/venti-fmt.8
new file mode 100644
index 00000000..4c130331
--- /dev/null
+++ b/man/man8/venti-fmt.8
@@ -0,0 +1,346 @@
+.TH VENTI-FMT 8
+.SH NAME
+buildindex,
+checkarenas,
+checkindex,
+conf,
+fmtarenas,
+fmtindex,
+fmtisect,
+syncindex \- prepare and maintain a venti server
+.SH SYNOPSIS
+.PP
+.B venti/fmtarenas
+[
+.B -Z
+]
+[
+.B -a
+.I arenasize
+]
+[
+.B -b
+.I blocksize
+]
+.I name
+.I file
+.PP
+.B venti/fmtisect
+[
+.B -1Z
+]
+[
+.B -b
+.I blocksize
+]
+.I name
+.I file
+.PP
+.B venti/fmtindex
+[
+.B -a
+]
+.I venti.conf
+.PP
+.B venti/conf
+[
+.B -w
+]
+.I partition
+[
+.I configfile
+]
+.if t .sp 0.5
+.PP
+.B venti/buildindex
+[
+.B -B
+.I blockcachesize
+]
+[
+.B -Z
+]
+.I venti.conf
+.I tmp
+.PP
+.B venti/checkindex
+[
+.B -f
+]
+[
+.B -B
+.I blockcachesize
+]
+.I venti.conf
+.I tmp
+.PP
+.B venti/checkarenas
+[
+.B -afv
+]
+.I file
+.PP
+.B venti/copy
+[
+.B -f
+]
+.I src
+.I dst
+.I score
+[
+.I type
+]
+.SH DESCRIPTION
+These commands aid in the setup, maintenance, and debugging of
+venti servers.
+See
+.IR venti (7)
+for an overview of the venti system and
+.IR venti (8)
+for an overview of the data structures used by the venti server.
+.PP
+Note that the units for the various sizes in the following
+commands can be specified by appending
+.LR k ,
+.LR m ,
+or
+.LR g
+to indicate kilobytes, megabytes, or gigabytes respectively.
+.SS Formatting
+To prepare a server for its initial use, the arena partitions and
+the index sections must be formatted individually, with
+.I fmtarenas
+and
+.IR fmtisect .
+Then the
+collection of index sections must be combined into a venti
+index with
+.IR fmtindex .
+.PP
+.I Fmtarenas
+formats the given
+.IR file ,
+typically a disk partition, into an arena partition.
+The arenas in the partition are given names of the form
+.IR name%d ,
+where
+.I %d
+is replaced with a sequential number starting at 0.
+.PP
+Options to
+.I fmtarenas
+are:
+.TP
+.BI -a " arenasize
+The arenas are of
+.I arenasize
+bytes. The default is
+.BR 512M ,
+which was selected to provide a balance
+between the number of arenas and the ability to copy an arena to external
+media such as recordable CDs and tapes.
+.TP
+.BI -b " blocksize
+The size, in bytes, for read and write operations to the file.
+The size is recorded in the file, and is used by applications that access the arenas.
+The default is
+.BR 8k .
+.TP
+.B -4
+Create a `version 4' arena partition for backwards compatibility with old servers.
+The default is version 5, used by the current venti server.
+.TP
+.B -Z
+Do not zero the data sections of the arenas.
+Using this option reduces the formatting time
+but should only be used when it is known that the file was already zeroed.
+(Version 4 only; version 5 sections are not and do not need to be zeroed.)
+.PD
+.PP
+.I Fmtisect
+formats the given
+.IR file ,
+typically a disk partition, as a venti index section with the specified
+.IR name .
+Each of the index sections in a venti configuration must have a unique name.
+.PP
+Options to
+.I fmtisect
+are:
+.TP
+.BI -b " bucketsize
+The size of an index bucket, in bytes.
+All the index sections within a index must have the same bucket size.
+The default is
+.BR 8k .
+.TP
+.B -1
+Create a `version 1' index section for backwards compatibility with old servers.
+The default is version 2, used by the current venti server.
+.TP
+.B -Z
+Do not zero the index.
+Using this option reduces the formatting time
+but should only be used when it is known that the file was already zeroed.
+(Version 1 only; version 2 sections are not and do not need to be zeroed.)
+.PD
+.I Fmtindex
+reads the configuration file
+.I venti.conf
+and initializes the index sections to form a usable index structure.
+The arena files and index sections must have previously been formatted
+using
+.I fmtarenas
+and
+.I fmtisect
+respectively.
+.PP
+The function of a venti index is to map a SHA1 fingerprint to a location
+in the data section of one of the arenas. The index is composed of
+blocks, each of which contains the mapping for a fixed range of possible
+fingerprint values.
+.I Fmtindex
+determines the mapping between SHA1 values and the blocks
+of the collection of index sections. Once this mapping has been determined,
+it cannot be changed without rebuilding the index.
+The basic assumption in the current implementation is that the index
+structure is sufficiently empty that individual blocks of the index will rarely
+overflow. The total size of the index should be about 2% to 10% of
+the total size of the arenas, but the exact percentage depends both on the
+index block size and the compressed size of blocks stored.
+See the discussion in
+.IR venti (8)
+for more.
+.PP
+.I Fmtindex
+also computes a mapping between a linear address space and
+the data section of the collection of arenas. The
+.B -a
+option can be used to add additional arenas to an index.
+To use this feature,
+add the new arenas to
+.I venti.conf
+after the existing arenas and then run
+.I fmtindex
+.BR -a .
+.PP
+A copy of the above mappings is stored in the header for each of the index sections.
+These copies enable
+.I buildindex
+to restore a single index section without rebuilding the entire index.
+.PP
+To make it easier to bootstrap servers, the configuration
+file can be stored in otherwise empty space
+at the beginning of any venti partitions using
+.IR conf .
+A partition so branded with a configuration file can
+be used in place of a configuration file when invoking any
+of the venti commands.
+By default,
+.I conf
+prints the configuration stored in
+.IR partition .
+When invoked with the
+.B -w
+flag,
+.I conf
+reads a configuration file from
+.I configfile
+(or else standard input)
+and stores it in
+.IR partition .
+.SS Checking and Rebuilding
+.PP
+.I Buildindex
+populates the index for the Venti system described in
+.IR venti.conf .
+The index must have previously been formatted using
+.IR fmtindex .
+This command is typically used to build a new index for a Venti
+system when the old index becomes too small, or to rebuild
+an index after media failure.
+Small errors in an index can usually be fixed with
+.IR checkindex .
+.PP
+The
+.I tmp
+file, usually a disk partition, must be large enough to store a copy of the index.
+This temporary space is used to perform a merge sort of index entries
+generated by reading the arenas.
+.PP
+Options to
+.I buildindex
+are:
+.TP
+.BI -B " blockcachesize
+The amount of memory, in bytes, to use for caching raw disk accesses while running
+.IR buildindex .
+(This is not a property of the created index.)
+The default is 8k.
+.TP
+.B -Z
+Do not zero the index.
+This option should only be used when it is known that the index was already zeroed.
+(Version 1 indexes only; see the discussion in
+.I fmtindex
+above.)
+.PD
+.PP
+.I Checkindex
+examines the Venti index described in
+.IR venti.conf .
+The program detects various error conditions including:
+blocks that are not indexed, index entries for blocks that do not exist,
+and duplicate index entries.
+If requested, an attempt can be made to fix errors that are found.
+.PP
+The
+.I tmp
+file, usually a disk partition, must be large enough to store a copy of the index.
+This temporary space is used to perform a merge sort of index entries
+generated by reading the arenas.
+.PP
+Options to
+.I checkindex
+are:
+.TP
+.BI -B " blockcachesize
+The amount of memory, in bytes, to use for caching raw disk accesses while running
+.IR checkindex .
+The default is 8k.
+.TP
+.B -f
+Attempt to fix any errors that are found.
+.PD
+.PP
+.I Checkarenas
+examines the Venti arenas contained in the given
+.IR file .
+The program detects various error conditions, and optionally attempts
+to fix any errors that are found.
+.PP
+Options to
+.I checkarenas
+are:
+.TP
+.B -a
+For each arena, scan the entire data section.
+If this option is omitted, only the end section of
+the arena is examined.
+.TP
+.B -f
+Attempt to fix any errors that are found.
+.TP
+.B -v
+Increase the verbosity of output.
+.PD
+.SH SOURCE
+.B \*9/src/cmd/venti/srv
+.SH SEE ALSO
+.IR venti (7),
+.IR venti (8)
+.SH BUGS
+.I Buildindex
+should allow an individual index section to be rebuilt.
+The merge sort could be performed in the space used to store the
+index rather than requiring a temporary file.
diff --git a/man/man8/venti.8 b/man/man8/venti.8
new file mode 100644
index 00000000..2327529f
--- /dev/null
+++ b/man/man8/venti.8
@@ -0,0 +1,431 @@
+.TH VENTI 8
+.SH NAME
+venti.conf \- venti configuration
+.SH DESCRIPTION
+Venti is a SHA1-addressed archival storage server.
+See
+.IR venti (7)
+for a full introduction to the system.
+This page documents the structure and operation of the server.
+.PP
+A venti server requires multiple disks or disk partitions,
+each of which must be properly formatted before the server
+can be run.
+.SS Disk
+The venti server maintains three disk structures, typically
+stored on raw disk partitions:
+the append-only
+.IR "data log" ,
+which holds, in sequential order,
+the contents of every block written to the server;
+the
+.IR index ,
+which helps locate a block in the data log given its score;
+and optionally the
+.IR "bloom filter" ,
+a concise summary of which scores are present in the index.
+The data log is the primary storage.
+To improve the robustness, it should be stored on
+a device that provides RAID functionality.
+The index and the bloom filter are optimizations
+employed to access the data log efficiently and can be rebuilt
+if lost or damaged.
+.PP
+The data log is logically split into sections called
+.IR arenas ,
+typically sized for easy offline backup
+(e.g., 500MB).
+A data log may comprise many disks, each storing
+one or more arenas.
+Such disks are called
+.IR "arena partitions" .
+Arena partitions are filled in the order given in the configuration.
+.PP
+The index is logically split into block-sized pieces called
+.IR buckets ,
+each of which is responsible for a particular range of scores.
+An index may be split across many disks, each storing many buckets.
+Such disks are called
+.IR "index sections" .
+.PP
+The index must be sized so that no bucket is full.
+When a bucket fills, the server must be shut down and
+the index made larger.
+Since scores appear random, each bucket will contain
+approximately the same number of entries.
+Index entries are 40 bytes long. Assuming that a typical block
+being written to the server is 8192 bytes and compresses to 4096
+bytes, the active index is expected to be about 1% of
+the active data log.
+Storing smaller blocks increases the relative index footprint;
+storing larger blocks decreases it.
+To allow variation in both block size and the random distribution
+of scores to buckets, the suggested index size is 5% of
+the active data log.
+.PP
+The (optional) bloom filter is a large bitmap that is stored on disk but
+also kept completely in memory while the venti server runs.
+It helps the venti server efficiently detect scores that are
+.I not
+already stored in the index.
+The bloom filter starts out zeroed.
+Each score recorded in the bloom filter is hashed to choose
+.I nhash
+bits to set in the bloom filter.
+A score is definitely not stored in the index of any of its
+.I nhash
+bits are not set.
+The bloom filter thus has two parameters:
+.I nhash
+(maximum 32)
+and the total bitmap size
+(maximum 512MB, 2\s-2\u32\d\s+2 bits).
+.PP
+The bloom filter should be sized so that
+.I nhash
+\(ti
+.I nblock
+\(ti
+0.7
+\(<=
+0.7 \(ti
+.IR b ,
+where
+.I nblock
+is the expected number of blocks stored on the server
+and
+.I b
+is the bitmap size in bits.
+The false positive rate of the bloom filter when sized
+this way is approximately 2\s-2\u\-\fInblock\fR\d\s+2.
+.I Nhash
+less than 10 are not very useful;
+.I nhash
+greater than 24 are probably a waste of memory.
+.I Fmtbloom
+(see
+.IR venti-fmt (8))
+can be given either
+.I nhash
+or
+.IR nblock ;
+if given
+.IR nblock ,
+it will derive an appropriate
+.IR nhash .
+.SS Memory
+Venti can make effective use of large amounts of memory
+for various caches.
+.PP
+The
+.I "lump cache
+holds recently-accessed venti data blocks, which the server refers to as
+.IR lumps .
+The lump cache should be at least 1MB but can profitably be much larger.
+The lump cache can be thought of as the level-1 cache:
+read requests handled by the lump cache can
+be served instantly.
+.PP
+The
+.I "block cache
+holds recently-accessed
+.I disk
+blocks from the arena partitions.
+The block cache needs to be able to simultaneously hold two blocks
+from each arena plus four blocks for the currently-filling arena.
+The block cache can be thought of as the level-2 cache:
+read requests handled by the block cache are slower than those
+handled by the lump cache, since the lump data must be extracted
+from the raw disk blocks and possibly decompressed, but no
+disk accesses are necessary.
+.PP
+The
+.I "index cache
+holds recently-accessed or prefetched
+index entries.
+The index cache needs to be able to hold index entries
+for three or four arenas, at least, in order for prefetching
+to work properly. Each index entry is 50 bytes.
+Assuming 500MB arenas of
+128,000 blocks that are 4096 bytes each after compression,
+the minimum index cache size is about 6MB.
+The index cache can be thought of as the level-3 cache:
+read requests handled by the index cache must still go
+to disk to fetch the arena blocks, but the costly random
+access to the index is avoided.
+.PP
+The size of the index cache determines how long venti
+can sustain its `burst' write throughput, during which time
+the only disk accesses on the critical path
+are sequential writes to the arena partitions.
+For example, if you want to be able to sustain 10MB/s
+for an hour, you need enough index cache to hold entries
+for 36GB of blocks. Assuming 8192-byte blocks,
+you need room for almost five million index entries.
+Since index entries are 50 bytes each, you need 250MB
+of index cache.
+If the background index update process can make a single
+pass through the index in an hour, which is possible,
+then you can sustain the 10MB/s indefinitely (at least until
+the arenas are all filled).
+.PP
+The
+.I "bloom filter
+requires memory equal to its size on disk,
+as discussed above.
+.PP
+A reasonable starting allocation is to
+divide memory equally (in thirds) between
+the bloom filter, the index cache, and the lump and block caches;
+the third of memory allocated to the lump and block caches
+should be split unevenly, with more (say, two thirds)
+going to the block cache.
+.SS Network
+The venti server announces two network services, one
+(conventionally TCP port
+.BR venti ,
+17034) serving
+the venti protocol as described in
+.IR venti (7),
+and one serving HTTP
+(conventionally TCP port
+.BR venti ,
+80).
+.PP
+The venti web server provides the following
+URLs for accessing status information:
+.TP
+.B /index
+A summary of the usage of the arenas and index sections.
+.TP
+.B /xindex
+An XML version of
+.BR /index .
+.TP
+.B /storage
+Brief storage totals.
+.TP
+.BI /set/ variable
+The current integer value of
+.IR variable .
+Variables are:
+.BR compress ,
+whether or not to compress blocks
+(for debugging);
+.BR logging ,
+whether to write entries to the debugging logs;
+.BR stats ,
+whether to collect run-time statistics;
+.BR icachesleeptime ,
+the time in milliseconds between successive updates
+of megabytes of the index cache;
+.BR arenasumsleeptime ,
+the time in milliseconds between reads while
+checksumming an arena in the background.
+The two sleep times should be (but are not) managed by venti;
+they exist to provide more experience with their effects.
+The other variables exist only for debugging and
+performance measurement.
+.TP
+.BI /set/ variable / value
+Set
+.I variable
+to
+.IR value .
+.TP
+.BI /graph/ name / param / param / \fR...
+A PNG image graphing the named run-time statistic over time.
+The details of names and parameters are undocumented;
+see
+.B httpd.c
+in the venti sources.
+.TP
+.B /log
+A list of all debugging logs present in the server's memory.
+.TP
+.BI /log/ name
+The contents of the debugging log with the given
+.IR name .
+.TP
+.B /flushicache
+Force venti to begin flushing the index cache to disk.
+The request response will not be sent until the flush
+has completed.
+.TP
+.B /flushdcache
+Force venti to begin flushing the arena block cache to disk.
+The request response will not be sent until the flush
+has completed.
+.PD
+.PP
+Requests for other files are served by consulting a
+directory named in the configuration file
+(see
+.B webroot
+below).
+.SS Configuration File
+A venti configuration file
+enumerates the various index sections and
+arenas that constitute a venti system.
+The components are indicated by the name of the file, typically
+a disk partition, in which they reside. The configuration
+file is the only location that file names are used. Internally,
+venti uses the names assigned when the components were formatted
+with
+.I fmtarenas
+or
+.I fmtisect
+(see
+.IR venti-fmt (8)).
+In particular, only the configuration needs to be
+changed if a component is moved to a different file.
+.PP
+The configuration file consists of lines in the form described below.
+Lines starting with
+.B #
+are comments.
+.TP
+.BI index " name
+Names the index for the system.
+.TP
+.BI arenas " file
+.I File
+is an arena partition, formatted using
+.IR fmtarenas .
+.TP
+.BI isect " file
+.I File
+is an index section, formatted using
+.IR fmtisect .
+.PP
+After formatting a venti system using
+.IR fmtindex ,
+the order of arenas and index sections should not be changed.
+Additional arenas can be appended to the configuration;
+run
+.I fmtindex
+with the
+.B -a
+flag to update the index.
+.PP
+The configuration file also holds configuration parameters
+for the venti server itself.
+These are:
+.TF httpaddr netaddr
+.TP
+.BI mem " size
+lump cache size
+.TP
+.BI bcmem " size
+block cache size
+.TP
+.BI icmem " size
+index cache size
+.TP
+.BI addr " netaddr
+network address to announce venti service
+(default
+.BR tcp!*!venti )
+.TP
+.BI httpaddr " netaddr
+network address to announce HTTP service
+(default
+.BR tcp!*!http )
+.TP
+.B queuewrites
+queue writes in memory
+(default is not to queue)
+.TP
+.BI webroot " dir
+directory tree containing files for HTTP server
+to consult for unrecognized URLs
+.PD
+.PP
+The units for the various cache sizes above can be specified by appending a
+.LR k ,
+.LR m ,
+or
+.LR g
+(case-insensitive)
+to indicate kilobytes, megabytes, or gigabytes respectively.
+.SS Command Line
+Options to
+.I venti
+are:
+.TP
+.BI -c " config
+The server configuration file
+(default
+.BR venti.conf )
+.TP
+.BI -o " line
+Set a server parameter, using the same syntax
+as in the configuration file.
+The
+.B -o
+options override the configuration file.
+.TP
+.B -d
+Produce various debugging information on standard error.
+Implies
+.BR -s .
+.TP
+.B -L
+Enable logging. By default all logging is disabled.
+Logging slows server operation considerably.
+.TP
+.B -s
+Do not run in the background.
+Normally,
+the foreground process will exit once the Venti server
+is initialized and ready for connections.
+.PD
+.SH EXAMPLE
+A simple configuration:
+.IP
+.EX
+% cat venti.conf
+index main
+isect /tmp/disks/isect0
+isect /tmp/disks/isect1
+arenas /tmp/disks/arenas
+mem 10M
+bcmem 20M
+icmem 30M
+%
+.EE
+.PP
+Format the index sections, the arena partition, and
+finally the main index:
+.IP
+.EX
+% venti/fmtisect isect0. /tmp/disks/isect0 &
+% venti/fmtisect isect1. /tmp/disks/isect1 &
+% venti/fmtarenas arenas0. /tmp/disks/arenas &
+% wait
+% venti/fmtindex venti.conf
+%
+.EE
+.PP
+Start the server and check the storage statistics:
+.IP
+.EX
+% venti/venti
+% hget http://$sysname/storage
+.EE
+.SH "SEE ALSO"
+.IR venti (1),
+.IR venti (3),
+.IR venti (7),
+.IR venti-backup (8)
+.IR venti-fmt (8)
+.br
+Sean Quinlan and Sean Dorward,
+``Venti: a new approach to archival storage'',
+.I "Usenix Conference on File and Storage Technologies" ,
+2002.
+.SH BUGS
+Setting up a venti server is too complicated.
+.PP
+Venti should not require the user to decide how to
+partition its memory usage.