aboutsummaryrefslogtreecommitdiff
path: root/man/man8
diff options
context:
space:
mode:
Diffstat (limited to 'man/man8')
-rw-r--r--man/man8/venti.8232
-rw-r--r--man/man8/ventiaux.8504
2 files changed, 736 insertions, 0 deletions
diff --git a/man/man8/venti.8 b/man/man8/venti.8
new file mode 100644
index 00000000..be112a6a
--- /dev/null
+++ b/man/man8/venti.8
@@ -0,0 +1,232 @@
+.TH VENTI 8
+.SH NAME
+venti \- an archival block storage server
+.SH SYNOPSIS
+.B venti/venti
+[
+.B -dsw
+]
+[
+.B -a
+.I ventiaddress
+]
+[
+.B -B
+.I blockcachesize
+]
+[
+.B -c
+.I config
+]
+[
+.B -C
+.I cachesize
+]
+[
+.B -h
+.I httpaddress
+]
+[
+.B -I
+.I icachesize
+]
+.PP
+.B venti/sync
+[
+.B -h
+.I host
+]
+.SH DESCRIPTION
+.I Venti
+is a block storage server intended for archival data.
+In a Venti server,
+the SHA1 hash of a block's contents acts as the block
+identifier for read and write operations.
+This approach enforces a write-once policy, preventing accidental or
+malicious destruction of data. In addition, duplicate copies of a
+block are coalesced, reducing the consumption of storage and
+simplifying the implementation of clients.
+.PP
+Storage for
+.I venti
+consists of a data log and an index, both of which
+can be spread across multiple files.
+The files containing the data log are themselves divided into self-contained sections called arenas.
+Each arena contains a large number of data blocks and is sized to
+facilitate operations such as copying to removable media.
+The index provides a mapping between the a Sha1 fingerprint and
+the location of the corresponding block in the data log.
+.PP
+The index and data log are typically stored on raw disk partitions.
+To improve the robustness, the data log should be stored on
+a device that provides RAID functionality. The index does
+not require such protection, since if necessary, it can
+can be regenerated from the data log.
+The performance of
+.I venti
+is typically limited to the random access performance
+of the index. This performance can be improved by spreading the
+index accross multiple disks.
+.PP
+The storage for
+.I venti
+is initialized using
+.IR fmtarenas ,
+.IR fmtisect ,
+and
+.I fmtindex
+(see
+.IR ventiaux (8)).
+A configuration file,
+.IR venti.conf (6),
+ties the index sections and data arenas together.
+.PP
+A Venti
+server is accessed via an undocumented network protocol.
+Two client applications are included in this distribution:
+.IR vac (1)
+and
+.IR vacfs (4).
+.I Vac
+copies files from a Plan 9 file system to Venti, creating an
+archive and returning the fingerprint of the root.
+This archive can be mounted in Plan 9 using
+.IR vacfs .
+These two commands enable a rudimentary backup system.
+A future release will include a Plan 9 file system that uses
+Venti as a replacement for the WORM device of
+.IR fs (4).
+.PP
+The
+.I venti
+server provides rudimentary status information via
+a built-in http server. The URL files it serves are:
+.TP
+.B stats
+Various internal statistics.
+.TP
+.B index
+An enumeration of the index sections and all non empty arenas, including various statistics.
+.TP
+.B storage
+A summary of the state of the data log.
+.TP
+.B xindex
+An enumeration of the index sections and all non empty arenas, in XML format.
+.PP
+Several auxiliary utilities (see
+.IR ventiaux (8))
+aid in maintaining the storage for Venti.
+With the exception of
+.I rdarena ,
+these utilities should generally be run after killing the
+.I venti
+server.
+The utilities are:
+.TP
+.I checkarenas
+Check the integrity, and optionally fix, Venti arenas.
+.TP
+.I checkindex
+Check the integrity, and optionally fix, a Venti index.
+.TP
+.I buildindex
+Rebuild a Venti index from scratch.
+.TP
+.I rdarena
+Extract a Venti arena and write to standard output.
+.PD
+.PP
+Options to
+.I venti
+are:
+.TP
+.BI -a " ventiaddress
+The network address on which the server listens for incoming connections.
+The default is
+.LR tcp!*!venti .
+.TP
+.BI -B " blockcachesize
+The size, in bytes, of memory allocated to caching raw disk blocks.
+.TP
+.BI -c " config
+Specifies the
+Venti
+configuration file.
+Defaults to
+.LR venti.conf .
+.TP
+.BI -C " cachesize
+The size, in bytes, of memory allocated to caching
+Venti
+blocks.
+.TP
+.BI -d
+Produce various debugging information on standard error.
+.TP
+.BI -h " httpaddress
+The network address of Venti's built-in
+http
+server.
+The default is
+.LR tcp!*!http .
+.TP
+.BI -I " icachesize
+The size, in bytes, of memory allocated to caching the index mapping fingerprints
+to locations in
+.IR venti 's
+data log.
+.TP
+.B -s
+Do not run in the background.
+Normally,
+the foreground process will exit once the Venti server
+is initialized and ready for connections.
+.TP
+.B -w
+Enable write buffering. This option increase the performance of writes to
+.I venti
+at the cost of returning success to the client application before the
+data has been written to disk.
+The server implements a
+.I sync
+rpc that waits for completion of all the writes buffered at the time
+the rpc was received.
+Applications such as
+.IR vac (1)
+and the
+.I sync
+command described below
+use this rpc to make sure that the data is correctly written to disk.
+Use of this option is recommended.
+.PD
+.PP
+The units for the various cache sizes above can be specified by appending a
+.LR k ,
+.LR m ,
+or
+.LR g
+to indicate kilobytes, megabytes, or gigabytes respectively.
+The command line options override options found in the
+.IR venti.conf (6)
+file.
+.PP
+.I Sync
+connects to a running Venti server and executes a sync rpc
+(described with the
+.B -w
+option above).
+If sync exits successfully, it means that all writes buffered at the
+time the command was issued are now on disk.
+.SH SOURCE
+.B /sys/src/cmd/venti
+.SH "SEE ALSO"
+.IR venti.conf (6),
+.IR ventiaux (8),
+.IR vac (1),
+.IR vacfs (4).
+.br
+Sean Quinlan and Sean Dorward,
+``Venti: a new approach to archival storage'',
+.I "Usenix Conference on File and Storage Technologies" ,
+2002.
diff --git a/man/man8/ventiaux.8 b/man/man8/ventiaux.8
new file mode 100644
index 00000000..fb6d8522
--- /dev/null
+++ b/man/man8/ventiaux.8
@@ -0,0 +1,504 @@
+.TH VENTIAUX 8
+.SH NAME
+buildindex,
+checkarenas,
+checkindex,
+conf,
+copy,
+fmtarenas,
+fmtindex,
+fmtisect,
+rdarena,
+rdarenablocks,
+read,
+wrarenablocks,
+write \- Venti maintenance and debugging commands
+.SH SYNOPSIS
+.B venti/buildindex
+[
+.B -B
+.I blockcachesize
+]
+[
+.B -Z
+]
+.I venti.config
+.I tmp
+.PP
+.B venti/checkarenas
+[
+.B -afv
+]
+.I file
+.PP
+.B venti/checkindex
+[
+.B -f
+]
+[
+.B -B
+.I blockcachesize
+]
+.I venti.config
+.I tmp
+.PP
+.B venti/conf
+[
+.B -w
+]
+.I partition
+[
+.I configfile
+]
+.PP
+.B venti/copy
+[
+.B -f
+]
+.I src
+.I dst
+.I score
+[
+.I type
+]
+.PP
+.B venti/fmtarenas
+[
+.B -Z
+]
+[
+.B -a
+.I arenasize
+]
+[
+.B -b
+.I blocksize
+]
+.I name
+.I file
+.PP
+.B venti/fmtindex
+[
+.B -a
+]
+.I venti.config
+.PP
+.B venti/fmtisect
+[
+.B -Z
+]
+[
+.B -b
+.I blocksize
+]
+.I name
+.I file
+.PP
+.B venti/rdarena
+[
+.B -v
+]
+.I arenapart
+.I arenaname
+.PP
+.B venti/read
+[
+.B -h
+.I host
+]
+.I score
+[
+.I type
+]
+.PP
+.B venti/wrarena
+[
+.B -o
+.I fileoffset
+]
+[
+.B -h
+.I host
+]
+.I arenafile
+[
+.I clumpoffset
+]
+.PP
+.B venti/write
+[
+.B -h
+.I host
+]
+[
+.B -t
+.I type
+]
+[
+.B -z
+]
+.SH DESCRIPTION
+These commands aid in the setup, maintenance, and debugging of
+Venti servers.
+See
+.IR venti (8)
+and
+.IR venti.conf (6)
+for an overview of the data structures stored by Venti.
+.PP
+Note that the units for the various sizes in the following
+commands can be specified by appending
+.LR k ,
+.LR m ,
+or
+.LR g
+to indicate kilobytes, megabytes, or gigabytes respectively.
+.PP
+.I Buildindex
+populates the index for the Venti system described in
+.IR venti.config .
+The index must have previously been formatted using
+.IR fmtindex .
+This command is typically used to build a new index for a Venti
+system when the old index becomes too small, or to rebuild
+an index after media failure.
+Small errors in an index can usually be fixed with
+.IR checkindex .
+.PP
+The
+.I tmp
+file, usually a disk partition, must be large enough to store a copy of the index.
+This temporary space is used to perform a merge sort of index entries
+generated by reading the arenas.
+.PP
+Options to
+.I buildindex
+are:
+.TP
+.BI -B " blockcachesize
+The amount of memory, in bytes, to use for caching raw disk accesses while running
+.IR buildindex .
+(This is not a property of the created index.)
+The default is 8k.
+.TP
+.B -Z
+Do not zero the index.
+This option should only be used when it is known that the index was already zeroed.
+.PD
+.PP
+.I Checkarenas
+examines the Venti arenas contained in the given
+.IR file .
+The program detects various error conditions, and optionally attempts
+to fix any errors that are found.
+.PP
+Options to
+.I checkarenas
+are:
+.TP
+.B -a
+For each arena, scan the entire data section.
+If this option is omitted, only the end section of
+the arena is examined.
+.TP
+.B -f
+Attempt to fix any errors that are found.
+.TP
+.B -v
+Increase the verbosity of output.
+.PD
+.PP
+.I Checkindex
+examines the Venti index described in
+.IR venti.config .
+The program detects various error conditions including:
+blocks that are not indexed, index entries for blocks that do not exist,
+and duplicate index entries.
+If requested, an attempt can be made to fix errors that are found.
+.PP
+The
+.I tmp
+file, usually a disk partition, must be large enough to store a copy of the index.
+This temporary space is used to perform a merge sort of index entries
+generated by reading the arenas.
+.PP
+Options to
+.I checkindex
+are:
+.TP
+.BI -B " blockcachesize
+The amount of memory, in bytes, to use for caching raw disk accesses while running
+.IR checkindex .
+The default is 8k.
+.TP
+.B -f
+Attempt to fix any errors that are found.
+.PD
+.PP
+.I Fmtarenas
+formats the given
+.IR file ,
+typically a disk partition, into a number of
+Venti
+arenas.
+The arenas are given names of the form
+.IR name%d ,
+where
+.I %d
+is replaced with a sequential number starting at 0.
+.PP
+Options to
+.I fmtarenas
+are:
+.TP
+.BI -a " arenasize
+The arenas are of
+.I arenasize
+bytes. The default is 512 megabytes, which was selected to provide a balance
+between the number of arenas and the ability to copy an arena to external
+media such as recordable CDs and tapes.
+.TP
+.BI -b " blocksize
+The size, in bytes, for read and write operations to the file.
+The size is recorded in the file, and is used by applications that access the arenas.
+The default is 8k.
+.TP
+.B -Z
+Do not zero the data sections of the arenas.
+Using this option reduces the formatting time
+but should only be used when it is known that the file was already zeroed.
+.PD
+.I Fmtindex
+takes the
+.IR venti.conf (6)
+file
+.I venti.config
+and initializes the index sections to form a usable index structure.
+The arena files and index sections must have previously been formatted
+using
+.I fmtarenas
+and
+.I fmtisect
+respectively.
+.PP
+The function of a Venti index is to map a SHA1 fingerprint to a location
+in the data section of one of the arenas. The index is composed of
+blocks, each of which contains the mapping for a fixed range of possible
+fingerprint values.
+.I Fmtindex
+determines the mapping between SHA1 values and the blocks
+of the collection of index sections. Once this mapping has been determined,
+it cannot be changed without rebuilding the index.
+The basic assumption in the current implementation is that the index
+structure is sufficiently empty that individual blocks of the index will rarely
+overflow. The total size of the index should be about 2% to 10% of
+the total size of the arenas, but the exact depends both the index block size
+and the compressed size of block stored to Venti.
+.PP
+.I Fmtindex
+also computes a mapping between a linear address space and
+the data section of the collection of arenas. The
+.B -a
+option can be used to add additional arenas to an index.
+To use this feature,
+add the new arenas to
+.I venti.config
+after the existing arenas and then run
+.I fmtindex
+.BR -a .
+.PP
+A copy of the above mappings is stored in the header for each of the index sections.
+These copies enable
+.I buildindex
+to restore a single index section without rebuilding the entire index.
+.PP
+.I Fmtisect
+formats the given
+.IR file ,
+typically a disk partition, as a Venti index section with the specified
+.IR name .
+One or more formatted index sections are combined into a Venti
+index using
+.IR fmtindex .
+Each of the index sections within an index must have a unique name.
+.PP
+Options to
+.I fmtisect
+are:
+.TP
+.BI -b " blocksize
+The size, in bytes, for read and write operations to the file.
+All the index sections within a index must have the same block size.
+The default is 8k.
+.TP
+.B -Z
+Do not zero the index.
+Using this option reduces the formatting time
+but should only be used when it is known that the file was already zeroed.
+.PD
+.PP
+.I Rdarena
+extracts the named
+.I arena
+from the arena partition
+.I arenapart
+and writes this arena to standard output.
+This command is typically used to back up an arena to external media.
+The
+.B -v
+option generates more verbose output on standard error.
+.PP
+.I Wrarena
+writes the blocks contained in the arena
+.I arenafile
+(typically, the output of
+.IR rdarena )
+to a Venti server.
+It is typically used to reinitialize a Venti server from backups of the arenas.
+For example,
+.IP
+.EX
+venti/rdarena /dev/sdC0/arenas arena.0 >external.media
+venti/wrarena -h venti2 external.media
+.EE
+.LP
+writes the blocks contained in
+.B arena.0
+to the Venti server
+.B venti2
+(typically not the one using
+.BR /dev/sdC0/arenas ).
+.PP
+The
+.B -o
+option specifies that the arena starts at byte
+.I fileoffset
+(default
+.BR 0 )
+in
+.I arenafile .
+This is useful for reading directly from
+the Venti arena partition:
+.IP
+.EX
+venti/wrarena -h venti2 -o 335872 /dev/sdC0/arenas
+.EE
+.LP
+(In this example, 335872 is the offset shown in the Venti
+server's index list (344064) minus one block (8192).
+You will need to substitute your own arena offsets
+and block size.)
+.PP
+Finally, the optional
+.I offset
+argument specifies that the writing should begin with the
+clump starting at
+.I offset
+within the arena.
+.I Wrarena
+prints the offset it stopped at (because there were no more data blocks).
+This could be used to incrementally back up a Venti server
+to another Venti server:
+.IP
+.EX
+last=`{cat last}
+venti/wrarena -h venti2 -o 335872 /dev/sdC0/arenas $last >output
+awk '/^end offset/ { print $3 }' offset >last
+.EE
+.LP
+Of course, one would need to add wrapper code to keep track
+of which arenas have been processed.
+See
+.B /sys/src/cmd/venti/backup.example
+for a version that does this.
+.PP
+.I Read
+and
+.I write
+read and write blocks from a running Venti server.
+They are intended to ease debugging of the server.
+The default
+.I host
+is the environment variable
+.BR $venti ,
+followed by the network metaname
+.BR $venti .
+The
+.I type
+is the decimal type of block to be read or written.
+If no
+.I type
+is specified for
+.I read ,
+all types are tried, and a command-line is printed to
+show the type that eventually worked.
+If no
+.I type
+is specified for
+.I write ,
+.B VtDataType
+(13)
+is used.
+.I Read
+reads the block named by
+.I score
+(a SHA1 hash)
+from the Venti server and writes it to standard output.
+.I Write
+reads a block from standard input and attempts to write
+it to the Venti server.
+If successful, it prints the score of the block on the server.
+.PP
+.I Copy
+walks the entire tree of blocks rooted at
+.I score ,
+copying all the blocks visited during the walk from
+the Venti server at network address
+.I src
+to the Venti server at network address
+.I dst .
+If
+.I type
+(a decimal block type for
+.IR score )
+is omitted, all types will be tried in sequence
+until one is found that works.
+The
+.B -f
+flag runs the copy in ``fast'' mode: if a block is already on
+.IR dst ,
+the walk does not descend below it, on the assumption that all its
+children are also already on
+.IR dst .
+Without this flag, the copy often transfers many times more
+data than necessary.
+.PP
+To make it easier to bootstrap servers, the configuration
+file can be stored at the beginning of any Venti partitions using
+.IR conf .
+A partition so branded with a configuration file can
+be used in place of a configuration file when invoking any
+of the venti commands.
+By default,
+.I conf
+prints the configuration stored in
+.IR partition .
+When invoked with the
+.B -w
+flag,
+.I conf
+reads a configuration file from
+.I configfile
+(or else standard input)
+and stores it in
+.IR partition .
+.SH SOURCE
+.B /sys/src/cmd/venti
+.SH "SEE ALSO"
+.IR venti (8),
+.IR venti.conf (6)
+.SH BUGS
+.I Buildindex
+should allow an individual index section to be rebuilt.
+The merge sort could be performed in the space used to store the
+index rather than requiring a temporary file.