1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
|
.TH VENTI-FMT 8
.SH NAME
buildindex,
checkarenas,
checkindex,
conf,
fmtarenas,
fmtbloom,
fmtindex,
fmtisect,
syncindex \- prepare and maintain a venti server
.SH SYNOPSIS
.PP
.B venti/fmtarenas
[
.B -4Z
]
[
.B -a
.I arenasize
]
[
.B -b
.I blocksize
]
.I name
.I file
.PP
.B venti/fmtisect
[
.B -1Z
]
[
.B -b
.I blocksize
]
.I name
.I file
.PP
.B venti/fmtbloom
[
.B -n
.I nblocks
|
.B -N
.I nhash
]
[
.B -s
.I size
]
.I file
.PP
.B venti/fmtindex
[
.B -a
]
.I venti.conf
.PP
.B venti/conf
[
.B -w
]
.I partition
[
.I configfile
]
.if t .sp 0.5
.PP
.B venti/buildindex
[
.B -bd
] [
.B -i
.I isect
] ... [
.B -M
.I imemsize
]
.I venti.conf
.PP
.B venti/checkindex
[
.B -f
]
[
.B -B
.I blockcachesize
]
.I venti.conf
.I tmp
.PP
.B venti/checkarenas
[
.B -afv
]
.I file
.SH DESCRIPTION
These commands aid in the setup, maintenance, and debugging of
venti servers.
See
.MR venti (7)
for an overview of the venti system and
.MR venti (8)
for an overview of the data structures used by the venti server.
.PP
Note that the units for the various sizes in the following
commands can be specified by appending
.LR k ,
.LR m ,
or
.LR g
to indicate kilobytes, megabytes, or gigabytes respectively.
.SS Formatting
To prepare a server for its initial use, the arena partitions and
the index sections must be formatted individually, with
.I fmtarenas
and
.IR fmtisect .
Then the
collection of index sections must be combined into a venti
index with
.IR fmtindex .
.PP
.I Fmtarenas
formats the given
.IR file ,
typically a disk partition, into an arena partition.
The arenas in the partition are given names of the form
.IR name%d ,
where
.I %d
is replaced with a sequential number starting at 0.
.PP
Options to
.I fmtarenas
are:
.TP
.BI -a " arenasize
The arenas are of
.I arenasize
bytes. The default is
.BR 512M ,
which was selected to provide a balance
between the number of arenas and the ability to copy an arena to external
media such as recordable CDs and tapes.
.TP
.BI -b " blocksize
The size, in bytes, for read and write operations to the file.
The size is recorded in the file, and is used by applications that access the arenas.
The default is
.BR 8k .
.TP
.B -4
Create a `version 4' arena partition for backwards compatibility with old servers.
The default is version 5, used by the current venti server.
.TP
.B -Z
Do not zero the data sections of the arenas.
Using this option reduces the formatting time
but should only be used when it is known that the file was already zeroed.
(Version 4 only; version 5 sections are not and do not need to be zeroed.)
.PD
.PP
.I Fmtisect
formats the given
.IR file ,
typically a disk partition, as a venti index section with the specified
.IR name .
Each of the index sections in a venti configuration must have a unique name.
.PP
Options to
.I fmtisect
are:
.TP
.BI -b " bucketsize
The size of an index bucket, in bytes.
All the index sections within a index must have the same bucket size.
The default is
.BR 8k .
.TP
.B -1
Create a `version 1' index section for backwards compatibility with old servers.
The default is version 2, used by the current venti server.
.TP
.B -Z
Do not zero the index.
Using this option reduces the formatting time
but should only be used when it is known that the file was already zeroed.
(Version 1 only; version 2 sections are not and do not need to be zeroed.)
.PD
.PP
.I Fmtbloom
formats the given
.I file
as a Bloom filter
(see
.MR venti (7) ).
The options are:
.TF "\fL-s\fI size"
.PD
.TP
.BI -n " nblock \fR| " -N " nhash
The number of blocks expected to be indexed by the filter
or the number of hash functions to use.
If the
.B -n
option
is given, it is used, along with the total size of the filter,
to compute an appropriate
.IR nhash .
.TP
.BI -s " size
The size of the Bloom filter. The default is the total size of the file.
In either case,
.I size
is rounded down to a power of two.
.PD
.PP
The
.I file
argument in the commands above can be of the form
.IB file : lo - hi
to specify a range of the file.
.I Lo
and
.I hi
are specified in bytes but can have the usual
.BI k ,
.BI m ,
or
.B g
suffixes.
Either
.I lo
or
.I hi
may be omitted.
This notation eliminates the need to
partition raw disks on non-Plan 9 systems.
.PP
.I Fmtindex
reads the configuration file
.I venti.conf
and initializes the index sections to form a usable index structure.
The arena files and index sections must have previously been formatted
using
.I fmtarenas
and
.I fmtisect
respectively.
.PP
The function of a venti index is to map a SHA1 fingerprint to a location
in the data section of one of the arenas. The index is composed of
blocks, each of which contains the mapping for a fixed range of possible
fingerprint values.
.I Fmtindex
determines the mapping between SHA1 values and the blocks
of the collection of index sections. Once this mapping has been determined,
it cannot be changed without rebuilding the index.
The basic assumption in the current implementation is that the index
structure is sufficiently empty that individual blocks of the index will rarely
overflow. The total size of the index should be about 2% to 10% of
the total size of the arenas, but the exact percentage depends both on the
index block size and the compressed size of blocks stored.
See the discussion in
.MR venti (8)
for more.
.PP
.I Fmtindex
also computes a mapping between a linear address space and
the data section of the collection of arenas. The
.B -a
option can be used to add additional arenas to an index.
To use this feature,
add the new arenas to
.I venti.conf
after the existing arenas and then run
.I fmtindex
.BR -a .
.PP
A copy of the above mappings is stored in the header for each of the index sections.
These copies enable
.I buildindex
to restore a single index section without rebuilding the entire index.
.PP
To make it easier to bootstrap servers, the configuration
file can be stored in otherwise empty space
at the beginning of any venti partitions using
.IR conf .
A partition so branded with a configuration file can
be used in place of a configuration file when invoking any
of the venti commands.
By default,
.I conf
prints the configuration stored in
.IR partition .
When invoked with the
.B -w
flag,
.I conf
reads a configuration file from
.I configfile
(or else standard input)
and stores it in
.IR partition .
.SS Checking and Rebuilding
.PP
.I Buildindex
populates the index for the Venti system described in
.IR venti.conf .
The index must have previously been formatted using
.IR fmtindex .
This command is typically used to build a new index for a Venti
system when the old index becomes too small, or to rebuild
an index after media failure.
Small errors in an index can usually be fixed with
.IR checkindex ,
but
.I checkindex
requires a large temporary workspace and
.I buildindex
does not.
.PP
Options to
.I buildindex
are:
.TF "\fL-M\fI imemsize"
.PD
.TP
.B -b
Reinitialise the Bloom filter, if any.
.TP
.B -d
`Dumb' mode; run all three passes.
.TP
.BI -i " isect
Only rebuild index section
.IR isect ;
may be repeated to rebuild multiple sections.
The name
.L none
is special and just reads the arenas.
.TP
.BI -M " imemsize
The amount of memory, in bytes, to use for caching raw disk accesses while running
.IR buildindex .
(This is not a property of the created index.)
The usual suffices apply.
The default is 256M.
.PD
.PP
.I Checkindex
examines the Venti index described in
.IR venti.conf .
The program detects various error conditions including:
blocks that are not indexed, index entries for blocks that do not exist,
and duplicate index entries.
If requested, an attempt can be made to fix errors that are found.
.PP
The
.I tmp
file, usually a disk partition, must be large enough to store a copy of the index.
This temporary space is used to perform a merge sort of index entries
generated by reading the arenas.
.PP
Options to
.I checkindex
are:
.TP
.BI -B " blockcachesize
The amount of memory, in bytes, to use for caching raw disk accesses while running
.IR checkindex .
The default is 8k.
.TP
.B -f
Attempt to fix any errors that are found.
.PD
.PP
.I Checkarenas
examines the Venti arenas contained in the given
.IR file .
The program detects various error conditions, and optionally attempts
to fix any errors that are found.
.PP
Options to
.I checkarenas
are:
.TP
.B -a
For each arena, scan the entire data section.
If this option is omitted, only the end section of
the arena is examined.
.TP
.B -f
Attempt to fix any errors that are found.
.TP
.B -v
Increase the verbosity of output.
.PD
.SH SOURCE
.B \*9/src/cmd/venti/srv
.SH SEE ALSO
.MR venti (7) ,
.MR venti (8)
.SH BUGS
.I Buildindex
should allow an individual index section to be rebuilt.
|