mirror of
https://github.com/9fans/plan9port.git
synced 2025-01-12 11:10:07 +00:00
gone
This commit is contained in:
parent
c6f92061c0
commit
b7fcec2b29
1 changed files with 0 additions and 360 deletions
|
@ -1,360 +0,0 @@
|
|||
.TH VENTI.CONF 7
|
||||
.SH NAME
|
||||
venti.conf \- venti configuration
|
||||
.SH DESCRIPTION
|
||||
Venti is a SHA1-addressed archival storage server.
|
||||
See
|
||||
.IR venti (7)
|
||||
for a full introduction to the system.
|
||||
This page documents the structure and operation of the server.
|
||||
.PP
|
||||
A venti server requires multiple disks or disk partitions,
|
||||
each of which must be properly formatted before the server
|
||||
can be run.
|
||||
.SS Disk
|
||||
The venti server maintains three disk structures, typically
|
||||
stored on raw disk partitions:
|
||||
the append-only
|
||||
.IR "data log" ,
|
||||
which holds, in sequential order,
|
||||
the contents of every block written to the server;
|
||||
the
|
||||
.IR index ,
|
||||
which helps locate a block in the data log given its score;
|
||||
and optionally the
|
||||
.IR "bloom filter" ,
|
||||
a concise summary of which scores are present in the index.
|
||||
The data log is the primary storage.
|
||||
To improve the robustness, it should be stored on
|
||||
a device that provides RAID functionality.
|
||||
The index and the bloom filter are optimizations
|
||||
employed to access the data log efficiently and can be rebuilt
|
||||
if lost or damaged.
|
||||
.PP
|
||||
The data log is logically split into sections called
|
||||
.IR arenas ,
|
||||
typically sized for easy offline backup
|
||||
(e.g., 500MB).
|
||||
A data log may comprise many disks, each storing
|
||||
one or more arenas.
|
||||
Such disks are called
|
||||
.IR "arena partitions" .
|
||||
Arena partitions are filled in the order given in the configuration.
|
||||
.PP
|
||||
The index is logically split into block-sized pieces called
|
||||
.IR buckets ,
|
||||
each of which is responsible for a particular range of scores.
|
||||
An index may be split across many disks, each storing many buckets.
|
||||
Such disks are called
|
||||
.IR "index sections" .
|
||||
.PP
|
||||
The index must be sized so that no bucket is full.
|
||||
When a bucket fills, the server must be shut down and
|
||||
the index made larger.
|
||||
Since scores appear random, each bucket will contain
|
||||
approximately the same number of entries.
|
||||
Index entries are 40 bytes long. Assuming that a typical block
|
||||
being written to the server is 8192 bytes and compresses to 4096
|
||||
bytes, the active index is expected to be about 1% of
|
||||
the active data log.
|
||||
Storing smaller blocks increases the relative index footprint;
|
||||
storing larger blocks decreases it.
|
||||
To allow variation in both block size and the random distribution
|
||||
of scores to buckets, the suggested index size is 5% of
|
||||
the active data log.
|
||||
.PP
|
||||
The (optional) bloom filter is a large bitmap that is stored on disk but
|
||||
also kept completely in memory while the venti server runs.
|
||||
It helps the venti server efficiently detect scores that are
|
||||
.I not
|
||||
already stored in the index.
|
||||
The bloom filter starts out zeroed.
|
||||
Each score recorded in the bloom filter is hashed to choose
|
||||
.I nhash
|
||||
bits to set in the bloom filter.
|
||||
A score is definitely not stored in the index of any of its
|
||||
.I nhash
|
||||
bits are not set.
|
||||
The bloom filter thus has two parameters:
|
||||
.I nhash
|
||||
(maximum 32)
|
||||
and the total bitmap size
|
||||
(maximum 512MB, 2\s-2\u32\d\s+2 bits).
|
||||
.PP
|
||||
The bloom filter should be sized so that
|
||||
.I nhash
|
||||
\(ti
|
||||
.I nblock
|
||||
\(ti
|
||||
0.7
|
||||
\(<=
|
||||
0.7 \(ti
|
||||
.IR b ,
|
||||
where
|
||||
.I nblock
|
||||
is the expected number of blocks stored on the server
|
||||
and
|
||||
.I b
|
||||
is the bitmap size in bits.
|
||||
The false positive rate of the bloom filter when sized
|
||||
this way is approximately 2\s-2\u\-\fInblock\fR\d\s+2.
|
||||
.I Nhash
|
||||
less than 10 are not very useful;
|
||||
.I nhash
|
||||
greater than 24 are probably a waste of memory.
|
||||
.I Fmtbloom
|
||||
(see
|
||||
.IR venti-fmt (8))
|
||||
can be given either
|
||||
.I nhash
|
||||
or
|
||||
.IR nblock ;
|
||||
if given
|
||||
.IR nblock ,
|
||||
it will derive an appropriate
|
||||
.IR nhash .
|
||||
.SS Memory
|
||||
Venti can make effective use of large amounts of memory
|
||||
for various caches.
|
||||
.PP
|
||||
The
|
||||
.I "lump cache
|
||||
holds recently-accessed venti data blocks, which the server refers to as
|
||||
.IR lumps .
|
||||
The lump cache should be at least 1MB but can profitably be much larger.
|
||||
The lump cache can be thought of as the level-1 cache:
|
||||
read requests handled by the lump cache can
|
||||
be served instantly.
|
||||
.PP
|
||||
The
|
||||
.I "block cache
|
||||
holds recently-accessed
|
||||
.I disk
|
||||
blocks from the arena partitions.
|
||||
The block cache needs to be able to simultaneously hold two blocks
|
||||
from each arena plus four blocks for the currently-filling arena.
|
||||
The block cache can be thought of as the level-2 cache:
|
||||
read requests handled by the block cache are slower than those
|
||||
handled by the lump cache, since the lump data must be extracted
|
||||
from the raw disk blocks and possibly decompressed, but no
|
||||
disk accesses are necessary.
|
||||
.PP
|
||||
The
|
||||
.I "index cache
|
||||
holds recently-accessed or prefetched
|
||||
index entries.
|
||||
The index cache needs to be able to hold index entries
|
||||
for three or four arenas, at least, in order for prefetching
|
||||
to work properly. Each index entry is 50 bytes.
|
||||
Assuming 500MB arenas of
|
||||
128,000 blocks that are 4096 bytes each after compression,
|
||||
the minimum index cache size is about 6MB.
|
||||
The index cache can be thought of as the level-3 cache:
|
||||
read requests handled by the index cache must still go
|
||||
to disk to fetch the arena blocks, but the costly random
|
||||
access to the index is avoided.
|
||||
.PP
|
||||
The size of the index cache determines how long venti
|
||||
can sustain its `burst' write throughput, during which time
|
||||
the only disk accesses on the critical path
|
||||
are sequential writes to the arena partitions.
|
||||
For example, if you want to be able to sustain 10MB/s
|
||||
for an hour, you need enough index cache to hold entries
|
||||
for 36GB of blocks. Assuming 8192-byte blocks,
|
||||
you need room for almost five million index entries.
|
||||
Since index entries are 50 bytes each, you need 250MB
|
||||
of index cache.
|
||||
If the background index update process can make a single
|
||||
pass through the index in an hour, which is possible,
|
||||
then you can sustain the 10MB/s indefinitely (at least until
|
||||
the arenas are all filled).
|
||||
.PP
|
||||
The
|
||||
.I "bloom filter
|
||||
requires memory equal to its size on disk,
|
||||
as discussed above.
|
||||
.PP
|
||||
A reasonable starting allocation is to
|
||||
divide memory equally (in thirds) between
|
||||
the bloom filter, the index cache, and the lump and block caches;
|
||||
the third of memory allocated to the lump and block caches
|
||||
should be split unevenly, with more (say, two thirds)
|
||||
going to the block cache.
|
||||
.SS Network
|
||||
The venti server announces two network services, one
|
||||
(conventionally TCP port
|
||||
.BR venti ,
|
||||
17034) serving
|
||||
the venti protocol as described in
|
||||
.IR venti (7),
|
||||
and one serving HTTP
|
||||
(conventionally TCP port
|
||||
.BR venti ,
|
||||
80).
|
||||
.PP
|
||||
The venti web server provides the following
|
||||
URLs for accessing status information:
|
||||
.TP
|
||||
.B /index
|
||||
A summary of the usage of the arenas and index sections.
|
||||
.TP
|
||||
.B /xindex
|
||||
An XML version of
|
||||
.BR /index .
|
||||
.TP
|
||||
.B /storage
|
||||
Brief storage totals.
|
||||
.TP
|
||||
.BI /set/ variable
|
||||
The current integer value of
|
||||
.IR variable .
|
||||
Variables are:
|
||||
.BR compress ,
|
||||
whether or not to compress blocks
|
||||
(for debugging);
|
||||
.BR logging ,
|
||||
whether to write entries to the debugging logs;
|
||||
.BR stats ,
|
||||
whether to collect run-time statistics;
|
||||
.BR icachesleeptime ,
|
||||
the time in milliseconds between successive updates
|
||||
of megabytes of the index cache;
|
||||
.BR arenasumsleeptime ,
|
||||
the time in milliseconds between reads while
|
||||
checksumming an arena in the background.
|
||||
The two sleep times should be (but are not) managed by venti;
|
||||
they exist to provide more experience with their effects.
|
||||
The other variables exist only for debugging and
|
||||
performance measurement.
|
||||
.TP
|
||||
.BI /set/ variable / value
|
||||
Set
|
||||
.I variable
|
||||
to
|
||||
.IR value .
|
||||
.TP
|
||||
.BI /graph/ name / param / param / \fR...
|
||||
A PNG image graphing the named run-time statistic over time.
|
||||
The details of names and parameters are undocumented;
|
||||
see
|
||||
.B httpd.c
|
||||
in the venti sources.
|
||||
.TP
|
||||
.B /log
|
||||
A list of all debugging logs present in the server's memory.
|
||||
.TP
|
||||
.BI /log/ name
|
||||
The contents of the debugging log with the given
|
||||
.IR name .
|
||||
.TP
|
||||
.B /flushicache
|
||||
Force venti to begin flushing the index cache to disk.
|
||||
The request response will not be sent until the flush
|
||||
has completed.
|
||||
.TP
|
||||
.B /flushdcache
|
||||
Force venti to begin flushing the arena block cache to disk.
|
||||
The request response will not be sent until the flush
|
||||
has completed.
|
||||
.PD
|
||||
.PP
|
||||
Requests for other files are served by consulting a
|
||||
directory named in the configuration file
|
||||
(see
|
||||
.B webroot
|
||||
below).
|
||||
.SS Configuration File
|
||||
A venti configuration file
|
||||
enumerates the various index sections and
|
||||
arenas that constitute a venti system.
|
||||
The components are indicated by the name of the file, typically
|
||||
a disk partition, in which they reside. The configuration
|
||||
file is the only location that file names are used. Internally,
|
||||
venti uses the names assigned when the components were formatted
|
||||
with
|
||||
.I fmtarenas
|
||||
or
|
||||
.I fmtisect
|
||||
(see
|
||||
.IR venti-fmt (8)).
|
||||
In particular, only the configuration needs to be
|
||||
changed if a component is moved to a different file.
|
||||
.PP
|
||||
The configuration file consists of lines in the form described below.
|
||||
Lines starting with
|
||||
.B #
|
||||
are comments.
|
||||
.TP
|
||||
.BI index " name
|
||||
Names the index for the system.
|
||||
.TP
|
||||
.BI arenas " file
|
||||
.I File
|
||||
is an arena partition, formatted using
|
||||
.IR fmtarenas .
|
||||
.TP
|
||||
.BI isect " file
|
||||
.I File
|
||||
is an index section, formatted using
|
||||
.IR fmtisect .
|
||||
.PP
|
||||
After formatting a venti system using
|
||||
.IR fmtindex ,
|
||||
the order of arenas and index sections should not be changed.
|
||||
Additional arenas can be appended to the configuration;
|
||||
run
|
||||
.I fmtindex
|
||||
with the
|
||||
.B -a
|
||||
flag to update the index.
|
||||
.PP
|
||||
The configuration file also holds configuration parameters
|
||||
for the venti server itself.
|
||||
These are:
|
||||
.TF httpaddr netaddr
|
||||
.TP
|
||||
.BI mem " size
|
||||
lump cache size
|
||||
.TP
|
||||
.BI bcmem " size
|
||||
block cache size
|
||||
.TP
|
||||
.BI icmem " size
|
||||
index cache size
|
||||
.TP
|
||||
.BI addr " netaddr
|
||||
network address to announce venti service
|
||||
(default
|
||||
.BR tcp!*!venti )
|
||||
.TP
|
||||
.BI httpaddr " netaddr
|
||||
network address to announce HTTP service
|
||||
(default
|
||||
.BR tcp!*!http )
|
||||
.TP
|
||||
.B queuewrites
|
||||
queue writes in memory
|
||||
(default is not to queue)
|
||||
.PD
|
||||
See the server description in
|
||||
.IR venti (8)
|
||||
for explanations of these variables.
|
||||
.SH EXAMPLE
|
||||
.IP
|
||||
.EX
|
||||
index main
|
||||
isect /tmp/disks/isect0
|
||||
isect /tmp/disks/isect1
|
||||
arenas /tmp/disks/arenas
|
||||
mem 10M
|
||||
bcmem 20M
|
||||
icmem 30M
|
||||
.EE
|
||||
.SH "SEE ALSO"
|
||||
.IR venti (8),
|
||||
.IR venti-fmt (8)
|
||||
.SH BUGS
|
||||
Setting up a venti server is too complicated.
|
||||
.PP
|
||||
Venti should not require the user to decide how to
|
||||
partition its memory usage.
|
Loading…
Reference in a new issue