Yeah, I was basically treating the lower process runs (<64) as in-memory
performance and the higher process runs as a mix (since, for example,
the 128 run deals with ~8 GB of data and I only have 1.25 GB of RAM).
> You'll find that running
> echo 80 0 0 0 500 3000 90 0 0 > /proc/sys/vm/bdflush
> will boost your dbench throughput muchly.
Yeah, actually, I've been sorta "brute-forcing" the bdflush and
max-readahead space (or the part of it that I chose for a start) over
the past few days for bonnie++ and dbench. The idea was to use these
quicker-running benchmarks to get an general idea of good values to use
and then zero in on the final values with longer, more real-world load.
I was thinking that bonnie++ would at least give me an idea of
sequential read/write performance for files larger than RAM (one part of
the typical workload I see is moving large files out to multiple [32-64
or so] machines at the same time) and that dbench would give me an idea
of performance for many small read/write operations, both for cached and
on-disk data (another aspect of the workload I see is reading/writing
many small files from multiple machines, such as postprocessing the
results of some large computational run). Oh, I don't think I actually
mentioned that I'm looking to tune fileservers here for medium-sized
(100-200 node) computational clusters and that in the end there will be
something much more powerful than a single SCSI disk in the backend.
FWIW, the top 10 bdflush/max-readahead combinations for dbench (sorted
by 128 processes) that I've seen so far are:
16 32 64 128
-------- -------- -------- --------
70-900-1000-90-2047 208.056 159.598 144.721 122.514
30-100-1000-50-127 113.829 101.820 110.699 120.017
70-500-1000-90-2047 209.547 150.172 142.556 115.979
30-300-1000-90-63 108.862 118.443 109.060 112.901
30-100-1000-50-63 113.904 96.648 113.969 112.021
50-700-1000-90-63 208.062 137.579 134.504 111.656
30-500-1000-50-255 111.955 97.373 115.360 111.004
30-100-1000-70-1023 115.110 99.823 122.720 110.016
70-300-1000-90-1023 220.096 169.194 160.025 109.753
70-700-1000-90-255 208.468 146.202 140.098 109.618
(with the numbers on the left being
nfract-interval-age_buffer-nfract_sync-max_readahead, the column entries
being the non-adjusted MB/s that dbench reports, and the columns being
the number of processes). Unfortunately, these are a bit bunk because I
haven't run the tests enough times to average the results to remove
variance between runs.
If you have any suggestions on better ways than dbench to somewhat
quickly simulate performance for many clients hitting a fileserver at
the same time, I'd love to hear it.
-- Jason Holmes - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to email@example.com More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/