3ware RAID arrays tests with Linux 2.4.19 on P4DPE with
twin 2.4 GHz CPUs
Why are we doing these tests?
We have an opportunity to use a 10Gbit WAN between
Sunnyvale and Chicago for testing. We want
to show 1 GByte/sec file transfers across this link. This
means we need to put servers at each end
of the link that can source/sink that rate. We know we can
achieve ~>200MByte/sec WAN speeds
using optimal TCP parameters and twin SysKonnect cards in
a single server. So we need to find
a disk subsytem that can read/write at least 200 MByte/sec.
Then we will put four or five of
them at each site. One challenge will be finding a
hardware/software configuration that allows
simultaneously to hammer the NICs and the Disks at this
rate.
Server Details
The hardware configuration is a Supermicro P4DPE-GE with dual
P4 2.4 GHz CPUs, 2GB ECC DDR RAM, twin 3ware 7850 IDE/RAID controllers.
Disks are sixteen 120GB Western Digital IDE 7200RPM attached to
3ware controllers, seventeenth is system disk.
3ware controllers are seated in slots #1 and #3 on the motherboard.
Software is Linux Red Hat 2.4.19 kernel.
The 3ware drivers are version 1.02.00.27. Firmware is version 1.05.01.034.
Benchmarking software is bonnie++ version 1.02c.
bonnie++ command used for all measurements:
bonnie++ -d /raid -s:5000:64k -mServer -r2048 -x10 -u0 -f -q
All numbers are for block read and write speeds as reported
by Bonnie++ (which we independently verified with our own
benchmark code, showing the same numbers) (In this image, the 3ware cards are in PCI slots 1 & 3. Better performance is obtained with the cards in slots 1 & 6 i.e. not sharing the same PCI I/O controller.) We purchased our systems from ACME Computer. Contact me if you'd like detailed information on the price (around $8k).
1) First with disks in RAID5 in 3ware, 64kB block size,
software raid 0 device chunk size 512kB:
XFS filesystem: read @ 140-202 MB/sec
write @ 80-88 MB/sec
(11 tests)
JFS filesystem: read @ 180-207 MB/sec
write @ 62-73 MB/sec (2 tests) reiserfs filesystem: read @ 145-177 MB/sec
write @ 106-107 MB/sec (2 tests) In general we note the wide variation in read rates in all the tests, and for all filesystem types.
XFS filesystem: read @ 142-172 MB/sec
write @ 80-86 MB/sec (2 tests)
We ran without this patch after noting these results.
3) Now with disks in
RAID0 in 3ware, 64kB block size, software raid 0 device chunk size 512kB:
XFS filesystem: read @ 189-198 MB/sec
write @ 160-168 MB/sec
(2 tests)
JFS filesystem: read @ 209-218 MB/sec
write @ 117-152 MB/sec (2 tests) reiserfs filesystem: read @ 188-189 MB/sec
write @ 139-140 MB/sec (2 tests)
4)
Repositioning one of the 3ware controllers to Slot 6 (133Mhz)
XFS filesystem: read @ 224-255 MB/sec
write @ 181-193 MB/sec
(2 tests)
5) Using various
chunk sizes in the software RAID
Using the XFS filesystem [except where indicated) (which has a 4kB blocksize).
Chunk
size for software stripe (kB) |
Read speed (MB/sec) |
Write speed (MB/sec) |
4 |
160-161 |
181-193 |
64 |
172-177 |
180-192 |
128 |
195-200 |
180-192 |
256 |
226-227 |
180-192 |
512 |
224-255 |
181-193 |
1024 |
190-249 |
180-191 |
|
|
|
512 (JFS) |
256-256 (JFS) |
149-173 (JFS) |
|
|
|
6) Now with 3ware RAID0 arrays
reconfigured to use 128kB chunks
512kB software stripe chunk size
XFS filesystem: read @ 216-245 MB/sec
write @ 179-191 MB/sec
(2 tests) JFS filesystem: read @ 256-271 MB/sec
write @ 148-167 MB/sec
(2 tests)
The JFS read result is the best yet.
7) At suggestion of Dan Yocum: turn
hyperthreading OFF in the BIOS
XFS filesystem: read @ 213-224 MB/sec
write @ 179-187 MB/sec
(2 tests) This mod appears (within measurement errors) to make little difference.
8) At suggestion of Bruce Allen, increase
read-ahead parameters:
echo 256 > /proc/sys/vm/max-readahead
echo 128 > /proc/sys/vm/min-readahead
Result in table below: it seems to have improved the write
speed!
Chunk
size for software stripe (kB) |
Read speed (MB/sec) |
Write speed (MB/sec) |
|
|
|
512 (without read-ahead mod) |
224-255 |
181-193 |
512 (with above mod) |
225-229 |
195-209 |
General comment: I'm quoting these B/W numbers to 3 digit
accuracy, but don't believe they are more accurate than +/- 10 MB/sec.
9) Applied the following "Yocum Settings":
echo "# Configure bdflush params" >> /etc/sysctl.conf
echo "vm.bdflush = 100 1200 128 512 15 5000 500 1884 2" >> /etc/sysctl.conf
echo "#configure vm readahead for 3ware controllers" >> /etc/sysctl.conf
echo "vm.max-readahead = 256" >> /etc/sysctl.conf
echo "vm.min-readahead = 128" >> /etc/sysctl.conf
echo "# Increase the number of file handles" >> /etc/sysctl.conf
echo "fs.file-max = 32768 " >> /etc/sysctl.conf
echo "# Increase the amount of maximum shared memory" >> /etc/sysctl.conf
echo "kernel.shmmax=1073741824" >> /etc/sysctl.conf
Chunk
size for software stripe (kB) |
Read speed (MB/sec) |
Write speed (MB/sec) |
|
|
|
512 (without Yocum Settings) |
224-255 |
181-193 |
512 (with Yocum Settings) |
228-232 |
196-208 |
Again, some marginal improvement in write speed.
10) CPU Load
CPU loads are around 80% during the block read tests, and
88% during block write. I.e. 80/88% of each and both CPUs in the system, or
160/176% of the combined CPU. This leaves 20/12% of the system for the NICs ...
which will be marginal, since we know that 1.8Gbits/sec on two NICs requires 23%
of each CPU ....
CONCLUSION:
We are not able to reproduce the excellent numbers described at: http://home.fnal.gov/~yocum/storageServerTechnicalNote.html It appears that the best performance for read-orientated and mixed workloads is obtained with JFS, for write-orientated XFS. Julian Bunn (with many thanks to Jan Lindheim for the really tricky Linux work!), November 2002.
|