Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Sep 1995 00:13:46 +1000
From:      Bruce Evans <bde@zeta.org.au>
To:        bde@zeta.org.au, rgrimes@GndRsh.aac.dev.com
Cc:        current@FreeBSD.ORG, rkw@dataplex.net, wollman@lcs.mit.edu
Subject:   Re: Which SUP files are available and where ?
Message-ID:  <199509201413.AAA31890@godzilla.zeta.org.au>

next in thread | raw e-mail | index | archive | help
>> On a disk that has an iozone speed of 4-5MB/s
>> here, the throughput of `cvs -Q co bin' is 30K/sec (2562K, 85.05 real,
>> 12.69 user, 20.96 sys) (the cvs repository is on a separate disk).  The
>> throughput of `cp -pR bin bin~' is 79K/sec (2562K, 33.41 real, 0.10 user,
>> 3.04 sys).  The throughput of `cp -pR bin separate_slower_disk/bin' is
>> 56K (2562K, 46.50 real, 0.10 user, 3.50 sys).  Abysmal results like this
>> are typical for accessing small files.

>Why is this?  And what can be done to speed it up?  Is this all Meta
>Data overhead?  I think you are lossing on the write side here, not
>the read side.  How fast does cp -pR go when the destination is a MFS?

I think it's mostly from synchronous writes, but there are apparently
still performance bugs in the caches.  Some other benchmarks on the same
file system (all this is on a 16MB DX2/66 system):

	du -a bin | wc:	1.36 real 0.05 user 0.34 sys
	du -a bin | wc:	1.36 real 0.05 user 0.34 sys
	# (598 files)

The drive LED is on for most of the time during the second test.  This
shows that the results of the first test are not being cached.  I have
run similar tests under Minix and Linux on the old Minix file system
that show up to 20000 files being cached in only 1MB of buffer cache
(the [iv]node cache gets wiped out but the buffer cache doesn't).  The
old Minix file system has only 48 bytes of metadata per file so 1MB can
cache about 3 times as many files as for ufs.  FreeBSD seems to be
limited to caching the metadata for only about 326 files :-(.

The cache performance bugs used to be that `numvnodes' was too small
(it was 0x459 for the above test), and that only a limited number of
vnodes can be kept in memory (because the blocks containing the vnodes
aren't kept in memory), and that thrashing of the vnode cache thrashed
the buffer cache (because buffers are attached to vnodes), and that
only a limited number of directories can be kept in memory (because
the buffers for the directories are attached to vnodes).  Minix and
Linux didn't have these problems because they cache blocks as blocks
and don't attach them to vnodes.

It may be that caching all the metadata found by an operation such
as `du -a /' is a poor use of memory.  However, with raw i/o speeds
of 5MB/sec and metadata access speeds of perhaps 100K/sec as is
typical for modern SCSI drive/controllers, the caching of metadata
should have a much higher priority than caching of data.  The correct
balance between caching of metadata and using memory to run programs
in is less clear.

Now another benchmark:

	diff -r bin bin~:	21.10 real 0.84 user 1.98 sys
	diff -r bin bin~:	22.08 real 0.87 user 2.19 sys
	du -a bin | wc:		 0.30 real 0.07 user 0.21 sys

This shows that reading of scattered files is almost as slow as writing.
The du time has now improved!  Somehow the caches' effectiveness have
been improved so that no i/o is done for the du.

	tar cf /dev/null bin:	 5.73 real 0.17 user 1.03 sys
	tar cf /dev/null bin:	 2.02 real 0.14 user 0.81 sys

The time for the first tar probably represents the file system's speed
when there is no data in the cache.  It is only 447K/sec.  The drive LED
was on for part of the time for the second tar, so it's not clear what
is being measured.  The speed is now 1268K/sec, almost 25% of the raw
disk speed.  I think most of the data was in the cache but most of the
metadata wasn't, so the speed is limited to that of du -a, so it is much
too slow.

>I don't use cp -pR, I use cpio -pdamu --block-size=16 and that cruzzes
>along pretty well, but I have never measured the speed, but I do know
>it is signifacantly faster than cp -pR.

If it is faster, then cpio must be doing a better job of buffering than
the file system.  cp probably loses compared with a tar or cpio pipeline
with a large block size by doing only one file at a time.  However:

	find bin | time \
	cpio -pdamu --block-size=16 bin~:	31.47 real 0.23 user 3.43 sys

has much the same speed as cp -pR.

Here are the above benchmarks packaged in a script:
---
#!/bin/sh
time cvs -Q co bin
time cvs -Q co bin
time cp -pR bin bin~
time cp -pR bin/* bin~
time du -a bin >/dev/null
time du -a bin >/dev/null
time diff -r bin bin~
time diff -r bin bin~
time du -a bin >/dev/null
time du -a bin >/dev/null
time tar cf /dev/null bin
time tar cf /dev/null bin
time rm -rf bin~
find bin | time cpio -pdamu --block-size=16 bin~
find bin | time cpio -pdamu --block-size=16 bin~
time rm -rf bin bin~
---
and the results of the script run in /tmp/bde on freefall:
---
       77.31 real         6.33 user        14.95 sys
       19.36 real         0.57 user         1.16 sys
       46.90 real         0.06 user         2.03 sys
       36.63 real         0.07 user         1.61 sys
        3.25 real         0.04 user         0.17 sys
        0.20 real         0.04 user         0.13 sys
        9.18 real         0.28 user         1.11 sys
       27.99 real         0.45 user         1.19 sys
        1.17 real         0.04 user         0.14 sys
        0.22 real         0.05 user         0.13 sys
        3.85 real         0.09 user         0.58 sys
        1.70 real         0.11 user         0.49 sys
       20.08 real         0.02 user         0.55 sys
264 blocks
       65.07 real         0.14 user         2.40 sys
264 blocks
       86.87 real         0.17 user         2.70 sys
       40.94 real         0.06 user         1.03 sys
---
and the results on my system again:
---
       84.35 real        12.61 user        20.56 sys
       19.85 real         1.08 user         1.69 sys
       34.23 real         0.17 user         3.02 sys
       21.58 real         0.10 user         2.59 sys
        1.64 real         0.13 user         0.22 sys
        0.29 real         0.09 user         0.19 sys
       14.09 real         0.95 user         1.65 sys
       21.99 real         0.81 user         2.07 sys
        1.45 real         0.07 user         0.26 sys
        0.30 real         0.09 user         0.18 sys
        8.62 real         0.13 user         0.98 sys
        2.31 real         0.15 user         0.87 sys
       14.69 real         0.06 user         0.69 sys
264 blocks
       28.87 real         0.19 user         3.31 sys
264 blocks
       37.45 real         0.26 user         4.02 sys
       26.93 real         0.09 user         1.56 sys
---

There are similar anamolies for the cp, diff and cpio pairs, and my system
at least is otherwise unloaded so these must be due to caching strategies:

1) the second cp is significantly faster than the first.
2) the second diff is significantly slower than the first.
2) the second cpio is significantly slower than the first.
4) freefall seems to be relatively slower at cpio.  Perhaps it was
   using the file system that /tmp is on for something else when
   it was doing the cpio.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199509201413.AAA31890>