Date: Fri, 13 Nov 1998 09:47:03 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Marc Slemko <marcs@znep.com> Cc: David Wolfskill <dhw@whistle.com>, hackers@FreeBSD.ORG Subject: Re: dump(8) very slow Message-ID: <199811131747.JAA05198@apollo.backplane.com> References: <Pine.BSF.4.05.9811130909500.12077-100000@alive.znep.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:Right, or you can just pipe dump through something like "team" which will
:buffer it. Really, that is the biggest problem I find with dump. On many
:systems, it isn't that it can't read quickly enough to fill the tape it is
:that there isn't enough buffering to let it do so smoothly, and when the
:tape isn't streaming it isn't a happy camper.
:
:You do, however, lose some things like reliable end of tape detection.
For BEST I wrote something similar to amanda (though never having
used amanda I couldn't compare them beyond that). The machine I've
been running all those SCSI tests on is our backup1.ba.best.com box.
It's idle during the day, active at night.
The backup system is separated into two parts: The first piece runs remote
dumps over ssh and writes the dump files to the disk buffer, doing up to
8 dumps in parallel, and the second piece copies the dump files from
the disk buffer to the two tape units. The disk buffer consists of four
9G drives striped together into a single ccd partition, using a 64 KByte
stripe.
Since the backups are occuring through encrypted ssh connections, the
machine tends to be cpu bound rather then I/O bound. We plan to upgrade
it to a FreeBSD-current duel-pentium box soon. A PPro 200 can handle
around 2.5 MBytes/sec worth of ssh encrypted data and the dump's running
on the remote machines cannot generally do better then 400KB/sec due to
the fact that the machines are active, plus I gzip -2 the dump output on
the remote machine before it comes back through the ssh pipe. Exabyte
tape drives are not very space efficient if you can't stream them, hence
the 36 GB 'whole dumps' disk buffer.
What I like the most about the system is that if a tape unit has an error,
the dumps related to that tape are simply left in the disk buffer while
dumps for other tapes run merrily on their way, allowing a human to fix
the problem without interrupting the dump process or having to redump
a machine. If the disk buffer fills up, my frontend simply goes into a
sleep-retry-loop until the backend drains it. Since these Exabytes can
do 3 MBytes/sec, the disk buffer usually does not fill up.
I also like the system because I do not depend on realtime pipes at all,
so I can guarentee 100% streaming to two tape units. Worst case disk I/O
is 3*2 + 2.5 = 8.5 MBytes/sec, well within the 30 MByte/sec bandwidth
of the ccd even assuming 2x loss in efficiency from the parallelism).
And, I might add, I think for smaller backup system installations, using
one or two DMA IDE drives (on their own controllers) would work very well
for this sort of linear transfer problem. Not that I do... I much prefer
SCSI. But it would work just fine and IBM's come out with some huge
low cost, high speed IDE drives.
-Matt
Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet
Communications & God knows what else.
<dillon@backplane.com> (Please include original email in any response)
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811131747.JAA05198>
