From owner-freebsd-hackers Fri Nov 13 09:47:30 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id JAA07533 for freebsd-hackers-outgoing; Fri, 13 Nov 1998 09:47:30 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id JAA07527 for ; Fri, 13 Nov 1998 09:47:29 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.1/8.9.1) id JAA05198; Fri, 13 Nov 1998 09:47:03 -0800 (PST) (envelope-from dillon) Date: Fri, 13 Nov 1998 09:47:03 -0800 (PST) From: Matthew Dillon Message-Id: <199811131747.JAA05198@apollo.backplane.com> To: Marc Slemko Cc: David Wolfskill , hackers@FreeBSD.ORG Subject: Re: dump(8) very slow References: Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :Right, or you can just pipe dump through something like "team" which will :buffer it. Really, that is the biggest problem I find with dump. On many :systems, it isn't that it can't read quickly enough to fill the tape it is :that there isn't enough buffering to let it do so smoothly, and when the :tape isn't streaming it isn't a happy camper. : :You do, however, lose some things like reliable end of tape detection. For BEST I wrote something similar to amanda (though never having used amanda I couldn't compare them beyond that). The machine I've been running all those SCSI tests on is our backup1.ba.best.com box. It's idle during the day, active at night. The backup system is separated into two parts: The first piece runs remote dumps over ssh and writes the dump files to the disk buffer, doing up to 8 dumps in parallel, and the second piece copies the dump files from the disk buffer to the two tape units. The disk buffer consists of four 9G drives striped together into a single ccd partition, using a 64 KByte stripe. Since the backups are occuring through encrypted ssh connections, the machine tends to be cpu bound rather then I/O bound. We plan to upgrade it to a FreeBSD-current duel-pentium box soon. A PPro 200 can handle around 2.5 MBytes/sec worth of ssh encrypted data and the dump's running on the remote machines cannot generally do better then 400KB/sec due to the fact that the machines are active, plus I gzip -2 the dump output on the remote machine before it comes back through the ssh pipe. Exabyte tape drives are not very space efficient if you can't stream them, hence the 36 GB 'whole dumps' disk buffer. What I like the most about the system is that if a tape unit has an error, the dumps related to that tape are simply left in the disk buffer while dumps for other tapes run merrily on their way, allowing a human to fix the problem without interrupting the dump process or having to redump a machine. If the disk buffer fills up, my frontend simply goes into a sleep-retry-loop until the backend drains it. Since these Exabytes can do 3 MBytes/sec, the disk buffer usually does not fill up. I also like the system because I do not depend on realtime pipes at all, so I can guarentee 100% streaming to two tape units. Worst case disk I/O is 3*2 + 2.5 = 8.5 MBytes/sec, well within the 30 MByte/sec bandwidth of the ccd even assuming 2x loss in efficiency from the parallelism). And, I might add, I think for smaller backup system installations, using one or two DMA IDE drives (on their own controllers) would work very well for this sort of linear transfer problem. Not that I do... I much prefer SCSI. But it would work just fine and IBM's come out with some huge low cost, high speed IDE drives. -Matt Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet Communications & God knows what else. (Please include original email in any response) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message