From owner-freebsd-current Tue May 7 18: 9:54 2002 Delivered-To: freebsd-current@freebsd.org Received: from akira.wossname.net (12-222-90-39.client.insightBB.com [12.222.90.39]) by hub.freebsd.org (Postfix) with ESMTP id D556F37B400 for ; Tue, 7 May 2002 18:09:44 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by akira.wossname.net (Postfix) with ESMTP id 6C0B95553 for ; Tue, 7 May 2002 20:08:50 -0500 (EST) Subject: Is anyone else having trouble with dump(8) on -current? From: Benjamin Lewis Reply-To: bhlewis@wossname.net To: freebsd-current@freebsd.org Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.3 Date: 07 May 2002 20:08:50 -0500 Message-Id: <1020820130.97599.43.camel@akira.wossname.net> Mime-Version: 1.0 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hello, As the subject implies, I've been having some odd troubles with /sbin/dump on a fairly current -CURRENT: FreeBSD akira.wossname.net 5.0-CURRENT FreeBSD 5.0-CURRENT #1: Sat May 4 17:50:36 EST 2002 root@akira.wossname.net:/usr/obj/usr/src-all/current/src/sys/AKIRA i386 I'm running SMP on a dual 1GHz Athlon, Tyan Tiger S2460 based system. dmesg output available upon request. Aside from this single problem, -CURRENT has been remarkably stable on this system. The sources were cvsup-ed just prior to that date and installworld completed soon after: -r-xr-xr-x 2 root wheel 394760 May 4 18:00 /sbin/dump* Now, on to the problem. I use amanda for backups, and since mid-April I've been seeing items like the following in the backup report: /-- akira.woss /var lev 0 FAILED [/sbin/dump returned 3] sendbackup: start [akira.wossname.net:/var level 0] sendbackup: info BACKUP=/sbin/dump sendbackup: info RECOVER_CMD=/sbin/restore -f... - sendbackup: info end | DUMP: Date of this level 0 dump: Mon Apr 29 00:11:50 2002 | DUMP: Date of last level 0 dump: the epoch | DUMP: Dumping /dev/da0s3e (/var) to standard output | DUMP: mapping (Pass I) [regular files] | DUMP: mapping (Pass II) [directories] | DUMP: estimated 36490 tape blocks. | DUMP: dumping (Pass III) [directories] | DUMP: slave couldn't reopen disk: Interrupted system call | DUMP: The ENTIRE dump is aborted. sendbackup: error [/sbin/dump returned 3] \-------- The actual failed partition varies; no single filesystem consistently fails. I can occasionally reproduce the error by running a dump by hand. More often, I cannot. The filesystems that fail are all on a single disk (da0): da0: Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers (20.000MHz, offset 16, 16bit), Tagged Queueing Enabled da0: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C) No filesystems on the other SCSI disk ever fail. The Fujitsu is a replacement for a disk that recently went bad and the filesystems were restored from tape. The dump problems happened soon, but not immediately, after the new disk was brought on-line. After a few successful backup runs I decided to update the system from the early March sources it had been running, and that's when this problem began. I've rebuilt the system from current-at-the-time sources several times since, whenever the mailing lists suggested it was safe-ish. I was just playing around, trying to duplicate the problem and the following happened, 2 failures and then a success: $ dump 0af /dev/null / DUMP: Date of this level 0 dump: Tue May 7 19:59:49 2002 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping /dev/da0s1a (/) to /dev/null DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 693225 tape blocks. DUMP: slave couldn't reopen disk: Interrupted system call DUMP: The ENTIRE dump is aborted. $ dump 0af /dev/null / DUMP: Date of this level 0 dump: Tue May 7 19:59:53 2002 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping /dev/da0s1a (/) to /dev/null DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 693225 tape blocks. DUMP: slave couldn't reopen disk: Interrupted system call DUMP: The ENTIRE dump is aborted. $ dump 0af /dev/null / DUMP: Date of this level 0 dump: Tue May 7 19:59:54 2002 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping /dev/da0s1a (/) to /dev/null DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 693225 tape blocks. DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: DUMP: 711739 tape blocks on 1 volume DUMP: finished in 176 seconds, throughput 4043 KBytes/sec DUMP: Closing /dev/null DUMP: DUMP IS DONE If anyone has some insight into this, I'd very much appreciate the help. Thanks, -Ben To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message