From owner-freebsd-questions Mon Sep 16 10:29:39 2002 Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ED90B37B400 for ; Mon, 16 Sep 2002 10:29:37 -0700 (PDT) Received: from sage-one.net (adsl-65-71-135-137.dsl.crchtx.swbell.net [65.71.135.137]) by mx1.FreeBSD.org (Postfix) with ESMTP id EB1EA43E3B for ; Mon, 16 Sep 2002 10:29:24 -0700 (PDT) (envelope-from jackstone@sage-one.net) Received: from sagea (sagea [192.168.0.3]) by sage-one.net (8.11.6/8.11.6) with SMTP id g8GHSxe00917 for ; Mon, 16 Sep 2002 12:29:02 -0500 (CDT) (envelope-from jackstone@sage-one.net) Message-Id: <3.0.5.32.20020916122857.0117d8b8@mail.sage-one.net> X-Sender: jackstone@mail.sage-one.net X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32) Date: Mon, 16 Sep 2002 12:28:57 -0500 To: freebsd-questions@freebsd.org From: "Jack L. Stone" Subject: Server Lockups during Backups ad0 to ad1 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Am running FreeBSD 4.5-RELEASE For some months I have been having hard lockups (requiring reboots) when I try to do backups of the ad0 HD to ad1 HD on a busy server. It happens with either tar or dump/restore. Funny thing is that a dd od ad0 to ad1 doesn't lock up (so far, but have done dozens). I could sure use some ideas here on what to do to figure this out. Don't know if it is hardware or software or both. Maybe related to the ata driver bug(?), but thought that popped up in 4.6+. Here's the error log recorded. Any help really appreciated as this is NOT GOOD! BTW, the BIOS doesn't even show the ad1 IDE on first reboot, which takes a hard switch reset. Tnen a complete power down/power up to bring the ad1 HD in the BIOS back....OUCH! Is scary for this server which is not very old. Just locked a few minutes ago and did the above, PLUS, fsck found errors to fix. 12:23PM up 39 mins, 5 users, load averages: 0.18, 0.17, 0.10 ERROR LOG ============================================================================== Sep 16 10:38:23 sage-one /kernel: ad1: WRITE command timeout tag=0 serv=0 - resetting Sep 16 10:40:07 sage-one /kernel: ata0: resetting devices .. done Sep 16 10:40:07 sage-one /kernel: ad1: WRITE command timeout tag=0 serv=0 - resetting Sep 16 10:40:07 sage-one /kernel: ata0: resetting devices .. done Sep 16 10:40:07 sage-one /kernel: ad1s1f: hard error writing fsbn 23516351 of 5466688-5466943 (ad1s1 bn 23516351; cn 1463 tn 210 sn 26)ata0-slave: timeout waiting for command=ef s=01 e=04 Sep 16 10:40:07 sage-one /kernel: ad1: timeout waiting for DRQ - resetting Sep 16 10:40:07 sage-one /kernel: ata0: resetting devices .. done Sep 16 10:40:07 sage-one /kernel: ad1: timeout waiting for DRQ - resetting Sep 16 10:40:07 sage-one /kernel: ata0: resetting devices .. done Sep 16 10:40:07 sage-one /kernel: swap_pager: indefinite wait buffer: device: #ad/0x20001, blkno: 392, size: 4096 ============================================================================ == Best regards, Jack L. Stone, Administrator SageOne Net http://www.sage-one.net jackstone@sage-one.net To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message