From owner-freebsd-current@FreeBSD.ORG Wed Sep 17 19:25:18 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7F0D516A4B3 for ; Wed, 17 Sep 2003 19:25:18 -0700 (PDT) Received: from svr7.m-online.net (svr7.m-online.net [62.245.150.229]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9259F43FB1 for ; Wed, 17 Sep 2003 19:25:17 -0700 (PDT) (envelope-from h@schmalzbauer.de) Received: from cale.flintsbach.schmalzbauer.de (ppp-62-245-210-195.mnet-online.de [62.245.210.195]) by svr7.m-online.net (Postfix) with ESMTP id D35A17D1AE for ; Thu, 18 Sep 2003 04:25:15 +0200 (CEST) From: Harald Schmalzbauer To: current@freebsd.org Date: Thu, 18 Sep 2003 04:25:08 +0200 User-Agent: KMail/1.5.3 X-Birthday: 06 Oktober 1972 X-Name: Harald Schmalzbauer X-Phone1: +49 (0) 163 555 3237 X-Phone2: +49 (0) 89 18947781 X-Address: Munich, 80686 X-Country: Germany MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_LeRa/eu3n+wIObY"; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <200309180425.15164@harrymail> Subject: 5.1-rel deleted it's own MBR X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Sep 2003 02:25:18 -0000 --Boundary-02=_LeRa/eu3n+wIObY Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline Hi all, big mysterious bug is lingering somwhere. (Machine: C3, 256MB, 2x 30GB 2,5"= =20 IDE, SIL0680 controller) One of my drives failed with the following recovered from messages: Sep 16 01:47:44 tek kernel: ad4: WRITE command timeout tag=3D0 serv=3D0 -=20 resetting Sep 16 01:47:45 tek kernel: ata2: resetting devices .. Sep 16 01:47:45 tek kernel: ad4: removed from configuration Sep 16 01:47:45 tek kernel: ar0: WARNING - mirror lost Sep 16 01:47:45 tek kernel: ad4: deleted from ar0 disk0 Sep 16 01:47:45 tek kernel: done This was at 1:47 but the machine ran until about 5:30. Then it died (no=20 message!) When I tried to reboot, BIOS complained about missing MBR. And indeed, when= I=20 opened the server and connected the drives to another box, BOTH drives had = no=20 partition table!!!! I got a correct bsdlabel from both, ad6 and ad6s1. How can this happen? Bug in ata? Bug in GEOM? Nobody was loged in and also nobody can log in so the machine deleted it.=20 That's really sure! My fix was to use the fixit CD and wrote a new one with: fdisk -I -B -b /boot/boot1 ar0 fdisk -u ar0 (to change the starting sector from 63 to 0) fsck found a few errors but the server is up and running again. S=F8ren: I remember you're planning better RAID management support. Will it= be=20 possible to control the ar0 by the controller's BIOS in the future? When I rebuilt the array with the BIOS (which took 6 hours!) FreeBSD still= =20 reported a degraded RAID1! This was really annoying Thanks, =2DHarry --Boundary-02=_LeRa/eu3n+wIObY Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQA/aReLBylq0S4AzzwRAluJAJsFpTckdf4fiDhXELfIVvwInZNU5ACePNOH P7m44UKfnXxw7ioN/IGXDmg= =fh+e -----END PGP SIGNATURE----- --Boundary-02=_LeRa/eu3n+wIObY--