Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Oct 2003 02:07:24 +0200
From:      Harald Schmalzbauer <h@schmalzbauer.de>
To:        Kris Kennaway <kris@obsecurity.org>, Dimitry Andric <dimitry@andric.com>
Cc:        FreeBSD-Current List <freebsd-current@freebsd.org>
Subject:   Re: MBR zapped when panicking?
Message-ID:  <200310220207.31139@harrymail>
In-Reply-To: <20031021185435.GA66921@rot13.obsecurity.org>
References:  <13041066290.20031021201724@andric.com> <20031021185435.GA66921@rot13.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Boundary-03=_Dpcl/JxDmqLNhnw
Content-Type: multipart/mixed;
  boundary="Boundary-01=_8ocl/h34c2Iu+Zc"
Content-Transfer-Encoding: 7bit
Content-Description: signed data
Content-Disposition: inline

--Boundary-01=_8ocl/h34c2Iu+Zc
Content-Type: text/plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
Content-Description: body text
Content-Disposition: inline

On Tuesday 21 October 2003 20:54, Kris Kennaway wrote:
> On Tue, Oct 21, 2003 at 08:17:24PM +0200, Dimitry Andric wrote:
> > Hi,
> >
> > Today I had a -CURRENT machine panic on me with a page fault, and
> > something happened that I have seen before: the machine refused to
> > come up afterwards. Closer inspection revealed that the MBR on the
> > boot disk was totally zapped, filled with seemingly random characters.
>
> This is a known bug in the ATA driver.  Tor Egge provided a workaround
> patch here a few weeks ago.  I didn't try it because I can't afford to
> trash my disks like that again.

Uhmm, at least a confirmation!
Some weeks ago (03/09/18) I wrote the attached mail.

Thanks,

-Harry

>
> Kris

--Boundary-01=_8ocl/h34c2Iu+Zc
Content-Type: text/plain;
  charset="iso-8859-15";
  name="5.1-rel deleted it's own MBR"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="5.1-rel deleted it's own MBR"

=46rom h@schmalzbauer.de Thu Sep 18 04:25:08 2003
=46rom: Harald Schmalzbauer <h@schmalzbauer.de>
To: current@freebsd.org
Subject: 5.1-rel deleted it's own MBR
Date: Thu, 18 Sep 2003 04:25:08 +0200
User-Agent: KMail/1.5.3
X-Birthday: 06 Oktober 1972
X-Name: Harald Schmalzbauer
X-Phone1: +49 (0) 163 555 3237
X-Phone2: +49 (0) 89 18947781
X-Address: Munich, 80686
X-Country: Germany
MIME-Version: 1.0
Content-Type: multipart/signed;
  protocol=3D"application/pgp-signature";
  micalg=3Dpgp-sha1;
  boundary=3D"Boundary-02=3D_LeRa/eu3n+wIObY";
  charset=3D"iso-8859-15"
Content-Transfer-Encoding: 7bit
Message-Id: <200309180425.15164@harrymail>
Status: RO
X-Status: S
X-KMail-EncryptionState: =20
X-KMail-SignatureState: =20

=2D-Boundary-02=3D_LeRa/eu3n+wIObY
Content-Type: text/plain;
  charset=3D"iso-8859-15"
Content-Transfer-Encoding: quoted-printable
Content-Description: signed data
Content-Disposition: inline

Hi all,

big mysterious bug is lingering somwhere. (Machine: C3, 256MB, 2x 30GB 2,5"=
=3D
=3D20
IDE, SIL0680 controller)
One of my drives failed with the following recovered from messages:

Sep 16 01:47:44 tek kernel: ad4: WRITE command timeout tag=3D3D0 serv=3D3D0=
 -=3D20
resetting
Sep 16 01:47:45 tek kernel: ata2: resetting devices ..
Sep 16 01:47:45 tek kernel: ad4: removed from configuration
Sep 16 01:47:45 tek kernel: ar0: WARNING - mirror lost
Sep 16 01:47:45 tek kernel: ad4: deleted from ar0 disk0
Sep 16 01:47:45 tek kernel: done


This was at 1:47 but the machine ran until about 5:30. Then it died (no=3D20
message!)
When I tried to reboot, BIOS complained about missing MBR. And indeed, when=
=3D
 I=3D20
opened the server and connected the drives to another box, BOTH drives had =
=3D
no=3D20
partition table!!!!
I got a correct bsdlabel from both, ad6 and ad6s1.
How can this happen?
Bug in ata?
Bug in GEOM?
Nobody was loged in and also nobody can log in so the machine deleted it.=
=3D20
That's really sure!

My fix was to use the fixit CD and wrote a new one with:

fdisk -I -B -b /boot/boot1 ar0
fdisk -u ar0 (to change the starting sector from 63 to 0)

fsck found a few errors but the server is up and running again.

S=3DF8ren: I remember you're planning better RAID management support. Will =
it=3D
 be=3D20
possible to control the ar0 by the controller's BIOS in the future?
When I rebuilt the array with the BIOS (which took 6 hours!) FreeBSD still=
=3D
=3D20
reported a degraded RAID1! This was really annoying

Thanks,

=3D2DHarry

=2D-Boundary-02=3D_LeRa/eu3n+wIObY
Content-Type: application/pgp-signature
Content-Description: signature

=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQA/aReLBylq0S4AzzwRAluJAJsFpTckdf4fiDhXELfIVvwInZNU5ACePNOH
P7m44UKfnXxw7ioN/IGXDmg=3D
=3Dfh+e
=2D----END PGP SIGNATURE-----

=2D-Boundary-02=3D_LeRa/eu3n+wIObY--


--Boundary-01=_8ocl/h34c2Iu+Zc--

--Boundary-03=_Dpcl/JxDmqLNhnw
Content-Type: application/pgp-signature
Content-Description: signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQA/lcpDBylq0S4AzzwRAtfGAJ4vLRlpz3rrz4n44S5ovkv0am4V3QCfT5pf
bT13i3C5OLxmt6shOk8OlXw=
=zIIB
-----END PGP SIGNATURE-----

--Boundary-03=_Dpcl/JxDmqLNhnw--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200310220207.31139>