From owner-freebsd-questions@FreeBSD.ORG Mon Dec 3 14:27:01 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5806816A56B for ; Mon, 3 Dec 2007 14:27:01 +0000 (UTC) (envelope-from J.Catrysse@proximedia.be) Received: from smtp.proximedia.com (popop.online.be [194.88.108.79]) by mx1.freebsd.org (Postfix) with ESMTP id D582B13C469 for ; Mon, 3 Dec 2007 14:27:00 +0000 (UTC) (envelope-from J.Catrysse@proximedia.be) Received: from MAILDC.office.proximedia.be (unknown [194.88.104.244]) by smtp.proximedia.com (Postfix) with ESMTP id 8AB00286D1 for ; Mon, 3 Dec 2007 15:26:58 +0100 (CET) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Date: Mon, 3 Dec 2007 15:30:45 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: DOH! ata_alloc_composite failed! Thread-Index: AcgtCh/E0b9nKYpLSdytUIO34/mPdQIrhl3Q From: "Jan Catrysse" To: Subject: RE: DOH! ata_alloc_composite failed! X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Dec 2007 14:27:01 -0000 > -----Original Message----- > From: Jan Catrysse=20 > Sent: Thursday, November 22, 2007 2:18 PM > To: 'freebsd-questions@freebsd.org' > Subject: DOH! ata_alloc_composite failed! >=20 > Dear subscribers, >=20 > I am currently running a production server: > FreeBSD 6.2 STABLE > Onboard Intel ICH8R Raid 1 with 2x SATA300 500GB HDD Using=20 > ATA for Raid1 >=20 > In my today's /var/log/messages I found: > Nov 22 03:01:33 www kernel: DOH! ata_alloc_composite failed! > Nov 22 03:01:33 www last message repeated 28 times >=20 > I've seen several topics on similar problems but without=20 > resolution. Other users have seen the problem when there is=20 > heavy I/O, the UID check done by one of the Periodic/Security=20 > scripts has been mentioned. Not impossible because the error=20 > occurs around 03:00 hours, when Periodic/Daily is ran. You should definitly upgrade to releng_6 there was a problem with this earlier but it has been fixed a long time ago but most likely newer put into the 6.2 branch. > Someone has a clew on what this could be? > Should I be worrying for my data? The kernel will retry failed R/W so if it succeeds later no problem, otherwise the application that tried to write the data will get the error back from the kernel, data on disk will be as what the application decides to do with a failed read or write. > How can I do some more diagnostics without putting the server=20 > down (if possible)? Nopes, you will need to instrument the kernel depending on what you want to know. > Please see my next post / question on Raid Synchronisation /=20 > Data Consistency. >=20 > Kind regards, > Jan Catrysse I have gotten some feedback from the ATA(4) developer and pasted it in the above post. Regards Jan Catrysse