From owner-freebsd-stable@FreeBSD.ORG  Mon Jan 23 08:53:55 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@freebsd.org
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 4758E16A41F
	for <freebsd-stable@freebsd.org>; Mon, 23 Jan 2006 08:53:55 +0000 (GMT)
	(envelope-from mse_software@charter.net)
Received: from mxsf30.cluster1.charter.net (mxsf30.cluster1.charter.net
	[209.225.28.230])
	by mx1.FreeBSD.org (Postfix) with ESMTP id AFC5243D45
	for <freebsd-stable@freebsd.org>; Mon, 23 Jan 2006 08:53:54 +0000 (GMT)
	(envelope-from mse_software@charter.net)
Received: from mxip02a.cluster1.charter.net (mxip02a.cluster1.charter.net
	[209.225.28.132])
	by mxsf30.cluster1.charter.net (8.12.11/8.12.11) with ESMTP id
	k0N8rrrX010644
	for <freebsd-stable@freebsd.org>; Mon, 23 Jan 2006 03:53:53 -0500
Received: from 68-113-23-60.dhcp.knwk.wa.charter.com (HELO yak.mseubanks.net)
	([68.113.23.60])
	by mxip02a.cluster1.charter.net with ESMTP; 23 Jan 2006 03:53:53 -0500
X-IronPort-AV: i="4.01,210,1136178000"; 
	d="scan'208"; a="1857319761:sNHT20858752"
From: "Michael S. Eubanks" <mse_software@charter.net>
To: freebsd-stable@freebsd.org
In-Reply-To: <44B2CAEF-A9E7-454B-A232-292B58083952@stromnet.org>
References: <991F35AA-151B-4AEA-82BD-5F4AEDF28424@stromnet.org>
	<a78074950511180117r6d64db25o4ae37c0c5998e002@mail.gmail.com>
	<74994962-5050-47BD-897B-DE3880B9EBD5@stromnet.org>
	<a78074950511180943r57fd9d03r64efcc705001bc35@mail.gmail.com>
	<A6F22EE2-B1E6-44B5-B4C2-E77E1A24FEBB@stromnet.org>
	<1132353600.903.19.camel@genius1.i.cz>
	<20051118231351.GA46946@holestein.holy.cow>
	<1132356649.903.32.camel@genius1.i.cz>
	<8A4DAD5D-44CF-42DD-A113-340226284533@stromnet.org>
	<268C3DEB-7569-4C18-BC35-1C5F36EF8EC4@stromnet.org>
	<1137967081.40786.36.camel@yak.mseubanks.net>
	<1DA0C9DF-BB42-415B-8851-FFB91CD0F1AC@stromnet.org>
	<1137975447.40786.83.camel@yak.mseubanks.net>
	<44B2CAEF-A9E7-454B-A232-292B58083952@stromnet.org>
Content-Type: text/plain; charset=ISO-8859-1
Date: Mon, 23 Jan 2006 00:53:51 -0800
Message-Id: <1138006431.44108.15.camel@yak.mseubanks.net>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.3 FreeBSD GNOME Team Port 
Content-Transfer-Encoding: 8bit
Subject: Re: Page fault, GEOM problem??
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: mse_software@charter.net
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Jan 2006 08:53:55 -0000

On Mon, 2006-01-23 at 06:43 +0100, Johan Ström wrote:
> On 23 jan 2006, at 01.17, Michael S. Eubanks wrote:
> 
> 
> > On Sun, 2006-01-22 at 23:51 +0100, Johan Ström wrote:
> >
> > ...snip...
> >
> >
> >> On 22 jan 2006, at 22.58, Michael S. Eubanks wrote:
> >> This card does afaik dont have raid functionalitys (I've never read
> >> anything about it either on the web, the cards box or anywhere  
> >> else..).
> >> I'm running GENERIC, which does include ataraid..
> >> What does your dmesg identify your card as?
> >>
> >> atapci0: <Promise PDC40518 SATA150 controller> port 0xb800-0xb87f,
> >> 0xb400-0xb4ff mem 0xfb800000-0xfb800fff,0xfb000000-0xfb01ffff irq 19
> >> at device 12.0 on pci0
> >>
> >> Is it the same PDC chipset?
> >>
> >> --
> >> Johan
> >>
> >>
> >
> > No, I have a different controller.  My mistake.  I think what is
> > happening is the DMA read command is failing, therefore causing the
> > device to be disconnected, and the kernel can't write to the disk from
> > that point on (this is somewhat obvious given the output below).
> >
> >
> >>> Nov 29 20:36:54 elfi kernel: subdisk10: detached
> >>> Nov 29 20:36:54 elfi kernel: ad10: detached
> >>> Nov 29 20:36:54 elfi kernel: unknown: TIMEOUT - READ_DMA48 retrying
> >>> (1 retry left) LBA=426562704
> >>> Nov 29 20:36:54 elfi kernel: GEOM_MIRROR: Device gm0s1: provider
> >>> ad10s1 disconnected.
> >>>
> >
> > The message seen from the last line above is generated in any of the
> > following scenarios (from g_mirror.c):
> >   1. Device wasn't running yet, but disk disappear.
> >   2. Disk was active and disapppear.
> >   3. Disk disappear during synchronization process.
> >
> >
> >>> Nov 29 20:36:54 elfi kernel: GEOM_MIRROR: Request failed (error=6).
> >>> ad10s1[WRITE(offset=134356992, length=16384)]
> >>>
> >
> > As far as recovering the disk, I remember seeing something about  
> > booting
> > to single user mode and using fsck after a core dump in a previous  
> > post.
> > I'm assuming the disks worked initially and that you were able to  
> > label
> > them etc?  Is there any possibility that the disk state may be altered
> > by a power saving feature or setting in the BIOS and FreeBSD just
> > doesn't know when it happens until the next time it tries to access  
> > the
> > disk?
> >
> 
> For recovering, i've always done a direct reboot, the gmirror  
> rebuilds the mirror and fsck is run.
> No problems reading labels etc, and never has been, only problem has  
> been these sporadic crashes.. And the read/write performance (see  
> earlier in thread)...
> 
> This is a server, so all bios setting for powersaving is (should be)  
> shut of. Bios should thus never make the disk go to sleep.
> 
> Thanks for trying to help!

Wish I could be of more help. :)  Have you tried to toggle the sysctl
dma flags?  I've seen similar posts in the past with read timeouts
caused from dma being enabled.

# sysctl -a | grep dma
...
hw.ata.ata_dma: 1      <=== Try turning this one off (1 ==> 0).
hw.ata.atapi_dma: 1
...

-Michael