From owner-freebsd-geom@FreeBSD.ORG Fri Nov 19 04:49:34 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 691FC16A4CE for ; Fri, 19 Nov 2004 04:49:34 +0000 (GMT) Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.49.29]) by mx1.FreeBSD.org (Postfix) with ESMTP id 04C6E43D5C for ; Fri, 19 Nov 2004 04:49:34 +0000 (GMT) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (pool-151-199-90-129.roa.east.verizon.net [151.199.90.129]) by gromit.dlib.vt.edu (8.13.1/8.13.1) with ESMTP id iAJ4nVCg090056 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 18 Nov 2004 23:49:32 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (localhost.Chelsea-Ct.Org [127.0.0.1]) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1) with ESMTP id iAJ4nP2X016755 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 18 Nov 2004 23:49:26 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: (from paul@localhost) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1/Submit) id iAJ4nOV8016754 for freebsd-geom@freebsd.org; Thu, 18 Nov 2004 23:49:24 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) X-Authentication-Warning: zappa.Chelsea-Ct.Org: paul set sender to paul@gromit.dlib.vt.edu using -f From: Paul Mather To: freebsd-geom@freebsd.org Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Thu, 18 Nov 2004 23:49:21 -0500 Message-Id: <1100839762.5421.21.camel@zappa.Chelsea-Ct.Org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 FreeBSD GNOME Team Port Subject: geom_mirror synchronisation hangs at boot X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Nov 2004 04:49:34 -0000 I had another one of those occasional "TIMEOUT - WRITE_DMA" messages. This time it was on a 6.0-CURRENT system (last built 2004-11-12) with a geom_mirror setup, and it caused one of the providers to be removed from the mirror and to operate in degraded mode: Nov 18 14:15:07 zappa kernel: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=49981679 Nov 18 14:15:10 zappa kernel: ad0: FAILURE - WRITE_DMA timed out Nov 18 14:15:10 zappa kernel: GEOM_MIRROR: Cannot update metadata on disk ad0 (error=5). Nov 18 14:15:10 zappa kernel: GEOM_MIRROR: Device raid1: provider ad0 disconnected. Nov 18 14:15:10 zappa kernel: GEOM_MIRROR: Request failed (error=1). ad0[WRITE(offset=809639936, length=2048)] As with my geom_vinum episode, the drive wasn't really down. So, like my geom_vinum case, I decided to reboot to make the drive magically be recognised as alive. (In retrospect, an atacontrol detach/attach would have been better. I'll remember that in future.:) When the system rebooted, all appeared well. The drive was recognised and I could hear reconstruction kick in as ad0 was rebuilt. Unfortunately, a moment later, when it came to probe my atapicam devices (cd0 and da0 ZIP drive), the sounds of furious reconstruction vanished and the machine locked up. I am wondering if the atapicam device probing interfered with/was affected by the rebuilding that had kicked off on ad0. Is there some way of delaying reconstruction such that it begins after some timed delay---rather like the way the background fsck is delayed for 60 seconds to get most of the boot sequence completed before it begins pounding on the drives? I've never yet had any problems reconstructing a mirror when the OS is up and running. In the end, the only way I seemed to be able to render my system bootable was to download and burn the latest FreeSBIE BETA CD, boot that, and then "kldload geom_mirror" to initiate reconstruction once the system was running. :-( I know there are various gmirror options to disable auto-synchronisation. Is there anything that can be twiddled in the loader to enable/disable auto-synchronisation? That would be really handy. Here is the typical order of probing of ATA devices in my system. Note that cd0 and da0 come after the GEOM_MIRROR discovery/initialisation: Nov 18 19:36:44 zappa kernel: ad0: 24405MB [49585/16/63] at ata0-master UDMA33 Nov 18 19:36:44 zappa kernel: acd0: DVDR at ata0-slave UDMA33 Nov 18 19:36:44 zappa kernel: ata1-slave: FAILURE - SETFEATURES SET TRANSFER MODE status=1 error=4 Nov 18 19:36:44 zappa kernel: ad2: 24405MB [49585/16/63] at ata1-master UDMA33 Nov 18 19:36:44 zappa kernel: afd0: REMOVABLE at ata1-slave BIOSPIO Nov 18 19:36:44 zappa kernel: GEOM_MIRROR: Device raid1 created (id=1030107361). Nov 18 19:36:44 zappa kernel: GEOM_MIRROR: Device raid1: provider ad0 detected. Nov 18 19:36:44 zappa kernel: GEOM_MIRROR: Device raid1: provider ad2 detected. Nov 18 19:36:44 zappa kernel: GEOM_MIRROR: Device raid1: provider ad2 activated. Nov 18 19:36:44 zappa kernel: GEOM_MIRROR: Device raid1: provider ad0 activated. Nov 18 19:36:44 zappa kernel: GEOM_MIRROR: Device raid1: provider mirror/raid1 launched. Nov 18 19:36:44 zappa kernel: da0 at ata1 bus 0 target 1 lun 0 Nov 18 19:36:44 zappa kernel: da0: Removable Direct Access SCSI-0 device Nov 18 19:36:44 zappa kernel: da0: 3.300MB/s transfers Nov 18 19:36:44 zappa kernel: da0: 96MB (196608 512 byte sectors: 64H 32S/T 96C) Nov 18 19:36:44 zappa kernel: cd0 at ata0 bus 0 target 1 lun 0 Nov 18 19:36:44 zappa kernel: cd0: Removable CD-ROM SCSI-0 device Nov 18 19:36:44 zappa kernel: cd0: 33.000MB/s transfers Nov 18 19:36:44 zappa kernel: cd0: Attempt to query device size failed: NOT READY, Medium not present Nov 18 19:36:44 zappa kernel: Mounting root from ufs:/dev/mirror/raid1a In the boot hang on auto-synchronisation, the hang would occur just before the "Mounting root" line. Cheers, Paul. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa