From owner-freebsd-geom@FreeBSD.ORG Wed Jun 21 06:50:49 2006 Return-Path: X-Original-To: freebsd-geom@freebsd.org Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E00E16A474 for ; Wed, 21 Jun 2006 06:50:49 +0000 (UTC) (envelope-from frank.b.scholl@web.de) Received: from fmmailgate01.web.de (fmmailgate01.web.de [217.72.192.221]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5BDB043D45 for ; Wed, 21 Jun 2006 06:50:48 +0000 (GMT) (envelope-from frank.b.scholl@web.de) Received: from smtp05.web.de (fmsmtp05.dlan.cinetic.de [172.20.4.166]) by fmmailgate01.web.de (Postfix) with ESMTP id 75A0D30DDE3 for ; Wed, 21 Jun 2006 08:50:46 +0200 (CEST) Received: from [85.216.1.218] (helo=[192.168.1.1]) by smtp05.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.107 #114) id 1FswY2-0007ol-00 for freebsd-geom@freebsd.org; Wed, 21 Jun 2006 08:50:46 +0200 From: "Frank B. Scholl" To: freebsd-geom@freebsd.org Date: Wed, 21 Jun 2006 08:50:58 +0200 User-Agent: KMail/1.9.3 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200606210850.58525.frank.b.scholl@web.de> Sender: frank.b.scholl@web.de X-Sender: frank.b.scholl@web.de Subject: GEOM_MIRROR after crash not identical X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Jun 2006 06:50:49 -0000 hello list, i just wanted to describe what happened to me last night. i have an ultra 10 running with a highpoint ide controller, on each channel there are udma100 drives. booting is done via compact flash with the onboard controller. the two udma100 drives form a mirror, which is encrypted with geli. after creating the device with gmirror and inserting the other disk, everything ran fine. then i needed to power down the machine to add more drives. after the machine came up, the mirror was degraded. it always failed to insert the second disk as a valid provider. wouldnt be that bad, i thought, just remount the degraded mirror readonly and backup data, after that lets see what can be done. well, after mounting i saw a lot of data missing, to be exact 180gb from 300gb total. so i thought it might be a problem of the filesystem and tried to fsck it. didnt work either, there are a lot of unreadable blocks on the device, geom_geli through geom_mirror didnt stop flooding my logs. problem: dma timeouts and interrupt storms en masse - with a controller that beforehand worked flawlessly over _weeks_ without any reboot. so i forced hw.ata.ata_dma and hw.ata.atapi_dma to zero and circumvented the timeout problem. changing cables didnt work, either, btw, what seems to be common practice in such cases. so far, i ve only worked on the first disk and decided to go to sleep after having lost 180gb of data to a mirror device. next morning, i woke up and had quite a good idea: lets try the same thing - mounting the degraded mirror - again, but with the second provider only. so i unplugged the first disk, booted, and see, it worked. so now.. how can that be that after a single reboot the two providers are not exactly the same? the thing is.. last data i have on the first provider is from 14 june, on the second provider i have everything until 20 june. the machine was powered down yesterday and was running at least since this month, if not longer. i went through the logfiles and there was not a single hint, where geom_mirror claimed about inconsistency. any ideas? my data is back, so i dont cry anymore. is this a problem due to my platform choice? namely sparc64? thanks for an answer, cheers, frank scholl