From owner-svn-src-projects@FreeBSD.ORG Wed Jan 5 12:41:45 2011 Return-Path: Delivered-To: svn-src-projects@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FA8E106566B; Wed, 5 Jan 2011 12:41:45 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 13DEF8FC0A; Wed, 5 Jan 2011 12:41:43 +0000 (UTC) Received: by fxm16 with SMTP id 16so14847178fxm.13 for ; Wed, 05 Jan 2011 04:41:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=mRk2uted1oRE7dX9CWzNu47k4HCCY228cGOzYAbY6Zg=; b=TQW/Gq0kgH0u5MJMoWaLdlhOal7/z/FTwKY0efMEIyY3Rrkj0RfysWct1i0okm14Js QdEV4q9GTxjBJTMZdkJLiyrvgNMQLRxjrGyMDe2XuXsABhpBQpBRpkwiDsgbixZiLs69 gVd1Ha/qAKevyYkWqSj2PIkvRnX8il8sh9/pI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=ozCZNYD+BibIAiyIBgpRwgYJol69MV8uYiLqEX/w3yKUHoeq0VRLlSy55MB7KpDYP9 6sNEfz1YXSpOTFXRqNEqkIrMefAyNJkzMaLRruCvTsGr8E3uBQ663vfapmAOXOUnLg8M tpYEkMC5Z73RVwM7Etj860QdRtatCtrLCgKD4= Received: by 10.223.81.76 with SMTP id w12mr14413770fak.26.1294231303178; Wed, 05 Jan 2011 04:41:43 -0800 (PST) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id y1sm5451255fak.39.2011.01.05.04.41.39 (version=SSLv3 cipher=RC4-MD5); Wed, 05 Jan 2011 04:41:40 -0800 (PST) Sender: Alexander Motin Message-ID: <4D2466FE.3000307@FreeBSD.org> Date: Wed, 05 Jan 2011 14:41:34 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101104 Thunderbird/3.1.6 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <201101050019.p050Je5J059533@svn.freebsd.org> <20110105083906.GB1740@garage.freebsd.pl> In-Reply-To: <20110105083906.GB1740@garage.freebsd.pl> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: svn-src-projects@freebsd.org, src-committers@freebsd.org, Warner Losh Subject: Re: svn commit: r216984 - projects/graid/head/sys/geom/raid X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jan 2011 12:41:45 -0000 On 05.01.2011 10:39, Pawel Jakub Dawidek wrote: > On Wed, Jan 05, 2011 at 12:19:40AM +0000, Warner Losh wrote: >> Author: imp >> Date: Wed Jan 5 00:19:40 2011 >> New Revision: 216984 >> URL: http://svn.freebsd.org/changeset/base/216984 >> >> Log: >> First pass at error recovery: if the first disk that we get errors on >> has a problem, try from the second one. Note info about possible bad >> sector remap attempt through write, and some ideas on when to eject >> the subdisk from the disk. > > My ideas what to do on I/O error mostly matches yours: > - On read error, read from the other disk, write the data back to the > first disk. Before you return the data up, you must wait for write to > complete. If you won't wait, you can lose race with new write request > going into the same area and you will overwrite new data with the old > one. In design document we have planned range locking mechanism for use here and during synchronization/rebuild. > - Count read errors and mark disk as broken after some number of errors. > If you get I/O errors because your requests time out you really want > to disconnect the misbehaving disk or your entire array would suffer > (read from the first disk, wait for timeout, read from the second > disk). It is planned. > - On write error you want to mark disk as broken immediately, as from > now on it has stale data and can't be trusted. Right. As further steps we have discussed idea of keeping such disks as part of array, marking them as dirty, avoiding reads from them. If main disk instrantly fail, partially broken disk is probably better then nothing. > How do you plan to detect if there was unclean shutdown and you need to > synchronize the disks? It depends from metadata format. Intel metadata, according to Linux sources, seem to have some flags related to the case. I have planned to implement logic used by gmirror (dirty on first write and clean on close or after timeout) using that flags and metadata sequence numbers. > Do you plan to support some kind of dirty bitmap to be able to optimize > synchronization time after unclean shutdown? If you do, you might want > to look at HAST. I implemented dirty bitmap handling based on DRBD > ideas, which gives the lowest overhead I can think of. I've thought about it, but it depends on metadata formats. At this moment I don't know such ones. -- Alexander Motin