From owner-freebsd-geom@FreeBSD.ORG  Thu Jun 10 20:05:38 2004
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2489616A4CE
	for <geom@FreeBSD.org>; Thu, 10 Jun 2004 20:05:38 +0000 (GMT)
Received: from imap.univie.ac.at (mailbox-lmtp.univie.ac.at [131.130.1.27])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 87E6C43D2D
	for <geom@FreeBSD.org>; Thu, 10 Jun 2004 20:05:37 +0000 (GMT)
	(envelope-from le@FreeBSD.org)
Received: from leelou (adslle.cc.univie.ac.at [131.130.102.11])
	by imap.univie.ac.at (8.12.10/8.12.10) with ESMTP id i5AK5Mrg1369828
	for <geom@freebsd.org>; Thu, 10 Jun 2004 22:05:25 +0200
Date: Thu, 10 Jun 2004 22:05:22 +0200 (CEST)
From: Lukas Ertl <le@FreeBSD.org>
To: geom@FreeBSD.org
Message-ID: <20040610214726.G23746@leelou.in.tern>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-DCC-ZID-Univie-Metrics: mx8 4248; Body=1 Fuz1=1 Fuz2=1
Subject: Correct GEOM bio handling
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2004 20:05:38 -0000

Hi there,

I've run into a problem with how to correctly handle struct bios in GEOM. 
I have the following scenario in vinum:

A plex needs to be synced because its data is out-of-date.  The solution I 
was thinking of is to create a kthread which reads the data from a 'good' 
plex and writes it out to the 'bad' plex.  Now, it would be ideal if 
'normal' requests (which are not part of this rebuild process) are already 
accepted while the rebuild process is still on-going.  Of course, this 
could be a problem if the new data is later overwritten by the rebuild 
process.  So I was thinking of cloning the incoming bio, check if the 
adjusted offsets are beyond where the rebuild process currently is, and if 
they are put the clone on a 'waitlist' where it will be picked up by the 
rebuilding kthread once the rebuild pointer is far enough and 
then scheduled down.

The rebuilding itself works fine, the requests on the waitqueue are 
detected, but they seem to be ignored once I g_io_request() them, and the 
process that initiated them is stuck.

So, I'm thinking that I'm missing some important detail in this bio 
handling, and I could use some input from you guys.

Thank you,
le

-- 
Lukas Ertl                         http://homepage.univie.ac.at/l.ertl/
le@FreeBSD.org                     http://people.freebsd.org/~le/

From owner-freebsd-geom@FreeBSD.ORG  Thu Jun 10 20:13:17 2004
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id C0AF816A4CE; Thu, 10 Jun 2004 20:13:17 +0000 (GMT)
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 272F943D31; Thu, 10 Jun 2004 20:13:17 +0000 (GMT)
	(envelope-from phk@phk.freebsd.dk)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.12.11/8.12.11) with ESMTP id i5AKD8Cc065670;
	Thu, 10 Jun 2004 22:13:09 +0200 (CEST)
	(envelope-from phk@phk.freebsd.dk)
To: Lukas Ertl <le@freebsd.org>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Thu, 10 Jun 2004 22:05:22 +0200."
             <20040610214726.G23746@leelou.in.tern> 
Date: Thu, 10 Jun 2004 22:13:08 +0200
Message-ID: <65669.1086898388@critter.freebsd.dk>
cc: geom@freebsd.org
Subject: Re: Correct GEOM bio handling 
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2004 20:13:17 -0000

In message <20040610214726.G23746@leelou.in.tern>, Lukas Ertl writes:

>A plex needs to be synced because its data is out-of-date.  The solution I 
>was thinking of is to create a kthread which reads the data from a 'good' 
>plex and writes it out to the 'bad' plex.  Now, it would be ideal if 
>'normal' requests (which are not part of this rebuild process) are already 
>accepted while the rebuild process is still on-going.

Normally what you do is you block the bad plex for reading but not
for writing.  That means all normal writes go also to the bad plex,
no matter where on the bad plex they are located.

Your rebuilder will then read from the good and write to the bad in
a sequential fashion, and when it is done, the bad plex is good too
and can be releases for reading.

Some implementations use compressed bitmaps, so that they know that
bits where a normal write happened can be skipped by the rebuilder.

Some even use "parasitic rebuild" where normal reads are written
to the bad plex as well (if not already up to date) in order to
save on the I/O operations.

>The rebuilding itself works fine, the requests on the waitqueue are 
>detected, but they seem to be ignored once I g_io_request() them, and the 
>process that initiated them is stuck.

Can you find where they are ?  Are they on the I/O list ?

What if you set debugflags=4, can you see where they went ?

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

From owner-freebsd-geom@FreeBSD.ORG  Thu Jun 10 22:08:05 2004
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9501516A4CE
	for <geom@FreeBSD.org>; Thu, 10 Jun 2004 22:08:05 +0000 (GMT)
Received: from imap.univie.ac.at (mailbox-lmtp.univie.ac.at [131.130.1.27])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D2C9943D48
	for <geom@FreeBSD.org>; Thu, 10 Jun 2004 22:08:04 +0000 (GMT)
	(envelope-from le@FreeBSD.org)
Received: from leelou (adslle.cc.univie.ac.at [131.130.102.11])
	by imap.univie.ac.at (8.12.10/8.12.10) with ESMTP id i5AM7Xs7956778;
	Fri, 11 Jun 2004 00:07:38 +0200
Date: Fri, 11 Jun 2004 00:07:33 +0200 (CEST)
From: Lukas Ertl <le@FreeBSD.org>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
In-Reply-To: <65669.1086898388@critter.freebsd.dk>
Message-ID: <20040610235840.Y24264@leelou.in.tern>
References: <65669.1086898388@critter.freebsd.dk>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-DCC-ZID-Univie-Metrics: mx8 4248; Body=2 Fuz1=2 Fuz2=2
cc: geom@FreeBSD.org
Subject: Re: Correct GEOM bio handling 
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2004 22:08:05 -0000

On Thu, 10 Jun 2004, Poul-Henning Kamp wrote:

> Normally what you do is you block the bad plex for reading but not
> for writing.  That means all normal writes go also to the bad plex,
> no matter where on the bad plex they are located.

Ah, ok, that makes sense.

>> The rebuilding itself works fine, the requests on the waitqueue are
>> detected, but they seem to be ignored once I g_io_request() them, and the
>> process that initiated them is stuck.
>
> Can you find where they are ?  Are they on the I/O list ?

They are neither on the bio_down queue nor on the bio up queue.  They 
vanished. :-)

> What if you set debugflags=4, can you see where they went ?

I assume you meant debugflags=2, since that's the bio debuglevel, but I 
don't see them there, too.

If I biowait() on them then I can see that BIO_DONE isn't set, so it 
will not return.

Anyway, if I let writes go through unconditionally and reject all reads, 
then I probably don't have this problem at all.

thanks,
le

-- 
Lukas Ertl                         http://homepage.univie.ac.at/l.ertl/
le@FreeBSD.org                     http://people.freebsd.org/~le/

From owner-freebsd-geom@FreeBSD.ORG  Fri Jun 11 07:35:06 2004
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 90B9816A4CE; Fri, 11 Jun 2004 07:35:06 +0000 (GMT)
Received: from darkness.comp.waw.pl (darkness.comp.waw.pl [195.117.238.236])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 482E743D46; Fri, 11 Jun 2004 07:35:06 +0000 (GMT)
	(envelope-from pjd@darkness.comp.waw.pl)
Received: by darkness.comp.waw.pl (Postfix, from userid 1009)
	id 92632ACAE3; Fri, 11 Jun 2004 09:34:52 +0200 (CEST)
Date: Fri, 11 Jun 2004 09:34:52 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Message-ID: <20040611073452.GB12007@darkness.comp.waw.pl>
References: <20040610214726.G23746@leelou.in.tern>
	<65669.1086898388@critter.freebsd.dk>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="BYl/BInBdgsQr4gH"
Content-Disposition: inline
In-Reply-To: <65669.1086898388@critter.freebsd.dk>
User-Agent: Mutt/1.4.2i
X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc
X-OS: FreeBSD 5.2.1-RC2 i386
cc: geom@freebsd.org
cc: Lukas Ertl <le@freebsd.org>
Subject: Re: Correct GEOM bio handling
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2004 07:35:06 -0000


--BYl/BInBdgsQr4gH
Content-Type: text/plain; charset=iso-8859-2
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jun 10, 2004 at 10:13:08PM +0200, Poul-Henning Kamp wrote:
+> Some implementations use compressed bitmaps, so that they know that
+> bits where a normal write happened can be skipped by the rebuilder.

I'm using bitmaps to do sychronization in geom_mirror.
It works quite ok.

+> Some even use "parasitic rebuild" where normal reads are written
+> to the bad plex as well (if not already up to date) in order to
+> save on the I/O operations.

Nice idea, I need to put it into geom_mirror:)

--=20
Pawel Jakub Dawidek                       http://www.FreeBSD.org
pjd@FreeBSD.org                           http://garage.freebsd.pl
FreeBSD committer                         Am I Evil? Yes, I Am!

--BYl/BInBdgsQr4gH
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (FreeBSD)

iD8DBQFAyWCcForvXbEpPzQRAnetAJ9FaL88kZiYdol88E93o6ox8biZVgCfUUn4
CWHbjaD74UuV7UO/anTgX8E=
=xLBr
-----END PGP SIGNATURE-----

--BYl/BInBdgsQr4gH--