From owner-freebsd-arch@FreeBSD.ORG Sun Apr 18 02:46:59 2010 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5FAAA1065674; Sun, 18 Apr 2010 02:46:59 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id F02018FC1E; Sun, 18 Apr 2010 02:46:58 +0000 (UTC) Received: from [127.0.0.1] (pooker.samsco.org [168.103.85.57]) (authenticated bits=0) by pooker.samsco.org (8.14.3/8.14.3) with ESMTP id o3I2ktur041963; Sat, 17 Apr 2010 20:46:55 -0600 (MDT) (envelope-from scottl@samsco.org) Mime-Version: 1.0 (Apple Message framework v1078) Content-Type: text/plain; charset=us-ascii From: Scott Long In-Reply-To: Date: Sat, 17 Apr 2010 20:46:55 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <91973FF7-4067-43ED-A20C-14B7B7D78449@samsco.org> References: <29917.1271406183@critter.freebsd.dk> To: Jeff Roberson X-Mailer: Apple Mail (2.1078) X-Spam-Status: No, score=-1.0 required=3.8 tests=ALL_TRUSTED, T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.0 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on pooker.samsco.org Cc: Attilio Rao , Poul-Henning Kamp , Giovanni Trematerra , freebsd-arch@freebsd.org Subject: Re: [PATCH] Syncer rewriting X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Apr 2010 02:46:59 -0000 On Apr 17, 2010, at 8:08 PM, Jeff Roberson wrote: > On Sat, 17 Apr 2010, Scott Long wrote: >=20 >> On Apr 16, 2010, at 2:23 AM, Poul-Henning Kamp wrote: >>>=20 >>>=20 >>>> - The standard syncer may be further improved getting rid of the >>>> bufobj. It should actually handle a list of vnodes rather than a = list >>>> of bufobj. However similar optimizations may be done after the = patch >>>> is ready to enter the tree. >>>=20 >>> That would be the wrong direction: we need the bufobj because for = instance >>> a RAID5 geom module does not have a vnode for the parity data. >>>=20 >>> If you force the syncer to only work on vnodes, then we need a = parallel >>> mechanism for non-filesystem disk users. >>=20 >> It's been 5-6 (7?) years since you invented the bufobj, but I still = haven't seen >> anything in GEOM use it as you suggest. You used to have a saying = about >> premature optimization... I'd like to see Attilio's work move = forward despite this. >>=20 >=20 > I tend to agree. I also think the syncer is inherently a vnode = centric operation. RAID5 should have its own rules and optimizations = for managing its dirty data. It would have to anyway to keep the disk = state consistent. Wouldn't it be a write through cache anyway and only = keep clean data in core? No, the fundamental idea behind RAID-5 caching is that is should try to = hold onto write buffers in an effort to collect enough to do a full = stripe write, instead of having to do a read-modify-write. So yes, = dirty buffers must be cached. However, I agree that the caching and = syncing policy here is likely to be completely different from what the = syncer might think is appropriate. Scott