From owner-freebsd-fs@FreeBSD.ORG Sat Apr 19 02:53:30 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 11DE437B401; Sat, 19 Apr 2003 02:53:30 -0700 (PDT) Received: from mail.tel.fer.hr (zg05-039.dialin.iskon.hr [213.191.138.40]) by mx1.FreeBSD.org (Postfix) with ESMTP id 41B0043FBF; Sat, 19 Apr 2003 02:53:25 -0700 (PDT) (envelope-from zec@tel.fer.hr) Received: from marko-tp (marko@[192.168.201.107]) by mail.tel.fer.hr (8.12.6/8.12.6) with ESMTP id h3J9pMxI000991; Sat, 19 Apr 2003 11:51:27 +0200 (CEST) (envelope-from zec@tel.fer.hr) From: Marko Zec To: Terry Lambert Date: Sat, 19 Apr 2003 11:53:03 +0200 User-Agent: KMail/1.5 References: <3E976EBD.C3E66EF8@tel.fer.hr> <200304182348.58356.zec@tel.fer.hr> <3EA0A647.BEC5931A@mindspring.com> In-Reply-To: <3EA0A647.BEC5931A@mindspring.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200304191153.03970.zec@tel.fer.hr> cc: freebsd-fs@FreeBSD.org cc: David Schultz cc: freebsd-stable@FreeBSD.org Subject: Re: PATCH: Forcible delaying of UFS (soft)updates X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Apr 2003 09:53:30 -0000 On Saturday 19 April 2003 03:28, Terry Lambert wrote: > I think that it is important that the value of syncer_delayno > needs to continue to be incremented once a second, and that the > modified sched_sync(), which with your patch no longer does this, > needs to used it's own counter. If you look again at the _unpatched_ syncer loop, you will clearly see that syncer_delayno is not guaranteed to be incremented only once a second. If speedup_syncer() increases the value of rushjob, the syncer_delayno will be increased up to rushjob times in a second in the syncer loop, depending if the buffers can be flushed fast enough. The expedited synching will proceed until rushjob drops down to 0. My patch didn't invent nor did change that model at all. > In other words, I think that you need to implement a two handed > clock algorithm, to keep the buckets from getting too deep with > work items, and in case there is some dependency which is not > being accounted for that has been working because there is an > implicit delay of 1 second or more in vn_syncer_add_to_worklist() > calls (your patch would break this, so without taking this into > account, we would have to retest all of soft updates). Again, please look at the _unmodified_ syncer code. My patch didn't change a thing regarding the possibility for the syncer to try flushing more than one syncer_workitem_pending queue in a second. > > > > Then when you write things out, they write out of order. > > > > Uhh.. NO! > > Uh, yes; potentially they do. See the implicit dependency > situation described above. There are other cases, too, but > they are much more complicated. I wish Kirk would speak up > in more technical detail about the problems you are potentially > introducing; they require a deep understanding of the soft > updates code. I wish also... > > > The purpose of the wheel is to allow placing of operations at > > > some relative offset in the future to an outstanding operation, > > > to ensure ordering. > > > > True. And this has not changed with my patch. > > No, it has changed. It's changed both in the depth of the > queue entries, and it's changed in the relative spacing of > implicit dependencies, and it's changed in the relative depth, > for two or more dependent operations with future offsets. How? By simply stopping the softupdates clock for a couple of seconds (ok, minutes :) more than usual? > In the depth case, when the code runs, is going to stall the > system for a really long time, relatively, because there are > a number of worklists which are *substantially* deep, because > vn_syncer_add_to_worklist() was using a syncer_delano that has > been assumed to be updated once a second, and never changed > during your stall. This means that the worklist represented by > syncer_workitem_pending[syncer_delayno] is going to contain > *almost all work* that was enqueued in the interim. > > The problem with this is that in the for(;;) loop in sched_sync() > in the "if (LIST_FIRST(slp) == vp)" code block, you are likely > to run yourself into a panic. See the comment about "sync_fsync() > moves it to a different slot so we are safe"? That comment is no > longer true. Is it possible you are confusing the sync_fsync() routine in kern/vfs_subr.c (which I didn't touch) with the modified fsync() handler in kern/vfs_syscalls.c. [the rest of the debate deleted] Can we please either slowly conclude this discussion, or provide a feasible alternative to the proposed patch? I start feeling like we are wasting tremendeous amount of time here while going nowhere. Marko