Date: Sat, 19 Sep 1998 08:10:50 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: ken@plutotech.com (Kenneth D. Merry) Cc: tlambert@primenet.com, andreas@klemm.gtn.com, current@FreeBSD.ORG Subject: Re: panic: newdirrem inum 48733 should be 48732 (SMP+SOFTUPDATES) Message-ID: <199809190810.BAA19634@usr08.primenet.com> In-Reply-To: <199809190140.TAA13870@panzer.plutotech.com> from "Kenneth D. Merry" at Sep 18, 98 07:40:34 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> > People seem to be having problems with soft updates because of CAM. > > I doubt the problems are *because* of CAM. It's more likely that the > problems are tickled/exacerbated/demonstrated by CAM. The pattern of I/O > that CAM does is different than the old SCSI layer; CAM handles a far > greater number of outstanding transactions. (as many as your combination > of drives and controllers can handle) If people have applied L. Chen's patch, and are still having the problem, I'd have to guess that what's happening is that CAM is not honoring the order of requests in its commits to disk. It is *imperitive* to the soft updates technology that async writes occur in the order they are requested to occur; the main privision of soft updates is to not advance the soft clock "wheel" until the previous wheel slot writes have been committed, such that all physical writes occur in dependency order. > > It looks like there is a commit guarantees in the old code that CAM > > is failing to honor correctly... > > Well, if you can point out the problem... > > It's entirely possible that there is some problem with the interaction > between the two, but I wouldn't be so sure. It is very suspicious to me that the problem didn't occur before CAM. It seems to me that CAM is perhaps being overly ambitious (from the point of view of soft updates, to be sure) in optimizing write request ordering. Perhaps this has to do with multiple outstanding write requests in a tagged command queue spanning a dependency boundary? If so, then CAM needs to export a command pipeline synchronization primitive, which is then called from the "syncer" between advances of the soft clock wheel... Tis is all supposition, of course; we really need a detailed description of the architectural differences between the CAM and per-CAM code. This will be moderately difficult, since the pre-CAM code was poorly documented, so even if the CAM code was a known quantity (which I am sure it is, to its authors), it will be hard to quantify the difference in behaviour resulting from the difference in architecture. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809190810.BAA19634>
