From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 12 13:11:35 2008 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2517116A469; Tue, 12 Feb 2008 13:11:35 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail18.syd.optusnet.com.au (mail18.syd.optusnet.com.au [211.29.132.199]) by mx1.freebsd.org (Postfix) with ESMTP id 9F37813C448; Tue, 12 Feb 2008 13:11:34 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-219-213.carlnfd3.nsw.optusnet.com.au (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail18.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m1CDBSUA008800 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 13 Feb 2008 00:11:32 +1100 Date: Wed, 13 Feb 2008 00:11:28 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Poul-Henning Kamp In-Reply-To: <4299.1202816854@critter.freebsd.dk> Message-ID: <20080212234018.O92415@delplex.bde.org> References: <4299.1202816854@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Tue, 12 Feb 2008 13:45:56 +0000 Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, Pav Lucistnik , Andriy Gapon , Bruce Evans Subject: Re: fs/udf: vm pages "overlap" while reading large dir X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Feb 2008 13:11:35 -0000 On Tue, 12 Feb 2008, Poul-Henning Kamp wrote: > In message <47B181F2.2070808@icyb.net.ua>, Andriy Gapon writes: > >> 2.3. this code passes to bread blkno that is calculated as 4*sector, >> where sector is a number of a physical 2048-byte sector. [**] >> [**] - I think that this is a requirement of buffcache system, because >> internally it performs many calculations that seem to assume that block >> size is always 512. > > Yes, the buf-cache works in 512 bytes units throughout. I missed the dbtob() conversions in vfs_bio.c when I replied previously So blkno cannot be a cookie; it must be for a 512-block. So how did the cases where bsize != DEV_BSIZE in getblk() ever work? >> 3.1. for a fresh buf getlbk would assign the following: >> bsize = bo->bo_bsize; >> offset = blkno * bsize; > > That's clearly wrong. If units were always 512-blocks, then anything except bsize = DEV_BSIZE would be clearly wrong. Things aren't that simple (but probably should be). Even RELENG_3 has bsize = f_iosize (possibly != 512) for non-disks. That seems to include nfs(client). In fact, nfs_getcacheblk() does weird scaling which seems to be mainly to compensate for for weird scaling here. It calls getblk() with a bn arg that seems to be f_iosize units. Then at then end, for the VREG case, it sets bp->b_blkno to this bn scaled to normal DEV_BSIZE units. bp->b_blkno seems to have DEV_BSIZE units for all uses of it in nfs. > We need to assert that the blkno is aligned to the start of a sector > and use the 512 byte units, so I guess it would be: > > offset = dbtob(blkno); > KASSERT(!(offset & (bsize - 1)), ("suitable diagnostic")); Barely worth checking. The current bug has nothing to do with this. The offset is just wrong at this point, after using a scale factor that is inconsistent with the units of blkno, for all (?) disk (and other?) file systems whose sector size isn't 512. Bruce