Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Aug 2021 01:29:44 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Alexander Lochmann <alexander.lochmann@tu-dortmund.de>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>, Horst Schirmeier <horst@schirmeier.de>
Subject:   Re: Various unprotected accesses to buf and vnode
Message-ID:  <YSq42Cb48SMv%2BsIO@kib.kiev.ua>
In-Reply-To: <46649402-d28a-6f81-f0a8-39180b681f4c@tu-dortmund.de>
References:  <55f3661e-2173-793e-4834-bbcd79d3d99e@tu-dortmund.de> <YSkxgXyXZfNvrXA/@kib.kiev.ua> <380bdcc8-bede-2a64-8e5e-031552231d82@tu-dortmund.de> <YSqhe3WI8dVvUq7g@kib.kiev.ua> <46649402-d28a-6f81-f0a8-39180b681f4c@tu-dortmund.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Aug 28, 2021 at 11:16:15PM +0200, Alexander Lochmann wrote:
> 
> 
> On 28.08.21 22:50, Konstantin Belousov wrote:
> > On Sat, Aug 28, 2021 at 08:53:19PM +0200, Alexander Lochmann wrote:
> > > On 27.08.21 20:40, Konstantin Belousov wrote:
> > > > > - Read of buf.b_blkno in cluster_write():
> > > > > According to the documentation b_lock is needed.
> > > > > Is b_blkno maybe a read-only element of struct buf?
> > > > No, b_blkno is not read-only, it is the disk block number for the block,
> > > > as opposed to b_lbklno which is logical file block number.  The buffer
> > > > is instantiated with b_blkno == b_lblkno, and when the buffer is mapped
> > > > to the real disk block, b_blkno is updated.
> > > > 
> > > > Could you show me the backtrace of the situation where cluster_write()
> > > > is called with unlocked buffer?
> > > > 
> > > > > 
> > > If you don't mind, you can find them here:
> > > https://thasos.cs.tu-dortmund.de/bugs/ctx-buf-b_blkno-r-cex.html
> > I see some banner "Counterexamples", two buttons, and then a blank space.
> > Both under Firefox and Chrome under FreeBSD.
> > 
> > I went as far as to ask and see what happens on Chrome under Windows, it
> > looks the same.
> Oh, I'm sorry.
> Pls click on "Show Member List", and then select an entry. For both cases,
> there should be only one entry.
> (JavaScript must be enabled for this to work.)
Ok, I see some call sequences (?), but again all of them are ffs_write()
(one is ext2_write) calling into cluster_write().  There the buffer lock
is owned.

Show me the specific call sequence where it is not.

> > 
> > > (Pls ignore the line 'Hypothesis 1: ....'.)
> > > > > - Write to vnode.v_bufobj.bo_object:
> > > > > https://github.com/freebsd/freebsd-src/blob/main/sys/vm/vnode_pager.c#L291
> > > > > According to the documentation, '[...] the vnode lock which embeds the
> > > > > bufobj' is needed. However, interlock is taken in line 276.
> > > > > Is the interlock equivalent to the vnode lock?
> > > > > (I assume 'the vnode lock' refers to vnode.v_lock.)
> > > > vnode_pager_alloc() must be called with the vnode exclusively locked.
> > > > This is asserted at the beginning of the function.
> > > > 
> > > Yeah, I see that check: ASSERT_VOP_LOCKED(vp, "vnode_pager_alloc");.
> > > However, our data says otherwise: According to our trace, the shared lock is
> > > taken.
> > > You may have a look at https://thasos.cs.tu-dortmund.de/bugs/ctx-vnode-v_bufobj.bo_object-w-cex.html.
> > > 'EMBSAME(vnode.v_lock[r])' refers to the shared vnode lock.
> > > A click on each lock description, e.g., EMBSAME(vnode.v_lock[r]), leads to
> > > the code where it was taken.
> > > (Pls ignore the line 'Hypothesis 1: ....'.)
Ah, yes, the calls from lookup and open would be with the shared lock.
Still, we lock the vnode interlock to avoid double-allocating the v_object
object, so it is fine.  Some mode of the vnode lock is required nonetheless,
because otherwise we might miss reclaim which guarantees that v_object 
is freed.

> > > > > 
> > > > > - Is buf.b_bufobj a read-only element?
> > > > You should scope the question.
> > > > 
> > > > While buffer is owned by a vnode, b_bufobj is constant.  Since buffers are
> > > > type-stable, they migrate between vnodes as cache finds it required to
> > > > reclaim and reuse.  There, b_bufobj is changed.
> > > > 
> > > I'm referring to getdirtybuf(): msleep(&bp->b_xflags,
> > > BO_LOCKPTR(bp->b_bufobj),PRIBIO | PDROP, "getbuf", 0);
> > And?  The msleep() just uses the address as the sleep chain id.
> > The getdirtybuf() function fails if it really slept, thus dropping bo lock
> > and allowing the buffer identity to change (Really not identity, because
> > vnode is locked, but allowing the buffer to be written out and moved to
> > clean queue).



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YSq42Cb48SMv%2BsIO>