Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 Jun 2025 21:42:01 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Sanchit Sahay <ss19723@nyu.edu>
Cc:        freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org
Subject:   Re: Corrupted bp->b_lblkno on bread() // Life-cycle of a buf obj?
Message-ID:  <aFRZ-Q62-WCx1Z7D@kib.kiev.ua>
In-Reply-To: <CAJ4siUBgGbeDKO8%2BW5JULfW8U0oLO6=xhjTr-utxuqV3N3Fnkg@mail.gmail.com>

index | next in thread | previous in thread | raw e-mail

On Tue, Jun 17, 2025 at 11:07:49PM -0400, Sanchit Sahay wrote:
> I'm working on porting a filesystem to FreeBSD, and am running into an
> issue that I'm having difficulty debugging. Any help would be appreciated.
> 
> When calling bread() with an blkno=lblkno, by the time the flow of the
> control reaches the vop_strategy function, the value of lblkno changes from
> 0 to a seemingly random value.
There is something strange in the sentence.  First you claim that
b_blkno == b_lblkno, then you claim thant b_lbkno changes from 0 to some
random value.

So, is it 0 or b_blkno?

> 
> Having inspected this with gdb,
> 
> On frame 9:
> 
> #9  0xffff0000c3e72930 in hfs_strategy ()
> 1488            kdb_enter("lblk random", "lblk random");
> 
> *(kgdb) p ap->a_bp->b_lblkno$10 = -281474971149872*
> 
> On frame 10:
> 
> #10 0xffff0000009387b0 in VOP_STRATEGY_APV () at vnode_if.c:2423
> 2423                    rc = vop->vop_strategy(a);
> 
> *(kgdb) p a->a_bp->b_lblkno$11 = 0*
And the same pattern occurs there.

> 
> This flow is triggered when calling bread() like so:
> 
> retval = bread(vp, blockNum, block->blockSize, NOCRED, &bp);
> 
> The stack trace is:
> 
> #9  0xffff0000c3e72930 in hfs_strategy (ap=0xffff00009bbd1058)
> #10 0xffff0000009387b0 in VOP_STRATEGY_APV (
> #11 0xffff00000054bbcc in VOP_STRATEGY (vp=0xffff000000a08fc5,
> #12 bufstrategy (bo=<optimized out>, bp=0xffff0000404990c8)
> #13 0xffff00000054d6f0 in bstrategy (bp=0xffff0000404990c8)
> #14 breadn_flags
> 
> There seems to be no code run between these two stacks, the a_bp in both
> these frames points to the same memory address. No other fields are
> modified between these two frames.
> 
> Because of this seemingly random lblkno value, VOP_BMAP is not triggered,
> and the read returns arbitrary results.
> 
> This issue only occurs when I have the kernel compiled with these
> additional flags (as suggested by the handbook for debugging deadlocks):
> 
> options INVARIANTS
> options INVARIANT_SUPPORT
> options WITNESS
> options WITNESS_SKIPSPIN
> options DEBUG_LOCKS
> options DEBUG_VFS_LOCKS
> options DIAGNOSTIC
> 
> Without these flags enabled, this lblkno corruption does not take place,
> and the bread returns a valid read. I don't see any conditional code that
> these flags enable which would cause such an issue.
And this smells like an KBI (Kernel Binary Interface) issue, since DEBUG_LOCKS
changes the layout of the struct lock, which is embedded into struct buf
with which you have problems.

How do you build your fs code? As a module?  If yes, you must use the same
set of opt_*.h headers as used for the kernel build.

> 
> Any tips on how to investigate this further would be greatly appreciated,
> or if I am missing something about the lifecycle of the buffer object that
> might cause it to "reset" certain fields.
> 
> Thanks
> Sanchit Sahay


help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?aFRZ-Q62-WCx1Z7D>