Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 26 Apr 2019 11:20:01 -0600
From:      Alan Somers <asomers@freebsd.org>
Cc:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: buf(9) woes: when does bcopy do nothing at all?
Message-ID:  <CAOtMX2gGyzMMuCHqjZxXiicqV_1Jx%2BU-Yyr92fYoEPdDi9PzwA@mail.gmail.com>
In-Reply-To: <CAOtMX2gdw%2BeQQU_-DC%2BEgimbCyw6ynbX1haGLUmn1dApk4rMZw@mail.gmail.com>
References:  <CAOtMX2gdw%2BeQQU_-DC%2BEgimbCyw6ynbX1haGLUmn1dApk4rMZw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 25, 2019 at 9:31 PM Alan Somers <asomers@freebsd.org> wrote:
>
> How is it possible that bcopy() doesn't affect its output array at
> all?  While investigating a data corruption issue in fuse, I narrowed
> the problem down to a bcopy operation that apparently has no affect.
> The code in question is:
>
> bcopy(cp, iov->iov_base, cnt);
> r = memcmp(cp, iov->iov_base, cnt);
> if (r)
>     printf("uiomove: miscompare\n");
>
> Rationally, I would expect that line never to be printed.  But it
> does.  The destination is always all zeros, even though the source is
> not.  I can only guess that there's something wrong about the way that
> I I'm using buf(9), because the output is part of a buffer allocated
> by bread(9).  I've been able to rule out:
>
> 1) Race conditions.  The bug is 100% reproducible, and doubling the
> bcopy or changing the timing in other ways has no effect.
> 2) Unmapped buffer.  I verified that the buf is not unmapped_buf.
> 3) Overlapping src and dst
> 4) Duplicated pages.  I verified that each of the buf's pages has a
> unique physical address
> 5) Bad RAM.  My machine passes memtest86, and anyway the failure is
> too specific and reproducible to be caused by bad hardware.
>
> What could I be missing?  Do I need to do something to prepare the buf
> before I can use it?  The code that allocates the buffer is here:
> https://svnweb.freebsd.org/base/projects/fuse2/sys/fs/fuse/fuse_io.c?view=markup#l240
>
> -Alan

To answer phk's questions, I checked that src and dst don't overlap,
and the kernel's bcopy is actually a wrapper around memmove.

To answer hps's question, this is on amd64, in a bhyve VM.

I solved the problem - part of it, anyway.  The user-visible problem
that originally led me down this rabbit-hole was an apparent cache
invalidation failure during writes on fusefs.  That turned out to be
caused by an off-by-one error that I just fixed in r346756.  However,
the miscompares remain.  Could those pages be mapped differently for
reading than for writing?  I don't know.  At this point, I'm not going
to put much more effort into investigating the problem; I've wasted
too much time already.

-Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2gGyzMMuCHqjZxXiicqV_1Jx%2BU-Yyr92fYoEPdDi9PzwA>