Date: Fri, 26 Apr 2019 11:20:01 -0600 From: Alan Somers <asomers@freebsd.org> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: buf(9) woes: when does bcopy do nothing at all? Message-ID: <CAOtMX2gGyzMMuCHqjZxXiicqV_1Jx%2BU-Yyr92fYoEPdDi9PzwA@mail.gmail.com> In-Reply-To: <CAOtMX2gdw%2BeQQU_-DC%2BEgimbCyw6ynbX1haGLUmn1dApk4rMZw@mail.gmail.com> References: <CAOtMX2gdw%2BeQQU_-DC%2BEgimbCyw6ynbX1haGLUmn1dApk4rMZw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 25, 2019 at 9:31 PM Alan Somers <asomers@freebsd.org> wrote: > > How is it possible that bcopy() doesn't affect its output array at > all? While investigating a data corruption issue in fuse, I narrowed > the problem down to a bcopy operation that apparently has no affect. > The code in question is: > > bcopy(cp, iov->iov_base, cnt); > r = memcmp(cp, iov->iov_base, cnt); > if (r) > printf("uiomove: miscompare\n"); > > Rationally, I would expect that line never to be printed. But it > does. The destination is always all zeros, even though the source is > not. I can only guess that there's something wrong about the way that > I I'm using buf(9), because the output is part of a buffer allocated > by bread(9). I've been able to rule out: > > 1) Race conditions. The bug is 100% reproducible, and doubling the > bcopy or changing the timing in other ways has no effect. > 2) Unmapped buffer. I verified that the buf is not unmapped_buf. > 3) Overlapping src and dst > 4) Duplicated pages. I verified that each of the buf's pages has a > unique physical address > 5) Bad RAM. My machine passes memtest86, and anyway the failure is > too specific and reproducible to be caused by bad hardware. > > What could I be missing? Do I need to do something to prepare the buf > before I can use it? The code that allocates the buffer is here: > https://svnweb.freebsd.org/base/projects/fuse2/sys/fs/fuse/fuse_io.c?view=markup#l240 > > -Alan To answer phk's questions, I checked that src and dst don't overlap, and the kernel's bcopy is actually a wrapper around memmove. To answer hps's question, this is on amd64, in a bhyve VM. I solved the problem - part of it, anyway. The user-visible problem that originally led me down this rabbit-hole was an apparent cache invalidation failure during writes on fusefs. That turned out to be caused by an off-by-one error that I just fixed in r346756. However, the miscompares remain. Could those pages be mapped differently for reading than for writing? I don't know. At this point, I'm not going to put much more effort into investigating the problem; I've wasted too much time already. -Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2gGyzMMuCHqjZxXiicqV_1Jx%2BU-Yyr92fYoEPdDi9PzwA>