Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Jul 2012 12:06:33 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Pavlo <devgs@ukr.net>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: mmap() incoherency on hi I/O load (FS is zfs)
Message-ID:  <20120704090633.GH2337@deviant.kiev.zoral.com.ua>
In-Reply-To: <23856.1341389256.6316487571580649472@ffe17.ukr.net>
References:  <91943.1339669820.1305529125424791552@ffe15.ukr.net> <23856.1341389256.6316487571580649472@ffe17.ukr.net>

next in thread | previous in thread | raw e-mail | index | archive | help

--vX2ve0hYIA9A871f
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Jul 04, 2012 at 11:07:36AM +0300, Pavlo wrote:
>=20
>=20
>=20
> --- Original message ---
> From: "Pavlo" <devgs@ukr.net>
> To: freebsd-fs@freebsd.org
> Date: 14 June 2012, 13:30:20
> Subject: mmap() incoherency on hi I/O load (FS is zfs)
>=20
>=20
> > There's a case when some parts of files that are mapped and then
> modified getting corrupted. By corrupted I mean some data is ok (one that
> was written using write()/pwrite()) but some looks like it never existed.
> Like it was some time in buffers, when several processes simultaneously
> (of course access was synchronised) used shared pages and reported it's
> existence. But after time pass they (processes) screamed that it is now
> lost. Only part of data written with pwrite() was there. Everything that
> was written via mmap() is zero.
> >
> > So as I said it occurs on hi I/O busyness. When in background 4+
> processes do indexing of huge ammount of data. Also I want to note, it
> never occurred in the life of our project  while we used mmap() under
> same I/O stress conditions when mapping was done for a whole file of just
> a part(header) starting from a beginning of a file. First time we used
> mapping of individual pages, just to save RAM, and this popped up.
> >
> > Solution for this problem is msync() before any munmap(). But man says:
> >
> >
>=20
> The msync() system call is usually not needed since BSD implements a
> coherent file system buffer cache.  However, it may be used to associate
> dirty VM pages with file system buffers and thus cause them to be flushed
> to physical media sooner rather than later.
> >=20
> > Any thoughts? Thanks.
> >=20
> >=20
>=20
> So I tracked issue to the place where it occurs. When I commit data to
> file using mmap() and pwrite() side by side, sometimes 'newest data' is
> being overwritten by 'elder data'. From time to time 'elder data' can be
> something written with mmap() either with pwrite(). It never happens when
> I use exclusively mmap() either pwrite(). Also this issue reproduces on
> UFS as well. I think there is a problem keeping mmapep pages and FS cache
> synced.
I am curious how do you label data with newer and elder labels.

I do admit a possibility of a race in ZFS double-copy implementation of
the mmap/cache coherency, but somewhat skeptical about the same possibility
for UFS. What you saying might indicate that we loose modified/dirty bits
for the page, but that would have much more firework then just eventual
race with write.

What version of the system ? Does the machine swap ?

>=20
> I will try to make test to reliably reproduces issue.
Yes, isolated test case is the best route forward. It would either show
a bug or demonstrate a misunderstanding on your part.


--vX2ve0hYIA9A871f
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAk/0B5kACgkQC3+MBN1Mb4gZ0QCg7SoPwIYcseI/gSsbOOyTboCN
oxgAn0HWsYDFOdsxdsedeuEbucXyGPUc
=DUC7
-----END PGP SIGNATURE-----

--vX2ve0hYIA9A871f--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120704090633.GH2337>