Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Jun 2016 23:14:51 -0700
From:      Andrew Bates <andrewbates09@gmail.com>
To:        Paul Koch <paul.koch137@gmail.com>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: ZFS ARC and mmap/page cache coherency question
Message-ID:  <CAPi5Lmm6RtXQ6UxzcfoRKtGC-LfBLJAW0qOy6=F5fh3mg-OB5w@mail.gmail.com>
In-Reply-To: <20160630140625.3b4aece3@splash.akips.com>
References:  <20160630140625.3b4aece3@splash.akips.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Heya Paul,

How is your ZFS configured ( zfs get all tank0 )?

These certainly aren't absolute, law, or perfect - but if you haven't yet,
I suggest you take a peek at the following:

* http://open-zfs.org/wiki/Performance_tuning
* https://www.joyent.com/blog/bruning-questions-zfs-record-size
* http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide

On Wed, Jun 29, 2016 at 9:06 PM, Paul Koch <paul.koch137@gmail.com> wrote:

>
> Posted this to -stable on the 15th June, but no feedback...
>
> We are trying to understand a performance issue when syncing large mmap'ed
> files on ZFS.
>
> Example test box setup:
>  FreeBSD 10.3-p5
>  Intel i7-5820K 3.30GHz with 64G RAM
>  6 * 2 Tbyte Seagate ST2000DM001-1ER164 in a ZFS stripe
>
> Read performance of a sequentially written large file on the pool is
> typically around 950Mbytes/sec using dd.
>
> Our software mmap's some large database files using MAP_NOSYNC, and we call
> fsync() every 10 minutes when we know the file system is mostly idle.  In
> our test setup, the database files are 1.1G, 2G, 1.4G, 12G, 4.7G and ~20
> small files (under 10M).  All of the memory pages in the mmap'ed files are
> updated every minute with new values, so the entire mmap'ed file needs to
> be
> synced to disk, not just fragments.
>
> When the 10 minute fsync() occurs, gstat typically shows very little disk
> reads and very high write speeds, which is what we expect.  But, every 80
> minutes we process the data in the large mmap'ed files and store it in
> highly
> compressed blocks of a ~300G file using pread/pwrite (i.e. not mmap'ed).
> After that, the performance of the next fsync() of the mmap'ed files falls
> off a cliff.  We are assuming it is because the ARC has thrown away the
> cached data of the mmap'ed files.  gstat shows lots of read/write
> contention
> and lots of things tend to stall waiting for disk.
>
> Is this just a lack of ZFS ARC and page cache coherency ??
>
> Is there a way to prime the ARC with the mmap'ed files again before we call
> fsync() ?
>
> We've tried cat and read() on the mmap'ed files but doesn't seem to touch
> the
> disk at all and the fsync() performance is still poor, so it looks like the
> ARC is not being filled.  msync() doesn't seem to be much different.
> mincore() stats show the mmap'ed data is entirely incore and referenced.
>
>         Paul.
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>



-- 
V/Respectfully,
Andrew M Bates



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPi5Lmm6RtXQ6UxzcfoRKtGC-LfBLJAW0qOy6=F5fh3mg-OB5w>