Date: Mon, 4 Jul 2016 21:32:49 -0500 From: Karl Denninger <karl@denninger.net> To: freebsd-hackers@freebsd.org Subject: Re: ZFS ARC and mmap/page cache coherency question Message-ID: <768b6169-70d9-5500-c455-563d8340972e@denninger.net> In-Reply-To: <34cf2d30-8884-95b6-f852-457d55710daf@freebsd.org> References: <20160630140625.3b4aece3@splash.akips.com> <CALXu0UfxRMnaamh%2Bpo5zp=iXdNUNuyj%2B7e_N1z8j46MtJmvyVA@mail.gmail.com> <20160703123004.74a7385a@splash.akips.com> <155afb8148f.c6f5294d33485.2952538647262141073@nextbsd.org> <45865ae6-18c9-ce9a-4a1e-6b2a8e44a8b2@denninger.net> <155b84da0aa.ad3af0e6139335.8627172617037605875@nextbsd.org> <7e00af5a-86cd-25f8-a4c6-2d946b507409@denninger.net> <34cf2d30-8884-95b6-f852-457d55710daf@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On 7/4/2016 21:28, Allan Jude wrote: > On 2016-07-04 22:26, Karl Denninger wrote: >> >> On 7/4/2016 18:45, Matthew Macy wrote: >>> >>> ---- On Sun, 03 Jul 2016 08:43:19 -0700 Karl Denninger <karl@denninger.net> wrote ---- >>> > >>> > On 7/3/2016 02:45, Matthew Macy wrote: >>> > > >>> > > Cedric greatly overstates the intractability of resolving it. Nonetheless, since the initial import very little has been done to improve integration, and I don't know of anyone who is up to the task taking an interest in it. Consequently, mmap() performance is likely "doomed" for the foreseeable future.-M---- >>> > >>> > Wellllll.... >>> > >>> > I've done a fair bit of work here (see >>> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594) and the >>> > political issues are at least as bad as the coding ones. >>> > >>> >>> >>> Strictly speaking, the root of the problem is the ARC. Not ZFS per se. Have you ever tried disabling MFU caching to see how much worse LRU only is? I'm not really convinced the ARC's benefits justify its cost. >>> >>> -M >>> >> The ARC is very useful when it gets a hit as it avoid an I/O that would >> otherwise take place. >> >> Where it sucks is when the system evicts working set to preserve ARC. >> That's always wrong in that you're trading a speculative I/O (if the >> cache is hit later) for a *guaranteed* one (to page out) and maybe *two* >> (to page back in.) >> > ZFS is better behaved in 11.x, there is a sysctl vfs.zfs.arc_free_target > that makes sure the ARC is reined in when there is memory pressure, by > ensuring a minimum amount of actually free pages. > Oh, but..... Again, go read the PR I linked (and the current version of the patch against 10-STABLE.) The issues are far more intertwined than that. Specifically, the dmu_tx cache decision (size of the write-back cache) is flat-out broken and inappropriate in essentially all cases, and the interaction of UMA and ARC is very destructive under a wide variety of workloads. The patch has hack-around for the dmu_tx problem and a reasonably-effective fix for the UMA issues. Actually fixing dmu_tx, however, is nowhere near that easy since it really needs to be computed per-zvol on an actual bytes moved per-unit-of-time basis. Note that one of the patches in the set I developed is indeed arc_free_target (indeed it was the first approach I took) -- but without addressing the other two issues it doesn't solve the problem. -- Karl Denninger karl@denninger.net <mailto:karl@denninger.net> /The Market Ticker/ /[S/MIME encrypted email preferred]/ [-- Attachment #2 --] 0 *H 010 `He 0 *H _0[0C)0 *H 010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA0 150421022159Z 200419022159Z0Z10 UUS10UFlorida10U Cuda Systems LLC10UKarl Denninger (OCSP)0"0 *H 0 X@vkY Tq/vE]5#֯MX\8LJ/V?5Da+ sJc*/r{ȼnS+ w")ąZ^DtdCOZ ~7Q '@a#ijc۴oZdB&!Ӝ-< ?HN5y 5}F|ef"Vلio74zn">a1qWuɖbFeGE&3(KhixG3!#e_XƬϜ/,$+;4y'Bz<qT9_?rRUpn5 Jn&Rx/p Jyel*pN8/#9u/YPEC)TY>~/˘N[vyiDKˉ,^" ?$T8 v&K%z8C @?K{9f`+@,|Mbia 007++0)0'+0http://cudasystems.net:88880 U0 0 `HB0U0, `HB OpenSSL Generated Certificate0U-h\Ff Y0U#0$q}ݽʒm50U0karl@denninger.net0 *H Owbabɺx&Uk[(Oj!%p MQ0I!#QH}.>~2&D}<wm_>V6v]f>=Nn+8;q wfΰ/RLyUG#b}n!Dր_up|_ǰc/%ۥ nN8:d;-UJd/m1~VނיnN I˾$tF1&}|?q?\đXԑ&\4V<lKۮ3%Am_(q-(cAeGX)f}-˥6cv~Kg8m~v;|9:-iAPқ6ېn-.)<[$KJtt/L4ᖣ^Cmu4vb{+BG$M0c\[MR|0FԸP&78"4p#}DZ9;V9#>Sw"[UP7100010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA)0 `He M0 *H 1 *H 0 *H 1 160705023249Z0O *H 1B@(8 v,*<:+ Tw //k~"D rboL0l *H 1_0]0 `He*0 `He0 *H 0*H 0 *H @0+0 *H (0 +710010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA)0*H 1010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA)0 *H 8m!yJ!R,?Y*_'MO9<"Be|~(H xҔ+.1 #t2˅eNp/sCrrkgOK5 ܒ=۴2nj`F`܍Z@PDTRv M6 2X=wWWA|(Mqgž'?{y^<U˘Sk&iL!vơQk =DFRR\X̐H d&>?5&qzi+J.s_l>2mA|_[̬ɐ)I*|#zup--s=ɤ1wQ)_ ufY5<_}7FogL۞*\XsqnR,1hIiՀa1]Gw 惂;[м7?8lneRqwwLen)FL´*'4@n:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?768b6169-70d9-5500-c455-563d8340972e>
