Date: Tue, 5 Jul 2016 12:50:16 -0500 From: Karl Denninger <karl@denninger.net> To: freebsd-hackers@freebsd.org Subject: Re: ZFS ARC and mmap/page cache coherency question Message-ID: <31f4d30f-4170-0d04-bd23-1b998474a92e@denninger.net> In-Reply-To: <155bc1260e6.12001bf18198857.6272515207330027022@nextbsd.org> References: <20160630140625.3b4aece3@splash.akips.com> <CALXu0UfxRMnaamh%2Bpo5zp=iXdNUNuyj%2B7e_N1z8j46MtJmvyVA@mail.gmail.com> <20160703123004.74a7385a@splash.akips.com> <155afb8148f.c6f5294d33485.2952538647262141073@nextbsd.org> <45865ae6-18c9-ce9a-4a1e-6b2a8e44a8b2@denninger.net> <155b84da0aa.ad3af0e6139335.8627172617037605875@nextbsd.org> <7e00af5a-86cd-25f8-a4c6-2d946b507409@denninger.net> <155bc1260e6.12001bf18198857.6272515207330027022@nextbsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On 7/5/2016 12:19, Matthew Macy wrote: > > > ---- On Mon, 04 Jul 2016 19:26:06 -0700 Karl Denninger <karl@denninger.net> wrote ---- > > > > > > On 7/4/2016 18:45, Matthew Macy wrote: > > > > > > > > > ---- On Sun, 03 Jul 2016 08:43:19 -0700 Karl Denninger <karl@denninger.net> wrote ---- > > > > > > > > On 7/3/2016 02:45, Matthew Macy wrote: > > > > > > > > > > Cedric greatly overstates the intractability of resolving it. Nonetheless, since the initial import very little has been done to improve integration, and I don't know of anyone who is up to the task taking an interest in it. Consequently, mmap() performance is likely "doomed" for the foreseeable future.-M---- > > > > > > > > Wellllll.... > > > > > > > > I've done a fair bit of work here (see > > > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594) and the > > > > political issues are at least as bad as the coding ones. > > > > > > > > > > > > > Strictly speaking, the root of the problem is the ARC. Not ZFS per se. Have you ever tried disabling MFU caching to see how much worse LRU only is? I'm not really convinced the ARC's benefits justify its cost. > > > > > > -M > > > > > > > The ARC is very useful when it gets a hit as it avoid an I/O that would > > otherwise take place. > > > > Where it sucks is when the system evicts working set to preserve ARC. > > That's always wrong in that you're trading a speculative I/O (if the > > cache is hit later) for a *guaranteed* one (to page out) and maybe *two* > > (to page back in.) > > The question wasn't ARC vs. no-caching. It was LRU only vs LRU + MFU. There are a lot of issues stemming from the fact that ZFS is a transactional object store with a POSIX FS on top. One is that it caches disk blocks as opposed to file blocks. However, if one could resolve that and have the page cache manage these blocks life would be much much better. However, you'd lose MFU. Hence my question. > > -M > I suspect there's an argument to be made there but the present problems make determining the impact of that difficult or impossible as those effects are swamped by the other issues. I can fairly-easily create workloads on the base code where simply typing "vi <some file>", making a change and hitting ":w" will result in a stall of tens of seconds or more while the cache flush that gets requested is run down. I've resolved a good part (but not all instances) of this through my work. My understanding is that 11- has had additional work done to the base code, but three underlying issues are not, from what I can see in the commit logs and discussions, addressed: The VM system will page out working set while leaving ARC alone, UMA reserved-but-not-in-use space is not policed adequately when memory pressure exists *before* the pager starts considering evicting working set and the write-back cache is for many machine configurations grossly inappropriate and cannot be tuned adequately by hand (particularly being true on a system with vdevs that have materially-varying performance levels.) I have more-or-less stopped work on the tree on a forward basis since I got to a place with 10.2 that (1) works for my production requirements, resolving the problems and (2) ran into what I deemed to be intractable political issues within core on progress toward eradicating the root of the problem. I will probably revisit the situation with 11- at some point, as I'll want to roll my production systems forward. However, I don't know when that will be -- right now 11- is stable enough for some of my embedded work (e.g. on the Raspberry Pi2) but is not on my server and client-class machines. Indeed just yesterday I got a lock-order reversal panic while doing a shutdown after a kernel update on one of my lab boxes running a just-updated 11- codebase. -- Karl Denninger karl@denninger.net <mailto:karl@denninger.net> /The Market Ticker/ /[S/MIME encrypted email preferred]/ [-- Attachment #2 --] 0 *H 010 `He 0 *H _0[0C)0 *H 010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA0 150421022159Z 200419022159Z0Z10 UUS10UFlorida10U Cuda Systems LLC10UKarl Denninger (OCSP)0"0 *H 0 X@vkY Tq/vE]5#֯MX\8LJ/V?5Da+ sJc*/r{ȼnS+ w")ąZ^DtdCOZ ~7Q '@a#ijc۴oZdB&!Ӝ-< ?HN5y 5}F|ef"Vلio74zn">a1qWuɖbFeGE&3(KhixG3!#e_XƬϜ/,$+;4y'Bz<qT9_?rRUpn5 Jn&Rx/p Jyel*pN8/#9u/YPEC)TY>~/˘N[vyiDKˉ,^" ?$T8 v&K%z8C @?K{9f`+@,|Mbia 007++0)0'+0http://cudasystems.net:88880 U0 0 `HB0U0, `HB OpenSSL Generated Certificate0U-h\Ff Y0U#0$q}ݽʒm50U0karl@denninger.net0 *H Owbabɺx&Uk[(Oj!%p MQ0I!#QH}.>~2&D}<wm_>V6v]f>=Nn+8;q wfΰ/RLyUG#b}n!Dր_up|_ǰc/%ۥ nN8:d;-UJd/m1~VނיnN I˾$tF1&}|?q?\đXԑ&\4V<lKۮ3%Am_(q-(cAeGX)f}-˥6cv~Kg8m~v;|9:-iAPқ6ېn-.)<[$KJtt/L4ᖣ^Cmu4vb{+BG$M0c\[MR|0FԸP&78"4p#}DZ9;V9#>Sw"[UP7100010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA)0 `He M0 *H 1 *H 0 *H 1 160705175016Z0O *H 1B@~mYMX4A5jH l댳J\o?ĉ@3w< maI0l *H 1_0]0 `He*0 `He0 *H 0*H 0 *H @0+0 *H (0 +710010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA)0*H 1010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA)0 *H Bl;҂CBK[l<,D{Q\m0O&Fxpwp#ij LBO{|]|9K, ӠIMX{>c=M:q~Gx+zQ+ڮӡ׃ e!5 nhra@*t=xv.:ަvnJHړ$P?O%j"xqʓESqTDM5z]!\P+V?-?*_қHyo[kU1'u|4'D.XC4#>FjN[db8 ΠK$T s,">0$nM?b Xj%W,dA1tQbtd."5:yW+éaMO\ah$˓T+(f*dhMMO!ImZ:a7µuCz&^qg4)*()qHJ7t*
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?31f4d30f-4170-0d04-bd23-1b998474a92e>
