Date: Fri, 24 Mar 2017 02:16:11 -0700 From: Mark Millard <markmi@dsl-only.net> To: freebsd-hackers@freebsd.org Cc: freebsd-arm <freebsd-arm@freebsd.org> Subject: head -r315870 (e.g.): fork-then-swap-out [zero RES(ident memory)] questions tied to arm64 failures (zeroed memory pages) Message-ID: <4D7B13F4-142E-4294-A6E7-11CCD4C92AC5@dsl-only.net> In-Reply-To: <59C6BC12-1BC2-41D5-8B47-D0AD44D2CDF0@dsl-only.net> References: <59C6BC12-1BC2-41D5-8B47-D0AD44D2CDF0@dsl-only.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-Mar-22, at 3:09 PM, Mark Millard <markmi@dsl-only.net> wrote: > The later questions are associated with: > > Bugzilla 217239 and 217138 (which I now expect have a common cause) > https://lists.freebsd.org/pipermail/freebsd-arm/2017-March/015867.html > (and its thread) > > These are tied to some process memory pages being trashed (to > be zero) in particular types of arm64 contexts. This is > reproducible in multiple arm64 contexts. The context is head > but I believe there are reports in the lists tied to 11 as > well. > > [Unfortunately the above all very much shows a learn-as-I-go > property. Also the list has a sub-exchange on my testing other > devices to check for device failures that is not directly > relevant here.] > > These are tied to problems with fork-then-swap-out-then-swap-in > contexts on arm64. (Even though I've occasionally typed amd64 > accidentally in places in those materials.) Memory allocations > from before the fork are involved, ones not yet accessed by > the child side of the fork at the time of the fork. > > fork sets up copy-on-write so that the child process temporarily > shares pages (those it does not write), or should. > > But what if the parent process or both parent and child are > swapped-out just shortly after the fork (so, say, top -PCwaopid > shows zero for RES(ident memory)? What is the handling of, say, > the child swapping back in while the parent still is swapped > out? > > I notice that the child can have a much smaller SWAP figure > than the parent so it would appear that the parent swap-out > has pages that the child does not. > > So what if the child needs some of those pages? What should > happen? (Vs. what does happen on arm64 in specific types > of contexts? More below.) > > I ask mostly to see if I can do more to give evidence of > what is going on and what to test for any proposed fix. > I'm not likely to find the underlying problem(s) for arm64 > directly, unlike my investigation that lead to > fork-trampoline being fixed in head's -r313772 > (2017-Feb-15). > > [ https://lists.freebsd.org/pipermail/freebsd-arm/2017-February/015656.html > and its thread, including when its title changed in: > https://lists.freebsd.org/pipermail/freebsd-arm/2017-February/015678.html > .] > > Part of that unlikely-to-solve status is because the > context seems to be bound to a lot of special conditions > and interesting behaviors simultaneously: > > A) Both my original reproductions of problem reports on the > lists and the only (simple) programs for reproducing the > probablems involve fork-then-swap-out [zero RES(ident > memory)]. Neither fork by itself nor swap-out/in by > itself have been sufficient. > > B) jemalloc's tcache being in use (__je_tcache_maxclass == 32*1024) > is part of every example of reproduction of the problem. > > C) allocations <= SMALL_MAXCLASS (SMALL_MAXCLASS==14*1024) get > the failure (but bigger ones work, both fitting inside > __je_tcache_maxclass and not). Again: every example > reproduction of the problem has this status. > > D) FreeBSD sometimes gets into a state where /etc/malloc.conf > doing tcache:false does not seem to disable tcache. (Rebooting > goes back to tcache:false working after such has been > observed.) [Related or independent? I've no clue.] Usually > tcache:false seems to work and so avoid the failures. > > E) Use of POSIX_MADV_WILLNEED on the problematical allocation(s) > in the child process after the fork but before the swap-outs > of the child and parent prevents the failures (no read or > write access to the memory from the child until after the > swap-in). Doing so just in the parent process does not prevent > the failures. > > F) Similar to (E) but read-accessing a byte or more of one or > more pages from the problematical allocations from the child > process after the fork but before the swap-out makes those > specific pages not fail. (The others still fail, if any.) > Done from the parent process instead does not avoid the > failures. > > G) In a sequence like: su creates a sh which then runs one > of my test programs that then forks off a child it can be > that all of the 4 processes show the zeroed memory area > like the child process does. su and sh need to have > swapped-out and back in for them to get failures. su and > sh die once they hit an assert that fails due to the zeroed > memory page(s). The asserts involve addresses also messed > up in the test program processes (parent and child). > > In my reading I've not been able to determine what to expect > for fork-then-swap-out-and-back-in for pages that the child > process had not accessed yet but which might not be around > for later activity because of the parent process's own > swapped-out status at the time. > > Note: While I usually run a non-debug kernel I've tried > a debug kernel and it provided no notices of problems. I > got no additional information from the attempt. > > [My usual KERNCONF file includes GENERIC and then disables > various debug items.] > > The bugzilla reports have example code for showing the > problems and various behaviors. The two in 217239 are > probably of more interest than the first one on 217138. I just updated the pine64+ 2GB to head -r315870 and it still gets the trashed-with-zeros pages from sequences such as: allocation(s) (tcache in use with fitting <= SMALL_MAXCLASS) initialize them (to non-zero bytes) fork sleep/wait then swap-out [zero RES(ident memory)] (both parent and child in my tests) (Note the lack of access so far on the child process side.) swap-in After swap-in both the Child and Parent see the indicated allocation(s) as having only zero bytes instead of the initialization values. === Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D7B13F4-142E-4294-A6E7-11CCD4C92AC5>