Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Mar 2017 02:16:11 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        freebsd-hackers@freebsd.org
Cc:        freebsd-arm <freebsd-arm@freebsd.org>
Subject:   head -r315870 (e.g.): fork-then-swap-out [zero RES(ident memory)] questions tied to arm64 failures (zeroed memory pages)
Message-ID:  <4D7B13F4-142E-4294-A6E7-11CCD4C92AC5@dsl-only.net>
In-Reply-To: <59C6BC12-1BC2-41D5-8B47-D0AD44D2CDF0@dsl-only.net>
References:  <59C6BC12-1BC2-41D5-8B47-D0AD44D2CDF0@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-Mar-22, at 3:09 PM, Mark Millard <markmi@dsl-only.net> wrote:

> The later questions are associated with:
> 
> Bugzilla 217239 and 217138 (which I now expect have a common cause)
> https://lists.freebsd.org/pipermail/freebsd-arm/2017-March/015867.html
> (and its thread)
> 
> These are tied to some process memory pages being trashed (to
> be zero) in particular types of arm64 contexts. This is
> reproducible in multiple arm64 contexts. The context is head
> but I believe there are reports in the lists tied to 11 as
> well.
> 
> [Unfortunately the above all very much shows a learn-as-I-go
> property. Also the list has a sub-exchange on my testing other
> devices to check for device failures that is not directly
> relevant here.]
> 
> These are tied to problems with fork-then-swap-out-then-swap-in
> contexts on arm64. (Even though I've occasionally typed amd64
> accidentally in places in those materials.) Memory allocations
> from before the fork are involved, ones not yet accessed by
> the child side of the fork at the time of the fork.
> 
> fork sets up copy-on-write so that the child process temporarily
> shares pages (those it does not write), or should.
> 
> But what if the parent process or both parent and child are
> swapped-out just shortly after the fork (so, say, top -PCwaopid
> shows zero for RES(ident memory)? What is the handling of, say,
> the child swapping back in while the parent still is swapped
> out?
> 
> I notice that the child can have a much smaller SWAP figure
> than the parent so it would appear that the parent swap-out
> has pages that the child does not.
> 
> So what if the child needs some of those pages? What should
> happen? (Vs. what does happen on arm64 in specific types
> of contexts? More below.)
> 
> I ask mostly to see if I can do more to give evidence of
> what is going on and what to test for any proposed fix.
> I'm not likely to find the underlying problem(s) for arm64
> directly, unlike my investigation that lead to
> fork-trampoline being fixed in head's -r313772
> (2017-Feb-15).
> 
> [ https://lists.freebsd.org/pipermail/freebsd-arm/2017-February/015656.html
>  and its thread, including when its title changed in:
>  https://lists.freebsd.org/pipermail/freebsd-arm/2017-February/015678.html
> .]
> 
> Part of that unlikely-to-solve status is because the
> context seems to be bound to a lot of special conditions
> and interesting behaviors simultaneously:
> 
> A) Both my original reproductions of problem reports on the
>   lists and the only (simple) programs for reproducing the
>   probablems involve fork-then-swap-out [zero RES(ident
>   memory)]. Neither fork by itself nor swap-out/in by
>   itself have been sufficient.
> 
> B) jemalloc's tcache being in use (__je_tcache_maxclass == 32*1024)
>   is part of every example of reproduction of the problem.
> 
> C) allocations <= SMALL_MAXCLASS (SMALL_MAXCLASS==14*1024) get
>   the failure (but bigger ones work, both fitting inside
>   __je_tcache_maxclass and not). Again: every example
>   reproduction of the problem has this status.
> 
> D) FreeBSD sometimes gets into a state where /etc/malloc.conf
>   doing tcache:false does not seem to disable tcache. (Rebooting
>   goes back to tcache:false working after such has been
>   observed.) [Related or independent? I've no clue.] Usually
>   tcache:false seems to work and so avoid the failures.
> 
> E) Use of POSIX_MADV_WILLNEED on the problematical allocation(s)
>   in the child process after the fork but before the swap-outs
>   of the child and parent prevents the failures (no read or
>   write access to the memory from the child until after the
>   swap-in). Doing so just in the parent process does not prevent
>   the failures.
> 
> F) Similar to (E) but read-accessing a byte or more of one or
>   more pages from the problematical allocations from the child
>   process after the fork but before the swap-out makes those
>   specific pages not fail. (The others still fail, if any.)
>   Done from the parent process instead does not avoid the
>   failures.
> 
> G) In a sequence like: su creates a sh which then runs one
>   of my test programs that then forks off a child it can be
>   that all of the 4 processes show the zeroed memory area
>   like the child process does. su and sh need to have
>   swapped-out and back in for them to get failures. su and
>   sh die once they hit an assert that fails due to the zeroed
>   memory page(s). The asserts involve addresses also messed
>   up in the test program processes (parent and child).
> 
> In my reading I've not been able to determine what to expect
> for fork-then-swap-out-and-back-in for pages that the child
> process had not accessed yet but which might not be around
> for later activity because of the parent process's own
> swapped-out status at the time.
> 
> Note: While I usually run a non-debug kernel I've tried
> a debug kernel and it provided no notices of problems. I
> got no additional information from the attempt.
> 
> [My usual KERNCONF file includes GENERIC and then disables
> various debug items.]
> 
> The bugzilla reports have example code for showing the
> problems and various behaviors. The two in 217239 are
> probably of more interest than the first one on 217138.

I just updated the pine64+ 2GB to head -r315870 and it
still gets the trashed-with-zeros pages from sequences
such as:

allocation(s) (tcache in use with fitting <= SMALL_MAXCLASS)
initialize them (to non-zero bytes)
fork
sleep/wait then swap-out [zero RES(ident memory)]
  (both parent and child in my tests)
(Note the lack of access so far on the child process
 side.)
swap-in


After swap-in both the Child and Parent see the indicated
allocation(s) as having only zero bytes instead of the
initialization values.

===
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D7B13F4-142E-4294-A6E7-11CCD4C92AC5>