From owner-freebsd-hackers@freebsd.org Fri Mar 24 09:16:20 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A83ACCA12E9 for ; Fri, 24 Mar 2017 09:16:20 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-181.reflexion.net [208.70.211.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6D24C1CEC for ; Fri, 24 Mar 2017 09:16:19 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 978 invoked from network); 24 Mar 2017 09:18:50 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 24 Mar 2017 09:18:50 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v8.30.2) with SMTP; Fri, 24 Mar 2017 05:16:13 -0400 (EDT) Received: (qmail 19409 invoked from network); 24 Mar 2017 09:16:13 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 24 Mar 2017 09:16:13 -0000 Received: from [192.168.1.119] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 5A30AEC8173; Fri, 24 Mar 2017 02:16:12 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: head -r315870 (e.g.): fork-then-swap-out [zero RES(ident memory)] questions tied to arm64 failures (zeroed memory pages) From: Mark Millard In-Reply-To: <59C6BC12-1BC2-41D5-8B47-D0AD44D2CDF0@dsl-only.net> Date: Fri, 24 Mar 2017 02:16:11 -0700 Cc: freebsd-arm Content-Transfer-Encoding: 7bit Message-Id: <4D7B13F4-142E-4294-A6E7-11CCD4C92AC5@dsl-only.net> References: <59C6BC12-1BC2-41D5-8B47-D0AD44D2CDF0@dsl-only.net> To: freebsd-hackers@freebsd.org X-Mailer: Apple Mail (2.3259) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Mar 2017 09:16:20 -0000 On 2017-Mar-22, at 3:09 PM, Mark Millard wrote: > The later questions are associated with: > > Bugzilla 217239 and 217138 (which I now expect have a common cause) > https://lists.freebsd.org/pipermail/freebsd-arm/2017-March/015867.html > (and its thread) > > These are tied to some process memory pages being trashed (to > be zero) in particular types of arm64 contexts. This is > reproducible in multiple arm64 contexts. The context is head > but I believe there are reports in the lists tied to 11 as > well. > > [Unfortunately the above all very much shows a learn-as-I-go > property. Also the list has a sub-exchange on my testing other > devices to check for device failures that is not directly > relevant here.] > > These are tied to problems with fork-then-swap-out-then-swap-in > contexts on arm64. (Even though I've occasionally typed amd64 > accidentally in places in those materials.) Memory allocations > from before the fork are involved, ones not yet accessed by > the child side of the fork at the time of the fork. > > fork sets up copy-on-write so that the child process temporarily > shares pages (those it does not write), or should. > > But what if the parent process or both parent and child are > swapped-out just shortly after the fork (so, say, top -PCwaopid > shows zero for RES(ident memory)? What is the handling of, say, > the child swapping back in while the parent still is swapped > out? > > I notice that the child can have a much smaller SWAP figure > than the parent so it would appear that the parent swap-out > has pages that the child does not. > > So what if the child needs some of those pages? What should > happen? (Vs. what does happen on arm64 in specific types > of contexts? More below.) > > I ask mostly to see if I can do more to give evidence of > what is going on and what to test for any proposed fix. > I'm not likely to find the underlying problem(s) for arm64 > directly, unlike my investigation that lead to > fork-trampoline being fixed in head's -r313772 > (2017-Feb-15). > > [ https://lists.freebsd.org/pipermail/freebsd-arm/2017-February/015656.html > and its thread, including when its title changed in: > https://lists.freebsd.org/pipermail/freebsd-arm/2017-February/015678.html > .] > > Part of that unlikely-to-solve status is because the > context seems to be bound to a lot of special conditions > and interesting behaviors simultaneously: > > A) Both my original reproductions of problem reports on the > lists and the only (simple) programs for reproducing the > probablems involve fork-then-swap-out [zero RES(ident > memory)]. Neither fork by itself nor swap-out/in by > itself have been sufficient. > > B) jemalloc's tcache being in use (__je_tcache_maxclass == 32*1024) > is part of every example of reproduction of the problem. > > C) allocations <= SMALL_MAXCLASS (SMALL_MAXCLASS==14*1024) get > the failure (but bigger ones work, both fitting inside > __je_tcache_maxclass and not). Again: every example > reproduction of the problem has this status. > > D) FreeBSD sometimes gets into a state where /etc/malloc.conf > doing tcache:false does not seem to disable tcache. (Rebooting > goes back to tcache:false working after such has been > observed.) [Related or independent? I've no clue.] Usually > tcache:false seems to work and so avoid the failures. > > E) Use of POSIX_MADV_WILLNEED on the problematical allocation(s) > in the child process after the fork but before the swap-outs > of the child and parent prevents the failures (no read or > write access to the memory from the child until after the > swap-in). Doing so just in the parent process does not prevent > the failures. > > F) Similar to (E) but read-accessing a byte or more of one or > more pages from the problematical allocations from the child > process after the fork but before the swap-out makes those > specific pages not fail. (The others still fail, if any.) > Done from the parent process instead does not avoid the > failures. > > G) In a sequence like: su creates a sh which then runs one > of my test programs that then forks off a child it can be > that all of the 4 processes show the zeroed memory area > like the child process does. su and sh need to have > swapped-out and back in for them to get failures. su and > sh die once they hit an assert that fails due to the zeroed > memory page(s). The asserts involve addresses also messed > up in the test program processes (parent and child). > > In my reading I've not been able to determine what to expect > for fork-then-swap-out-and-back-in for pages that the child > process had not accessed yet but which might not be around > for later activity because of the parent process's own > swapped-out status at the time. > > Note: While I usually run a non-debug kernel I've tried > a debug kernel and it provided no notices of problems. I > got no additional information from the attempt. > > [My usual KERNCONF file includes GENERIC and then disables > various debug items.] > > The bugzilla reports have example code for showing the > problems and various behaviors. The two in 217239 are > probably of more interest than the first one on 217138. I just updated the pine64+ 2GB to head -r315870 and it still gets the trashed-with-zeros pages from sequences such as: allocation(s) (tcache in use with fitting <= SMALL_MAXCLASS) initialize them (to non-zero bytes) fork sleep/wait then swap-out [zero RES(ident memory)] (both parent and child in my tests) (Note the lack of access so far on the child process side.) swap-in After swap-in both the Child and Parent see the indicated allocation(s) as having only zero bytes instead of the initialization values. === Mark Millard markmi at dsl-only.net