From owner-freebsd-arm@freebsd.org Wed Apr 5 03:00:49 2017 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 832CFD2FE9E for ; Wed, 5 Apr 2017 03:00:49 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-81.reflexion.net [208.70.210.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48754CAC for ; Wed, 5 Apr 2017 03:00:48 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 24798 invoked from network); 5 Apr 2017 03:00:47 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 5 Apr 2017 03:00:47 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v8.40.0) with SMTP; Tue, 04 Apr 2017 23:00:47 -0400 (EDT) Received: (qmail 14817 invoked from network); 5 Apr 2017 03:00:47 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 5 Apr 2017 03:00:47 -0000 Received: from [192.168.1.119] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 83DDCEC88F3; Tue, 4 Apr 2017 20:00:46 -0700 (PDT) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: The arm64 fork-then-swap-out-then-swap-in failures: a program source for exploring them Message-Id: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net> Date: Tue, 4 Apr 2017 20:00:45 -0700 To: freebsd-arm , freebsd-hackers@freebsd.org X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Apr 2017 03:00:49 -0000 Uncommenting/commenting parts of the below program allows exploring the problems with fork-then-swap-out-then-in on arm64. Note: By swap-out I mean that zero RES(ident memory) results, for the process(s) of interest, as shown by "top -PCwaopid" . I discovered recently that swapping-out just before the fork() prevents the failure from the swapping after the fork(). Note: Without the fork() no problem happens. Without the later swap-out no problem happens. Both are required. But some activities before the fork() or between fork() and the swap-out prevent the failures. Some of the comments are based on a pine64+ 2GB context. I use stress to force swap-outs during some sleeps in the program. See also Buzilla 217239 and 217138. (I now expect that they have the same cause.) In my environment I've seen the fork-then-swap-out/swap-in failures on a pine64+ 2GB and a rpi3. They are repeatable on both. I do not have access to server-class machines, or any other arm64 machines. // swap_testing5.c // Built via (cc was clang 4.0 in my case): // // cc -g -std=3Dc11 -Wpedantic -o swaptesting5 swap_testing5.c // -O0 and -O2 also gets the problem. // Note: jemalloc's tcache needs to be enabled to get the failure. // But FreeBSD can get into a state were /etc/malloc.conf // -> 'tcache:false' is ineffective. Also: the allocation // size needs to by sufficiently small (<=3D SMALL_MAXCLASS) // to see the problem. Other comments are based on a specific // context (pine64+ 2GB). #include // for raise(.), SIGABRT (induce core dump) #include // for fork(), sleep(.) #include // for pid_t #include // for wait(.) extern void test_setup(void); // Sets up the memory byte = patterns. extern void test_check(void); // Tests the memory byte patterns. extern void memory_willneed(void); // For seeing if // = posix_madvise(.,.,POSIX_MADV_WILLNEED) // makes a difference. int main(void) { sleep(30); // Potentialy force swap-out here. // [Swap-out here does not avoid later failures.] test_setup(); test_check(); // Before potential sleep(.)/swap-out or fork(.) = [passes] sleep(30); // Potentialy force swap-out here. // [Everything below passes if swapped-out here, // no matter if there are later swap-outs // or not.] pid_t pid =3D fork(); // To test no-fork use: =3D 0; no-fork does = not fail. int wait_status =3D 0; // HERE: After fork; before sleep/swap-out/wait. // if (0 < pid) memory_willneed(); // Does not prevent either = parent or // child failure if enabled. // if (0 =3D=3D pid) memory_willneed(); // Prevents both the parent = and the // child failure. Disable to see // failure of both parent and = child. // [Presuming no prior swap-out: = that // would make everything pass.] // During sleep/wait: manually force this process to // swap out. I use something like: // stress -m 1 --vm-bytes 1800M // in another shell and ^C'ing it after top shows the // swapped status desired. 1800M just happened to work // on the Pine64+ 2GB that I was using. I watch with // top -PCwaopid [checking for zero RES(ident memory)]. if (0 < pid) { sleep(30); // Intend to swap-out during sleep. // test_check(); // Test in parent before child runs (longer = sleep). // This test fails if run for a failing = region_size // unless earlier preventing-activity happened. wait(&wait_status); // Only if test_check above passes or is // disabled above. } if (-1 !=3D wait_status && 0 <=3D pid) { if (0 =3D=3D pid) { sleep(90); } // Intend to swap-out during = sleep. test_check(); // Fails for small-enough region_size, both // parent and child processes, unless earlier // preventing-activty happened. } } // The memory and test code follows. #include // for size_t, NULL #include // for malloc(.), free(.) #include // for POSIX_MADV_WILLNEED, posix_madvise(.,.,.) #define region_size (14u*1024u) // Bad dyn_region pattern, parent and child processes examples: // 256u, 2u*1024u, 4u*1024u, 8u*1024u, 9u*1024u, 12u*1024u, = 14u*1024u // No failure examples: // 14u*1024u+1u, 15u*1024u, 16u*1024u, 32u*1024u, = 256u*1024u*1024u #define num_regions (256u*1024u*1024u/region_size) typedef volatile unsigned char value_type; struct region_struct { value_type array[region_size]; }; typedef struct region_struct region; static region * volatile dyn_regions[num_regions] =3D {NULL,}; static value_type value(size_t v) { return (value_type)((v&0xFEu)|0x1u); = } // value avoids zero values: the bad values are zeros. void test_setup(void) { for(size_t i=3D0u; i