From owner-freebsd-arm@freebsd.org Fri Mar 31 03:54:49 2017 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BA0D1D26662 for ; Fri, 31 Mar 2017 03:54:49 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-14.reflexion.net [208.70.210.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7215EBC6 for ; Fri, 31 Mar 2017 03:54:48 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 3283 invoked from network); 31 Mar 2017 03:54:47 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 31 Mar 2017 03:54:47 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v8.30.2) with SMTP; Thu, 30 Mar 2017 23:54:46 -0400 (EDT) Received: (qmail 1156 invoked from network); 31 Mar 2017 03:54:46 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 31 Mar 2017 03:54:46 -0000 Received: from [192.168.1.119] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id EC716EC7848; Thu, 30 Mar 2017 20:54:45 -0700 (PDT) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Is this procstat -v output valid/expected? Explanation? Message-Id: Date: Thu, 30 Mar 2017 20:54:45 -0700 Cc: freebsd-arm To: freebsd-hackers@freebsd.org X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Mar 2017 03:54:49 -0000 The following is based on a test-case program that: A) Allocates lots of 14KiByte "regions" with malloc, initializing each byte of each region to a never-zero pattern of bytes. "Lots" uses up most of 256 MiBytes across the regions. Once initialized none of these bytes are written again (not even by the later child process). B) Tests the byte patterns (SIGABRT if the pattern test fails). C) Forks. D) The parent waits for child; child sleeps 60 sec. Note: In the full test forcing swapping of both is involved during the sleep but that is not being done here. E) The child checks the byte patterns and exits (SIGABRT if the pattern test fails). The child does not write to any of the allocation regions. F) The (former) parent checks the byte patterns and exits (SIGABRT if the pattern test fails). It does not write to any of the allocated regions during this activity. [The context happens to be arm64.] Note the two instances of "67306" in the below from what will become the parent process: # procstat -v 6310 PID START END PRT RES PRES REF SHD FLAG = TP PATH 6310 0x10000 0x11000 r-- 1 51 3 1 CN-- = vn /root/c_tests/swaptesting2 6310 0x20000 0x21000 r-x 1 51 3 1 CN-- = vn /root/c_tests/swaptesting2 6310 0x30000 0x40000 rw- 16 0 1 0 C--- = vn /root/c_tests/swaptesting2 6310 0x40000 0x41000 r-- 1 38 2 0 ---- = df=20 6310 0x41000 0x75000 rw- 37 38 2 0 ---- = df=20 6310 0x40030000 0x4004b000 r-x 27 29 59 27 CN-- = vn /libexec/ld-elf.so.1 6310 0x4004b000 0x40052000 rw- 7 7 1 0 ---- = df=20 6310 0x4005a000 0x4005b000 rw- 1 0 1 0 C--- = vn /libexec/ld-elf.so.1 6310 0x4005b000 0x4005c000 rw- 1 1 1 0 ---- = df=20 6310 0x4005c000 0x401b4000 r-x 344 376 59 27 CN-- = vn /lib/libc.so.7 6310 0x401b4000 0x401c3000 --- 0 0 1 0 ---- = df=20 6310 0x401c3000 0x401cf000 rw- 12 0 1 0 C--- = vn /lib/libc.so.7 6310 0x401cf000 0x40202000 rw- 22 67306 2 0 ---- = df=20 6310 0x40400000 0x50e00000 rw- 67284 67306 2 0 ---- = df=20 6310 0xfffffffdf000 0xfffffffff000 rw- 3 3 1 0 ---D = df=20 6310 0xfffffffff000 0x1000000000000 r-x 1 1 35 0 ---- = ph=20 Later after the fork (so child sleeping and parent waiting) it as turned into: # procstat -v 6310 PID START END PRT RES PRES REF SHD FLAG = TP PATH 6310 0x10000 0x11000 r-- 1 51 5 1 CN-- = vn /root/c_tests/swaptesting2 6310 0x20000 0x21000 r-x 1 51 5 1 CN-- = vn /root/c_tests/swaptesting2 6310 0x30000 0x40000 rw- 16 0 1 0 C--- = vn /root/c_tests/swaptesting2 6310 0x40000 0x41000 r-- 1 1 2 0 CN-- = df=20 6310 0x41000 0x75000 rw- 37 37 2 0 CN-- = df=20 6310 0x40030000 0x4004b000 r-x 27 29 60 27 CN-- = vn /libexec/ld-elf.so.1 6310 0x4004b000 0x40052000 rw- 7 0 1 0 C--- = df=20 6310 0x4005a000 0x4005b000 rw- 1 0 2 0 CN-- = vn /libexec/ld-elf.so.1 6310 0x4005b000 0x4005c000 rw- 1 0 1 0 C--- = df=20 6310 0x4005c000 0x401b4000 r-x 344 376 60 27 CN-- = vn /lib/libc.so.7 6310 0x401b4000 0x401c3000 --- 0 0 2 0 CN-- = df=20 6310 0x401c3000 0x401cf000 rw- 12 0 2 0 CN-- = vn /lib/libc.so.7 6310 0x401cf000 0x40202000 rw- 22 22 2 0 CN-- = df=20 6310 0x40400000 0x50e00000 rw- 67284 67284 2 0 CN-- = df=20 6310 0xfffffffdf000 0xfffffffff000 rw- 3 0 1 0 C--D = df=20 6310 0xfffffffff000 0x1000000000000 r-x 1 1 36 0 ---- = ph=20 The child never shows the large PRES figure for the range: 0x401cf000 0x40202000 But for the size of that range the earlier PRES=3D=3D67306 seems odd, as if it spans the following: 0x40400000 0x50e0000 In fact 22+67284=3D=3D67306. Another point that I noticed that the I found SHD stays zero on the memory area spanning the allocations (0x40400000 0x50e00000) and more: (This was during the child's sleep.) # procstat -v 6313 PID START END PRT RES PRES REF SHD FLAG = TP PATH 6313 0x10000 0x11000 r-- 1 51 5 1 CN-- = vn /root/c_tests/swaptesting2 6313 0x20000 0x21000 r-x 1 51 5 1 CN-- = vn /root/c_tests/swaptesting2 6313 0x30000 0x40000 rw- 16 0 1 0 C--- = vn /root/c_tests/swaptesting2 6313 0x40000 0x41000 r-- 1 1 2 0 CN-- = df=20 6313 0x41000 0x75000 rw- 37 37 2 0 CN-- = df=20 6313 0x40030000 0x4004b000 r-x 27 29 60 27 CN-- = vn /libexec/ld-elf.so.1 6313 0x4004b000 0x40052000 rw- 7 0 1 0 C--- = df=20 6313 0x4005a000 0x4005b000 rw- 1 0 2 0 CN-- = vn /libexec/ld-elf.so.1 6313 0x4005b000 0x4005c000 rw- 1 0 1 0 C--- = df=20 6313 0x4005c000 0x401b4000 r-x 344 376 60 27 CN-- = vn /lib/libc.so.7 6313 0x401b4000 0x401c3000 --- 0 0 2 0 CN-- = df=20 6313 0x401c3000 0x401cf000 rw- 12 0 2 0 CN-- = vn /lib/libc.so.7 6313 0x401cf000 0x40202000 rw- 22 22 2 0 CN-- = df=20 6313 0x40400000 0x50e00000 rw- 67284 67284 2 0 CN-- = df=20 6313 0xfffffffdf000 0xfffffffff000 rw- 3 0 1 0 C--D = df=20 6313 0xfffffffff000 0x1000000000000 r-x 1 1 36 0 ---- = ph=20 For: 0x40400000 0x50e00000 (and more) my first thought was that forking would shadow for copy-on-write and so the shadow page count would be non-zero in one or both of the parent vs. child. But Ive never seen procstat -v report such a figure for the range holding the allocations. The REF=3D=3D2 also seems odd: it lasts from before the fork through after it as well, both parent and child processes still existing. It would seem that the REF's are not per-process. Context details: # uname -paKU FreeBSD pine64 12.0-CURRENT FreeBSD 12.0-CURRENT r315914M arm64 = aarch64 1200027 1200027 FYI: the source code is. . . (Ignore comments tied to swapping and its/the "problem" for this question.) # more swap_testing2.c // swap_testing2.c // Built via (c++ was clang++ 4.0 in my case): // // cc -g -std=3Dc11 -Wpedantic -o swaptesting2 swap_testing2.c // -O0 and -O2 also gets the problem. #include // for fork(), sleep(.) #include // for pid_t #include // for wait(.) #include // for raise(.), SIGABRT extern void test_setup(void); // Sets up the memory byte = patterns. extern void test_check(void); // Tests the memory byte patterns. extern void partial_test_check(void); // Tests just [0] of = dyn_regions[0] int main(void) { test_setup(); test_check(); // Before fork() [passes] pid_t pid =3D fork(); int wait_status =3D 0;; // After fork; before waitsleep/swap-out. //if (0=3D=3Dpid) partial_test_check(); // Even the above is sufficient by // itself to prevent failure for // region_size 1u through // 4u*1024u! // But 4u*1024u+1u and above fail // with this access to memory. // The failing test is of // (*dyn_regions[0]).array[4096u]. // This test never fails here. if (0 // for size_t, NULL #include // for malloc(.), free(.) #define region_size (14u*1024u) // Bad dyn_regions patterns, parent and child // processes: // 256u, 2u*1024u, 4u*1024u, 8u*1024u, // 9u*1024u, 12u*1024u, 14u*1024u // (but see the partial_test_check() call // notes above). // Works: // 14u*1024u+1u, 15u*1024u, 16u*1024u, // 32u*1024u, 256u*1024u*1024u #define num_regions (256u*1024u*1024u/region_size) typedef volatile unsigned char value_type; struct region_struct { value_type array[region_size]; }; typedef struct region_struct region; static region * volatile dyn_regions[num_regions] =3D {NULL,}; static value_type value(size_t v) { return (value_type)((v&0xFEu)|0x1u); = } // value now avoids the zero value since the failures // are zeros. void test_setup(void) { for(size_t i=3D0u; i