From owner-freebsd-current@freebsd.org Tue Mar 28 03:59:10 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4AC28D2155D for ; Tue, 28 Mar 2017 03:59:10 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-178.reflexion.net [208.70.211.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0A37D7BE for ; Tue, 28 Mar 2017 03:59:09 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 21249 invoked from network); 28 Mar 2017 04:01:44 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 28 Mar 2017 04:01:44 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v8.30.2) with SMTP; Mon, 27 Mar 2017 23:59:03 -0400 (EDT) Received: (qmail 27894 invoked from network); 28 Mar 2017 03:59:02 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 28 Mar 2017 03:59:02 -0000 Received: from [192.168.1.119] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 27966EC8257; Mon, 27 Mar 2017 20:59:02 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: arm64 fork/swap data corruptions: A ~110 line C program demonstrating an example (Pine64+ 2GB context) [Corrected subject: arm64!] From: Mark Millard In-Reply-To: <1C595969-C6E0-44A2-9086-E48C237263BC@dsl-only.net> Date: Mon, 27 Mar 2017 20:59:01 -0700 Cc: FreeBSD Current , FreeBSD-STABLE Mailing List Content-Transfer-Encoding: 7bit Message-Id: References: <01735A68-FED6-4E63-964F-0820FE5C446C@dsl-only.net> <16B3D614-62E1-4E58-B409-8DB9DBB35BCB@dsl-only.net> <5BEAFC6C-DA80-4D7B-AB55-977E585D1ACC@dsl-only.net> <10F50F1C-FD26-4142-9350-966312822438@dsl-only.net> <76DD9882-B4BD-4A16-A8E1-5A5FBB5A21F5@dsl-only.net> <1C595969-C6E0-44A2-9086-E48C237263BC@dsl-only.net> To: Andrew Turner , freebsd-arm X-Mailer: Apple Mail (2.3259) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Mar 2017 03:59:10 -0000 On 2017-Mar-21, at 7:21 PM, Mark Millard wrote: > On 2017-Mar-18, at 9:10 PM, Mark Millard wrote: > >> >> On 2017-Mar-18, at 5:53 PM, Mark Millard wrote: >> >>> A new, significant discovery follows. . . >>> >>> While checking out use of procstat -v I ran >>> into the following common property for the 3 >>> programs that I looked at: >>> >>> A) My small test program that fails for >>> a dynamically allocated space. >>> >>> B) sh reporting Failed assertion: "tsd_booted". >>> >>> C) su reporting Failed assertion: "tsd_booted". >>> >>> Here are example addresses from the area of >>> incorrectly zeroed memory (A then B then C): >>> >>> (lldb) print dyn_region >>> (region *volatile) $0 = 0x0000000040616000 >>> >>> (lldb) print &__je_tsd_booted >>> (bool *) $0 = 0x0000000040618520 >>> >>> (lldb) print &__je_tsd_booted >>> (bool *) $0 = 0x0000000040618520 >> >> That last above was a copy/paste error. Correction: >> >> (lldb) print &__je_tsd_booted >> (bool *) $0 = 0x000000004061d520 >> >>> The first is from dynamic allocation ending up >>> in the area. The other two are from libc.so.7 >>> globals/statics ending up in the general area. >>> >>> It looks like something is trashing a specific >>> memory area for some reason, rather independently >>> of what the program specifics are. > > I probably should have noted that the processes > involved were: child/parent then grandparent > and then great grandparent. The grandparent > was sh and the great grandparent was su. > > The ancestors in the process tree are being > damaged, not just the instances of the > program that demonstrates the problem. > >>> Other notes: >>> >>> At least for my small program showing failure: >>> >>> Being explicit about the combined conditions for failure >>> for my test program. . . >>> >>> Both tcache enabled and allocations fitting in SMALL_MAXCLASS >>> are required in order to make the program fail. >>> >>> Note: >>> >>> lldb) print __je_tcache_maxclass >>> (size_t) $0 = 32768 >>> >>> which is larger than SMALL_MAXCLASS. I've not observed >>> failures for sizes above SMALL_MAXCLASS but not exceeding >>> __je_tcache_maxclass. >>> >>> Thus tcache use by itself does not seen sufficient for >>> my program to get corruption of its dynamically allocated >>> memory: the small allocation size also matters. >>> >>> >>> Be warned that I can not eliminate the possibility that >>> the trashing changed what region of memory it trashed >>> for larger allocations or when tcache is disabled. >> >> The pine64+ 2GB eventually got into a state where: >> >> /etc/malloc.conf -> tcache:false >> >> made no difference and the failure kept occurring >> with that symbolic link in place. >> >> But after a reboot of the pin46+ 2GB >> /etc/malloc.conf -> tcache:false was again effective >> for my test program. (It was still present from >> before the reboot.) >> >> I checked the .core files and the allocated address >> assigned to dyn_region was the same in the tries >> before and after the reboot. (I had put in an >> additional raise(SIGABRT) so I'd always have >> a core file to look at.) >> >> Apparently /etc/malloc.conf -> tcache:false was >> being ignored before the reboot for some reason? > > I have also discovered that if the child process > in an example like my program does a: > > (void) posix_madvise(dyn_region, region_size, POSIX_MADV_WILLNEED); > > after the fork but before the sleep/swap-out/wait > then the problem does not happen. This is without > any read or write access to the memory between the > fork and sleep/swap-out/wait. > > By contrast such POSIX_MADV_WILLNEED use in the parent > process does not change the failure behavior. I've added another test program to bugzilla 217239 and 217138, one with thousands of 14 KiByte allocations. The test program usually ends up with them all being zeroed in the parent and child of the fork. But I've had a couple of runs where a much smaller prefix was messed up and then there were normal, expected values. #define region_size (14u*1024u) . . . #define num_regions (256u*1024u*1024u/region_size) So num_regions==18724, using up most of 256 MiBytes. Note: each region has its own 14 KiByte allocation. But dyn_regions[1296].array[0] in one example was the first normal value. In another example dyn_regions[2180].array[4096] was the first normal value. The last is interesting for being part way through an allocation's space. That but aligning with a 4 KiByte page size would seem odd for a pure-jemalloc issue. === Mark Millard markmi at dsl-only.net