From owner-freebsd-current@freebsd.org Sun Mar 19 04:10:39 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 746BDD134E9 for ; Sun, 19 Mar 2017 04:10:39 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-173.reflexion.net [208.70.211.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 244B1B9B for ; Sun, 19 Mar 2017 04:10:38 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 3309 invoked from network); 19 Mar 2017 04:11:27 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 19 Mar 2017 04:11:27 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v8.30.2) with SMTP; Sun, 19 Mar 2017 00:10:36 -0400 (EDT) Received: (qmail 2340 invoked from network); 19 Mar 2017 04:10:36 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 19 Mar 2017 04:10:36 -0000 Received: from [192.168.1.111] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 70116EC894B; Sat, 18 Mar 2017 21:10:35 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: arm64 fork/swap data corruptions: A ~110 line C program demonstrating an example (Pine64+ 2GB context) [Corrected subject: arm64!] From: Mark Millard In-Reply-To: <76DD9882-B4BD-4A16-A8E1-5A5FBB5A21F5@dsl-only.net> Date: Sat, 18 Mar 2017 21:10:34 -0700 Cc: FreeBSD Current , FreeBSD-STABLE Mailing List Content-Transfer-Encoding: 7bit Message-Id: References: <01735A68-FED6-4E63-964F-0820FE5C446C@dsl-only.net> <16B3D614-62E1-4E58-B409-8DB9DBB35BCB@dsl-only.net> <5BEAFC6C-DA80-4D7B-AB55-977E585D1ACC@dsl-only.net> <10F50F1C-FD26-4142-9350-966312822438@dsl-only.net> <76DD9882-B4BD-4A16-A8E1-5A5FBB5A21F5@dsl-only.net> To: Andrew Turner , freebsd-arm X-Mailer: Apple Mail (2.3259) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Mar 2017 04:10:39 -0000 On 2017-Mar-18, at 5:53 PM, Mark Millard wrote: > A new, significant discovery follows. . . > > While checking out use of procstat -v I ran > into the following common property for the 3 > programs that I looked at: > > A) My small test program that fails for > a dynamically allocated space. > > B) sh reporting Failed assertion: "tsd_booted". > > C) su reporting Failed assertion: "tsd_booted". > > Here are example addresses from the area of > incorrectly zeroed memory (A then B then C): > > (lldb) print dyn_region > (region *volatile) $0 = 0x0000000040616000 > > (lldb) print &__je_tsd_booted > (bool *) $0 = 0x0000000040618520 > > (lldb) print &__je_tsd_booted > (bool *) $0 = 0x0000000040618520 That last above was a copy/paste error. Correction: (lldb) print &__je_tsd_booted (bool *) $0 = 0x000000004061d520 > The first is from dynamic allocation ending up > in the area. The other two are from libc.so.7 > globals/statics ending up in the general area. > > It looks like something is trashing a specific > memory area for some reason, rather independently > of what the program specifics are. > > > Other notes: > > At least for my small program showing failure: > > Being explicit about the combined conditions for failure > for my test program. . . > > Both tcache enabled and allocations fitting in SMALL_MAXCLASS > are required in order to make the program fail. > > Note: > > lldb) print __je_tcache_maxclass > (size_t) $0 = 32768 > > which is larger than SMALL_MAXCLASS. I've not observed > failures for sizes above SMALL_MAXCLASS but not exceeding > __je_tcache_maxclass. > > Thus tcache use by itself does not seen sufficient for > my program to get corruption of its dynamically allocated > memory: the small allocation size also matters. > > > Be warned that I can not eliminate the possibility that > the trashing changed what region of memory it trashed > for larger allocations or when tcache is disabled. The pine64+ 2GB eventually got into a state where: /etc/malloc.conf -> tcache:false made no difference and the failure kept occurring with that symbolic link in place. But after a reboot of the pin46+ 2GB /etc/malloc.conf -> tcache:false was again effective for my test program. (It was still present from before the reboot.) I checked the .core files and the allocated address assigned to dyn_region was the same in the tries before and after the reboot. (I had put in an additional raise(SIGABRT) so I'd always have a core file to look at.) Apparently /etc/malloc.conf -> tcache:false was being ignored before the reboot for some reason? === Mark Millard markmi at dsl-only.net