From owner-freebsd-ppc@freebsd.org Mon May 11 02:53:29 2020 Return-Path: Delivered-To: freebsd-ppc@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7F8532D3A7B for ; Mon, 11 May 2020 02:53:29 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic312-25.consmr.mail.gq1.yahoo.com (sonic312-25.consmr.mail.gq1.yahoo.com [98.137.69.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49L5942tJKz3xkM for ; Mon, 11 May 2020 02:53:27 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: GEM_aYgVM1kUIyTf_LYoiFGb0Q7rW6OYpsUWyv59NjxsvA.W1mj_sUC6My4Hxor Xw.RK_oOsQKkGfhVtwZT4UPnAONfx8ljF1I4kwjNoPzDe46arN78Q1l35UlrMQbCP.eb7JcER3PV XGYU9jzY7uQrMYb0qOGYvsdv0BhaeTxXOXtgzG20NZWxeIer6reUAFA3G5Pr81RXpZ6h..jLIl9D 2teTg3EQWNMDGp8._cA6ki3jEfys5dIuB57PHbSmJPPSm_AVvWDAIK0ml2mMfRSK6oK.NJ62SWx. sOXmnsK4lNfvTU3bubG9eFWv3wXRW.OhzQvRNu7TUwrcfoKj6Kh6.hkFWUIvm6kN2bAucNSPtz6d on10mk5d9jWx4MY3oicMbQEZEAH4zpzI8ftYWQ5aEnWFNMeTtgGbW45rJiaXpZx.hCBHhCzNxDe0 HdvjJMBbHRHWfqKwKjgGHu97m3MXe_bnq5ef782Gw0P0aMd5RaWSNbNZ5DvA8x3R45O5yM1DlvTi 1_Fpk_GZu.GEi6rxyqB.l_udPOddLljo8MQ9PphkBPP4KLK1nhU4MkRaFOmgOvIGacMz.JpotjdZ mzFfE8QIqCeavOw45n5ZR7ov2C5xihkP20WM.7OD9HszMRVNN362zapGzuFQ0mjRN.7_P0Q_1EOZ x7KNVFniO0NCCT_iToBQ0TABf8fNydPesyVQvmct.b8pLEfrBN5_j97PJIl_thYu8z5Aq8.8ULAS fAUXsLvqv9TfdOrl8wPWYBeNCRB.GT76kR9ddhPaNwsw58V96u2NJCWcRBX8QjsH.lAZ96EFCVCK ImcJWpkADaUSaUqH7ekW.jA3gFEIdCnxKLA4mFN4RKnyg6VvYkBDZQJkSL_U1QprwWfotRFTFWbx e1KEDHUydze5Bugac3l8oOAz4v2XrzdbzxXxEzqcKVLVQvmdjcj9SkCJJ.pMU3w8K5F70Wi7tYlJ xjqAXP0lmwN2hzbwHlM7lY8CDEhZks04ipicvaeEOWar7xW8i9OJihIpimuFO6BRE.c4ldLTlyS_ _3phTkpX13rgAVVcSXIiVFyFM2dvDlCz1lw4O2HsN_2Xt5MLCO_3h2u7yEpxZoLMHQanWY3XdZEY Ytb0Q2PtislE8fnzD6c0W2O8pTGXvHk2iH6Vb431KrjDBTmt3agMXXsAuzib9Xt0Gapyx3QRPskM miyX23sHgbE3SNhPkqnC7aSu8D.2vAd271A19Zc17B7iwJCfGsuy6kFrlCluRNmkGFbvx4Aj8AOc tDmBHl.RsnrsIXl73CxyFhg.Z.g8avkyde6baMwXhDbH_FpeVbn_jw0pYVInPM_5WOikgsCMQtfU 6Fe15UM2xnBz2ZAqDWA7BgLmwTn2u1DCJ9qnBORAty.kJtWNvEprgAVA9DuyvnO0e6b81p8OrFro hy2HOI23hVggrLpumFJkhfrFKJdYbvhwm0vyWFxnfiE6B_TX1i4W5gSLTWzFh Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.gq1.yahoo.com with HTTP; Mon, 11 May 2020 02:53:26 +0000 Received: by smtp414.mail.bf1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 1e0514e9cf2507050b754f47e602e546; Mon, 11 May 2020 02:53:21 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311 From: Mark Millard In-Reply-To: Date: Sun, 10 May 2020 19:53:19 -0700 Cc: Brandon Bergren , Justin Hibbits Content-Transfer-Encoding: 7bit Message-Id: References: <8479DD58-44F6-446A-9CA5-D01F0F7C1B38@yahoo.com> <17ACDA02-D7EF-4F26-874A-BB3E935CD072@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> To: "vangyzen@freebsd.org" , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD Hackers , FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3608.80.23.2.2) X-Rspamd-Queue-Id: 49L5942tJKz3xkM X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.50 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (-3.69), ipnet: 98.137.64.0/21(0.83), asn: 36647(0.66), country: US(-0.05)]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[206.69.137.98.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[206.69.137.98.rep.mailspike.net : 127.0.0.17]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2020 02:53:29 -0000 [A new kind of experiment and partial results.] Given the zero'ed memory page(s) that for some of the example contexts include a page that should not be changing after initialization in my context (jemalloc global variables), I have attempted the following for such examples: A) Run gdb B) Attach to one of the live example processes. C) Check that the page is not zeroed yet. (print/x __je_sz_size2index_tab) D) protect the page containing the start of __je_sz_size2index_tab, using 0x1 as the PROT_READ mask. (print (int)mprotect(ADDRESS,1,0x1)) E) detach. The hope was to discover which of the following was involved: A) user-space code trying to write the page should get a SIGSEGV. In this case I'd likely be able to see what code was attempting the write. B) kernel-code doing something odd to the content or mapping of memory would not (or need not) lead to SIGSEGV. In this case I'd be unlikely to see what code lead to the zeros on the page. So far I've gotten only one failure example, nfsd during its handling of a SIGUSR1. Previous nfs mounts and dismounts worked fine, not asserting, indicating that at the time the page was not zeroed. I got no evidence of SIGSEGV from an attempted user space write to the page. But the nfsd.core shows the page as zeroed and the assert having caused abort(). That suggests the kernel side of things for what leads to the zeros. It turns out that just before the "unregsiteration()" activity is "killchildren()" activity: (gdb) list 971 972 static void 973 nfsd_exit(int status) 974 { 975 killchildren(); 976 unregistration(); 977 exit(status); 978 } (frame #12) used via: (gdb) list cleanup 954 /* 955 * Cleanup master after SIGUSR1. 956 */ 957 static void 958 cleanup(__unused int signo) 959 { 960 nfsd_exit(0); 961 } . . . and (for master): (void)signal(SIGUSR1, cleanup); This suggests the possibility that the zero'd pages could be associated with killing the child processes. (I've had a past aarch64 context where forking had problems with pages that were initially common to parent and child processes. In that context having the processes swap out [not just mostly paged out] and then swap back in was involved in showing the problem. The issue was fixed and was aarch64 specific. But it leaves me willing to consider fork-related memory management as possibly odd in some way for 32-bit powerpc.) Notes . . . Another possible kind of evidence: I've gone far longer with the machine doing just normal background processing with nothing failing on its own. This suggests that the (int)mprotect(ADDRESS,1,0x1) might be changing the context --or just doing the attach and detach in gdb does. I've nothing solid in this area so I'll ignore it, other than this note. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)