From owner-freebsd-hackers@freebsd.org Sat May 9 03:58:10 2020 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 149C72DB536 for ; Sat, 9 May 2020 03:58:10 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic304-23.consmr.mail.gq1.yahoo.com (sonic304-23.consmr.mail.gq1.yahoo.com [98.137.68.204]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49Jthb6Sdmz4Bwj for ; Sat, 9 May 2020 03:58:07 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: s0_eZEgVM1kntVEPp5zUb2Pq16aoJ1O8ool0YjqYxqXmvYgL8PHMFtmIwpbNfpG pxKFMBhV3sd28Fx3ufO1P5mYcQLHKDE1m5CZzhFzyJBBMzzHtBfxBLVnqAWkMyyAuZmqZ2ScnjMj cxVGWsQGfYur.9Ku7XZW._hrh8RLnFRQCF01D.moEPnPC0WTgKFm_GXn7SYkq6.Pg6C90yZDMjqw Vtib5SySCCLgrjLLrOua8o8Uh7FLsJFZygoKZytO0hCCStPuAaW._U0kSIRb92mwxPouIJHkvBr_ nhGADpZV5WmJ0Leor8z1Ku2dfqMpuFWQDd4zljLTUQpkYSKM7TBGNXLF8WE7Q8k41QyVZecMRvN9 Ufxzs78oOVrOELk6f0Ia71uSXshU6WRWmq2jMlAZhNeZVQpGnOHOPPqL9W1ZHhiUc27yY.lE1etw QWnCahLz9ApGIXMgeyzuhnosN517oR728w6YmL2gckxVaz4XSt.gfnz0ZqwBCWsazBp.7IfcI0bR b6FzhWhZCJiveX4tnkrTfGRyKRLiyT4k.S1uNWeI8.NQpRaCsA.NYPNTYqMt_Kht2TllSaWodXbH qSCyfW3o3azjKBGB5UkknnupwGfAWQLuy.qIqu56S7sLr27rX7truy_X1uvGowvx5ZXNZaNePqxT I1ZJG29M_Z80wl6BTuJL6jXByXHuAnPe1mT_XOxkvu7ZQdUJ6AbMiiuvfTglinoJhdDYfsxB5pVP HBviAwEBQXU0H6xcu02h2hkHWis.ktNJHlSRuAuTCr85WiDzqTyfsA1iRg611eV6.GKIexXNiUG7 whjFVar0Me_25vxiy3AgllxABzMXSTA5.FTuvVrzstGE0.eicLA318w7ty2jhII4lQTQpLP1abby .dKS7I4hgQosHxEicdvmZaLdTU3cbZ4eWLh7l9VzphGNSxT15mPVyqAiz78Igiak.HBJFoqmxuMD 09jARYa2OwCSVXPG8QAa72U47SlhwJVlyhfLUnA0cNuLcM_4Lw.g3eL5SNWKlTVMGuCGBvl4JKB3 LBNhojpv9exZQmEqy7QnNOGZfMOV76T1VdOARbW5WuCvDR2RsbnqxIHeGwDvIF4m7YTIV90XhbVK qderv9zcQCJZQTJYHj7WHEiCVg2dmFQ4Fv4K6nCNAMrkzufYII4jHJo7CjwccD1TrUJDp0doHSZt 82JCGkLWphzUXwMFeYfgRQLRt3AkU5mM_4YltVflTxRDVhDGyqsUaVnN98V101ghOlN0DcOSI6Ht CKIeM6UphwsEE2HG1HovfZMnv9RN1EhaA2ia.NNY_kAxOrRS232wMCuyYz3kZ7nQ86DDl6f8SR0R 1ZA.iRPhbPWJR0rozq7W0hb3l1Vm1QHhvgUd2cYhL2cWgYnndUzL2RwfkmGjlvMIFm5AaPKoCADw sVVUYgN0GqHUK64o2bSpDQa_OWbQXF08- Received: from sonic.gate.mail.ne1.yahoo.com by sonic304.consmr.mail.gq1.yahoo.com with HTTP; Sat, 9 May 2020 03:58:04 +0000 Received: by smtp415.mail.gq1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 143842a2404404a60b46c84d00c5d4fd; Sat, 09 May 2020 03:58:02 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311 From: Mark Millard In-Reply-To: Date: Fri, 8 May 2020 20:58:02 -0700 Cc: Brandon Bergren , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: References: <8479DD58-44F6-446A-9CA5-D01F0F7C1B38@yahoo.com> <17ACDA02-D7EF-4F26-874A-BB3E935CD072@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> To: "vangyzen@freebsd.org" , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD Hackers , FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3608.80.23.2.2) X-Rspamd-Queue-Id: 49Jthb6Sdmz4Bwj X-Spamd-Bar: - X-Spamd-Result: default: False [-1.89 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.62)[-0.617,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-0.77)[-0.771,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (7.01), ipnet: 98.137.64.0/21(0.83), asn: 36647(0.66), country: US(-0.05)]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[204.68.137.98.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[204.68.137.98.rep.mailspike.net : 127.0.0.17]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.32 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 May 2020 03:58:10 -0000 [I caused nfsd to having things shifted in mmeory some to see it it tracked content vs. page boundary for where the zeros stop. Non-nfsd examples omitted.] > . . . >> nfsd hit an assert, failing ret =3D=3D sz_size2index_compute(size) >=20 > [Correction: That should have referenced sz_index2size_lookup(index).] >=20 >> (also, but a different caller of sz_size2index): >=20 > [Correction: The "also" comment should be ignored: > sz_index2size_lookup(index) is referenced below.] >=20 >>=20 >> (gdb) bt >> #0 thr_kill () at thr_kill.S:4 >> #1 0x502b2170 in __raise (s=3D6) at /usr/src/lib/libc/gen/raise.c:52 >> #2 0x50211cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67 >> #3 0x50206104 in sz_index2size_lookup (index=3D) at = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200 >> #4 sz_index2size (index=3D) at = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:207 >> #5 ifree (tsd=3D0x50094018, ptr=3D0x50041028, tcache=3D0x50094138, = slow_path=3D) at jemalloc_jemalloc.c:2583 >> #6 0x50205cac in __je_free_default (ptr=3D0x50041028) at = jemalloc_jemalloc.c:2784 >> #7 0x50206294 in __free (ptr=3D0x50041028) at = jemalloc_jemalloc.c:2852 >> #8 0x50287ec8 in ns_src_free (src=3D0x50329004, = srclistsize=3D) at /usr/src/lib/libc/net/nsdispatch.c:452 >> #9 ns_dbt_free (dbt=3D0x50329000) at = /usr/src/lib/libc/net/nsdispatch.c:436 >> #10 vector_free (vec=3D0x50329000, count=3D, esize=3D12,= free_elem=3D) at /usr/src/lib/libc/net/nsdispatch.c:253 >> #11 nss_atexit () at /usr/src/lib/libc/net/nsdispatch.c:578 >> #12 0x5028d958 in __cxa_finalize (dso=3D0x0) at = /usr/src/lib/libc/stdlib/atexit.c:240 >> #13 0x502117f8 in exit (status=3D0) at = /usr/src/lib/libc/stdlib/exit.c:74 >> #14 0x10013f9c in child_cleanup (signo=3D) at = /usr/src/usr.sbin/nfsd/nfsd.c:969 >> #15 >> #16 0x00000000 in ?? () >>=20 >> (gdb) up 3 >> #3 0x50206104 in sz_index2size_lookup (index=3D) at = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200 >> 200 assert(ret =3D=3D sz_index2size_compute(index)); >>=20 >> (ret is optimized out.) >>=20 >> 197 JEMALLOC_ALWAYS_INLINE size_t >> 198 sz_index2size_lookup(szind_t index) { >> 199 size_t ret =3D (size_t)sz_index2size_tab[index]; >> 200 assert(ret =3D=3D sz_index2size_compute(index)); >> 201 return ret; >> 202 } >=20 > (gdb) print/x __je_sz_index2size_tab > $3 =3D {0x0 } >=20 > Also: >=20 > (gdb) x/4x __je_arenas+16368/4 > 0x5030cab0 <__je_arenas+16368>: 0x00000000 0x00000000 = 0x00000000 0x00000000 > (gdb) print/x __je_arenas_lock = =20= > $8 =3D {{{prof_data =3D {tot_wait_time =3D {ns =3D 0x0}, max_wait_time = =3D {ns =3D 0x0}, n_wait_times =3D 0x0, n_spin_acquired =3D 0x0, = max_n_thds =3D 0x0, n_waiting_thds =3D {repr =3D 0x0}, n_owner_switches = =3D 0x0,=20 > prev_owner =3D 0x0, n_lock_ops =3D 0x0}, lock =3D 0x0, = postponed_next =3D 0x0, locked =3D {repr =3D 0x0}}}, witness =3D {name =3D= 0x0, rank =3D 0x0, comp =3D 0x0, opaque =3D 0x0, link =3D {qre_next =3D = 0x0,=20 > qre_prev =3D 0x0}}, lock_order =3D 0x0} > (gdb) print/x __je_narenas_auto > $9 =3D 0x0 > (gdb) print/x malloc_conf =20 > $10 =3D 0x0 > (gdb) print/x __je_ncpus=20 > $11 =3D 0x0 > (gdb) print/x __je_manual_arena_base > $12 =3D 0x0 > (gdb) print/x __je_sz_pind2sz_tab =20 > $13 =3D {0x0 } > (gdb) print/x __je_sz_size2index_tab > $1 =3D {0x0 , 0x1a, 0x1b , 0x1c = } >=20 >> Booting and immediately trying something like: >>=20 >> service nfsd stop >>=20 >> did not lead to a failure. But may be after >> a while it would and be less drastic than a >> reboot or power down. >=20 > More detail: >=20 > So, for rpcbind and nfds at some point a large part of > __je_sz_size2index_tab is being stomped on, as is all of > __je_sz_index2size_tab and more. >=20 > . . . >=20 > For nfsd, it is similar (again showing the partially > non-zero live process context instead of the all-zeros > from the .core file): >=20 > 0x5030cab0 <__je_arenas+16368>: 0x00000000 0x00000000 = 0x00000000 0x00000009 > 0x5030cac0 <__je_arenas_lock>: 0x00000000 0x00000000 = 0x00000000 0x00000000 > 0x5030cad0 <__je_arenas_lock+16>: 0x00000000 0x00000000 = 0x00000000 0x00000000 > 0x5030cae0 <__je_arenas_lock+32>: 0x00000000 0x00000000 = 0x00000000 0x00000000 > 0x5030caf0 <__je_arenas_lock+48>: 0x00000000 0x00000000 = 0x00000000 0x00000000 > 0x5030cb00 <__je_arenas_lock+64>: 0x00000000 0x502ff070 = 0x00000000 0x00000000 > 0x5030cb10 <__je_arenas_lock+80>: 0x500ebb04 0x00000003 = 0x00000000 0x00000000 > 0x5030cb20 <__je_arenas_lock+96>: 0x5030cb10 0x5030cb10 = 0x00000000 0x00000000 >=20 > Then the memory in the crash continues to be zero until: >=20 > 0x5030d000 <__je_sz_size2index_tab+384>: 0x1a1b1b1b = 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b >=20 > Notice the interesting page boundary for where non-zero > is first available again! >=20 > Between __je_arenas_lock and __je_sz_size2index_tab are: >=20 > 0x5030cb30 __je_narenas_auto > 0x5030cb38 malloc_conf > 0x5030cb3c __je_ncpus > 0x5030cb40 __je_manual_arena_base > 0x5030cb80 __je_sz_pind2sz_tab > 0x5030ccc0 __je_sz_index2size_tab > 0x5030ce80 __je_sz_size2index_tab >=20 >=20 > Note: because __je_arenas is normally > mostly zero for these contexts, I can > not tell where the memory trashing > started, only where it replaced non-zero > values with zeros. > . . . I caused the memory content to have shifted some in nfsd. The resultant zeros-stop-at from the failure look like: (gdb) x/128x __je_sz_size2index_tab 0x5030cf00 <__je_sz_size2index_tab>: 0x00000000 0x00000000 = 0x00000000 0x00000000 0x5030cf10 <__je_sz_size2index_tab+16>: 0x00000000 0x00000000 = 0x00000000 0x00000000 0x5030cf20 <__je_sz_size2index_tab+32>: 0x00000000 0x00000000 = 0x00000000 0x00000000 0x5030cf30 <__je_sz_size2index_tab+48>: 0x00000000 0x00000000 = 0x00000000 0x00000000 0x5030cf40 <__je_sz_size2index_tab+64>: 0x00000000 0x00000000 = 0x00000000 0x00000000 0x5030cf50 <__je_sz_size2index_tab+80>: 0x00000000 0x00000000 = 0x00000000 0x00000000 0x5030cf60 <__je_sz_size2index_tab+96>: 0x00000000 0x00000000 = 0x00000000 0x00000000 0x5030cf70 <__je_sz_size2index_tab+112>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030cf80 <__je_sz_size2index_tab+128>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030cf90 <__je_sz_size2index_tab+144>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030cfa0 <__je_sz_size2index_tab+160>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030cfb0 <__je_sz_size2index_tab+176>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030cfc0 <__je_sz_size2index_tab+192>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030cfd0 <__je_sz_size2index_tab+208>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030cfe0 <__je_sz_size2index_tab+224>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030cff0 <__je_sz_size2index_tab+240>: 0x00000000 = 0x00000000 0x00000000 0x00000000 0x5030d000 <__je_sz_size2index_tab+256>: 0x18191919 = 0x19191919 0x19191919 0x19191919 0x5030d010 <__je_sz_size2index_tab+272>: 0x19191919 = 0x19191919 0x19191919 0x19191919 0x5030d020 <__je_sz_size2index_tab+288>: 0x19191919 = 0x19191919 0x19191919 0x19191919 0x5030d030 <__je_sz_size2index_tab+304>: 0x19191919 = 0x19191919 0x19191919 0x19191919 0x5030d040 <__je_sz_size2index_tab+320>: 0x191a1a1a = 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a 0x5030d050 <__je_sz_size2index_tab+336>: 0x1a1a1a1a = 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a 0x5030d060 <__je_sz_size2index_tab+352>: 0x1a1a1a1a = 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a 0x5030d070 <__je_sz_size2index_tab+368>: 0x1a1a1a1a = 0x1a1a1a1a 0x1a1a1a1a 0x1a1a1a1a 0x5030d080 <__je_sz_size2index_tab+384>: 0x1a1b1b1b = 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x5030d090 <__je_sz_size2index_tab+400>: 0x1b1b1b1b = 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x5030d0a0 <__je_sz_size2index_tab+416>: 0x1b1b1b1b = 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x5030d0b0 <__je_sz_size2index_tab+432>: 0x1b1b1b1b = 0x1b1b1b1b 0x1b1b1b1b 0x1b1b1b1b 0x5030d0c0 <__je_sz_size2index_tab+448>: 0x1b1c1c1c = 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x5030d0d0 <__je_sz_size2index_tab+464>: 0x1c1c1c1c = 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x5030d0e0 <__je_sz_size2index_tab+480>: 0x1c1c1c1c = 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c 0x5030d0f0 <__je_sz_size2index_tab+496>: 0x1c1c1c1c = 0x1c1c1c1c 0x1c1c1c1c 0x1c1c1c1c So, it is the page boundary that it tracks, not the detailed placement of the memory contents. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)