From owner-freebsd-current@freebsd.org Fri Jul 24 13:03:20 2015 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E75979A9AD4 for ; Fri, 24 Jul 2015 13:03:19 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id CBB5916E2 for ; Fri, 24 Jul 2015 13:03:19 +0000 (UTC) (envelope-from david@catwhisker.org) Received: by mailman.ysv.freebsd.org (Postfix) id C8AB89A9AD3; Fri, 24 Jul 2015 13:03:19 +0000 (UTC) Delivered-To: current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C82AB9A9AD0 for ; Fri, 24 Jul 2015 13:03:19 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (mx.catwhisker.org [198.144.209.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7DF7716E0; Fri, 24 Jul 2015 13:03:18 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.15.2/8.15.2) with ESMTP id t6OD3HfL063020; Fri, 24 Jul 2015 06:03:17 -0700 (PDT) (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.15.2/8.15.2/Submit) id t6OD3HQw063019; Fri, 24 Jul 2015 06:03:17 -0700 (PDT) (envelope-from david) Date: Fri, 24 Jul 2015 06:03:17 -0700 From: David Wolfskill To: Ian Lepore , current@freebsd.org Subject: Re: Segmentation fault running ntpd Message-ID: <20150724130317.GS27865@albert.catwhisker.org> Mail-Followup-To: David Wolfskill , Ian Lepore , current@freebsd.org References: <20150719183600.GF1217@albert.catwhisker.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="Rli/F08USW5HCFIx" Content-Disposition: inline In-Reply-To: <20150719183600.GF1217@albert.catwhisker.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Jul 2015 13:03:20 -0000 --Rli/F08USW5HCFIx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jul 19, 2015 at 11:36:00AM -0700, David Wolfskill wrote: > On Sun, Jul 19, 2015 at 10:24:11AM -0600, Ian Lepore wrote: > > ... > > Was there anything (at all) in /var/log/messages about ntpd? Even the > > routine messages (such as what interfaces it binds to) might give a bit > > of a clue about how far it got in its init before it died.=20 > > .... >=20 > Sorry; there might have been something yesterday... > If I do get another recurrence, I'll try to gather a bit more > information. > .... OK; got another one. This time, I have the complete /var/log/messages for a verbose boot, =66rom that boot to just a bit after the ntpd crash; it's in ; as of the moment, that contains: [PARENTDIR] Parent Directory - =20 [ ] CANARY 2015-03-22 10:03 15K =20 [ ] CANARY.gz 2015-03-22 10:03 6.3K =20 [ ] ntpd.core 2015-07-24 05:31 13M =20 [ ] ntpd.core.gz 2015-07-24 05:31 124K =20 [TXT] ntpd_crash_msgs.txt 2015-07-24 05:40 138K =20 [ ] ntpd_crash_msgs.txt.gz 2015-07-24 05:40 19K =20 This was running: FreeBSD g1-245.catwhisker.org 11.0-CURRENT FreeBSD 11.0-CURRENT #133 r2858= 36M/285836:1100077: Fri Jul 24 05:24:41 PDT 2015 root@g1-245.catwhisker= =2Eorg:/common/S4/obj/usr/src/sys/CANARY amd64 Trying "gdb /usr/obj/usr/src/usr.sbin/ntp/ntpd/ntpd ntpd.core" still doesn't help much: This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols f= ound)... Core was generated by `ntpd'. Program terminated with signal 11, Segmentation fault. Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done. =2E.. Loaded symbols for /libexec/ld-elf.so.1 #0 0x00000008011cd6a0 in sbrk () from /lib/libc.so.7 [New Thread 801c07400 (LWP 100133/)] [New Thread 801c06400 (LWP 100132/)] (gdb) bt #0 0x00000008011cd6a0 in sbrk () from /lib/libc.so.7 #1 0x00000008ccbd4f34 in ?? () #2 0x0000000000000005 in ?? () #3 0x0000000801800448 in ?? () #4 0x00000008011ca888 in sbrk () from /lib/libc.so.7 #5 0x00000008018000c8 in ?? () #6 0x00000008018000c0 in ?? () #7 0x0000000000000208 in ?? () #8 0x0000000801c32fb0 in ?? () #9 0x0000000000000001 in ?? () #10 0x0000000801cc20c8 in ?? () #11 0x0000000000000030 in ?? () #12 0x0000000801cc20c8 in ?? () #13 0x00007fffffffe480 in ?? () #14 0x00000008011cd240 in sbrk () from /lib/libc.so.7 #15 0x0000000000000280 in ?? () #16 0x00000008014bbc70 in malloc_message () from /lib/libc.so.7 #17 0x00000008018000c0 in ?? () #18 0x0000000801800448 in ?? () #19 0x0000000000000032 in ?? () #20 0x0000000801800458 in ?? () #21 0x00000008014bbc68 in malloc_message () from /lib/libc.so.7 #22 0x0000000801cc2000 in ?? () #23 0x00000008014bba60 in malloc_message () from /lib/libc.so.7 #24 0x0000000801cc20d8 in ?? () #25 0x00000000000000a0 in ?? () #26 0x0000000000000208 in ?? () #27 0x00007fffffffe4d0 in ?? () #28 0x00000008011bdd7a in _malloc_thread_cleanup () from /lib/libc.so.7 Previous frame inner to this frame (corrupt stack?) (gdb)=20 I am presently suspecting that it's a bit dependent on ... well, "timing". I have my ~/.xsession set up so that once I've entered the passphrase(s) for my SSH private key(s), scripts start running to establish connections to other machines -- e.g., open an xterm locally, ssh over to my mailhub and (re-)establish a tmux session on that machine where I run mutt to read & write email (such as this message). While that almost always Just Works in stable/10, it's rather ... spottier ... in head -- I'd say it's about a 50% probability that it will work, vs. the ssh connection attempt hanging, and eventually timing out. But if I've waited (say) 30 seconds or so, I can establish such a connection easily. Granted, I am using wireless (802.11), but I get a sense that "things" are claimed to be "ready to go" a bit prematurely -- at least sometimes. Peace, david --=20 David H. Wolfskill david@catwhisker.org Those who murder in the name of God or prophet are blasphemous cowards. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --Rli/F08USW5HCFIx Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQJ8BAEBCgBmBQJVsjeVXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ4RThEMDY4QTIxMjc1MDZFRDIzODYzRTc4 QTY3RjlDOERFRjQxOTNCAAoJEIpn+cje9Bk7ylkQAIk6NiZINHEewN0gN5BLQgQb soEvFg0NOzKiarOh9M4GOVlaYzRMT8l+pdhU/wcQitMJ4bqG/xgPMeISryu0RKLv JUDzmn5UD6wlgShGKd0cW1MUnm8sd43vnBJNV1+Pchp1D5KVinORRJ5ju249HWiI l/wJ6YPGvkrp3yIja9gNMsHzBdO7Yx5xof6VgjyMvh+i1WO+mQMRCmFtUT1xi48r s10BoSoxP6W0SNL7/5Z6eglYy0kx2Q4KPHd1wQLEQdVFu8vS1LtzZdr/n0pJh2cT wzEzTp0/60mjGo9dZ0q+X4jleEBRMWodYEyQzqrF7c7P8MMtGRvh0IbJo04rOsDR 4UbcpvdBH6vBv80TxzTtcrnEmlaFH++z+s9YJ6NXGiSDy5ZkokhfNTTYpiOwUlIG FgfqUpDL63mQSi+0LOm0w+/xVxztFkHrla/uvpEYetEcmINm02BvT496ietU5tX9 fuTVdAGR3grldFCUTJmX7rOU6jNLDA7qgp8P+5A6whNQ71WoPkRMDXFZx9NHLMxS J3tu+Z5mYmDcafgqorrjrOkRfWyzpAWrlbi+u1h9Gkpt/os528kpnZ5IaZTgho7G Jwc8mbOEBfYQdDaU4QSM5TShwXuGvlvl7kKb5Sbs5ElfgLqrmN8HEcEUevDUK046 YvABqEyY9icmtPYVFTKU =+alI -----END PGP SIGNATURE----- --Rli/F08USW5HCFIx--