Date: Mon, 26 Oct 2015 23:06:44 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 204048] stable/9: r289998: ntpd 4.2.8p4 DNS resolution misbehaves (occasional segfault) Message-ID: <bug-204048-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204048 Bug ID: 204048 Summary: stable/9: r289998: ntpd 4.2.8p4 DNS resolution misbehaves (occasional segfault) Product: Base System Version: 9.3-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: bin Assignee: freebsd-bugs@FreeBSD.org Reporter: jdc@koitsu.org Recent upgrade of ntpd 4.2.8p4 to stable/9 results in a daemon which behaves very very oddly. Said upgrade: http://www.freshbsd.org/commit/freebsd/r289998 My log after several manual troubleshooting attempts -- note the intermixed segfaults: Oct 26 15:38:05 icarus ntpd[1092]: giving up resolving host clock.isc.org: servname not supported for ai_socktype (9) Oct 26 15:38:23 icarus ntpd[1116]: giving up resolving host clock.isc.org: servname not supported for ai_socktype (9) pid 1139 (ntpd), uid 0: exited on signal 11 (core dumped) Oct 26 15:39:07 icarus ntpd[1176]: giving up resolving host clock.isc.org: servname not supported for ai_socktype (9) Oct 26 15:39:59 icarus ntpd[1209]: giving up resolving host ntp-1.cso.uiuc.edu: servname not supported for ai_socktype (9) Oct 26 15:40:24 icarus ntpd[1268]: giving up resolving host clock.isc.org: servname not supported for ai_socktype (9) pid 1294 (ntpd), uid 0: exited on signal 11 (core dumped) pid 1312 (ntpd), uid 0: exited on signal 11 (core dumped) Oct 26 15:44:09 icarus ntpd[1409]: giving up resolving host clock.isc.org: servname not supported for ai_socktype (9) Oct 26 15:45:26 icarus ntpd[1490]: giving up resolving host 0.freebsd.pool.ntp.org: servname not supported for ai_socktype (9) Oct 26 15:50:18 icarus ntpd[1656]: giving up resolving host tick.jrc.us: servname not supported for ai_socktype (9) Segfaults are always here: root@icarus:~ # gdb /usr/sbin/ntpd /ntpd.core ... #0 0x000000080114d79d in _malloc_postfork () from /lib/libc.so.7 [New Thread 801807c00 (LWP 100797/ntpd)] [New Thread 801807400 (LWP 100791/ntpd)] (gdb) bt #0 0x000000080114d79d in _malloc_postfork () from /lib/libc.so.7 #1 0x000000080114fb3e in _malloc_postfork () from /lib/libc.so.7 #2 0x00000008011523fe in _malloc_prefork () from /lib/libc.so.7 #3 0x0000000801154482 in calloc () from /lib/libc.so.7 #4 0x000000080117aba6 in __res_state () from /lib/libc.so.7 #5 0x000000080118698c in freeaddrinfo () from /lib/libc.so.7 #6 0x00000008011ab61a in nsdispatch () from /lib/libc.so.7 #7 0x0000000801187ffb in getaddrinfo () from /lib/libc.so.7 #8 0x0000000000474f04 in blocking_getaddrinfo () #9 0x0000000000473a43 in blocking_child_common () #10 0x00000000004737e9 in blocking_thread () #11 0x0000000800afee70 in pthread_getprio () from /lib/libthr.so.3 #12 0x0000000000000000 in ?? () Important: The behaviour seen is very strange. Basically, the daemon starts, emits one of the aforementioned DNS errors, then proceeds to either a) exit, b) crash, or c) continue running. Sometimes when the daemon exits (possibly when crashing too), it restarts itself. There have been a couple times where ps -auxwww | grep ntp returns nothing, yet a few seconds later the daemon is found running. Things I've tried which made no difference: 1. Removing -4 from $ntpd_flags (I set this because while my system has IPv6, I prefer using IPv4 everywhere) 2. Using /etc/ntp.conf (r289998) instead of my own ntp.conf There is no workaround for this other than to roll back to something prior to r289998. Googling turns up several reports of this problem, but all relate to people trying to use chroot'ing with ntpd (I DO NOT use this feature). https://mail-index.netbsd.org/current-users/2014/01/26/msg024169.html https://mail-index.netbsd.org/current-users/2014/06/01/msg024998.html One report says that use of -O1 (on ARM) relieves the problem, but crashing is seen on VAX and other platforms. (My system uses gcc, not clang, just for the record) Footnote: upgrading to stable/10 is not an option until the load average bug there is rectified (I am not the only one to report this problem). I can try to test out this ntpd on a VM running stable/10 to see if the problem there is reproducible or not. My ntp.conf (w/ comments removed): server clock.isc.org iburst server ntp-1.cso.uiuc.edu iburst server clock.psu.edu iburst server tick.jrc.us iburst server 0.us.pool.ntp.org iburst restrict default limited kod nomodify notrap nopeer noquery restrict 127.0.0.1 restrict 192.168.1.0 mask 255.255.255.0 My rc.conf ntp-related flags: # ntpd_flags: temporary workaround for https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199127 ntpd_enable="yes" ntpd_config="/conf/ME/ntp.conf" ntpd_sync_on_start="yes" ntpd_flags="-4 ${ntpd_flags}" -- You are receiving this mail because: You are the assignee for the bug.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-204048-8>