From owner-freebsd-current@freebsd.org Mon Sep 9 21:12:24 2019 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D6B01E175F for ; Mon, 9 Sep 2019 21:12:24 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from outbound2m.ore.mailhop.org (outbound2m.ore.mailhop.org [54.149.155.156]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46S1883K7mz4fd9 for ; Mon, 9 Sep 2019 21:12:24 +0000 (UTC) (envelope-from ian@freebsd.org) ARC-Seal: i=1; a=rsa-sha256; t=1568063543; cv=none; d=outbound.mailhop.org; s=arc-outbound20181012; b=oqbi0IAv/cKwJMQFruTgPztIVmNBTbTr9ZidypcZf2YRLTA8Lm4B+7F3Zq7x0fh7yp/VXnihPeFll 40iyxUuymg+TPEZaIWXJzlHaDcWdIuwW8EKnekgvf7VdDEHlAUC/OfYbrf3QTNAQTmbwSk05443JGs Iw5J7XO9hxMjUuximoK8HWEDm+CYoNC9+8h1VZqz1xRyKZVFdGTiMqnYpl70PrO5U50LmJol1wGeAY 4EubSgMZTdNpkYHxPwljT6kbOSm82bDppJCBQZY1nyph/lkxJey5COYWG/p9g0mUKHY0lbyyPSF4Cd xGxD7yfrV7ghdWNaIvQgUMAcyCHL07g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=outbound.mailhop.org; s=arc-outbound20181012; h=content-transfer-encoding:mime-version:content-type:references:in-reply-to: date:cc:to:from:subject:message-id:dkim-signature:from; bh=Bv2ydlIUZ+kCz/pyFE3b2tNML0DRTgLRmiKYZUrQKvk=; b=Ggl4wP5AJLe9h6yE3WNlbv0DwvxlJ1NlfcVttSosn4LatykkN1GhcoZbtU1xv9anmXO5S4ljCHtO3 qu013Qt+1HKJqr1ue53Gxx8xmj5ukr/amhpgdM4gZw5srH+R0WvFWpcuZmSv1FRPi/P5+/mIQvkpRJ tfXvKNfwQDWiCFjU3xsaiBLJubVqL+BUJS0hruPFXpxX/tFdsKT3gkjIuHsvI6ar/lo740uaM1cvwL lMj5pKUtVYAMlleVmPQaRvcY8QeYFNJ+npxnrghH3kl1ZpimBZPkUdIK+U8vUYj4i6/WZg1YCAkDkF 4c9gs8pPTXznOXYOI6pb7//yNK81TOA== ARC-Authentication-Results: i=1; outbound4.ore.mailhop.org; spf=softfail smtp.mailfrom=freebsd.org smtp.remote-ip=67.177.211.60; dmarc=none header.from=freebsd.org; arc=none header.oldest-pass=0; DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outbound.mailhop.org; s=dkim-high; h=content-transfer-encoding:mime-version:content-type:references:in-reply-to: date:cc:to:from:subject:message-id:from; bh=Bv2ydlIUZ+kCz/pyFE3b2tNML0DRTgLRmiKYZUrQKvk=; b=H17egwVEMpkmRlxWj+DYCZsPcoVOZXK636hAGZWnJkYTGEmuJamQCamzOLTeAhMeJdooAq3ZeDLoJ 1RjbRbVQPktOqplLbecgIDcbM0L5kyaHjFg5NdN52llusnQq3/9w6MoTVUlcX6BXgc5yyuDKdku7ma 4b5Y2HcmJgRbU5Gs6jedJaJ683qQjpaz6Ys1XNPzBSZ+xkpFk2quvcaZhfB5pE9kf3zGuwrDFI4rfJ hiKSpoVU9nLUuInIJ9VOD5G954rsWAidE+av9Bo3p7XdG4XhUMh1WARf+O+JbH4AwjSFdlg12QDnQV TKkala8PPUvJN/YzKYt9yypPWd8Y8LQ== X-MHO-RoutePath: aGlwcGll X-MHO-User: 82052336-d346-11e9-85ed-13b9aae3a1d2 X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 67.177.211.60 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [67.177.211.60]) by outbound4.ore.mailhop.org (Halon) with ESMTPSA id 82052336-d346-11e9-85ed-13b9aae3a1d2; Mon, 09 Sep 2019 21:12:21 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.15.2) with ESMTP id x89LCJk0063759; Mon, 9 Sep 2019 15:12:19 -0600 (MDT) (envelope-from ian@freebsd.org) Message-ID: Subject: Re: ntpd segfaults on start From: Ian Lepore To: Konstantin Belousov Cc: "Rodney W. Grimes" , Cy Schubert , Harlan Stenn , Vladimir Zakharov , freebsd-current@freebsd.org Date: Mon, 09 Sep 2019 15:12:18 -0600 In-Reply-To: <20190909184446.GU2559@kib.kiev.ua> References: <201909091630.x89GUjGX044288@gndrsh.dnsmgr.net> <20190909184446.GU2559@kib.kiev.ua> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 FreeBSD GNOME Team Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 46S1883K7mz4fd9 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-2.00 / 15.00]; TAGGED_RCPT(0.00)[]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; ASN(0.00)[asn:16509, ipnet:54.148.0.0/15, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Sep 2019 21:12:24 -0000 On Mon, 2019-09-09 at 21:44 +0300, Konstantin Belousov wrote: > On Mon, Sep 09, 2019 at 12:13:24PM -0600, Ian Lepore wrote: > > On Mon, 2019-09-09 at 09:30 -0700, Rodney W. Grimes wrote: > > > > On Sat, 2019-09-07 at 09:28 -0700, Cy Schubert wrote: > > > > > In message <20190907161749.GJ2559@kib.kiev.ua>, Konstantin > > > > > Belousov writes: > > > > > > On Sat, Sep 07, 2019 at 08:45:21AM -0700, Cy Schubert > > > > > > wrote: > > > > > > > [...] > > > > > > Doesn't locking this memory down also protect ntpd from OOM kills? > > > If so that is a MUST preserve functionality, as IMHO killing ntpd > > > on a box that has it configured is a total no win situation. > > > > > > > Does it have that effect? I don't know. But I would argue that that's > > a separate issue, and we should make that happen by adding > > ntpd_oomprotect=YES to /etc/defaults/rc.conf > > Wiring process memory has no effect on OOM selection. More, because > all potentially allocated pages are allocated for real after mlockall(), > the size of the vmspace, as accounted by OOM, is the largest possible > size from the whole lifetime. > > On the other hand, the code execution times are not predictable if the > process's pages can be paged out. Under severe load next instruction > might take several seconds or even minutes to start. It is quite unlike > the scheduler delays. That introduces a jitter in the local time > measurements and their usage as done in userspace. Wouldn't this affect > the accuracy ? > IMO, there is a large gap between "in theory, paging could cause indeterminate delays in code execution" and "time will be inaccurate on your system". If there were a delay in a part of the code where it matters that amounted to "seconds or even minutes", what you'd end up with is a measurement that would be discarded by the median filter as an outlier. There would be some danger that if that kind of delay happened for too many polling cycles in a row, you'd end up with no usable measurements after a while and clock accuracy would suffer. Sub-second delays would be more worriesome because they might not be rejected as outliers. There are only a couple code paths in freebsd ntpd processing where a paging (or scheduling) delay could cause measurement inaccuracy: - When stepping the clock, the code that runs between calling clock_gettime() and calling clock_settime() to apply the step adjustment to the clock. - When beginning an exchange with or replying to a peer, the code that runs between obtaining system time for the outgoing Transmit Timestamp and actually transmitting that packet. Stepping the clock typically only happens once at startup. The ntpd code itself recognizes that this is a time-critical path (it has comments to that effect) but unfortunately the code that runs is scattered among several different .c files so it's hard to say what the likelyhood is that code in the critical section will all be in the same page (or be already-resident because other startup-time code faulted in those pages). IMO, the right fix for this would be a kernel interface that let you apply a step-delta to the clock with a single syscall (perhaps as an extension to the existing ntp_adjtime() using a new mode flag). On freebsd, the Receive timestamps are captured in the kernel and delivered along with the packet to userland, and are retrieved by the ntpd code from the SCM_BINTIME control message in the packet, so there is no latency problem in the receive path. There isn't a corresponding kernel mechanism for setting the outgoing timestamps, so whether it's originating a request to a peer or replying to a request from a peer, the transmit timestamp could be wrong due to: - paging delays - scheduler delays - network stack, outgoing queues, and driver delays So the primary vulnerability is on the transmit path between obtaining system time and the packet leaving the system. A quick glance at that code makes me think that most of the data being touched has already been referenced pretty recently during the process of assembling the outgoing packet, so it's unlikely that storing the timestamp into the outgoing packet or the other bit of work that happens after that triggers a pagein unless the system is pathologically overloaded. Naturally, obtaining the timestamp and putting it into the packet is one of the last things it does before sending, so the code path is relatively short, but it's not clear to me whether it's likely or not that the code involved all lives in the same page. Still, it's one of the heavily exercised paths within ntpd, which should increase the odds of the pages being resident because of recent use. So, I'm not disputing the point that a sufficiently overloaded system can lead to an indeterminate delay between *any* two instructions executed in userland. What I've said above is more along the lines of considering the usual situation, not the most pathlogical one. In the most pathological cases, either the delays introduced are fairly minor and you get some minor jitter in system time (ameliorated by the median filtering built in to ntpd), or the delays are major (a full second or more) and get rejected as outliers, not affecting system time at all unless the situation persists and prevents getting any good measurements for many hours. -- Ian