From owner-freebsd-current@FreeBSD.ORG Wed Feb 4 14:29:49 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 29DBFB96 for ; Wed, 4 Feb 2015 14:29:49 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8EE12DD5 for ; Wed, 4 Feb 2015 14:29:48 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t14EThqi077740 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 4 Feb 2015 16:29:43 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t14EThqi077740 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t14ETfsi077739; Wed, 4 Feb 2015 16:29:41 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 4 Feb 2015 16:29:41 +0200 From: Konstantin Belousov To: Peter Wemm Subject: Re: PSA: If you run -current, beware! Message-ID: <20150204142941.GE42409@kib.kiev.ua> References: <8089702.oYScRm8BTN@overcee.wemm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8089702.oYScRm8BTN@overcee.wemm.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: 'freebsd-current' X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Feb 2015 14:29:49 -0000 On Tue, Feb 03, 2015 at 01:33:15PM -0800, Peter Wemm wrote: > Sometime in the Dec 10th through Jan 7th timeframe a timing bug has been > introduced to 11.x/head/-current. With HZ=1000 (the default for bare metal, > not for a vm); the clocks stop just after 24 days of uptime. This means > things like cron, sleep, timeouts etc stop working. TCP/IP won't time out or > retransmit, etc etc. It can get ugly. > > The problem is NOT in 10.x/-stable. > > We hit this in the freebsd.org cluster, the builds that we used are: > FreeBSD 11.0-CURRENT #0 r275684: Wed Dec 10 20:38:43 UTC 2014 - fine > FreeBSD 11.0-CURRENT #0 r276779: Wed Jan 7 18:47:09 UTC 2015 - broken > > If you are running -current in a situation where it'll accumulate uptime, you > may want to take precautions. A reboot prior to 24 days uptime (as horrible a > workaround as that is) will avoid it. > > Yes, this is being worked on. So the issue is reproducable in 3 minutes after boot with the following change in kern_clock.c: volatile int ticks = INT_MAX - (/*hz*/1000 * 3 * 60); It is fixed (in the proper meaning of the word, not like worked around, covered by paper) by the patch at the end of the mail. We already have a story trying to enable much less ambitious option -fno-strict-overflow, see r259045 and the revert in r259422. I do not see other way than try one more time. Too many places in kernel depend on the correctly wrapping 2-complement arithmetic, among others are callweel and scheduler. diff --git a/sys/conf/kern.mk b/sys/conf/kern.mk index c031b3a..eb7ce2f 100644 --- a/sys/conf/kern.mk +++ b/sys/conf/kern.mk @@ -158,6 +158,11 @@ INLINE_LIMIT?= 8000 CFLAGS+= -ffreestanding # +# Make signed arithmetic wrap. +# +CFLAGS+= -fwrapv + +# # GCC SSP support # .if ${MK_SSP} != "no" && \