From owner-freebsd-hackers@freebsd.org Tue Feb 2 14:53:28 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 11A54A9811C for ; Tue, 2 Feb 2016 14:53:28 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9719181 for ; Tue, 2 Feb 2016 14:53:27 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u12ErMFM062640 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 2 Feb 2016 16:53:22 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u12ErMFM062640 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u12ErMgi062639; Tue, 2 Feb 2016 16:53:22 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 2 Feb 2016 16:53:22 +0200 From: Konstantin Belousov To: Jov Cc: freebsd-hackers@freebsd.org Subject: Re: Fwd: [BUGS] BUG #13900: stop standby failed with writer process hang(happen 3 times in 2 days) Message-ID: <20160202145322.GA91220@kib.kiev.ua> References: <20160130071346.31022.37189@wrigleys.postgresql.org> <20160131162929.GI91220@kib.kiev.ua> <20160201190415.GO91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 14:53:28 -0000 On Tue, Feb 02, 2016 at 09:36:19AM +0800, Jov wrote: > ???Thanks for your info,Belousov. > Would you please point which PR or which commit I can exam???I can not move > to 10 stable now,maybe I can avoid trigger this bug or add some monitor???. stable/10 is not fixed as well. I do not have a good suggestion rather then to sync kern_timeout.c with head. After that, you need one more patch. diff --git a/sys/kern/kern_timeout.c b/sys/kern/kern_timeout.c index 9c9d25f..e827665 100644 --- a/sys/kern/kern_timeout.c +++ b/sys/kern/kern_timeout.c @@ -1162,7 +1162,7 @@ _callout_stop_safe(struct callout *c, int safe, void (*drain)(void *)) int direct, sq_locked, use_lock; int not_on_a_list; - if (safe) + if ((safe & CS_DRAIN) != 0) WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, c->c_lock, "calling %s", __func__); @@ -1170,7 +1170,7 @@ _callout_stop_safe(struct callout *c, int safe, void (*drain)(void *)) * Some old subsystems don't hold Giant while running a callout_stop(), * so just discard this check for the moment. */ - if (!safe && c->c_lock != NULL) { + if ((safe & CS_DRAIN) == 0 && c->c_lock != NULL) { if (c->c_lock == &Giant.lock_object) use_lock = mtx_owned(&Giant); else { @@ -1253,7 +1253,7 @@ again: return (-1); } - if (safe) { + if ((safe & CS_DRAIN) != 0) { /* * The current callout is running (or just * about to run) and blocking is allowed, so @@ -1370,7 +1370,7 @@ again: cc_exec_drain(cc, direct) = drain; } CC_UNLOCK(cc); - return (0); + return ((safe & CS_MIGRBLOCK) != 0); } CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p", c, c->c_func, c->c_arg); diff --git a/sys/kern/subr_sleepqueue.c b/sys/kern/subr_sleepqueue.c index bbbec92..12908f6 100644 --- a/sys/kern/subr_sleepqueue.c +++ b/sys/kern/subr_sleepqueue.c @@ -586,7 +586,8 @@ sleepq_check_timeout(void) * another CPU, so synchronize with it to avoid having it * accidentally wake up a subsequent sleep. */ - else if (callout_stop(&td->td_slpcallout) == 0) { + else if (_callout_stop_safe(&td->td_slpcallout, CS_MIGRBLOCK, NULL) + == 0) { td->td_flags |= TDF_TIMEOUT; TD_SET_SLEEPING(td); mi_switch(SW_INVOL | SWT_SLEEPQTIMO, NULL); diff --git a/sys/sys/callout.h b/sys/sys/callout.h index 3e71c87..51dc137 100644 --- a/sys/sys/callout.h +++ b/sys/sys/callout.h @@ -62,6 +62,9 @@ struct callout_handle { struct callout *callout; }; +#define CS_DRAIN 0x0001 +#define CS_MIGRBLOCK 0x0002 + #ifdef _KERNEL /* * Note the flags field is actually *two* fields. The c_flags @@ -81,7 +84,7 @@ struct callout_handle { */ #define callout_active(c) ((c)->c_flags & CALLOUT_ACTIVE) #define callout_deactivate(c) ((c)->c_flags &= ~CALLOUT_ACTIVE) -#define callout_drain(c) _callout_stop_safe(c, 1, NULL) +#define callout_drain(c) _callout_stop_safe(c, CS_DRAIN, NULL) void callout_init(struct callout *, int); void _callout_init_lock(struct callout *, struct lock_object *, int); #define callout_init_mtx(c, mtx, flags) \