Date: Wed, 13 Jul 2016 11:56:01 +0200 From: Hans Petter Selasky <hps@selasky.org> To: =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= <royger@FreeBSD.org>, Gleb Smirnoff <glebius@FreeBSD.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r302350 - in head: share/man/man9 sys/kern sys/sys Message-ID: <b955c4a9-2fbe-55ee-fa15-16702cbe2074@selasky.org> In-Reply-To: <20160713084947.a4nfb6obr475pah6@mac> References: <201607051847.u65IlIYf000901@repo.freebsd.org> <20160713084947.a4nfb6obr475pah6@mac>
next in thread | previous in thread | raw e-mail | index | archive | help
On 07/13/16 10:49, Roger Pau Monné wrote: > On Tue, Jul 05, 2016 at 06:47:18PM +0000, Gleb Smirnoff wrote: >> Author: glebius >> Date: Tue Jul 5 18:47:17 2016 >> New Revision: 302350 >> URL: https://svnweb.freebsd.org/changeset/base/302350 >> >> Log: >> The paradigm of a callout is that it has three consequent states: >> not scheduled -> scheduled -> running -> not scheduled. The API and the >> manual page assume that, some comments in the code assume that, and looks >> like some contributors to the code also did. The problem is that this >> paradigm isn't true. A callout can be scheduled and running at the same >> time, which makes API description ambigouous. In such case callout_stop() >> family of functions/macros should return 1 and 0 at the same time, since it >> successfully unscheduled future callout but the current one is running. >> Before this change we returned 1 in such a case, with an exception that >> if running callout was migrating we returned 0, unless CS_MIGRBLOCK was >> specified. >> >> With this change, we now return 0 in case if future callout was unscheduled, >> but another one is still in action, indicating to API users that resources >> are not yet safe to be freed. >> >> However, the sleepqueue code relies on getting 1 return code in that case, >> and there already was CS_MIGRBLOCK flag, that covered one of the edge cases. >> In the new return path we will also use this flag, to keep sleepqueue safe. >> >> Since the flag CS_MIGRBLOCK doesn't block migration and now isn't limited to >> migration edge case, rename it to CS_EXECUTING. >> >> This change fixes panics on a high loaded TCP server. >> >> Reviewed by: jch, hselasky, rrs, kib >> Approved by: re (gjb) >> Differential Revision: https://reviews.freebsd.org/D7042 > > This change triggers a KASSERT when resuming a Xen VM from suspension: > > panic: bogus refcnt 0 on lle 0xfffff800035b9c00 > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001d2fa800 > vpanic() at vpanic+0x182/frame 0xfffffe001d2fa880 > kassert_panic() at kassert_panic+0x126/frame 0xfffffe001d2fa8f0 > llentry_free() at llentry_free+0x136/frame 0xfffffe001d2fa920 > in_lltable_free_entry() at in_lltable_free_entry+0xb0/frame 0xfffffe001d2fa950 > arp_ifinit() at arp_ifinit+0x54/frame 0xfffffe001d2fa980 > netfront_backend_changed() at netfront_backend_changed+0x154/frame 0xfffffe001d2faa50 > xenwatch_thread() at xenwatch_thread+0x1a2/frame 0xfffffe001d2faa70 > fork_exit() at fork_exit+0x84/frame 0xfffffe001d2faab0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001d2faab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > This seems to be caused by the following snipped in xen-netfront, which is > used by netfront in order to send a gratuitous ARP after recovering from > migration, in order for the bridge in the host to cache the MAC of the > domain. > > TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { > if (ifa->ifa_addr->sa_family == AF_INET) { > arp_ifinit(ifp, ifa); > } > } > > FWIW, this was working fine before this change, so I'm afraid this change > triggered some kind of bug in the lle entries. > > Roger. > Hi, > static void > nd6_llinfo_settimer_locked(struct llentry *ln, long tick) > { > int canceled; > > LLE_WLOCK_ASSERT(ln); > > if (tick < 0) { > ln->la_expire = 0; > ln->ln_ntick = 0; > canceled = callout_stop(&ln->lle_timer); > } else { > ln->la_expire = time_uptime + tick / hz; > LLE_ADDREF(ln); > if (tick > INT_MAX) { > ln->ln_ntick = tick - INT_MAX; > canceled = callout_reset(&ln->lle_timer, INT_MAX, > nd6_llinfo_timer, ln); > } else { > ln->ln_ntick = 0; > canceled = callout_reset(&ln->lle_timer, tick, > nd6_llinfo_timer, ln); > } > } > if (canceled > 0) > LLE_REMREF(ln); > } Like stated in D7042, there are dependencies in the code that expects callout_reset() to work like this: int callout_reset() { atomic_lock(); retval = callout_stop(); restart callout(); atomic_unlock(); return (retval); } As you can see in the piece of code above. r302350 fundamentally changes that. D7042 has many open questions that are not answered and I think it should be reverted from head and 11-stable until the discussions are complete. I believe the problems found should be fixed in the TCP stack, using locked callouts instead of unlocked. --HPS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b955c4a9-2fbe-55ee-fa15-16702cbe2074>