From owner-svn-src-head@freebsd.org Fri Nov 13 21:34:13 2015 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 10CA2A2E0EA for ; Fri, 13 Nov 2015 21:34:13 +0000 (UTC) (envelope-from rrs@netflix.com) Received: from mail-pa0-x22c.google.com (mail-pa0-x22c.google.com [IPv6:2607:f8b0:400e:c03::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C9A571BE7 for ; Fri, 13 Nov 2015 21:34:12 +0000 (UTC) (envelope-from rrs@netflix.com) Received: by pacej9 with SMTP id ej9so4640283pac.2 for ; Fri, 13 Nov 2015 13:34:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netflix.com; s=google; h=content-type:mime-version:subject:from:in-reply-to:date:cc :message-id:references:to; bh=Ug+Hy2LCj98I1L3IU/0K7i81PwQGJl+D/HimwT2GDYs=; b=pxQPtxOhW/qaL/lQqB6ny3owJvcM9S5B534Rr3CzRhndQpywKkwHRW3LcmcBNn/Y/K V/S1anJLEUB6bmuhpFTum5AylO/mKNugjYo+FgMbA1fgEprBD6ca7pEB6fVOArFSkCuG lR8X/zHfGeTsXDfJNHVC51BV1VQ/1h5QNczq8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:message-id:references:to; bh=Ug+Hy2LCj98I1L3IU/0K7i81PwQGJl+D/HimwT2GDYs=; b=TExNtGFs7/AgRw2MThtlT80fYbC5hF+JDIIjuxAdEvspSWiNCmpNN2TibpeaM9F+Zj SypEoe1X3OrTDq13JYbKN1LsCCQfbDeTHmbt6YR1p83Ls01FGi5tnZ19SoUO4PW30uxY eoFh/isnCe3y0GeFWTEbsPkavjx/p7gejvxSrwjgCG8OwKXfeTeHzbyTXa+pwvoHemF0 sKW75n8e3hMsFCFRTkthNNzX11GKNNKz8mpxbCpbP5GS4+7+N1lWbXKjCI9ILJcaxXoO KUOVdzMSL6Tj8PDVZbI8Xi67v3inlzrw0RZ8WDNC0uUlERDjs2w+e5Vu/rOzkVa2pZpP Fxew== X-Gm-Message-State: ALoCoQluLhdTbwkWyVCcKXOaVsjrHji1jiqC8zirV4hCcttRVoOexDnDPEQiSE6nd7mq7ji+/x0l X-Received: by 10.66.150.165 with SMTP id uj5mr34795447pab.23.1447450452391; Fri, 13 Nov 2015 13:34:12 -0800 (PST) Received: from ip-192-168-70-10.us-west-2.compute.internal ([69.53.245.107]) by smtp.gmail.com with ESMTPSA id qk7sm22139948pbb.80.2015.11.13.13.34.08 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 13 Nov 2015 13:34:11 -0800 (PST) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: svn commit: r290664 - in head: share/man/man9 sys/kern sys/sys From: Randall Stewart In-Reply-To: <278491447449232@web25j.yandex.ru> Date: Fri, 13 Nov 2015 16:34:13 -0500 Cc: "src-committers@freebsd.org" , "svn-src-all@freebsd.org" , "svn-src-head@freebsd.org" , Adrian Chadd , "imp@freebsd.org" Message-Id: <343356B9-A02C-4DC6-A890-A2727436041C@netflix.com> References: null <201511101449.tAAEnXIi065747@repo.freebsd.org> <1660421447413365@web19h.yandex.ru> <278491447449232@web25j.yandex.ru> To: "Alexander V. Chernikov" X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Nov 2015 21:34:13 -0000 My patch address the following: On Nov 13, 2015, at 4:13 PM, Alexander V. Chernikov = wrote: > =20 > =20 > 13.11.2015, 23:59, "Randall Stewart" : >> Strange >> =20 >> I went looking through all calls to callout stop with cscope and saw >> no one paying attention to the return value=85 (which I thought was = not good). > =20 > 23:49 [0] m@fhead5 grep -R callout_stop sys | egrep '(=3D|\)' > sys/netgraph/ng_base.c: rval =3D callout_stop(c); This one does not need changing. > sys/netpfil/pf/if_pfsync.c: if (callout_stop(&pd->pd_tmo)) { > sys/netpfil/pf/if_pfsync.c: if (callout_stop(&pd->pd_tmo)) The above two I changed to > 0 > sys/dev/isci/isci_timer.c: /* callout_stop() will *not* keep the = time None of the ones in isci_timer.c check the return code or do anything = different. >=20 > sys/netinet6/nd6.c: canceled =3D callout_stop(&ln->lle_timer); > sys/netinet6/in6.c: if (callout_stop(&lle->lle_timer)) > sys/net/if_llatbl.c: if (callout_stop(&lle->lle_timer)) The above needed the same > sys/kern/subr_taskqueue.c: pending =3D = !!callout_stop(&timeout_task->c); same as above.. only I think the !! is strange :-) > sys/kern/kern_exit.c: callout_stop(&p->p_itcallout) =3D=3D 0) { Hmm I may have missed that one let me check Ok looking at that one it does not need to be changed.. in fact it is = more correct. Since the 0 return on a already expired callout is now -1 which this if = code is looking for. > sys/kern/subr_sleepqueue.c: else if = (callout_stop(&td->td_slpcallout) =3D=3D 0) { This one again was causing extra work when the callout was already = stopped and it returned 0.. it would do a synchronize on the other CPU.. but if -1 = comes back it says the callout is already stopped.. so no synchronization is needed.. > sys/netinet/in.c: if (callout_stop(&lle->lle_timer)) > sys/netinet/tcp_timer.c: if (callout_stop(t_callout) && These two I made > 0 though the TCP one needs to change to use the new async_drain > =20 > (not counting callout_drain() here) drain is different since it is done safe it should wait for the completion of the timeout. I don=92t know if you could ever get a 0 return from it.. R > =20 >> =20 >> And yes I am running this in a lot of systems. > Try this: > 0:11 [0] fhead0# ifconfig vtnet0 alias 10.10.10.10/32 > 0:11 [0] fhead0# ifconfig vtnet0 -alias 10.10.10.10 > callout_stop() for lle 10.10.10.10 on vtnet0, lle_refcnt=3D1 > panic: bogus refcnt 0 on lle 0xfffff8001996c400 >> =20 >> R >> =20 >> =20 >> On Nov 13, 2015, at 6:16 AM, Alexander V. Chernikov = wrote: >> =20 >>>=20 >>> 10.11.2015, 17:49, "Randall Stewart" : >>>>=20 >>>> Author: rrs >>>> Date: Tue Nov 10 14:49:32 2015 >>>> New Revision: 290664 >>>> URL: https://svnweb.freebsd.org/changeset/base/290664 >>>>=20 >>>> Log: >>>> Add new async_drain to the callout system. This is so-far not = used but >>>> should be used by TCP for sure in its cleanup of the IN-PCB (will = be coming shortly). >>>=20 >>> Randall, this commit introduced change in callout_stop() which was = not mentioned in commit message. >>> This change has broken lltable arp/nd handling: deleting interface = address causes immediate panic. >>> I also see other other code/subsystems relying on callout_stop() = return value (netgraph, pfsync, iscsi). >>> I was not able to find any discussion/analysis/testing for these in = D4076 so this change does not look like being properly tested prior = commiting.. >>>=20 >>>=20 >>> =20 >>>>=20 >>>> Sponsored by: Netflix Inc. >>>> Differential Revision: https://reviews.freebsd.org/D4076 >>>>=20 >>>> Modified: >>>> head/share/man/man9/timeout.9 >>>> head/sys/kern/kern_timeout.c >>>> head/sys/sys/callout.h >>>>=20 >>>> Modified: head/share/man/man9/timeout.9 >>>> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >>>> --- head/share/man/man9/timeout.9 Tue Nov 10 14:14:41 2015 = (r290663) >>>> +++ head/share/man/man9/timeout.9 Tue Nov 10 14:49:32 2015 = (r290664) >>>> @@ -35,6 +35,7 @@ >>>> .Sh NAME >>>> .Nm callout_active , >>>> .Nm callout_deactivate , >>>> +.Nm callout_async_drain , >>>> .Nm callout_drain , >>>> .Nm callout_handle_init , >>>> .Nm callout_init , >>>> @@ -69,6 +70,8 @@ typedef void timeout_t (void *); >>>> .Ft void >>>> .Fn callout_deactivate "struct callout *c" >>>> .Ft int >>>> +.Fn callout_async_drain "struct callout *c" "timeout_t *drain" >>>> +.Ft int >>>> .Fn callout_drain "struct callout *c" >>>> .Ft void >>>> .Fn callout_handle_init "struct callout_handle *handle" >>>> @@ -236,17 +239,42 @@ The function >>>> cancels a callout >>>> .Fa c >>>> if it is currently pending. >>>> -If the callout is pending, then >>>> +If the callout is pending and successfuly stopped, then >>>> .Fn callout_stop >>>> -returns a non-zero value. >>>> -If the callout is not set, >>>> -has already been serviced, >>>> -or is currently being serviced, >>>> +returns a value of one. >>>> +If the callout is not set, or >>>> +has already been serviced, then >>>> +negative one is returned. >>>> +If the callout is currently being serviced and cannot be stopped, >>>> then zero will be returned. >>>> If the callout has an associated lock, >>>> then that lock must be held when this function is called. >>>> .Pp >>>> The function >>>> +.Fn callout_async_drain >>>> +is identical to >>>> +.Fn callout_stop >>>> +with one difference. >>>> +When >>>> +.Fn callout_async_drain >>>> +returns zero it will arrange for the function >>>> +.Fa drain >>>> +to be called using the same argument given to the >>>> +.Fn callout_reset >>>> +function. >>>> +.Fn callout_async_drain >>>> +If the callout has an associated lock, >>>> +then that lock must be held when this function is called. >>>> +Note that when stopping multiple callouts that use the same lock = it is possible >>>> +to get multiple return's of zero and multiple calls to the >>>> +.Fa drain >>>> +function, depending upon which CPU's the callouts are running. The >>>> +.Fa drain >>>> +function itself is called from the context of the completing = callout >>>> +i.e. softclock or hardclock, just like a callout itself. >>>> +p >>>> +.Pp >>>> +The function >>>> .Fn callout_drain >>>> is identical to >>>> .Fn callout_stop >>>>=20 >>>> Modified: head/sys/kern/kern_timeout.c >>>> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >>>> --- head/sys/kern/kern_timeout.c Tue Nov 10 14:14:41 2015 (r290663) >>>> +++ head/sys/kern/kern_timeout.c Tue Nov 10 14:49:32 2015 (r290664) >>>> @@ -136,6 +136,7 @@ u_int callwheelsize, callwheelmask; >>>> */ >>>> struct cc_exec { >>>> struct callout *cc_curr; >>>> + void (*cc_drain)(void *); >>>> #ifdef SMP >>>> void (*ce_migration_func)(void *); >>>> void *ce_migration_arg; >>>> @@ -170,6 +171,7 @@ struct callout_cpu { >>>> #define callout_migrating(c) ((c)->c_iflags & = CALLOUT_DFRMIGRATION) >>>>=20 >>>> #define cc_exec_curr(cc, dir) cc->cc_exec_entity[dir].cc_curr >>>> +#define cc_exec_drain(cc, dir) cc->cc_exec_entity[dir].cc_drain >>>> #define cc_exec_next(cc) cc->cc_next >>>> #define cc_exec_cancel(cc, dir) cc->cc_exec_entity[dir].cc_cancel >>>> #define cc_exec_waiting(cc, dir) = cc->cc_exec_entity[dir].cc_waiting >>>> @@ -679,6 +681,7 @@ softclock_call_cc(struct callout *c, str >>>>=20 >>>> cc_exec_curr(cc, direct) =3D c; >>>> cc_exec_cancel(cc, direct) =3D false; >>>> + cc_exec_drain(cc, direct) =3D NULL; >>>> CC_UNLOCK(cc); >>>> if (c_lock !=3D NULL) { >>>> class->lc_lock(c_lock, lock_status); >>>> @@ -744,6 +747,15 @@ skip: >>>> CC_LOCK(cc); >>>> KASSERT(cc_exec_curr(cc, direct) =3D=3D c, ("mishandled = cc_curr")); >>>> cc_exec_curr(cc, direct) =3D NULL; >>>> + if (cc_exec_drain(cc, direct)) { >>>> + void (*drain)(void *); >>>> + >>>> + drain =3D cc_exec_drain(cc, direct); >>>> + cc_exec_drain(cc, direct) =3D NULL; >>>> + CC_UNLOCK(cc); >>>> + drain(c_arg); >>>> + CC_LOCK(cc); >>>> + } >>>> if (cc_exec_waiting(cc, direct)) { >>>> /* >>>> * There is someone waiting for the >>>> @@ -1145,7 +1157,7 @@ callout_schedule(struct callout *c, int >>>> } >>>>=20 >>>> int >>>> -_callout_stop_safe(struct callout *c, int safe) >>>> +_callout_stop_safe(struct callout *c, int safe, void (*drain)(void = *)) >>>> { >>>> struct callout_cpu *cc, *old_cc; >>>> struct lock_class *class; >>>> @@ -1225,19 +1237,22 @@ again: >>>> * stop it by other means however. >>>> */ >>>> if (!(c->c_iflags & CALLOUT_PENDING)) { >>>> - c->c_flags &=3D ~CALLOUT_ACTIVE; >>>> - >>>> /* >>>> * If it wasn't on the queue and it isn't the = current >>>> * callout, then we can't stop it, so just bail. >>>> + * It probably has already been run (if locking >>>> + * is properly done). You could get here if the caller >>>> + * calls stop twice in a row for example. The second >>>> + * call would fall here without CALLOUT_ACTIVE set. >>>> */ >>>> + c->c_flags &=3D ~CALLOUT_ACTIVE; >>>> if (cc_exec_curr(cc, direct) !=3D c) { >>>> CTR3(KTR_CALLOUT, "failed to stop %p func = %p arg %p", >>>> c, c->c_func, c->c_arg); >>>> CC_UNLOCK(cc); >>>> if (sq_locked) >>>> = sleepq_release(&cc_exec_waiting(cc, direct)); >>>> - return (0); >>>> + return (-1); >>>> } >>>>=20 >>>> if (safe) { >>>> @@ -1298,14 +1313,16 @@ again: >>>> CC_LOCK(cc); >>>> } >>>> } else if (use_lock && >>>> - !cc_exec_cancel(cc, direct)) { >>>> + !cc_exec_cancel(cc, direct) && (drain =3D=3D NULL)) { >>>>=20 >>>> /* >>>> * The current callout is waiting for its >>>> * lock which we hold. Cancel the callout >>>> * and return. After our caller drops the >>>> * lock, the callout will be skipped in >>>> - * softclock(). >>>> + * softclock(). This *only* works with a >>>> + * callout_stop() *not* callout_drain() or >>>> + * callout_async_drain(). >>>> */ >>>> cc_exec_cancel(cc, direct) =3D true; >>>> CTR3(KTR_CALLOUT, "cancelled %p func %p = arg %p", >>>> @@ -1351,11 +1368,17 @@ again: >>>> #endif >>>> CTR3(KTR_CALLOUT, "postponing stop %p func = %p arg %p", >>>> c, c->c_func, c->c_arg); >>>> + if (drain) { >>>> + cc_exec_drain(cc, direct) =3D drain; >>>> + } >>>> CC_UNLOCK(cc); >>>> return (0); >>>> } >>>> CTR3(KTR_CALLOUT, "failed to stop %p func %p arg = %p", >>>> c, c->c_func, c->c_arg); >>>> + if (drain) { >>>> + cc_exec_drain(cc, direct) =3D drain; >>>> + } >>>> CC_UNLOCK(cc); >>>> KASSERT(!sq_locked, ("sleepqueue chain still = locked")); >>>> return (0); >>>>=20 >>>> Modified: head/sys/sys/callout.h >>>> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >>>> --- head/sys/sys/callout.h Tue Nov 10 14:14:41 2015 (r290663) >>>> +++ head/sys/sys/callout.h Tue Nov 10 14:49:32 2015 (r290664) >>>> @@ -81,7 +81,7 @@ struct callout_handle { >>>> */ >>>> #define callout_active(c) ((c)->c_flags & CALLOUT_ACTIVE) >>>> #define callout_deactivate(c) ((c)->c_flags &=3D ~CALLOUT_ACTIVE) >>>> -#define callout_drain(c) _callout_stop_safe(c, 1) >>>> +#define callout_drain(c) _callout_stop_safe(c, 1, NULL) >>>> void callout_init(struct callout *, int); >>>> void _callout_init_lock(struct callout *, struct lock_object *, = int); >>>> #define callout_init_mtx(c, mtx, flags) \ >>>> @@ -119,10 +119,11 @@ int callout_schedule(struct callout *, i >>>> int callout_schedule_on(struct callout *, int, int); >>>> #define callout_schedule_curcpu(c, on_tick) \ >>>> callout_schedule_on((c), (on_tick), PCPU_GET(cpuid)) >>>> -#define callout_stop(c) _callout_stop_safe(c, 0) >>>> -int _callout_stop_safe(struct callout *, int); >>>> +#define callout_stop(c) _callout_stop_safe(c, 0, NULL) >>>> +int _callout_stop_safe(struct callout *, int, void (*)(void *)); >>>> void callout_process(sbintime_t now); >>>> - >>>> +#define callout_async_drain(c, d) \ >>>> + _callout_stop_safe(c, 0, d) >>>> #endif >>>>=20 >>>> #endif /* _SYS_CALLOUT_H_ */ >> =20 >> -------- >> Randall Stewart >> rrs@netflix.com >> 803-317-4952 >> =20 >>=20 >> =20 -------- Randall Stewart rrs@netflix.com 803-317-4952