From owner-svn-src-all@FreeBSD.ORG Fri Feb 13 22:27:46 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B6F6E65D for ; Fri, 13 Feb 2015 22:27:46 +0000 (UTC) Received: from mail-pd0-f171.google.com (mail-pd0-f171.google.com [209.85.192.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 743EE39F for ; Fri, 13 Feb 2015 22:27:46 +0000 (UTC) Received: by pdbfl12 with SMTP id fl12so22122626pdb.2 for ; Fri, 13 Feb 2015 14:27:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netflix.com; s=google; h=content-type:mime-version:subject:from:in-reply-to:date:cc :message-id:references:to; bh=rYFPt1hjx02a79I+zHK+Hiro5J6ZtyyGELeL1DBgHHs=; b=Hs9W8FonpsDskpCepsl3JyahvEx0rqFyV5Lsslx+jD05ztidE2JuIs50q9tcjljPiy XZkgqN9kX1DAcDlVzJQyHRHUGW0m7HYFzeINEHwOtxSRFjoJlQhyp0VkwJ75atXAV5ZQ tVfAJg0vm03LB/LE06yDecfnsl37HSwmqF5Lc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:message-id:references:to; bh=rYFPt1hjx02a79I+zHK+Hiro5J6ZtyyGELeL1DBgHHs=; b=GW+5hSkfv25Bey2q37ofV4FujVSbqQ2GLNnHdov0zbZ6WklSFoyG+IwpgZMkV3jvWH h+fxeRMAzEKMtDN2j3SwGKb7D9x69hDp+NL9CnoZ1NMPFGG9tiqig3UbcBU/pm2TXfnN A4jm5jtLzTyjU1I6vsU24TK8Hv7c2uIuHPbczEwvkJkqE8DqNcS8gRSdME4XloVM6ff9 dl2t0Oomujj7vqPC3wahaE+g6lyhbT/sQGtZjfTb6jwFwQy0QuFPxJ1/tEKDhVnRTMKN KnCLI/a22NGVCW8KEnKN5htg8ABXSHE/dA8xK2XvHzS5NNxxyOgqjjS3LAFe+6eEUAuC A8rQ== X-Gm-Message-State: ALoCoQlKBBdW8/ShGDEtUd71WkdnPhZ8xd3grTWhZuiPKtKARj0YWQgWVo++SAzfi4McLCTj/rrR X-Received: by 10.69.27.42 with SMTP id jd10mr19020947pbd.97.1423866466011; Fri, 13 Feb 2015 14:27:46 -0800 (PST) Received: from [10.64.24.11] ([69.53.236.236]) by mx.google.com with ESMTPSA id c9sm7667877pdm.51.2015.02.13.14.27.42 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 13 Feb 2015 14:27:44 -0800 (PST) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: svn commit: r278472 - in head/sys: netinet netinet6 From: Randall Stewart In-Reply-To: <20150213212128.GG15484@FreeBSD.org> Date: Fri, 13 Feb 2015 17:27:40 -0500 Message-Id: <89BA29F2-C617-40DE-90E5-F3EC9C17AEC8@netflix.com> References: <201502091928.t19JSC5P066293@svn.freebsd.org> <1903622.i3cFFNVYcL@ralph.baldwin.cx> <20150213212128.GG15484@FreeBSD.org> To: Gleb Smirnoff X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: src-committers@freebsd.org, John Baldwin , Eric van Gyzen , svn-src-all@freebsd.org, Randall Stewart , svn-src-head@freebsd.org, zont@FreeBSD.org, rstone@FreeBSD.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Feb 2015 22:27:46 -0000 Gleb: Ok here is the diff of the arp timer function that this changed made = (238990): **************************** arptimer(void *arg) { + struct llentry *lle =3D (struct llentry *)arg; struct ifnet *ifp; - struct llentry *lle; - int pkts_dropped; + size_t pkts_dropped; =20 - KASSERT(arg !=3D NULL, ("%s: arg NULL", __func__)); - lle =3D (struct llentry *)arg; + if (lle->la_flags & LLE_STATIC) { + LLE_WUNLOCK(lle); + return; + } + ifp =3D lle->lle_tbl->llt_ifp; CURVNET_SET(ifp->if_vnet); + + if (lle->la_flags !=3D LLE_DELETED) { + int evt; + + if (lle->la_flags & LLE_VALID) + evt =3D LLENTRY_EXPIRED; + else + evt =3D LLENTRY_TIMEDOUT; + EVENTHANDLER_INVOKE(lle_event, lle, evt); + } + + callout_stop(&lle->la_timer); + + /* XXX: LOR avoidance. We still have ref on lle. */ + LLE_WUNLOCK(lle); IF_AFDATA_LOCK(ifp); LLE_WLOCK(lle); - if (lle->la_flags & LLE_STATIC) - LLE_WUNLOCK(lle); - else { - if (!callout_pending(&lle->la_timer) && - callout_active(&lle->la_timer)) { - callout_stop(&lle->la_timer); - LLE_REMREF(lle); =20 - if (lle->la_flags !=3D LLE_DELETED) { - int evt; - - if (lle->la_flags & LLE_VALID) - evt =3D LLENTRY_EXPIRED; - else - evt =3D LLENTRY_TIMEDOUT; - EVENTHANDLER_INVOKE(lle_event, lle, = evt); - } - - pkts_dropped =3D llentry_free(lle); - ARPSTAT_ADD(dropped, pkts_dropped); - ARPSTAT_INC(timeouts); - } else { -#ifdef DIAGNOSTIC - struct sockaddr *l3addr =3D L3_ADDR(lle); - log(LOG_INFO, - "arptimer issue: %p, IPv4 address: = \"%s\"\n", lle, - inet_ntoa( - ((const struct sockaddr_in = *)l3addr)->sin_addr)); -#endif - LLE_WUNLOCK(lle); - } - } + LLE_REMREF(lle); + pkts_dropped =3D llentry_free(lle); IF_AFDATA_UNLOCK(ifp); + ARPSTAT_ADD(dropped, pkts_dropped); + ARPSTAT_INC(timeouts); CURVNET_RESTORE(); } ****************************** And I can see *what* the problem was.. If you look at the lines: - if (!callout_pending(&lle->la_timer) && - callout_active(&lle->la_timer)) { This is the *incorrect* way to write this code it should have been: if (callout_pending(&lle->la_timer) && !callout_active(&lle->la_timer)) { To properly do the MPSAFE thing.. take a look at the callout manual.. So the original author just mixed up the test..=20 The idea is after you get the lock you check if its pending again, if so someone has restarted it.. and if its not active, then someone has stopped it. They check if it was *not* pending.. which is almost always the case since the callout code removes the pending flag and thats what you want to be there. So not pending would always be true..=20 I don=92t see that this old code did the callout_deactive().. but I = think the callout_stop() does that I guess.. I think the problem was that the original code was wrong and that the fix yes, avoided one race but put in another. I do think a callout_async_drain is the right thing, but that will take = a while to get right. Peter Holm (thank you so much) has been pounding on this, = if you could find another test to add.. that would be great. I think this will = work the way it is.. R On Feb 13, 2015, at 4:21 PM, Gleb Smirnoff wrote: > 165863 -------- Randall Stewart rrs@netflix.com 803-317-4952