From owner-freebsd-net@FreeBSD.ORG Mon Jan 9 15:41:22 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BBD431065678 for ; Mon, 9 Jan 2012 15:41:22 +0000 (UTC) (envelope-from bms@incunabulum.net) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id 85D6B8FC0C for ; Mon, 9 Jan 2012 15:41:22 +0000 (UTC) Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id C32D321432 for ; Mon, 9 Jan 2012 10:22:49 -0500 (EST) Received: from frontend1.nyi.mail.srv.osa ([10.202.2.160]) by compute2.internal (MEProxy); Mon, 09 Jan 2012 10:22:49 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:date:from:mime-version:to:cc :subject:references:in-reply-to:content-type :content-transfer-encoding; s=smtpout; bh=XkgTOneOItEEw/q64RGDgF CFANE=; b=c1a7FnifyHfZFXXwjWdFPw1LdGQt7e1tehXAQK4MIpr8W+5eapHFKg gK8BZYGBerOmeVUo4a/kUuy+lRV1G/u6cCR2Ssd0uFJoQiWJ7pAgoZ0yAWuT/Vx3 aliXxFTmUr9DiAE6m7ar8Lp6egP9PEZa8iaE4I+ALzjxRQcx52o7g= X-Sasl-enc: Oj+ISlX+MuXEDs5nS97UuiBTkSxT7zlNrD9rYOLGAAAN 1326122569 Received: from [138.251.207.122] (dyn-207-122.cs.st-andrews.ac.uk [138.251.207.122]) by mail.messagingengine.com (Postfix) with ESMTPSA id 4463F8E0082; Mon, 9 Jan 2012 10:22:49 -0500 (EST) Message-ID: <4F0B0684.8040609@incunabulum.net> Date: Mon, 09 Jan 2012 15:23:48 +0000 From: Bruce Simpson User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: John Baldwin References: <201112221115.10239.jhb@freebsd.org> In-Reply-To: <201112221115.10239.jhb@freebsd.org> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: net@freebsd.org Subject: Re: Deferring inp_freemoptions() to an asychronous task X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Jan 2012 15:41:22 -0000 John, Sorry it's taken me so long to reply. No objections in principle to your change, but this seems to point at a more general issue with modern network controllers. You've also stumbled on the behaviour specific to how BSD has traditionally dealt with broadcast/multicast sockets. The pcbinfo structure can't really be disentangled from this. Of course, it doesn't help that we have historically required these sockets to be bound to INADDR_ANY. It might be useful to break reception out using a separate hash/tree, rather than walking all sockets as is currently done, but legacy usage needs to be supported. Interestingly enough, Microsoft has probably done something similar, judging from things which appear in MSDN. John Baldwin wrote: > I have a workload at work where a particular device driver can take a while to > update its MAC filter table when adding or removing multicast link-layer > addresses. One of the ways I've tackled fixing this is to change > inp_freemoptions() so that it does all of its actual work asychronously in a > separate task. Currently it does its work synchronously; however, it can be > invoked while the associated protocol holds a write lock on its pcbinfo lock > (e.g. from in_pcbdetach() called from udp_detach()). This stalls all packet > reception for that protocol since received packets need a read lock on the > pcbinfo to lookup the socket associated with a given (ip, port) tuple. There is often a delay between asking for the group and actually getting the hash filter entry set up in the MAC, so the operations are async. I can see many apps like to assume the operation is instantaneous rather than deferred; they are probably being naive... The same being true for taking down the hash filter entry is not surprising. thanks Bruce