From owner-freebsd-net@FreeBSD.ORG Wed May 23 19:57:12 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 15EBD106564A for ; Wed, 23 May 2012 19:57:12 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id DE4BA8FC15 for ; Wed, 23 May 2012 19:57:11 +0000 (UTC) Received: by dadv36 with SMTP id v36so11056793dad.13 for ; Wed, 23 May 2012 12:57:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:subject:message-id:mime-version:content-type :content-disposition:user-agent; bh=zYPdKRwFNpKb2SgvLC1KTn4t1qYj0lOWgjfkqJn8Xug=; b=x7kXB0Mp1XiN6bFQAmSKlQhCMTTkdjaDNp4mvqpX9G+HIvRfW22BjeWESRvCVxV5Q7 lzsP8VsvumZWSP0OQxGi/miUefpwqxuJ2bq/JD/M97T553McVLeU6soCjytakpcNi2st sqGUUBNC1WYtmREAyLr0AtnIjaskiPBm6ekgI1ubG4rALi48o09XkxmF8B5L5yNPUr7j DH9ptct0ZIKu7JHFcvN65WPqFCa4rHX5M82RukkRCzkBuwBXwF9+SmhqkbcwM+z4lRUE PDmTtbCPtaK8bZN6elpTs9KG4q46qpJU46hVPUTiqUKviK4zZ3vfN20xR7OP2HLWfSff Xs4g== Received: by 10.68.226.228 with SMTP id rv4mr1459330pbc.167.1337803031490; Wed, 23 May 2012 12:57:11 -0700 (PDT) Received: from oddish.sandvine.com ([64.7.137.182]) by mx.google.com with ESMTPS id wn3sm2797991pbc.74.2012.05.23.12.57.10 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 23 May 2012 12:57:10 -0700 (PDT) Date: Wed, 23 May 2012 15:58:27 -0400 From: Mark Johnston To: freebsd-net@freebsd.org Message-ID: <20120523195827.GA56015@oddish.sandvine.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Subject: in_ifaddr initialization X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 May 2012 19:57:12 -0000 Hello, I've run into an issue related to the ordering of the steps in in.c:in_ifinit(). in_ifinit() gets called when an inet address gets added to an interface, and (loosely) it does the following: 1. Set the address of the in_ifaddr 2. Call the driver's ioctl handler (which generally seems to result in a chip reset, either directly or from ether_ioctl()). 3. Fill out the rest of the in_ifaddr and add a route if necessary. My problem occurs with step 2 - an ixgbe(4) reset takes upwards of 250ms according to DTrace, and during this time another userland program may try to get information about the addresses associated with the interface. In particular, in_ifinit() hasn't locked the in_ifaddr in any way at this point, so the userland program will get the new IP address, but a netmask of 0. This ends up causing some problems in our application: it frequently sends messages over a specific ixgbe interface, and checks to see if the interface's address has changed before doing so. It then tries to bind a socket to the broadcast address of the interface, which causes problems if the interface's address has recently changed during the 250ms window. Thus I'm wondering if there's any problem filling out the entire in_ifaddr before calling the driver's ioctl handler (as in the patch below). I realize that this doesn't completely eliminate the race, but it makes the window much smaller. I would like to fix it properly, but I was hoping someone more familiar with this code might be able to comment on the following: - Does the reordering below have any other unintended consequences? The only problem that I can see is in the case that if_ioctl returns an error: if the ia is not new (e.g. if the user updated the netmask alone), it will get updated before a possible error in the ioctl handler. But in practice this should never be a problem since the SIFADDR handlers never return an error. To be safe, the old ia could be saved at the beginning of in_ifinit() and restored in the case of an error, but I don't think that's necessary. - Is there any objection to addressing this problem at all? It's an unlikely interaction since most drivers do resets quite quickly, but it'd be nice to fix this I think. It seems reasonable to expect address changes to be done atomically, but I'm not sure. Thanks, -Mark diff --git a/sys/netinet/in.c b/sys/netinet/in.c index c1cbcb1..4996d47 100644 --- a/sys/netinet/in.c +++ b/sys/netinet/in.c @@ -819,16 +819,6 @@ in_ifinit(struct ifnet *ifp, struct in_ifaddr *ia, struct sockaddr_in *sin, return (error); /* - * Give the interface a chance to initialize - * if this is its first address, - * and to validate the address if necessary. - */ - if (ifp->if_ioctl != NULL && - (error = (*ifp->if_ioctl)(ifp, SIOCSIFADDR, (caddr_t)ia)) != 0) - /* LIST_REMOVE(ia, ia_hash) is done in in_control */ - return (error); - - /* * Be compatible with network classes, if netmask isn't supplied, * guess it based on classes. */ @@ -843,9 +833,7 @@ in_ifinit(struct ifnet *ifp, struct in_ifaddr *ia, struct sockaddr_in *sin, } ia->ia_subnet = i & ia->ia_subnetmask; in_socktrim(&ia->ia_sockmask); - /* - * Add route for the network. - */ + ia->ia_ifa.ifa_metric = ifp->if_metric; if (ifp->if_flags & IFF_BROADCAST) { if (ia->ia_subnetmask == IN_RFC3021_MASK) @@ -861,6 +849,20 @@ in_ifinit(struct ifnet *ifp, struct in_ifaddr *ia, struct sockaddr_in *sin, return (0); flags |= RTF_HOST; } + + /* + * Give the interface a chance to initialize + * if this is its first address, + * and to validate the address if necessary. + */ + if (ifp->if_ioctl != NULL && + (error = (*ifp->if_ioctl)(ifp, SIOCSIFADDR, (caddr_t)ia)) != 0) + /* LIST_REMOVE(ia, ia_hash) is done in in_control */ + return (error); + + /* + * Add route for the network. + */ if (!vhid && (error = in_addprefix(ia, flags)) != 0) return (error);