From owner-freebsd-net@FreeBSD.ORG Sat Apr 19 11:17:26 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 470C1E3B for ; Sat, 19 Apr 2014 11:17:26 +0000 (UTC) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [84.52.89.53]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B270F1876 for ; Sat, 19 Apr 2014 11:17:25 +0000 (UTC) Received: from [192.168.38.17] (aros.oktetlabs.ru [192.168.38.17]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id 810DB7F659; Sat, 19 Apr 2014 15:17:22 +0400 (MSK) X-DKIM: Sendmail DKIM Filter v2.8.2 shelob.oktetlabs.ru 810DB7F659 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1397906242; bh=Y2P4WrIcKZW5zSls6oOX4/CkU/zYeCApU3PYxNI4zaY=; l=2931; h=Message-ID:Date:From:MIME-Version:To:Subject:References: In-Reply-To:Content-Type; b=g+YYF+51M5YTwl5nNzDhauYOhAz8qyfUnEzwqRH/6R8sAjoOn56c6cHGzfsctcLkz eEsY7acDBJRpiOqxaraSWkCLkTDaXiYwWkisAvXhjHm0cV05X/OQ+CmOv8c7MsjlKs nBh3qox0giMzDB1c7wGat8H4koyS2FACG3W4Isj4= Message-ID: <53525B43.9050602@oktetlabs.ru> Date: Sat, 19 Apr 2014 15:17:23 +0400 From: Andrew Rybchenko Organization: OKTET Labs User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: aurfalien , freebsd-net@freebsd.org Subject: Re: Solarflare LACP bug? References: <2818B48D-3A2E-416A-875D-36DFD982D58A@gmail.com> In-Reply-To: <2818B48D-3A2E-416A-875D-36DFD982D58A@gmail.com> Content-Type: multipart/mixed; boundary="------------080705090505090908040007" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Apr 2014 11:17:26 -0000 This is a multi-part message in MIME format. --------------080705090505090908040007 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Hi, On 04/16/2014 11:00 PM, aurfalien wrote: > Hi, > > I’ve a Solarflare SFN5162F dual port 10Gb ethernet adapter. > > While the card works fine as individual ports, upon configuring LACP the machine suddenly reboots. > > Here are my commands; > > ifconfig sfxge0 up > ifconfig sfxge1 up > ifconfig lagg0 create > * ifconfig lagg0 up laggproto lacp laggport sfxge0 laggport sfxge1 10.0.10.99/16 > > * This is were the system reboots. please, find patch attached. It solves the problem for me. I'll discuss it with Solarflare and then submit patch to be pushed to subversion. Regards, Andrew. > I believe this to be a bug, should i post this on freebsd-bugs@freebsd.org > > The only thing in /var/crash is minfree. > > - aurf > > "Janitorial Services" --------------080705090505090908040007 Content-Type: text/x-patch; name="sfxge-lag-fix.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="sfxge-lag-fix.patch" sfxge: check that port is started when MAC filter is set MAC filter set may be called without softc_lock held in the case of SIOCADDMULTI and SIOCDELMULTI ioctls. ioctl handler checks IFF_DRV_RUNNING flag which implies port started, but it is not guaranteed to remain. softc_lock shared lock can't be held in the case of these ioctls processing, since it results in failure where kernel complains that non-sleepable lock is held in sleeping thread. Both problems are repeatable on LAG with LACP proto bring up. Submitted by: Andrew Rybchenko Sponsored by: Solarflare Communications, Inc. diff -r 8dc01b10eb64 sys/dev/sfxge/sfxge_port.c --- a/sys/dev/sfxge/sfxge_port.c Tue Apr 15 10:32:43 2014 +0100 +++ b/sys/dev/sfxge/sfxge_port.c Sat Apr 19 14:49:46 2014 +0400 @@ -357,10 +357,21 @@ struct sfxge_port *port = &sc->port; int rc; - KASSERT(port->init_state == SFXGE_PORT_STARTED, ("port not started")); - mtx_lock(&port->lock); - rc = sfxge_mac_filter_set_locked(sc); + /* + * The function may be called without softc_lock held in the + * case of SIOCADDMULTI and SIOCDELMULTI ioctls. ioctl handler + * checks IFF_DRV_RUNNING flag which implies port started, but + * it is not guaranteed to remain. softc_lock shared lock can't + * be held in the case of these ioctls processing, since it + * results in failure where kernel complains that non-sleepable + * lock is held in sleeping thread. Both problems are repeatable + * on LAG with LACP proto bring up. + */ + if (port->init_state == SFXGE_PORT_STARTED) + rc = sfxge_mac_filter_set_locked(sc); + else + rc = 0; mtx_unlock(&port->lock); return rc; } --------------080705090505090908040007--