Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 May 2017 20:36:11 +0000
From:      Colin Percival <cperciva@tarsnap.com>
To:        =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= <royger@FreeBSD.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r301198 - head/sys/dev/xen/netfront
Message-ID:  <0100015beeed2897-85c5b43b-da66-4c04-86ae-d4ebd3fd93e8-000000@email.amazonses.com>
In-Reply-To: <20170509100912.h3ylwugahvfi5cc2@dhcp-3-128.uk.xensource.com>
References:  <201606021116.u52BGajD047287@repo.freebsd.org> <0100015bccba6abc-4c3b1487-25e3-4640-8221-885341ece829-000000@email.amazonses.com> <20170509100912.h3ylwugahvfi5cc2@dhcp-3-128.uk.xensource.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 05/09/17 03:09, Roger Pau Monn� wrote:
> On Wed, May 03, 2017 at 05:13:40AM +0000, Colin Percival wrote:
>> On 06/02/16 04:16, Roger Pau Monn� wrote:
>>> Author: royger
>>> Date: Thu Jun  2 11:16:35 2016
>>> New Revision: 301198
>>> URL: https://svnweb.freebsd.org/changeset/base/301198
>>
>> I think this commit is responsible for panics I'm seeing in EC2 on T2 family
>> instances.  [...]
>> but under high traffic volumes I think a separate thread can already be
>> running in xn_rxeof, having dropped the RX lock while it passes a packet
>> up the stack.  This would result in two different threads trying to process
>> the same set of responses from the ring, with (unsurprisingly) bad results.
> 
> Hm, right, xn_rxeof drops the lock while pushing the packet up the stack.
> There's a "XXX" comment on top of that, could you try to remove the lock
> dripping and see what happens?
> 
> I'm not sure there's any reason to drop the lock here, I very much doubt
> if_input is going to sleep.

Judging by
$ grep -R -B 1 -A 1 if_input /usr/src/sys/dev
I'm pretty sure that we do indeed need to drop the lock.  If it's possible
to enter if_input while holding locks, there are a *lot* of network interface
drivers which are dropping locks unnecessarily...

>> 3. Why xn_ifinit_locked is consuming ring responses.
> 
> There might be pending RX packets on the ring, so netfront consumes them and
> signals netback. In the unlikely event that the RX ring was full when
> xn_ifinit_locked is called, not doing this would mean the RX queue would get
> stuck forever, since there's no guarantee netfront will receive event channel
> notifications.

In that case, I'm guessing it would be safe to skip this if another thread is
already running xn_rxeof and chewing through the packets on the ring?  It
would be easy to set a flag in xn_rxeof before we drop locks.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0100015beeed2897-85c5b43b-da66-4c04-86ae-d4ebd3fd93e8-000000>