From owner-freebsd-stable@FreeBSD.ORG  Mon Jun 20 12:12:38 2005
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: stable@freebsd.org
Delivered-To: freebsd-stable@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id ABE0016A41C
	for <stable@freebsd.org>; Mon, 20 Jun 2005 12:12:38 +0000 (GMT)
	(envelope-from ltning@anduin.net)
Received: from anduin.net (anduin.net [212.12.46.226])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 655D943D49
	for <stable@freebsd.org>; Mon, 20 Jun 2005 12:12:38 +0000 (GMT)
	(envelope-from ltning@anduin.net)
Received: from eirik.unicore.no ([213.225.74.166] helo=[10.0.16.10])
	by anduin.net with esmtpa (Exim 4.50 (FreeBSD)) id 1DkL8n-000C9C-3n
	for stable@freebsd.org; Mon, 20 Jun 2005 14:12:37 +0200
Resent-Message-Id: <D6049B6F-31D3-42E9-ADDC-C5C092C9AC78@anduin.net>
Mime-Version: 1.0 (Apple Message framework v730)
Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed
Resent-Date: Mon, 20 Jun 2005 14:12:02 +0200
Message-Id: <A9D88C9D-B3F4-4FD3-A210-06A59EA15787@anduin.net>
Content-Transfer-Encoding: quoted-printable
Resent-To: stable@freebsd.org
From: =?ISO-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net>
Resent-From: =?ISO-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net>
Date: Mon, 20 Jun 2005 10:53:19 +0200
To: Robert Watson <rwatson@FreeBSD.org>
X-Mailer: Apple Mail (2.730)
Cc: 
Subject: Re: NFS-related hang in 5.4?
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Jun 2005 12:12:38 -0000

On 20. jun. 2005, at 10.38, Robert Watson wrote:


>
> On Mon, 20 Jun 2005, Eirik =D8verby wrote:
>
>
>
>>> Hmm.  Looks like a bug in dummynet.  ipfw should not be directly =20
>>> re- injecting UDP traffic back into the input path from an =20
>>> outbound path, or it risks re-entering, generating lock order =20
>>> problems, etc. It should be getting dropped into the netisr queue =20=

>>> to be processed from the netisr context.
>>>
>>>
>>
>> This problem would exist across all 5.4 installations, both i386 =20
>> and amd64? Would it depend on heavy load, or could it =20
>> theoretically happen at any time when there's traffic? All three =20
>> of my fbsd5 servers (dual opteron, dual p3-1ghz, dual p3-700mhz) =20
>> are experiencing random hangs with ~a few weeks between, =20
>> impression is that if running single-cpu mode they are all stable. =20=

>> All using dummynet in a comparable manner. Ideas?
>>
>>
>
> Yes.  Basically, the network stack avoids recursion in processing =20
> for "complicated" packets by deferring processing an offending =20
> packet to a thread called the 'netisr'.  Whenever the stack reaches =20=

> a possible recursion point on a packet, it's supposed to queue the =20
> packet for processing 'later' in a per-protocol queue, unwind, and =20
> then when the netisr runs, pick up and continue processing.  In the =20=

> stack trace you provide, dummynet appears to immediately =20
> immediately invoke the in-bound network path from the out-bound =20
> network path, walking back into the network stack from the outbound =20=

> path.  This is generally forbidden, for a variety of reasons:
>
> - We do allow the in-bound path to call the out-bound path, so that
>   protocols like TCP, and services like NFS can turn around packets
>   without a context switch.  If further recursion is permitted, the =20=

> stack
>   may overflow.
>
> - Both paths may hold network stack locks over calls in either =20
> direction
>   -- specifically, we allow protocol locks to be held over calls =20
> into the
>   socket layer, as the protocol layer drives operation; if a recursive
>   call is made, deadlocks can occur due to violating the lock =20
> order.  This
>   is what is happening in your case.
>
> Pretty much all network code is entirely architecture-independent, =20
> so bugs typically span architectures, although race conditions can =20
> sometimes be hard to reproduce if they require precise timing and =20
> multiple processors.
>

So I'm lucky to have seen this one... Great ;)


>>> Is it possible to configure dummynet out of your configuration, =20
>>> and see if the problem goes away?
>>>
>>>
>>
>> I'm running a test right now, will let you know in the morning.
>>
>>
>
> Thanks.
>

I know enough not to call this a "confirmation", but disabling =20
dummynet did indeed allow me to finish the backup. I never made it =20
past 15GBs before, now the full 19GB tar.gz file is done, and the =20
boxes are both still running. The funny thing is - I only disabled =20
dummynet on one of the boxes now - the source of the backup, the box =20
that pushes data. The other box has pretty much 100% the same setup, =20
and is also i386. But as traffic shaping can only happen on outgoing =20
packets, I suppose that makes sense.

I can try re-running the test again if you wish, in order to gain =20
more statistics. It's just too bad it takes a while ;)


/Eirik


>
> Robert N M Watson
>