From owner-freebsd-stable@FreeBSD.ORG  Sun Jun 19 18:06:04 2005
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: stable@freebsd.org
Delivered-To: freebsd-stable@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E2A6716A42F;
	Sun, 19 Jun 2005 18:06:02 +0000 (GMT)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 1371A43E84;
	Sun, 19 Jun 2005 18:03:56 +0000 (GMT)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by cyrus.watson.org (Postfix) with ESMTP id 99BEC46BD1;
	Sun, 19 Jun 2005 14:03:55 -0400 (EDT)
Date: Sun, 19 Jun 2005 19:06:35 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: =?ISO-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net>
In-Reply-To: <8149D7F8-3FA2-48F5-BF03-9AF813448BF0@anduin.net>
Message-ID: <20050619185338.J6413@fledge.watson.org>
References: <8149D7F8-3FA2-48F5-BF03-9AF813448BF0@anduin.net>
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="0-1985640926-1119204395=:6413"
Cc: stable@freebsd.org, mlaier@FreeBSD.org
Subject: Re: NFS-related hang in 5.4?
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Jun 2005 18:06:07 -0000

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-1985640926-1119204395=:6413
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE


On Sun, 19 Jun 2005, Eirik =D8verby wrote:

> when doing large file transfers (backing up jails using tar+gzip to a=20
> neighboring server), NFS has a tendency to lock up on me. This usually=20
> happens after quite a while - like a few hours or so. Also, before the=20
> hang, performance is generally bad.

Hmm.  Looks like a bug in dummynet.  ipfw should not be directly=20
re-injecting UDP traffic back into the input path from an outbound path,=20
or it risks re-entering, generating lock order problems, etc. It should be=
=20
getting dropped into the netisr queue to be processed from the netisr=20
context.

Is it possible to configure dummynet out of your configuration, and see if=
=20
the problem goes away?

Robert N M Watson

>
> KDB trace:
>
> db> trace
> Tracing pid 56 tid 100064 td 0xc1a18600
> kdb_enter(c096bad3,4,480758,c08dcbf9,f5) at kdb_enter+0x30
> siointr1(c1a8e000,c1a18600,c1a148d4,c1a12700,c1a12700) at siointr1+0xe7
> siointr(c1a8e000,0,0,4,c1a18600) at siointr+0x78
> intr_execute_handlers(c19bd090,d54807bc,d5480818,c08d05a3,34) at=20
> intr_execute_handlers+0x88
> lapic_handle_intr(34) at lapic_handle_intr+0x3a
> Xapic_isr1() at Xapic_isr1+0x33
> --- interrupt, eip =3D 0xc06b8490, esp =3D 0xd5480800, ebp =3D 0xd5480818=
 ---
> _mtx_lock_sleep(c0a1cd2c,c1a18600,0,0,0) at _mtx_lock_sleep+0xb0
> udp_input(c2d40000,14,c1a99000,1,0) at udp_input+0x257
> ip_input(c2d40000,0,0,0,0) at ip_input+0x590
> transmit_event(c1c64100,20940000,0,c1d58a80,7f4220) at transmit_event+0x1=
07
> ready_event_wfq(c1c64100,20940000,0,c1d58a80,c06d860a) at=20
> ready_event_wfq+0x511
> dummynet_io(c2bd2e00,64,1,d54809c8,c2bd2e00) at dummynet_io+0x519
> ipfw_check_out(0,d5480a24,c1a99000,2,c1d1821c) at ipfw_check_out+0xf1
> pfil_run_hooks(c0a1c160,d5480a9c,c1a99000,2,c1d1821c) at pfil_run_hooks+0=
x138
> ip_output(c2bd2e00,0,0,0,0) at ip_output+0x593
> udp_output(c1d1821c,c2bd2e00,0,0,c1a18600) at udp_output+0x597
> udp_send(c2242654,0,c1e12100,0,0) at udp_send+0x30
> sosend(c2242654,0,0,c1e12100,0) at sosend+0x6f1
> nfs_send(c2242654,c1d57860,c1e12100,c2313900,1c) at nfs_send+0xc9
> nfs_request(c22cf108,c1e12a00,7,0,c20bb300) at nfs_request+0x342
> nfs_writerpc(c22cf108,d5480ca4,c20bb300,d5480c94,d5480c98) at=20
> nfs_writerpc+0x2a0
> nfs_doio(cbf75e08,c20bb300,0,c094f9b4,0) at nfs_doio+0x508
> nfssvc_iod(c0a21828,d5480d38,0,0,0) at nfssvc_iod+0x1db
> fork_exit(c07c5150,c0a21828,d5480d38) at fork_exit+0x80
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip =3D 0, esp =3D 0xd5480d6c, ebp =3D 0 ---
>
> I cannot seem to kill process 56 (nfsiod), so I have to reset the box.
>
> Anyone got a clue? What can I do to ease debugging here? Next time it hap=
pens=20
> I can probably make a dump, at least I will have a debug kernel running t=
hen.
>
> /Eirik
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>
--0-1985640926-1119204395=:6413--