From owner-freebsd-stable Sat Jul 6 15:54:24 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 762AF37B400 for ; Sat, 6 Jul 2002 15:54:22 -0700 (PDT) Received: from wall.polstra.com (wall-gw.polstra.com [206.213.73.130]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8BB5843E31 for ; Sat, 6 Jul 2002 15:54:21 -0700 (PDT) (envelope-from jdp@polstra.com) Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13]) by wall.polstra.com (8.11.3/8.11.3) with ESMTP id g66MsKT79223 for ; Sat, 6 Jul 2002 15:54:20 -0700 (PDT) (envelope-from jdp@vashon.polstra.com) Received: (from jdp@localhost) by vashon.polstra.com (8.12.4/8.12.4/Submit) id g66MsJPj000565; Sat, 6 Jul 2002 15:54:19 -0700 (PDT) (envelope-from jdp) Date: Sat, 6 Jul 2002 15:54:19 -0700 (PDT) Message-Id: <200207062254.g66MsJPj000565@vashon.polstra.com> To: stable@freebsd.org From: John Polstra Subject: Re: NFS errors at high hz values with TCP mounts In-Reply-To: <200207062205.g66M5n1X000473@vashon.polstra.com> References: <200207062205.g66M5n1X000473@vashon.polstra.com> Organization: Polstra & Co., Seattle, WA Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In article <200207062205.g66M5n1X000473@vashon.polstra.com>, John Polstra wrote: > In article , > John Polstra wrote: > > Here's what happens when I try to copy a 512 kbyte file from the > > hz=10000 client to a server that is NFS-mounted: > > > > thin$ dd if=/dev/zero of=/mnt/foo count=1000 > > dd: /mnt/foo: Resource temporarily unavailable > > 61+0 records in > > 60+0 records out > > 30720 bytes transferred in 0.000996 secs (30843571 bytes/sec) > > I forget to mention that this message appears in the dmesg output on > the client machine: > > nfs send error 35 for server strings:/usr/home/jdp > > It comes from sys/nfs/nfs_socket.c line 499. Sorry for the extended conversation with myself. :-) I think I found the bug. In nfs_connect() at line 300 of sys/nfs/nfs_socket.c we have this code: so->so_rcv.sb_timeo = (5 * hz); so->so_snd.sb_timeo = (5 * hz); But sb_timeo has type "short", which overflows when hz is 10000. This is in struct sockbuf. I don't think it would break binary compatibility with existing 3rd party modules to change it to type "long". Are there any contrary opinions? John -- John Polstra John D. Polstra & Co., Inc. Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message