From owner-freebsd-current Thu Jul 29 13:44:28 1999 Delivered-To: freebsd-current@freebsd.org Received: from skynet.ctr.columbia.edu (skynet.ctr.columbia.edu [128.59.64.70]) by hub.freebsd.org (Postfix) with SMTP id BDEA91511D for ; Thu, 29 Jul 1999 13:44:15 -0700 (PDT) (envelope-from wpaul@skynet.ctr.columbia.edu) Received: (from wpaul@localhost) by skynet.ctr.columbia.edu (8.6.12/8.6.9) id QAA17186; Thu, 29 Jul 1999 16:44:19 -0400 From: Bill Paul Message-Id: <199907292044.QAA17186@skynet.ctr.columbia.edu> Subject: Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm To: dillon@apollo.backplane.com (Matthew Dillon) Date: Thu, 29 Jul 1999 16:44:18 -0400 (EDT) Cc: peter@netplex.com.au, crossd@cs.rpi.edu, current@FreeBSD.ORG In-Reply-To: <199907291725.KAA76001@apollo.backplane.com> from "Matthew Dillon" at Jul 29, 99 10:25:46 am X-Mailer: ELM [version 2.4 PL24] Content-Type: text Content-Length: 3710 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Of all the gin joints in all the towns in all the world, Matthew Dillon had to walk into mine and say: > :Yes, we do. I've run into this problem elsewhere but a quick fix was needed > :so it just got hacked. NT NFS clients tend to trigger it too. > : > :The problem is that the sanity check is a fair way away from where the problem > :packet is generated. The bad reply is generated in the readdirplus routine, > :gets replied (without checking) and cached. The client drops the (oversize) > :packet, resends, and the nfsd replies from the cache and this time hits > :the sanity check and panics. > : > :... > : > :I will have another look shortly. Anyway, the clue is that the server > :readdirplus routine is the apparent culprit. > : > :Cheers, > :-Peter > > This makes a lot of sense. A report of du causing the panic, and > the good possibility that readdirplus is caching an oversized response > packet. Tell me what you come up with! I'll take a crack at it if you > don't find anything. Caching doesn't enter into it. The problem is bad arithmetic. In /sys/nfs/nfs_serv.c:nfsrv_readdirplus(), we have the following code: /* * If either the dircount or maxcount will be * exceeded, get out now. Both of these lengths * are calculated conservatively, including all * XDR overheads. */ len += (7 * NFSX_UNSIGNED + nlen + rem + NFSX_V3FH + NFSX_V3POSTOPATTR); dirlen += (6 * NFSX_UNSIGNED + nlen + rem); if (len > cnt || dirlen > fullsiz) { eofflag = 0; break; } I observed that the value of "len" didn't agree with the actual amount of data beong consumed in the mbuf chain. It turns out that each time through the loop, len is being incremented by 4 bytes too little. In other words, 7 * NFSX_UNSIGNED should really be 8 * NFSX_UNSIGNED. When I change 7 to 8, I no longer get oversized replies and everything adds up. This sanity code is trying to add up the amount of data consumed for each entryplus3 that gets consumed by a directory entry. The entryplus3 is defined in nfs_prot.x like this: struct entryplus3 { fileid3 fileid; filename3 name; cookie3 cookie; post_op_attr name_attributes; post_op_fh3 name_handle; entryplus3 *nextentry; }; Unfortunately I haven't been able to wrap my brain around how this is being counted up for the "len" calculation. Whatever it's doing, it's off by 4 bytes. Possibly somebody forgot that "filename3" is a string, which in XDR format consists of a string bytes, plus padding to a longword boundary, *plus* a longword length value. Some comments would have been useful here. (Hint, hint.) What I don't know is whether or not the calculation for dirlen is wrong or not. Hopefully now that I've shown everyone the light, maybe somebody can tell me for sure. -Bill -- ============================================================================= -Bill Paul (212) 854-6020 | System Manager, Master of Unix-Fu Work: wpaul@ctr.columbia.edu | Center for Telecommunications Research Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City ============================================================================= "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" ============================================================================= To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message