Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Jul 1999 16:44:18 -0400 (EDT)
From:      Bill Paul <wpaul@skynet.ctr.columbia.edu>
To:        dillon@apollo.backplane.com (Matthew Dillon)
Cc:        peter@netplex.com.au, crossd@cs.rpi.edu, current@FreeBSD.ORG
Subject:   Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm
Message-ID:  <199907292044.QAA17186@skynet.ctr.columbia.edu>
In-Reply-To: <199907291725.KAA76001@apollo.backplane.com> from "Matthew Dillon" at Jul 29, 99 10:25:46 am

next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

> :Yes, we do.  I've run into this problem elsewhere but a quick fix was needed
> :so it just got hacked.  NT NFS clients tend to trigger it too.
> :
> :The problem is that the sanity check is a fair way away from where the problem
> :packet is generated.  The bad reply is generated in the readdirplus routine,
> :gets replied (without checking) and cached.  The client drops the (oversize)
> :packet, resends, and the nfsd replies from the cache and this time hits
> :the sanity check and panics.
> :
> :...
> :
> :I will have another look shortly.  Anyway, the clue is that the server
> :readdirplus routine is the apparent culprit.
> :
> :Cheers,
> :-Peter
> 
>     This makes a lot of sense.  A report of du causing the panic, and
>     the good possibility that readdirplus is caching an oversized response
>     packet.  Tell me what you come up with!  I'll take a crack at it if you
>     don't find anything.

Caching doesn't enter into it. The problem is bad arithmetic.

In /sys/nfs/nfs_serv.c:nfsrv_readdirplus(), we have the following
code:

                        /*
                         * If either the dircount or maxcount will be
                         * exceeded, get out now. Both of these lengths
                         * are calculated conservatively, including all
                         * XDR overheads.
                         */
                        len += (7 * NFSX_UNSIGNED + nlen + rem + NFSX_V3FH +
                                NFSX_V3POSTOPATTR);
                        dirlen += (6 * NFSX_UNSIGNED + nlen + rem);
                        if (len > cnt || dirlen > fullsiz) {
                                eofflag = 0;
                                break;
                        }


I observed that the value of "len" didn't agree with the actual amount
of data beong consumed in the mbuf chain. It turns out that each
time through the loop, len is being incremented by 4 bytes too little.
In other words, 7 * NFSX_UNSIGNED should really be 8 * NFSX_UNSIGNED.
When I change 7 to 8, I no longer get oversized replies and everything
adds up.

This sanity code is trying to add up the amount of data consumed for
each entryplus3 that gets consumed by a directory entry. The entryplus3
is defined in nfs_prot.x like this:

struct entryplus3 {
        fileid3         fileid;
        filename3       name;
        cookie3         cookie;
        post_op_attr    name_attributes;
        post_op_fh3     name_handle;
        entryplus3      *nextentry;
};

Unfortunately I haven't been able to wrap my brain around how this is
being counted up for the "len" calculation. Whatever it's doing, it's
off by 4 bytes. Possibly somebody forgot that "filename3" is a string,
which in XDR format consists of a string bytes, plus padding to a longword
boundary, *plus* a longword length value. Some comments would have been
useful here. (Hint, hint.)

What I don't know is whether or not the calculation for dirlen is
wrong or not. Hopefully now that I've shown everyone the light, maybe
somebody can tell me for sure.

-Bill

-- 
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager, Master of Unix-Fu
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=============================================================================


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199907292044.QAA17186>