Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Jan 2013 10:06:54 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-fs@freebsd.org, yongari@freebsd.org, Christian Gusenbauer <c47g@gmx.at>
Subject:   Re: 9.1-stable crashes while copying data from a NFS mounted directory
Message-ID:  <1436756651.2351478.1359126414802.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20130125113606.GO2522@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Konstantin Belousov wrote:
> On Thu, Jan 24, 2013 at 09:39:38PM -0500, Rick Macklem wrote:
> > John Baldwin wrote:
> > > On Thursday, January 24, 2013 4:22:12 pm Konstantin Belousov
> > > wrote:
> > > > On Thu, Jan 24, 2013 at 09:50:52PM +0100, Christian Gusenbauer
> > > > wrote:
> > > > > On Thursday 24 January 2013 20:37:09 Konstantin Belousov
> > > > > wrote:
> > > > > > On Thu, Jan 24, 2013 at 07:50:49PM +0100, Christian
> > > > > > Gusenbauer
> > > > > > wrote:
> > > > > > > On Thursday 24 January 2013 19:07:23 Konstantin Belousov
> > > > > > > wrote:
> > > > > > > > On Thu, Jan 24, 2013 at 08:03:59PM +0200, Konstantin
> > > > > > > > Belousov wrote:
> > > > > > > > > On Thu, Jan 24, 2013 at 06:05:57PM +0100, Christian
> > > > > > > > > Gusenbauer wrote:
> > > > > > > > > > Hi!
> > > > > > > > > >
> > > > > > > > > > I'm using 9.1 stable svn revision 245605 and I get
> > > > > > > > > > the
> > > > > > > > > > panic below
> > > > > > > > > > if I execute the following commands (as single
> > > > > > > > > > user):
> > > > > > > > > >
> > > > > > > > > > # swapon -a
> > > > > > > > > > # dumpon /dev/ada0s3b
> > > > > > > > > > # mount -u /
> > > > > > > > > > # ifconfig age0 inet 192.168.2.2 mtu 6144 up
> > > > > > > > > > # mount -t nfs -o rsize=32768 data:/multimedia /mnt
> > > > > > > > > > # cp /mnt/Movies/test/a.m2ts /tmp
> > > > > > > > > >
> > > > > > > > > > then the system panics almost immediately. I'll
> > > > > > > > > > attach
> > > > > > > > > > the stack
> > > > > > > > > > trace.
> > > > > > > > > >
> > > > > > > > > > Note, that I'm using jumbo frames (6144 byte) on a
> > > > > > > > > > 1Gbit
> > > > > > > > > > network,
> > > > > > > > > > maybe that's the cause for the panic, because the
> > > > > > > > > > bcopy
> > > > > > > > > > (see stack
> > > > > > > > > > frame #15) fails.
> > > > > > > > > >
> > > > > > > > > > Any clues?
> > > > > > > > >
> > > > > > > > > I tried a similar operation with the nfs mount of
> > > > > > > > > rsize=32768 and mtu
> > > > > > > > > 6144, but the machine runs HEAD and em instead of age.
> > > > > > > > > I
> > > > > > > > > was unable
> > > > > > > > > to reproduce the panic on the copy of the 5GB file
> > > > > > > > > from
> > > > > > > > > nfs mount.
> > > > > > >
> > > > > > > Hmmm, I did a quick test. If I do not change the MTU, so
> > > > > > > just
> > > > > > > configuring
> > > > > > > age0 with
> > > > > > >
> > > > > > > # ifconfig age0 inet 192.168.2.2 up
> > > > > > >
> > > > > > > then I can copy all files from the mounted directory
> > > > > > > without
> > > > > > > any
> > > > > > > problems, too. So it's probably age0 related?
> > > > > >
> > > > > > From your backtrace and the buffer printout, I see somewhat
> > > > > > strange thing.
> > > > > > The buffer data address is 0xffffff8171418000, while kernel
> > > > > > faulted
> > > > > > at the attempt to write at 0xffffff8171413000, which is is
> > > > > > lower
> > > > > > then
> > > > > > the buffer data pointer, at the attempt to bcopy to the
> > > > > > buffer.
> > > > > >
> > > > > > The other data suggests that there were no overflow of the
> > > > > > data
> > > > > > from the
> > > > > > server response. So it might be that mbuf_len(mp) returned
> > > > > > negative number
> > > > > > ? I am not sure is it possible at all.
> > > > > >
> > > > > > Try this debugging patch, please. You need to add INVARIANTS
> > > > > > etc
> > > > > > to the
> > > > > > kernel config.
> > > > > >
> > > > > > diff --git a/sys/fs/nfs/nfs_commonsubs.c
> > > > > > b/sys/fs/nfs/nfs_commonsubs.c
> > > > > > index efc0786..9a6bda5 100644
> > > > > > --- a/sys/fs/nfs/nfs_commonsubs.c
> > > > > > +++ b/sys/fs/nfs/nfs_commonsubs.c
> > > > > > @@ -218,6 +218,7 @@ nfsm_mbufuio(struct nfsrv_descript *nd,
> > > > > > struct uio
> > > > > > *uiop, int siz) }
> > > > > >  				mbufcp = NFSMTOD(mp, caddr_t);
> > > > > >  				len = mbuf_len(mp);
> > > > > > + KASSERT(len > 0, ("len %d", len));
> > > > > >  			}
> > > > > >  			xfer = (left > len) ? len : left;
> > > > > >  #ifdef notdef
> > > > > > @@ -239,6 +240,8 @@ nfsm_mbufuio(struct nfsrv_descript *nd,
> > > > > > struct uio
> > > > > > *uiop, int siz) uiop->uio_resid -= xfer;
> > > > > >  		}
> > > > > >  		if (uiop->uio_iov->iov_len <= siz) {
> > > > > > + KASSERT(uiop->uio_iovcnt > 1, ("uio_iovcnt %d",
> > > > > > + uiop->uio_iovcnt));
> > > > > >  			uiop->uio_iovcnt--;
> > > > > >  			uiop->uio_iov++;
> > > > > >  		} else {
> > > > > >
> > > > > > I thought that server have returned too long response, but
> > > > > > it
> > > > > > seems to
> > > > > > be not the case from your data. Still, I think the patch
> > > > > > below
> > > > > > might be
> > > > > > due.
> > > > > >
> > > > > > diff --git a/sys/fs/nfsclient/nfs_clrpcops.c
> > > > > > b/sys/fs/nfsclient/nfs_clrpcops.c index be0476a..a89b907
> > > > > > 100644
> > > > > > --- a/sys/fs/nfsclient/nfs_clrpcops.c
> > > > > > +++ b/sys/fs/nfsclient/nfs_clrpcops.c
> > > > > > @@ -1444,7 +1444,7 @@ nfsrpc_readrpc(vnode_t vp, struct uio
> > > > > > *uiop, struct
> > > > > > ucred *cred, NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
> > > > > >  			eof = fxdr_unsigned(int, *tl);
> > > > > >  		}
> > > > > > - NFSM_STRSIZ(retlen, rsize);
> > > > > > + NFSM_STRSIZ(retlen, len);
> > > > > >  		error = nfsm_mbufuio(nd, uiop, retlen);
> > > > > >  		if (error)
> > > > > >  			goto nfsmout;
> > > > >
> > I think this patch is appropriate, although I don't see it as too
> > critical. It just tightens the "sanity check" on the read reply
> > length (which should never exceed what the client requested).
> I agree, but client cannot control the server response.
Righto. However, the server can put the correct "size" at the beginning
of the reply and then follow it with the wrong amount of data. Those are
the cases that the EBADRPC reply from nfsm_mbufuio() hopefully catches.

But, I agree that changing "rsize" to "len" is correct in this case and
would catch the case where the server replies with too large a "size".

> Anyway, I
> think
> there is too much things that could go wrong if the server actively
> exploit the client code.
> 
> >
> > nfsm_mbufuio() shouldn't transfer more than the uio structure can
> > handle, even if the replied read size is larger than requested.
> Yes, this what happen, I suppose, due to the decrement of the
> uio_iovcnt
> and the EBADRPC error return at the beginning of loop. But IMO the
> situation should be catched and asserted instead. This is why I added
> KASSERT(uio_iovcnt > 1) before the decrement.
> 
> I do not think that we should both add my KASSERT for iovcnt and leave
> the EBADRPC return. What is your preference there ?
> 
Well, if a server sends a reply with "size == 16384", but then follows
it with 32768bytes of data, I don't think that should panic the client,
since it isn't a client bug. I think this is where the EBADRPC reply for
iovcnt will happen.
The reverse: A reply with "size == 32768", but followed by 16384bytes of
data, would hit the end of the mbuf list. I don't think that should be
a KASSERT() either since, again, it is a server bug and not a client one.

Now, the case where the mbuf list is bogus (bad m_len or ???) I see
the argument for KASSERT()s, since it is a bug somewhere in the client
machine and that bug may soon (or have already) corrupted other data
structures, even if there is no damage done for this case once caught.
I'm fine with KASSERT()s for these. (A KASSERT() panic with a message
is probably easier to debug than a bcopy() crash, although you did the
latter amazingly well;-)

> >
> > It does seem that nfsm_mbufuio() should apply a sanity check on
> > m_len. I think m_len == 0 is ok, but negative or very large should
> > be checked for. Maybe just return EBADRPC after a printf() instead
> > of a KASSERT(), as a safety belt against a trashed m_len from a
> > driver
> > or ???
> The same as with the overflowed size, it would only hide another bug
> in
> the kernel. Probably, the assert m_len >= 0 and other useful
> assertions
> should be performed in the central place in the network stack, to
> catch
> an error earlier and for all consumers.
> 
Yes. That sounds like a good idea to me. To be honest, there are a lot
of places where a bogus mbuf list can trash the NFS code, so catching
it in a central place would be good, I think.

Thanks for looking into this, rick

> >
> > rick
> >
> > > > > I applied your patches and now I get a
> > > > >
> > > > > panic: len -4
> > > > > cpuid = 1
> > > > > KDB: enter: panic
> > > > > Dumping 377 out of 6116
> > > > > MB:..5%..13%..22%..34%..43%..51%..64%..73%..81%..94%
> > > > >
> > > > This means that the age driver either produced corrupted mbuf
> > > > chain,
> > > > or filled wrong negative value into the mbuf len field. I am
> > > > quite
> > > > certain that the issue is in the driver.
> > > >
> > > > I added the net@ to Cc:, hopefully you could get help there.
> > >
> > > And I've cc'd Pyun who has written most of this driver and is
> > > likely
> > > the one
> > > most familiar with its handling of jumbo frames.
> > >
> > > --
> > > John Baldwin
> > > _______________________________________________
> > > freebsd-fs@freebsd.org mailing list
> > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > > To unsubscribe, send any mail to
> > > "freebsd-fs-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1436756651.2351478.1359126414802.JavaMail.root>