Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 4 Jun 2017 23:19:49 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Bryan Drewery <bdrewery@FreeBSD.org>
Cc:        FreeBSD FS <freebsd-fs@FreeBSD.org>
Subject:   Re: NFS panic: newnfs_copycred: negative nfsc_ngroups
Message-ID:  <YTXPR01MB0189942CAB532478E7002855DDF50@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <c8df390d-590f-5843-8f62-b284d295240a@FreeBSD.org>
References:  <2b7a77df-8291-d399-6d1f-c454fbb2a5d9@FreeBSD.org>, <c8df390d-590f-5843-8f62-b284d295240a@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
You could try the attached little patch (untested) and see if the panics
go away. It is weird that no one else seems to see this, but I can see that
it might be possible for the code to create an open structure and not
initialize the nfso_cred structure in it. This patch makes sure it is set to
the credentials at the time the open is created, which I think is harmless.

If this stops the crashes, I can easily come up with a better patch for this
and commit it to head.

rick
________________________________________
From: Bryan Drewery <bdrewery@FreeBSD.org>
Sent: Saturday, June 3, 2017 8:24:11 PM
To: Rick Macklem
Cc: FreeBSD FS
Subject: Re: NFS panic: newnfs_copycred: negative nfsc_ngroups

On 6/3/2017 3:43 PM, Bryan Drewery wrote:
> Last reported here but I forgot to follow-up
> https://lists.freebsd.org/pipermail/freebsd-current/2013-July/042996.html
>
> I still get this quite often.
>
> Server is: 10.2-RELEASE-p2
>
> Client is: 12.0-CURRENT #5 r318116M
>
> mount is (no soft or intr since 2013):
>
>> tank:/tank/distfiles/freebsd                        /mnt/distfiles      nfs             rw,bg,noatime,rsize=65536,wsize=65536,readahead=8,nfsv4,rdirplus    0       0
>
> I have a core for debugging...
>> (kgdb) bt
>> #0  __curthread () at ./machine/pcpu.h:232
>> #1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:318
>> #2  0xffffffff803abf3c in db_fncall_generic (addr=<optimized out>, rv=<optimized out>, nargs=<optimized out>, args=<optimized out>) at /usr/src/sys/ddb/db_command.c:581
>> #3  db_fncall (dummy1=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>) at /usr/src/sys/ddb/db_command.c:629
>> #4  0xffffffff803abaaf in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=<optimized out>) at /usr/src/sys/ddb/db_command.c:453
>> #5  0xffffffff803ab7e4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:506
>> #6  0xffffffff803ae89f in db_trap (type=<optimized out>, code=<optimized out>) at /usr/src/sys/ddb/db_main.c:248
>> #7  0xffffffff80a9fda3 in kdb_trap (type=3, code=-61456, tf=<optimized out>) at /usr/src/sys/kern/subr_kdb.c:654
>> #8  0xffffffff80ee9286 in trap (frame=0xfffffe355f840540) at /usr/src/sys/amd64/amd64/trap.c:537
>> #9  <signal handler called>
>> #10 kdb_enter (why=0xffffffff81455661 "panic", msg=<optimized out>) at /usr/src/sys/kern/subr_kdb.c:444
>> #11 0xffffffff80a5d759 in vpanic (fmt=<optimized out>, ap=0xfffffe355f8406d0) at /usr/src/sys/kern/kern_shutdown.c:772
>> #12 0xffffffff80a5d59f in _kassert_panic (fatal=1, fmt=0xffffffff81434d8b "newnfs_copycred: negative nfsc_ngroups") at /usr/src/sys/kern/kern_shutdown.c:669
>> #13 0xffffffff80946ec2 in newnfs_copycred (nfscr=0xfffff8047b3eb530, cr=0xfffff80122cfa500) at /usr/src/sys/fs/nfs/nfs_commonport.c:244
>> #14 0xffffffff8094bddc in nfscl_getstateid (vp=<optimized out>, nfhp=0xfffff80501233902 "\233\262\tM\336\006\236\313\n", fhlen=28, mode=1, fords=<optimized out>, cred=<optimized out>, p=<optimized out>, stateidp=<optimized out>, lckpp=<optimized out>) at /usr/src/sys/fs/nfsclient/nfs_clstate.c:630
>> #15 0xffffffff8095ca88 in nfsrpc_read (vp=0xfffff8030bc209c0, uiop=0xfffffe355f840af8, cred=0xfffff80122cfa500, p=0x0, nap=0xfffffe355f8409d0, attrflagp=0xfffffe355f840aa4, stuff=<optimized out>) at /usr/src/sys/fs/nfsclient/nfs_clrpcops.c:1396
>> #16 0xffffffff8096b90a in ncl_readrpc (vp=0xfffff8030bc209c0, uiop=0xfffffe355f840af8, cred=0xfffff801b4913300) at /usr/src/sys/fs/nfsclient/nfs_clvnops.c:1375
>> #17 0xffffffff80976656 in ncl_doio (vp=0xfffff8030bc209c0, bp=0xfffffe349a268750, cr=<optimized out>, td=0x0, called_from_strategy=<optimized out>) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1643
>> #18 0xffffffff80978694 in nfssvc_iod (instance=<optimized out>) at /usr/src/sys/fs/nfsclient/nfs_clnfsiod.c:302
>> #19 0xffffffff80a1e394 in fork_exit (callout=0xffffffff80978420 <nfssvc_iod>, arg=0xffffffff81c7de64 <nfs_asyncdaemon+4>, frame=0xfffffe355f840c00) at /usr/src/sys/kern/kern_fork.c:1038
>> #20 <signal handler called>
>> (kgdb) p *nfscr
>> $3 = {nfsc_uid = 3735929054, nfsc_groups = {3735929054 <repeats 17 times>}, nfsc_ngroups = -559038242}
>> (kgdb) frame 17
>> #17 0xffffffff80976656 in ncl_doio (vp=0xfffff8030bc209c0, bp=0xfffffe349a268750, cr=<optimized out>, td=0x0, called_from_strategy=<optimized out>) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1643
>> (kgdb) p vp->v_mount->mnt_stat.f_mntonname
>> $8 = "/mnt/distfiles", '\000' <repeats 73 times>
>
> I had some bogus -domain in my nfsuserd options on the client that I
> removed after the recent panic. Not sure if it is relevant.
>

No that had no impact, I've hit it 3 times since sending the last email.

--
Regards,
Bryan Drewery


[-- Attachment #2 --]
--- fs/nfsclient/nfs_clstate.c.credpanic	2017-06-04 10:46:09.273613000 -0400
+++ fs/nfsclient/nfs_clstate.c	2017-06-04 11:01:21.536093000 -0400
@@ -290,6 +290,19 @@ nfscl_open(vnode_t vp, u_int8_t *nfhp, i
 	    newonep);
 
 	/*
+	 * If nfhp != NULL && nop == NULL, a new Open structure was allocated
+	 * using *nop.  For this case, set the credentials in the Open, so
+	 * that they are never uninitialized.
+	 */
+	if (nfhp != NULL && nop == NULL) {
+		KASSERT(*newonep != 0, ("%s: new open was allocated\n",
+		    __func__));
+		KASSERT(op != NULL, ("%s: New open must be returned\n",
+		    __func__));
+		newnfs_copyincred(cred, &op->nfso_cred);
+	}
+
+	/*
 	 * Now, check the mode on the open and return the appropriate
 	 * value.
 	 */

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB0189942CAB532478E7002855DDF50>