From owner-freebsd-sparc64@FreeBSD.ORG Thu Aug 12 11:11:48 2004 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7EBAF16A4CE for ; Thu, 12 Aug 2004 11:11:48 +0000 (GMT) Received: from ns.kt-is.co.kr (ns.kt-is.co.kr [211.218.149.125]) by mx1.FreeBSD.org (Postfix) with ESMTP id C867143D1D for ; Thu, 12 Aug 2004 11:11:47 +0000 (GMT) (envelope-from yongari@kt-is.co.kr) Received: from michelle.kt-is.co.kr (ns2.kt-is.co.kr [220.76.118.193]) (authenticated bits=128) by ns.kt-is.co.kr (8.12.10/8.12.10) with ESMTP id i7CBBhAh062256 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Thu, 12 Aug 2004 20:11:43 +0900 (KST) Received: from michelle.kt-is.co.kr (localhost.kt-is.co.kr [127.0.0.1]) by michelle.kt-is.co.kr (8.12.10/8.12.10) with ESMTP id i7CBBFRg013447 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 12 Aug 2004 20:11:15 +0900 (KST) (envelope-from yongari@kt-is.co.kr) Received: (from yongari@localhost) by michelle.kt-is.co.kr (8.12.10/8.12.10/Submit) id i7CBBE4h013446 for sparc64@freebsd.org; Thu, 12 Aug 2004 20:11:14 +0900 (KST) (envelope-from yongari@kt-is.co.kr) Date: Thu, 12 Aug 2004 20:11:14 +0900 From: Pyun YongHyeon To: sparc64@freebsd.org Message-ID: <20040812111114.GB12556@kt-is.co.kr> References: <20040727110011.GA5553@kt-is.co.kr> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="61jdw2sOBCFtR2d/" Content-Disposition: inline In-Reply-To: <20040727110011.GA5553@kt-is.co.kr> User-Agent: Mutt/1.4.1i X-Filter-Version: 1.11a (ns.kt-is.co.kr) Subject: Re: NFS panic and malloc(9) warning X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: yongari@kt-is.co.kr List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Aug 2004 11:11:48 -0000 --61jdw2sOBCFtR2d/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Jul 27, 2004 at 08:00:11PM +0900, To sparc64@freebsd.org wrote: > Hello All, > > As soon as I did 'cp' via NFS, my Ultra2(NFS server) paniced. > The panic is reproduable. Even though I got core files successfully, > neither gdb6 nor kgdb understand the core.(no trace info.) > > 1st panic: trap: memory address not aligned > nfs_getreq() + 0x1d0 > nfssrv_dorec() + 0xcc > nfssvc_nfsd + 0x1c0 ... > > 2nd panic: trap: memory address not aligned > nfsrv_write() + 0x178 > nfssvc_nfsd() + 0x850 > ... > > And there is always a malloc(9) warning during the NFS operation. > Jul 27 16:51:36 daemon kernel: malloc(M_WAITOK) of "Mbuf", forcing M_NOWAIT with the following non-sleepable locks held: > Jul 27 16:51:36 daemon kernel: exclusive sleep mutex nfsd_mtx r = 0 (0xc03cdc38) locked @ /usr/src/sys/nfsserver/nfs_srvsock.c:712 > Jul 27 16:51:36 daemon kernel: KDB: stack backtrace: > Jul 27 16:51:36 daemon kernel: uma_zalloc_arg() at uma_zalloc_arg+0x40 > Jul 27 16:51:36 daemon kernel: nfsm_disct() at nfsm_disct+0x90 > Jul 27 16:51:36 daemon kernel: nfs_getreq() at nfs_getreq+0x44 > Jul 27 16:51:36 daemon kernel: nfsrv_dorec() at nfsrv_dorec+0xcc > Jul 27 16:51:36 daemon kernel: nfssvc_nfsd() at nfssvc_nfsd+0x2e4 > Jul 27 16:51:36 daemon kernel: nfssvc() at nfssvc+0x144 > Jul 27 16:51:36 daemon kernel: syscall() at syscall+0x21c > Jul 27 16:51:36 daemon kernel: -- syscall (155, FreeBSD ELF64, nfssvc) %o7=0x102bac -- > Jul 27 16:51:36 daemon kernel: userland() at 0x4039d848 > Jul 27 16:51:36 daemon kernel: user trace: trap %o7=0x102bac > Jul 27 16:51:36 daemon kernel: pc 0x4039d848, sp 0x7fdffffe031 > Jul 27 16:51:36 daemon kernel: pc 0x1019a4, sp 0x7fdffffe0f1 > Jul 27 16:51:36 daemon kernel: pc 0x100f80, sp 0x7fdffffe4d1 > Jul 27 16:51:36 daemon kernel: pc 0x4020a8d4, sp 0x7fdffffe591 > Jul 27 16:51:36 daemon kernel: done > Seeing no reply, so I assume I am the only persion seeing NFS panics on sparc64. How to reproduce: Sparc64 NFS server exports a directory to a NFS client. NFS client mounts the directory with TCP transport(either NFS v2 or v3 protocol). Now on client side, copy a big file(> 100MB) from the server's exported directory to a sub-directory of the exported directory. The NFS server panics immediatly. When sparc64 machine is configurated to NFS client, the same operation panics the client too. With UDP transport layer the panic doesn't happen. With the following patch, I couldn't panic sparc64 machine anymore. Patch summery. 1. We have now M_TRYWAIT == M_WAITOK, I think there is no need to check possible mixing. 2. Since nfsm_disct() is always called with NFSD lock held, we can't create a mbuf with M_TRYWAIT. This generates numerous WITNESS warning. However, I can't sure it's safe not to sleep when creating a mbuf under heavy memory pressure. 3. The patch is just one possible hack. I believe nfsm_dissect() should be completly rewritten in order to eliminate the need of nfs_realign(). Thanks. Regards, Pyun YongHyeon -- Pyun YongHyeon --61jdw2sOBCFtR2d/ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="nfs.patch" --- sys/nfs/nfs_common.c.orig Wed Apr 7 13:59:56 2004 +++ sys/nfs/nfs_common.c Thu Aug 12 20:01:07 2004 @@ -166,7 +166,7 @@ void * nfsm_disct(struct mbuf **mdp, caddr_t *dposp, int siz, int left) { - struct mbuf *mp, *mp2; + struct mbuf *mp, *mp2, *n; int siz2, xfer; caddr_t ptr; void *ret; @@ -187,7 +187,9 @@ } else if (siz > MHLEN) { panic("nfs S too big"); } else { - MGET(mp2, M_TRYWAIT, MT_DATA); + MGET(mp2, M_NOWAIT, MT_DATA); + if (mp2 == NULL) + return (NULL); mp2->m_next = mp->m_next; mp->m_next = mp2; mp->m_len -= left; @@ -216,6 +218,26 @@ mp->m_len = siz; *mdp = mp2; *dposp = mtod(mp2, caddr_t); + /* + * XXX + * Forcibly align memory accesses on sparc64. Since function + * nfs_realign() doesn't work well, we may need to completly + * rewrite nfsm_disct() in near future such that eliminates + * nfs_realign() in the frequent code path. + */ + if (!nfsm_aligned(*dposp, u_int32_t)) { + MGET(n, M_DONTWAIT, MT_DATA); + if (n == NULL) + return (NULL); + n->m_len = 0; + bcopy(mtod(mp2, caddr_t), mtod(n, caddr_t), mp2->m_len); + n->m_len += mp2->m_len; + n->m_next = mp2->m_next; + mp->m_next = n; + m_free(mp2); + *mdp = n; + *dposp = mtod(n, caddr_t); + } } return ret; } --- sys/nfs/nfs_common.h.orig Wed Apr 7 13:59:56 2004 +++ sys/nfs/nfs_common.h Wed Aug 11 20:36:11 2004 @@ -118,4 +118,6 @@ nfsm_dcheck(t1, mrep); \ } while (0) +#define nfsm_aligned(p,t) ((((u_long)(p)) & (sizeof(t)-1)) == 0) + #endif --- sys/nfsserver/nfs_srvsock.c.orig Mon Jul 26 20:33:08 2004 +++ sys/nfsserver/nfs_srvsock.c Thu Aug 12 19:45:14 2004 @@ -94,7 +94,7 @@ #define NFS_MAXCWND (NFS_CWNDSCALE * 32) struct callout nfsrv_callout; -static void nfs_realign(struct mbuf **pm, int hsiz); /* XXX SHARED */ +static void nfs_realign(struct mbuf **, int, int); /* XXX SHARED */ static int nfsrv_getstream(struct nfssvc_sock *, int); int32_t (*nfsrv3_procs[NFS_NPROCS])(struct nfsrv_descript *nd, @@ -153,14 +153,14 @@ * If this is a big reply, use a cluster else * try and leave leading space for the lower level headers. */ - mreq->m_len = 6 * NFSX_UNSIGNED; siz += RPC_REPLYSIZ; - if ((max_hdr + siz) >= MINCLSIZE) { + if (siz >= max_datalen) { MCLGET(mreq, M_TRYWAIT); } else - mreq->m_data += min(max_hdr, M_TRAILINGSPACE(mreq)); + mreq->m_data += max_hdr; NFSD_LOCK(); tl = mtod(mreq, u_int32_t *); + mreq->m_len = 6 * NFSX_UNSIGNED; bpos = ((caddr_t)tl) + mreq->m_len; *tl++ = txdr_unsigned(nd->nd_retxid); *tl++ = nfsrv_rpc_reply; @@ -235,7 +235,7 @@ * with TCP. Use vfs.nfs.realign_count and realign_test to check this. */ static void -nfs_realign(struct mbuf **pm, int hsiz) /* XXX COMMON */ +nfs_realign(struct mbuf **pm, int hsiz, int waitflag) /* XXX COMMON */ { struct mbuf *m; struct mbuf *n = NULL; @@ -247,12 +247,14 @@ ++nfs_realign_test; while ((m = *pm) != NULL) { if ((m->m_len & 0x3) || (mtod(m, intptr_t) & 0x3)) { - NFSD_UNLOCK(); - MGET(n, M_TRYWAIT, MT_DATA); + if (waitflag == M_WAITOK) + NFSD_UNLOCK(); + MGET(n, waitflag, MT_DATA); if (m->m_len >= MINCLSIZE) { - MCLGET(n, M_TRYWAIT); + MCLGET(n, waitflag); } - NFSD_LOCK(); + if (waitflag == M_WAITOK) + NFSD_LOCK(); n->m_len = 0; break; } @@ -508,8 +510,7 @@ if (mp) { struct nfsrv_rec *rec; rec = malloc(sizeof(struct nfsrv_rec), - M_NFSRVDESC, - waitflag == M_DONTWAIT ? M_NOWAIT : M_WAITOK); + M_NFSRVDESC, waitflag); if (!rec) { if (nam) FREE(nam, M_SONAME); @@ -518,7 +519,7 @@ continue; } NFSD_LOCK(); - nfs_realign(&mp, 10 * NFSX_UNSIGNED); + nfs_realign(&mp, 10 * NFSX_UNSIGNED, waitflag); rec->nr_address = nam; rec->nr_packet = mp; STAILQ_INSERT_TAIL(&slp->ns_rec, rec, nr_link); @@ -666,13 +667,12 @@ if (slp->ns_flag & SLP_LASTFRAG) { struct nfsrv_rec *rec; NFSD_UNLOCK(); - rec = malloc(sizeof(struct nfsrv_rec), M_NFSRVDESC, - waitflag == M_DONTWAIT ? M_NOWAIT : M_WAITOK); + rec = malloc(sizeof(struct nfsrv_rec), M_NFSRVDESC, waitflag); NFSD_LOCK(); if (!rec) { m_freem(slp->ns_frag); } else { - nfs_realign(&slp->ns_frag, 10 * NFSX_UNSIGNED); + nfs_realign(&slp->ns_frag, 10 * NFSX_UNSIGNED, waitflag); rec->nr_address = NULL; rec->nr_packet = slp->ns_frag; STAILQ_INSERT_TAIL(&slp->ns_rec, rec, nr_link); --61jdw2sOBCFtR2d/--