Date: Tue, 27 Jul 2004 20:00:11 +0900 From: Pyun YongHyeon <yongari@kt-is.co.kr> To: sparc64@freebsd.org Subject: NFS panic and malloc(9) warning Message-ID: <20040727110011.GA5553@kt-is.co.kr>
next in thread | raw e-mail | index | archive | help
Hello All, As soon as I did 'cp' via NFS, my Ultra2(NFS server) paniced. The panic is reproduable. Even though I got core files successfully, neither gdb6 nor kgdb understand the core.(no trace info.) 1st panic: trap: memory address not aligned nfs_getreq() + 0x1d0 nfssrv_dorec() + 0xcc nfssvc_nfsd + 0x1c0 objdump said the following. .... 0000000000000480 <nfs_getreq>: nfs_getreq(): /usr/src/sys/nfsserver/nfs_srvsock.c:286 /* * Parse an RPC request * - verify it * - fill in the cred struct. */ .... /usr/src/sys/nfsserver/nfs_srvsock.c:373 650: c4 06 80 00 ld [ %i2 ], %g2 654: b4 06 a0 04 add %i2, 4, %i2 658: c4 26 20 bc st %g2, [ %i0 + 0xbc ] .... So the paniced code is located in line 373. 364 /* 365 * XXX: This credential should be managed using crget(9) 366 * and related calls. Right now, this tramples on any 367 * extensible data in the ucred, fails to initialize the 368 * mutex, and worse. This must be fixed before FreeBSD 369 * 5.3-RELEASE. 370 */ 371 bzero((caddr_t)&nd->nd_cr, sizeof (struct ucred)); 372 nd->nd_cr.cr_ref = 1; 373 nd->nd_cr.cr_uid = fxdr_unsigned(uid_t, *tl++); 374 nd->nd_cr.cr_gid = fxdr_unsigned(gid_t, *tl++); 375 len = fxdr_unsigned(int, *tl); 2nd panic: trap: memory address not aligned nfsrv_write() + 0x178 nfssvc_nfsd() + 0x850 objdump said the following. .... 0000000000002600 <nfsrv_write>: nfsrv_write(): /usr/src/sys/nfsserver/nfs_serv.c:1059 ... /usr/src/sys/nfsserver/nfs_serv.c:1106 off = fxdr_hyper(tl); 2778: c2 02 00 00 ld [ %o0 ], %g1 277c: c4 02 20 04 ld [ %o0 + 4 ], %g2 So the paniced code is located in line 1106. 1103 NFSD_LOCK(); 1104 if (v3) { 1105 tl = nfsm_dissect(u_int32_t *, 5 * NFSX_UNSIGNED); 1106 off = fxdr_hyper(tl); 1107 tl += 3; 1108 stable = fxdr_unsigned(int, *tl++); 1109 } else { 1110 tl = nfsm_dissect(u_int32_t *, 4 * NFSX_UNSIGNED); 1111 off = (off_t)fxdr_unsigned(u_int32_t, *++tl); 1112 tl += 2; And there is always a malloc(9) warning during the NFS operation. Jul 27 16:51:36 daemon kernel: malloc(M_WAITOK) of "Mbuf", forcing M_NOWAIT with the following non-sleepable locks held: Jul 27 16:51:36 daemon kernel: exclusive sleep mutex nfsd_mtx r = 0 (0xc03cdc38) locked @ /usr/src/sys/nfsserver/nfs_srvsock.c:712 Jul 27 16:51:36 daemon kernel: KDB: stack backtrace: Jul 27 16:51:36 daemon kernel: uma_zalloc_arg() at uma_zalloc_arg+0x40 Jul 27 16:51:36 daemon kernel: nfsm_disct() at nfsm_disct+0x90 Jul 27 16:51:36 daemon kernel: nfs_getreq() at nfs_getreq+0x44 Jul 27 16:51:36 daemon kernel: nfsrv_dorec() at nfsrv_dorec+0xcc Jul 27 16:51:36 daemon kernel: nfssvc_nfsd() at nfssvc_nfsd+0x2e4 Jul 27 16:51:36 daemon kernel: nfssvc() at nfssvc+0x144 Jul 27 16:51:36 daemon kernel: syscall() at syscall+0x21c Jul 27 16:51:36 daemon kernel: -- syscall (155, FreeBSD ELF64, nfssvc) %o7=0x102bac -- Jul 27 16:51:36 daemon kernel: userland() at 0x4039d848 Jul 27 16:51:36 daemon kernel: user trace: trap %o7=0x102bac Jul 27 16:51:36 daemon kernel: pc 0x4039d848, sp 0x7fdffffe031 Jul 27 16:51:36 daemon kernel: pc 0x1019a4, sp 0x7fdffffe0f1 Jul 27 16:51:36 daemon kernel: pc 0x100f80, sp 0x7fdffffe4d1 Jul 27 16:51:36 daemon kernel: pc 0x4020a8d4, sp 0x7fdffffe591 Jul 27 16:51:36 daemon kernel: done However, this warning is really strange. The source code at line 712 is 707 m = rec->nr_packet; 708 free(rec, M_NFSRVDESC); 709 NFSD_UNLOCK(); 710 MALLOC(nd, struct nfsrv_descript *, sizeof (struct nfsrv_descript), 711 M_NFSRVDESC, M_WAITOK); 712 NFSD_LOCK(); 713 nd->nd_md = nd->nd_mrep = m; So when malloc(9) is called there is no nfsd_mtx lock held. The system is -CURRENT(July 26 2004). Any clues? Regards, Pyun YongHyeon -- Pyun YongHyeon <http://www.kr.freebsd.org/~yongari>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040727110011.GA5553>