From owner-freebsd-sparc64@FreeBSD.ORG Tue Jul 27 11:00:27 2004 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0DE5616A4CF for ; Tue, 27 Jul 2004 11:00:27 +0000 (GMT) Received: from ns.kt-is.co.kr (ns.kt-is.co.kr [211.218.149.125]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5C57243D39 for ; Tue, 27 Jul 2004 11:00:26 +0000 (GMT) (envelope-from yongari@kt-is.co.kr) Received: from michelle.kt-is.co.kr (ns2.kt-is.co.kr [220.76.118.193]) (authenticated bits=128) by ns.kt-is.co.kr (8.12.10/8.12.10) with ESMTP id i6RAowAh012849 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Tue, 27 Jul 2004 19:50:58 +0900 (KST) Received: from michelle.kt-is.co.kr (localhost.kt-is.co.kr [127.0.0.1]) by michelle.kt-is.co.kr (8.12.10/8.12.10) with ESMTP id i6RB0BoI005756 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 27 Jul 2004 20:00:11 +0900 (KST) (envelope-from yongari@kt-is.co.kr) Received: (from yongari@localhost) by michelle.kt-is.co.kr (8.12.10/8.12.10/Submit) id i6RB0Bki005755 for sparc64@freebsd.org; Tue, 27 Jul 2004 20:00:11 +0900 (KST) (envelope-from yongari@kt-is.co.kr) Date: Tue, 27 Jul 2004 20:00:11 +0900 From: Pyun YongHyeon To: sparc64@freebsd.org Message-ID: <20040727110011.GA5553@kt-is.co.kr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-Filter-Version: 1.11a (ns.kt-is.co.kr) Subject: NFS panic and malloc(9) warning X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: yongari@kt-is.co.kr List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Jul 2004 11:00:27 -0000 Hello All, As soon as I did 'cp' via NFS, my Ultra2(NFS server) paniced. The panic is reproduable. Even though I got core files successfully, neither gdb6 nor kgdb understand the core.(no trace info.) 1st panic: trap: memory address not aligned nfs_getreq() + 0x1d0 nfssrv_dorec() + 0xcc nfssvc_nfsd + 0x1c0 objdump said the following. .... 0000000000000480 : nfs_getreq(): /usr/src/sys/nfsserver/nfs_srvsock.c:286 /* * Parse an RPC request * - verify it * - fill in the cred struct. */ .... /usr/src/sys/nfsserver/nfs_srvsock.c:373 650: c4 06 80 00 ld [ %i2 ], %g2 654: b4 06 a0 04 add %i2, 4, %i2 658: c4 26 20 bc st %g2, [ %i0 + 0xbc ] .... So the paniced code is located in line 373. 364 /* 365 * XXX: This credential should be managed using crget(9) 366 * and related calls. Right now, this tramples on any 367 * extensible data in the ucred, fails to initialize the 368 * mutex, and worse. This must be fixed before FreeBSD 369 * 5.3-RELEASE. 370 */ 371 bzero((caddr_t)&nd->nd_cr, sizeof (struct ucred)); 372 nd->nd_cr.cr_ref = 1; 373 nd->nd_cr.cr_uid = fxdr_unsigned(uid_t, *tl++); 374 nd->nd_cr.cr_gid = fxdr_unsigned(gid_t, *tl++); 375 len = fxdr_unsigned(int, *tl); 2nd panic: trap: memory address not aligned nfsrv_write() + 0x178 nfssvc_nfsd() + 0x850 objdump said the following. .... 0000000000002600 : nfsrv_write(): /usr/src/sys/nfsserver/nfs_serv.c:1059 ... /usr/src/sys/nfsserver/nfs_serv.c:1106 off = fxdr_hyper(tl); 2778: c2 02 00 00 ld [ %o0 ], %g1 277c: c4 02 20 04 ld [ %o0 + 4 ], %g2 So the paniced code is located in line 1106. 1103 NFSD_LOCK(); 1104 if (v3) { 1105 tl = nfsm_dissect(u_int32_t *, 5 * NFSX_UNSIGNED); 1106 off = fxdr_hyper(tl); 1107 tl += 3; 1108 stable = fxdr_unsigned(int, *tl++); 1109 } else { 1110 tl = nfsm_dissect(u_int32_t *, 4 * NFSX_UNSIGNED); 1111 off = (off_t)fxdr_unsigned(u_int32_t, *++tl); 1112 tl += 2; And there is always a malloc(9) warning during the NFS operation. Jul 27 16:51:36 daemon kernel: malloc(M_WAITOK) of "Mbuf", forcing M_NOWAIT with the following non-sleepable locks held: Jul 27 16:51:36 daemon kernel: exclusive sleep mutex nfsd_mtx r = 0 (0xc03cdc38) locked @ /usr/src/sys/nfsserver/nfs_srvsock.c:712 Jul 27 16:51:36 daemon kernel: KDB: stack backtrace: Jul 27 16:51:36 daemon kernel: uma_zalloc_arg() at uma_zalloc_arg+0x40 Jul 27 16:51:36 daemon kernel: nfsm_disct() at nfsm_disct+0x90 Jul 27 16:51:36 daemon kernel: nfs_getreq() at nfs_getreq+0x44 Jul 27 16:51:36 daemon kernel: nfsrv_dorec() at nfsrv_dorec+0xcc Jul 27 16:51:36 daemon kernel: nfssvc_nfsd() at nfssvc_nfsd+0x2e4 Jul 27 16:51:36 daemon kernel: nfssvc() at nfssvc+0x144 Jul 27 16:51:36 daemon kernel: syscall() at syscall+0x21c Jul 27 16:51:36 daemon kernel: -- syscall (155, FreeBSD ELF64, nfssvc) %o7=0x102bac -- Jul 27 16:51:36 daemon kernel: userland() at 0x4039d848 Jul 27 16:51:36 daemon kernel: user trace: trap %o7=0x102bac Jul 27 16:51:36 daemon kernel: pc 0x4039d848, sp 0x7fdffffe031 Jul 27 16:51:36 daemon kernel: pc 0x1019a4, sp 0x7fdffffe0f1 Jul 27 16:51:36 daemon kernel: pc 0x100f80, sp 0x7fdffffe4d1 Jul 27 16:51:36 daemon kernel: pc 0x4020a8d4, sp 0x7fdffffe591 Jul 27 16:51:36 daemon kernel: done However, this warning is really strange. The source code at line 712 is 707 m = rec->nr_packet; 708 free(rec, M_NFSRVDESC); 709 NFSD_UNLOCK(); 710 MALLOC(nd, struct nfsrv_descript *, sizeof (struct nfsrv_descript), 711 M_NFSRVDESC, M_WAITOK); 712 NFSD_LOCK(); 713 nd->nd_md = nd->nd_mrep = m; So when malloc(9) is called there is no nfsd_mtx lock held. The system is -CURRENT(July 26 2004). Any clues? Regards, Pyun YongHyeon -- Pyun YongHyeon