Date: Mon, 31 Jul 2006 17:45:26 GMT From: Robert Watson <rwatson@FreeBSD.org> To: rwatson@FreeBSD.org, freebsd-bugs@FreeBSD.org, rwatson@FreeBSD.org Subject: Re: kern/100940: passing file descriptor over datagram UNIX domain socket crashes kernel Message-ID: <200607311745.k6VHjQ9g091849@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
Synopsis: passing file descriptor over datagram UNIX domain socket crashes kernel Responsible-Changed-From-To: freebsd-bugs->rwatson Responsible-Changed-By: rwatson Responsible-Changed-When: Mon Jul 31 17:27:10 UTC 2006 Responsible-Changed-Why: Grab ownership of this PR, since I have a strong interest in the UNIX domain socket code. The problem as seen here with WITNESS in place on a 7.x kernel is: tiger-1# ./fd_passing Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex unp r = 0 (0xc0a56714) locked @ kern/uipc_usrreq.c:999 KDB: stack backtrace: kdb_backtrace(1,c7193b04,c,c7164a20,e974fad4,...) at kdb_backtrace+0x29 witness_warn(5,0,c094452a) at witness_warn+0x192 trap(c7160008,c76d0028,28,0,c7497578,...) at trap+0x108 calltrap() at calltrap+0x5 --- trap 0xc, eip = 0xc06e38ff, esp = 0xe974fb1c, ebp = 0xe974fb44 --- uipc_send(c76f4530,0,c73e0200,c7121160,c73e0500,c7164a20) at uipc_send+0xdb sosend_generic(c76f4530,c7121160,e974fbe4,c73e0200,c73e0300,...) at sosend_generic+0x3e5 sosend(c76f4530,c7121160,e974fbe4,0,c73e0300,0,c7164a20) at sosend+0x3c kern_sendit(c7164a20,3,e974fc5c,0,c73e0300,0) at kern_sendit+0x101 sendit(c7164a20,3,e974fc5c,0,c7121170,...) at sendit+0x87 sendmsg(c7164a20,e974fd04) at sendmsg+0x53 syscall(3b,3b,3b,bfbfed5c,bfbfed54,...) at syscall+0x256 Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (28, FreeBSD ELF32, sendmsg), eip = 0x2812fab3, esp = 0xbfbfec1c, ebp = 0xbfbfece8 --- Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 06 fault virtual address = 0x8 fault code = supervisor read, page not present instruction pointer = 0x20:0xc06e38ff stack pointer = 0x28:0xe974fb1c frame pointer = 0x28:0xe974fb44 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 901 (fd_passing) [thread pid 901 tid 100075 ] Stopped at uipc_send+0xdb: movl 0x8(%ecx),%edi db> bt Tracing pid 901 tid 100075 td 0xc7164a20 uipc_send(c76f4530,0,c73e0200,c7121160,c73e0500,c7164a20) at uipc_send+0xdb sosend_generic(c76f4530,c7121160,e974fbe4,c73e0200,c73e0300,...) at sosend_generic+0x3e5 sosend(c76f4530,c7121160,e974fbe4,0,c73e0300,0,c7164a20) at sosend+0x3c kern_sendit(c7164a20,3,e974fc5c,0,c73e0300,0) at kern_sendit+0x101 sendit(c7164a20,3,e974fc5c,0,c7121170,...) at sendit+0x87 sendmsg(c7164a20,e974fd04) at sendmsg+0x53 syscall(3b,3b,3b,bfbfed5c,bfbfed54,...) at syscall+0x256 Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (28, FreeBSD ELF32, sendmsg), eip = 0x2812fab3, esp = 0xbfbfec1c, ebp = 0xbfbfece8 --- db> show alllocks Process 901 (fd_passing) thread 0xc7164a20 (100075) exclusive sleep mutex unp r = 0 (0xc0a56714) locked @ kern/uipc_usrreq.c:999 995 if (vp != NULL) 996 vput(vp); 997 mtx_unlock(&Giant); 998 free(sa, M_SONAME); 999 UNP_LOCK(); 1000 unp->unp_flags &= ~UNP_CONNECTING; 1001 return (error); 1002 } 1003 1004 static int 1005 unp_connect2(struct socket *so, struct socket *so2, int req) 1006 { (gdb) l *0xc06e38ff 0xc06e38ff is in uipc_send (../../../kern/uipc_usrreq.c:609). 604 error = ENOTCONN; 605 break; 606 } 607 } 608 unp2 = unp->unp_conn; 609 so2 = unp2->unp_socket; 610 if (unp->unp_addr != NULL) 611 from = (struct sockaddr *)unp->unp_addr; 612 else 613 from = &sun_noname; The problem appears to be that unp_connect() can return with the socket disconnected as a result of dropping the UNIX domain socket subsystem lock while discarding the vnode reference for the remote socket, so that the socket is disconnected before the send can proceed. Probably the answer is to add a check for a NULL unp->unp_conn pointer and return an appropriate error, as the connect() and send() cannot be performed atomically. I will follow up with a patch. http://www.freebsd.org/cgi/query-pr.cgi?pr=100940
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200607311745.k6VHjQ9g091849>