Date: Fri, 8 Mar 2002 13:12:38 -0800 (PST) From: Archie Cobbs <archie@dellroad.org> To: freebsd-stable@freebsd.org Cc: dillon@freebsd.org Subject: M_NOWAIT is waiting anyway..? Message-ID: <200203082112.g28LCcl48040@arch20m.dellroad.org>
next in thread | raw e-mail | index | archive | help
I'm seeing a panic that suggests that the kernel malloc() implementation is broken with respect to M_NOWAIT, hard to believe as that is. Here's the trace: > panic: tsleep1 > Debugger("panic") > Stopped at Debugger+0x34: movb $0,in_Debugger.426 > db> tr > Debugger(c025f03b) at Debugger+0x34 > panic(c025f50c,c02befd0,1000000,ffffffff,34) at panic+0x70 > tsleep(c02befd0,4,c0279394,0,c02befd0) at tsleep+0x5e > acquire(c02befd0,1000000,600,1,c02b3540) at acquire+0x88 > lockmgr(c02befd0,2,0,0,1) at lockmgr+0x248 > kmem_malloc(c02befa0,1000,1,0,c06fb700) at kmem_malloc+0x54 > malloc(c,c02b3540,1,0,c06fb700) at malloc+0x246 > typed_mem_realloc(c028e988,a9,c028e984,0,24) at typed_mem_realloc+0xa2 > pkt_new_nobuf(c1e8f7c0,c06fb73e,c02962b8,c024d84a,c06fb700) at pkt_new_nobuf+0x38 > ng_test_mbuf2pkt(c06fb700) at ng_test_mbuf2pkt+0x39 > ng_test_rx_int(c1ebb940,c06fb700,0,5,c1ebb920) at ng_test_rx_int+0x62 > ng_test_rcvdata(c1ebb940,c06fb700,0,c1f2eba0,c1ebb920) at ng_test_rcvdata+0xe6 > ng_send_dataq(c1ebb920,c06fb700,0) at ng_send_dataq+0x6f > ngintr(c021a44f,0,c0290010,c0140010,c02b0010) at ngintr+0xd3 > swi_net_next(3581000,0,0,0,0) at swi_net_next > vm_page_zero_idle(f,686,2,383f9ff,756e6547) at vm_page_zero_idle+0xdf > idle_loop() at idle_loop+0x13 The important things to note are: - A netgraph soft interrupt is running during the idle process, so curproc == NULL - malloc() is being called with the M_NOWAIT - tsleep() is being called anyway This is on a 4.4-REL kernel, but it appears that the same thing would happen in 4.5-REL as well. This is of course completely broken, because M_NOWAIT tells malloc() it should never sleep, returning NULL instead. As always, this could be happening to me due to memory corruption, which was my first thought, but after looking at the code, it does appear that this can happen like so: 1. malloc() is called with M_NOWAIT 2. malloc() calls kmem_malloc(kmem_map, ...), which has this rather disturbing comment: * NOTE: This routine is not supposed to block if M_NOWAIT is set, but * I have not verified that it actually does not block. 3. kmem_malloc() calls vm_map_lock(kmem_map) 4. vm_map_lock() is a macro that calls lockmgr(&kmem_map->lock, LK_EXCLUSIVE, ..) 5. kmem_map->lock->lk_flags does not include LK_NOWAIT (it is initialized by vm_map_init() which calls lockinit() with LK_NOPAUSE but not LK_NOWAIT) so lockmgr() calls acquire() 6. acquire() calls tsleep() -> panic because curproc == NULL Is the above scenario correct? If so it seems like a very serious problem for me to be the first one to see it.. though that may be because my kernel netgraph node is allocating enough memory to cause malloc() to call kmem_malloc(), which normally does not happen.. ? Thanks for any insights! -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200203082112.g28LCcl48040>