Date: Fri, 8 Mar 2002 13:12:38 -0800 (PST) From: Archie Cobbs <archie@dellroad.org> To: freebsd-stable@freebsd.org Cc: dillon@freebsd.org Subject: M_NOWAIT is waiting anyway..? Message-ID: <200203082112.g28LCcl48040@arch20m.dellroad.org>
next in thread | raw e-mail | index | archive | help
I'm seeing a panic that suggests that the kernel malloc() implementation
is broken with respect to M_NOWAIT, hard to believe as that is.
Here's the trace:
> panic: tsleep1
> Debugger("panic")
> Stopped at Debugger+0x34: movb $0,in_Debugger.426
> db> tr
> Debugger(c025f03b) at Debugger+0x34
> panic(c025f50c,c02befd0,1000000,ffffffff,34) at panic+0x70
> tsleep(c02befd0,4,c0279394,0,c02befd0) at tsleep+0x5e
> acquire(c02befd0,1000000,600,1,c02b3540) at acquire+0x88
> lockmgr(c02befd0,2,0,0,1) at lockmgr+0x248
> kmem_malloc(c02befa0,1000,1,0,c06fb700) at kmem_malloc+0x54
> malloc(c,c02b3540,1,0,c06fb700) at malloc+0x246
> typed_mem_realloc(c028e988,a9,c028e984,0,24) at typed_mem_realloc+0xa2
> pkt_new_nobuf(c1e8f7c0,c06fb73e,c02962b8,c024d84a,c06fb700) at pkt_new_nobuf+0x38
> ng_test_mbuf2pkt(c06fb700) at ng_test_mbuf2pkt+0x39
> ng_test_rx_int(c1ebb940,c06fb700,0,5,c1ebb920) at ng_test_rx_int+0x62
> ng_test_rcvdata(c1ebb940,c06fb700,0,c1f2eba0,c1ebb920) at ng_test_rcvdata+0xe6
> ng_send_dataq(c1ebb920,c06fb700,0) at ng_send_dataq+0x6f
> ngintr(c021a44f,0,c0290010,c0140010,c02b0010) at ngintr+0xd3
> swi_net_next(3581000,0,0,0,0) at swi_net_next
> vm_page_zero_idle(f,686,2,383f9ff,756e6547) at vm_page_zero_idle+0xdf
> idle_loop() at idle_loop+0x13
The important things to note are:
- A netgraph soft interrupt is running during the idle process,
so curproc == NULL
- malloc() is being called with the M_NOWAIT
- tsleep() is being called anyway
This is on a 4.4-REL kernel, but it appears that the same thing
would happen in 4.5-REL as well.
This is of course completely broken, because M_NOWAIT tells malloc()
it should never sleep, returning NULL instead.
As always, this could be happening to me due to memory corruption,
which was my first thought, but after looking at the code, it does
appear that this can happen like so:
1. malloc() is called with M_NOWAIT
2. malloc() calls kmem_malloc(kmem_map, ...), which has this
rather disturbing comment:
* NOTE: This routine is not supposed to block if M_NOWAIT is set, but
* I have not verified that it actually does not block.
3. kmem_malloc() calls vm_map_lock(kmem_map)
4. vm_map_lock() is a macro that calls
lockmgr(&kmem_map->lock, LK_EXCLUSIVE, ..)
5. kmem_map->lock->lk_flags does not include LK_NOWAIT (it is
initialized by vm_map_init() which calls lockinit() with LK_NOPAUSE
but not LK_NOWAIT) so lockmgr() calls acquire()
6. acquire() calls tsleep() -> panic because curproc == NULL
Is the above scenario correct? If so it seems like a very serious
problem for me to be the first one to see it.. though that may be
because my kernel netgraph node is allocating enough memory to cause
malloc() to call kmem_malloc(), which normally does not happen.. ?
Thanks for any insights!
-Archie
__________________________________________________________________________
Archie Cobbs * Packet Design * http://www.packetdesign.com
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200203082112.g28LCcl48040>
