Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Oct 2003 12:15:56 +0200 (CEST)
From:      Stefan Farfeleder <stefan@fafoe.narf.at>
To:        FreeBSD-gnats-submit@FreeBSD.org
Cc:        stefan@fafoe.narf.at
Subject:   kern/57945: [patch] Add locking to kqueue to make it MP-safe
Message-ID:  <20031013101556.7FFCA482@frog.fafoe.narf.at>
Resent-Message-ID: <200310131020.h9DAK9eE057443@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         57945
>Category:       kern
>Synopsis:       [patch] Add locking to kqueue to make it MP-safe
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Oct 13 03:20:09 PDT 2003
>Closed-Date:
>Last-Modified:
>Originator:     Stefan Farfeleder
>Release:        FreeBSD 5.1-CURRENT i386
>Organization:
>Environment:
System: FreeBSD frog.fafoe.narf.at 5.1-CURRENT FreeBSD 5.1-CURRENT #14: Mon Oct 13 02:25:41 CEST 2003 freebsd@frog.fafoe.narf.at:/freebsd/frog/obj/freebsd/frog/src/sys/FROG i386

>Description:
The current kqueue implementation does not seem to be MP-safe.  The
kqueue facility does not having its own locks, I believe it was
intended to rely on Giant.  As Giant gets locked down, kqueue was
apparently forgotten.  The KNOTE() macro is spread over the whole
kernel, sometimes called with Giant hold, sometimes not.  The function
knote_enqueue() (called via knote() and KNOTE_ACTIVATE()) inserts a node
into a TAILQ, this operation isn't atomic.  If another processor
executes KNOTE() or kevent() at the same time, the queue might get
corrupted.

>How-To-Repeat:
A good test is to decomment -DUSE_QUEUE in the make(1) Makefile,
recompile it and do a buildworld with, say, -j8.  Almost always this
will result in a null pointer dereference in kqueue_scan() (this happens
rather seldomly nowadays) or a deadlock on my dual Athlon MP box.  The
only thing I can do then is to break into ddb via the serial console:

%%
db> t
siointr1(c6506800,0,c05d9669,6a0,e402cca4) at siointr1+0xc9
siointr(c6506800) at siointr+0x35
Xfastintr4() at Xfastintr4+0xba
--- interrupt, eip = 0xc048f818, esp = 0xe402cc18, ebp = 0xe402cca4 ---
kqueue_scan(c7090594,4,bfbfeb40,0,c70d9850) at kqueue_scan+0x228
kevent(c70d9850,e402cd14,c05dca42,3ec,6) at kevent+0x1df
syscall(2f,2f,2f,bfbff918,bfbff928) at syscall+0x233
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (363, FreeBSD ELF32, kevent), eip = 0x806740b, esp = 0xbfbfeb0c, ebp = 0xbfbfeba8 ---
db> ps
  pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
 1637 c66afd3c e3db6000 1005  1635 83873 0004002 [SLP]piperd 0xc714f60c] as
 1636 c70dd5ac e404b000 1005  1635 83873 0000002 [LOCK  Giant c0619140] cc
 1635 c68f5974 e3e44000 1005  1626 83873 0004002 [SLP]wait 0xc68f5974] cc
 1634 c6a97b58 e3f12000 1005  1632 83873 0004002 [SLP]piperd 0xc66c6204] as
 1633 c6a971e4 e3f0d000 1005  1632 83873 0000002 [LOCK  Giant c0619140] cc
 1632 c6a975ac e3f0f000 1005  1627 83873 0004002 [SLP]wait 0xc6a975ac] cc
 1631 c712ad3c e40bd000 1005 98527 83873 0004002 [LOCK  Giant c0619140] sh
 1630 c6a97000 e3f0c000 1005  1628 83873 0004002 [LOCK  Giant c0619140] as
 1629 c68f5d3c e3e46000 1005  1628 83873 0004002 [LOCK  Giant c0619140] cc1
 1628 c66afb58 e3db5000 1005  1597 83873 0004002 [SLP]wait 0xc66afb58] cc
 1627 c7160d3c e413c000 1005  1375 83873 0004002 [SLP]wait 0xc7160d3c] sh
 1626 c71605ac e40ea000 1005  1137 83873 0004002 [SLP]wait 0xc71605ac] sh
 1625 c67e03c8 e3de3000 1005  1060 83873 0004002 [LOCK  Giant c0619140] sh
 1624 c70ae790 e3f98000 1005  1375 83873 0004002 [LOCK  Giant c0619140] sh
 1623 c6a8a1e4 e3ecd000 1005  1060 83873 0004002 [LOCK  Giant c0619140] sh
 1622 c70ae5ac e3f97000 1005  1620 83873 0004002 [LOCK  Giant c0619140] as
 1621 c6ab7000 e3f25000 1005  1620 83873 0004002 [LOCK  Giant c0619140] cc1
 1620 c67df1e4 e3dda000 1005  1610 83873 0004002 [SLP]wait 0xc67df1e4] cc
 1619 c70dd000 e4048000 1005  1613 83873 0004002 [SLP]piperd 0xc6a59b6c] as
--More-- 1618 c70ae1e4 e3f95000 1005  1613 83873 0004002 [LOCK  Giant c0619140] cc1
--More-- 1617 c712a000 e4068000 1005  1615 83873 0004002 [LOCK  Giant c0619140] as
 1616 c70dad3c e4047000 1005  1615 83873 0004002 [LOCK  Giant c0619140] cc1
 1615 c6a201e4 e3e77000 1005  1611 83873 0004002 [SLP]wait 0xc6a201e4] cc
 1614 c70acb58 e3f92000 1005  1375 83873 0004002 [LOCK  Giant c0619140] sh
 1613 c70da790 e4044000 1005  1612 83873 0004002 [SLP]wait 0xc70da790] cc
 1612 c6a96d3c e3f0b000 1005 98527 83873 0004002 [SLP]wait 0xc6a96d3c] sh
 1611 c6a8a3c8 e3ece000 1005  1060 83873 0004002 [SLP]wait 0xc6a8a3c8] sh
 1610 c70da974 e4045000 1005  1375 83873 0004002 [SLP]wait 0xc70da974] sh
 1609 c70d75ac e3fed000 1005  1607 83873 0004002 [SLP]piperd 0xc6b2e8bc] as
 1608 c6a20d3c e3ecb000 1005  1607 83873 0004002 [LOCK  Giant c0619140] cc1
 1607 c70dab58 e4046000 1005  1606 83873 0004002 [SLP]wait 0xc70dab58] cc
 1606 c6a1f974 e3e73000 1005  1137 83873 0004002 [SLP]wait 0xc6a1f974] sh
 1603 c70aeb58 e3fe8000 1005  1601 83873 0004002 [SLP]piperd 0xc66c42b0] as
 1602 c70ac790 e3f90000 1005  1601 83873 0004002 [LOCK  Giant c0619140] cc1
 1601 c6a1f000 e3e6e000 1005  1600 83873 0004002 [SLP]wait 0xc6a1f000] cc
 1600 c70dd3c8 e404a000 1005  1060 83873 0004002 [SLP]wait 0xc70dd3c8] sh
 1597 c6a1f1e4 e3e6f000 1005 98527 83873 0004002 [SLP]wait 0xc6a1f1e4] sh
 1578 c718e3c8 e413f000 1005  1575 83873 0004002 [SLP]piperd 0xc716b0ac] as
 1576 c712ab58 e40bc000 1005  1575 83873 0004002 [LOCK  Giant c0619140] cc1
 1575 c70da000 e4040000 1005  1572 83873 0004002 [SLP]wait 0xc70da000] cc
--More-- 1572 c66473c8 e1b40000 1005 98527 83873 0004002 [SLP]wait 0xc66473c8] sh
 1375 c71603c8 e40e9000 1005  1317 83873 0004002 [SLP]kqread 0xc6a93000] make
 1317 c715f974 e40e4000 1005 96810 83873 0004002 [SLP]wait 0xc715f974] sh
 1137 c718e000 e413d000 1005   184 83873 0004002 [LOCK  Giant c0619140] make
 1060 c715fd3c e40e6000 1005 99104 83873 0004002 [CPU 0] make
  184 c70dd1e4 e4049000 1005   183 83873 0004002 [SLP]wait 0xc70dd1e4] sh
  183 c70ae3c8 e3f96000 1005 99167 83873 0004002 [SLP]kqread 0xcd046000] make
99167 c712a790 e406c000 1005 96810 83873 0004002 [SLP]wait 0xc712a790] sh
99104 c67e0d3c e3e0f000 1005 96810 83873 0004002 [SLP]wait 0xc67e0d3c] sh
98527 c68f5000 e3e18000 1005 96820 83873 0004002 [SLP]kqread 0xc6fdd400] make
96820 c6ab73c8 e3f27000 1005 96810 83873 0004002 [SLP]wait 0xc6ab73c8] sh
96810 c718e5ac e4140000 1005 96271 83873 0004002 [SLP]kqread 0xc867b900] make
96271 c70acd3c e3f93000 1005 96264 83873 0004002 [SLP]wait 0xc70acd3c] sh
96264 c70d73c8 e3fec000 1005 96263 83873 0004002 [SLP]kqread 0xc6a8f100] make
96263 c70ac3c8 e3f8e000 1005 83901 83873 0004002 [SLP]wait 0xc70ac3c8] sh
83901 c6a97d3c e3f13000 1005 83899 83873 0004002 [SLP]kqread 0xc6feb700] make
83899 c712e1e4 e40bf000 1005 83873 83873 0004002 [SLP]wait 0xc712e1e4] sh
83873 c6ab7790 e3f29000 1005   593 83873 0004002 [SLP]kqread 0xc6fe4e00] make
83845 c6a1fd3c e3e75000 1001 83844 83845 0004002 [SLP]ttyin 0xc6802c40] bash
83844 c70da5ac e4043000 1001 83842 83842 0000100 [CV]select 0xc06418d4] sshd
--More--83842 c70ddd3c e404f000    0   428 83842 0000100 [SLP]sbwait 0xc6819664] sshd
  593 c67e0b58 e3e0e000 1005   565   593 0004002 [SLP]wait 0xc67e0b58] bash
  582 c66af5ac e3db2000    0     0     0 0000204 [SLP]mdwait 0xc68d0e00] md0
  565 c67df790 e3ddd000 1001   564   565 0004002 [SLP]wait 0xc67df790] bash
  564 c68f53c8 e3e1a000 1001   562   562 0000100 [CV]select 0xc06418d4] sshd
  562 c68f55ac e3e42000    0   428   562 0000100 [SLP]sbwait 0xc6815a64] sshd
  561 c67df3c8 e3ddb000    0     1   561 0004002 [SLP]ttyin 0xc686cc10] getty
  560 c67dfb58 e3ddf000    0     1   560 0004002 [SLP]ttyin 0xc686d010] getty
  559 c67df974 e3dde000    0     1   559 0004002 [SLP]ttyin 0xc686d410] getty
  558 c67e0000 e3de1000    0     1   558 0004002 [SLP]ttyin 0xc686d810] getty
  557 c66493c8 e1b48000    0     1   557 0004002 [SLP]ttyin 0xc6680e10] getty
  556 c6649b58 e1b73000    0     1   556 0004002 [SLP]ttyin 0xc6754810] getty
  555 c67dfd3c e3de0000    0     1   555 0004002 [SLP]ttyin 0xc6754010] getty
  554 c67e01e4 e3de2000    0     1   554 0004002 [SLP]ttyin 0xc64d9a10] getty
  542 c66af1e4 e3db0000    0     1   542 0000000 [CV]select 0xc06418d4] inetd
  517 c66495ac e1b49000 1006   510   510 0004100 [LOCK  Giant c0619140] qmgr
  516 c6647000 e1b3e000 1006   510   510 0004100 [CV]select 0xc06418d4] pickup
  510 c66af000 e3d61000    0     1   510 0004100 [CV]select 0xc06418d4] master
  450 c66af974 e3db4000    0     1   450 0000000 [SLP]nanslp 0xc061b80c] cron
  428 c6649790 e1b4a000    0     1   428 0000100 [CV]select 0xc06418d4] sshd
--More--  406 c66af790 e3db3000    0     1   406 0000000 [CV]select 0xc06418d4] ntpd
  284 c64eed3c e1b33000    0     1   284 0000000 [CV]select 0xc06418d4] rpcbind
  265 c66471e4 e1b3f000    0     1   265 0000000 [CV]select 0xc06418d4] syslogd
   37 c66475ac e1b41000    0     0     0 0000204 [SLP]syncer 0xc061b1c0] syncer
   36 c6647790 e1b42000    0     0     0 0000204 [SLP]vlruwt 0xc6647790] vnlru
   35 c6647974 e1b43000    0     0     0 0000204 [SLP]psleep 0xc0641d68] bufdaemon
   34 c6647b58 e1b44000    0     0     0 000020c [SLP]pgzero 0xc0645588] pagezero
   33 c6647d3c e1b45000    0     0     0 0000204 [SLP]psleep 0xc06455e0] vmdaemon
    9 c6649000 e1b46000    0     0     0 0000204 [SLP]psleep 0xc06455cc] pagedaemon
   32 c66491e4 e1b47000    0     0     0 0000204 new [IWAIT] irq8: rtc
   31 c646d5ac e1afa000    0     0     0 0000204 new [IWAIT] irq0: clk
   30 c646d790 e1afb000    0     0     0 0000204 new [IWAIT] irq3: sio1
   29 c646d974 e1afc000    0     0     0 0000204 new [IWAIT] irq4: sio0
   28 c646db58 e1afd000    0     0     0 0000204 [IWAIT] swi0: tty:sio
   27 c646dd3c e1afe000    0     0     0 0000204 new [IWAIT] irq7: ppc0
   26 c64ee000 e1b05000    0     0     0 0000204 [LOCK  Giant c0619140] irq11: fxp0
   25 c64ee1e4 e1b06000    0     0     0 0000204 new [IWAIT] irq15: ata1
   24 c64ee3c8 e1b07000    0     0     0 0000204 [IWAIT] irq14: ata0
    8 c64ee5ac e1b08000    0     0     0 0000204 [SLP]actask 0xc071224c] acpi_task2
    7 c64ee790 e1b09000    0     0     0 0000204 [SLP]actask 0xc071224c] acpi_task1
--More--    6 c64ee974 e1b0a000    0     0     0 0000204 [SLP]actask 0xc071224c] acpi_task0
   23 c25ac1e4 e00c8000    0     0     0 0000204 new [IWAIT] irq9: acpi0
   22 c25ac3c8 e00c9000    0     0     0 0000204 new [IWAIT] swi3: cambio
   21 c25ac5ac e00ca000    0     0     0 0000204 new [IWAIT] swi2: camnet
   20 c25ac790 e00cb000    0     0     0 0000204 new [IWAIT] swi5:+
    5 c25ac974 e00cc000    0     1     0 0000204 [SLP]tqthr 0xc061cd68] taskqueue
   19 c25acb58 e00cd000    0     0     0 0000204 [IWAIT] swi7: acpitaskq
   18 c25acd3c e00f5000    0     0     0 0000204 new [IWAIT] swi6:+
   17 c646d000 e1af7000    0     0     0 0000204 [IWAIT] swi7: task queue
   16 c646d1e4 e1af8000    0     0     0 0000204 [SLP]- 0xc0608c60] random
    4 c646d3c8 e1af9000    0     0     0 0000204 [SLP]- 0xc06155e0] g_down
    3 c25a5000 e0071000    0     0     0 0000204 [SLP]- 0xc06155dc] g_up
    2 c25a51e4 e00c0000    0     0     0 0000204 [SLP]- 0xc06155d4] g_event
   15 c25a53c8 e00c1000    0     0     0 0000204 new [IWAIT] swi4: vm
   14 c25a55ac e00c2000    0     0     0 000020c [LOCK  Giant c0619140] swi8: tty:sio clock
   13 c25a5790 e00c3000    0     0     0 0000204 [IWAIT] swi1: net
   12 c25a5974 e00c4000    0     0     0 000020c [Can run] idle: cpu0
   11 c25a5b58 e00c5000    0     0     0 000020c [CPU 1] idle: cpu1
    1 c25a5d3c e00c6000    0     0     1 0004200 [SLP]wait 0xc25a5d3c] init
   10 c25ac000 e00c7000    0     0     0 0000204 [CV]ktrace 0xc0618b94] ktrace
--More--    0 c0615680 c081f000    0     0     0 0000200 [SLP]sched 0xc0615680] swapper
db> show witness
Sleep locks:
0 taskqueue kthread -- last acquired @ /freebsd/testing/src/sys/kern/subr_taskqueue.c:253
0 g_xdown -- last acquired @ /freebsd/testing/src/sys/geom/geom_io.c:356
3  ATA queue lock -- last acquired @ /freebsd/testing/src/sys/dev/ata/ata-queue.c:174
9  Malloc Stats -- last acquired @ /freebsd/testing/src/sys/kern/kern_malloc.c:335
1  ATA disk bioqueue lock -- last acquired @ /freebsd/testing/src/sys/dev/ata/ata-disk.c:240
3  bio queue -- last acquired @ /freebsd/testing/src/sys/geom/geom_io.c:64
1  md bio queue -- last acquired @ /freebsd/testing/src/sys/dev/md/md.c:603
9  system map -- last acquired @ /freebsd/testing/src/sys/vm/vm_map.c:2910
10  kmem object -- last acquired @ /freebsd/testing/src/sys/vm/vm_object.c:433
11   vm page queue mutex -- last acquired @ /freebsd/testing/src/sys/vm/vm_fault.c:907
12    vnode interlock -- last acquired @ /freebsd/testing/src/sys/kern/vfs_subr.c:2176
13     vnode_free_list -- last acquired @ /freebsd/testing/src/sys/kern/vfs_subr.c:962
13     spechash -- last acquired @ /freebsd/testing/src/sys/kern/vfs_subr.c:1989
13     Syncer mtx -- last acquired @ /freebsd/testing/src/sys/kern/vfs_subr.c:1663
13     Name Cache -- last acquired @ /freebsd/testing/src/sys/kern/vfs_cache.c:849
12    UMA pcpu -- last acquired @ /freebsd/testing/src/sys/vm/uma_core.c:1390
13     KMAP ENTRY -- last acquired @ /freebsd/testing/src/sys/vm/uma_core.c:361
13     UMA zone -- last acquired @ /freebsd/testing/src/sys/vm/uma_core.c:1435
12    CMAPCADDR12 -- last acquired @ /freebsd/testing/src/sys/i386/i386/pmap.c:2475
10  vm object -- last acquired @ /freebsd/testing/src/sys/vm/vm_object.c:1787
11   vm page queue mutex -- (already displayed)
0 g_xup -- last acquired @ /freebsd/testing/src/sys/geom/geom_io.c:375
2  Giant -- last acquired @ /freebsd/testing/src/sys/kern/kern_event.c:444
3   kernel linker -- last acquired @ /freebsd/testing/src/sys/kern/kern_linker.c:430
3   malloc -- last acquired @ /freebsd/testing/src/sys/kern/kern_malloc.c:502
3   devstat -- last acquired @ /freebsd/testing/src/sys/kern/subr_devstat.c:190
3   eventhandler -- last acquired @ /freebsd/testing/src/sys/kern/subr_eventhandler.c:213
4    eventhandler list -- last acquired @ /freebsd/testing/src/sys/kern/kern_exit.c:210
3   ipflow list head -- last acquired @ /freebsd/testing/src/sys/netinet/ip_flow.c:288
3   vm object_list -- last acquired @ /freebsd/testing/src/sys/vm/vm_object.c:218
10   kmem object -- (already displayed)
10   vm object -- (already displayed)
3   ithread -- last acquired @ /freebsd/testing/src/sys/kern/kern_intr.c:265
3   sf_bufs list lock -- last acquired @ /freebsd/testing/src/sys/i386/i386/vm_machdep.c:577
9    Malloc Stats -- (already displayed)
9    system map -- (already displayed)
3   GEOM orphanage -- last acquired @ /freebsd/testing/src/sys/geom/geom_event.c:169
3   taskqueue list -- last acquired @ /freebsd/testing/src/sys/kern/subr_taskqueue.c:384
3   ipqlock -- last acquired @ /freebsd/testing/src/sys/netinet/ip_input.c:1237
3   accounting -- last acquired @ /freebsd/testing/src/sys/kern/kern_acct.c:232
3   rman head -- last acquired @ /freebsd/testing/src/sys/kern/subr_rman.c:110
3   ACPI semaphore -- last acquired @ /freebsd/testing/src/sys/dev/acpica/Osd/OsdSynch.c:320
3   rman -- last acquired @ /freebsd/testing/src/sys/kern/subr_rman.c:445
9    Malloc Stats -- (already displayed)
9    system map -- (already displayed)
3   devd -- last acquired @ /freebsd/testing/src/sys/kern/subr_bus.c:479
9    Malloc Stats -- (already displayed)
12   UMA pcpu -- (already displayed)
3   pseudofs_vncache -- last acquired @ /freebsd/testing/src/sys/fs/pseudofs/pseudofs_vncache.c:225
3   p_peers -- last acquired @ /freebsd/testing/src/sys/kern/kern_exit.c:252
3   bio queue -- (already displayed)
3   taskqueue -- last acquired @ /freebsd/testing/src/sys/kern/subr_taskqueue.c:210
3   pseudofs -- last acquired @ /freebsd/testing/src/sys/fs/pseudofs/pseudofs_fileno.c:86
3   acpica subsystem lock -- last acquired @ /freebsd/testing/src/sys/dev/acpica/Osd/OsdSynch.c:381
3   pbuf mutex -- last acquired @ /freebsd/testing/src/sys/vm/vm_pager.c:443
3   domain list -- last acquired @ /freebsd/testing/src/sys/kern/uipc_domain.c:114
3   dirhash list -- last acquired @ /freebsd/testing/src/sys/ufs/ufs/ufs_dirhash.c:342
4    dirhash -- last acquired @ /freebsd/testing/src/sys/ufs/ufs/ufs_dirhash.c:657
3   sem -- last acquired @ /freebsd/testing/src/sys/kern/sysv_sem.c:1158
3   mntvnode -- last acquired @ /freebsd/testing/src/sys/kern/vfs_subr.c:1054
12   vnode interlock -- (already displayed)
3   ufs ihash -- last acquired @ /freebsd/testing/src/sys/ufs/ufs/ufs_ihash.c:160
12   vnode interlock -- (already displayed)
3   mntid -- last acquired @ /freebsd/testing/src/sys/kern/vfs_subr.c:586
4    mountlist -- last acquired @ /freebsd/testing/src/sys/kern/vfs_subr.c:3500
3   ATA queue lock -- (already displayed)
3   netisr lock -- last acquired @ /freebsd/testing/src/sys/net/netisr.c:229
4    tcp -- last acquired @ /freebsd/testing/src/sys/netinet/tcp_usrreq.c:653
5     inp -- last acquired @ /freebsd/testing/src/sys/netinet/tcp_usrreq.c:670
6      random reseed -- last acquired @ /freebsd/testing/src/sys/dev/random/yarrow.c:172
6      radix node head -- last acquired @ /freebsd/testing/src/sys/netinet/if_ether.c:146
7       rtentry -- last acquired @ /freebsd/testing/src/sys/net/route.c:1170
8        ifaddr -- last acquired @ /freebsd/testing/src/sys/net/route.c:670
8        network driver -- last acquired @ /freebsd/testing/src/sys/dev/fxp/if_fxp.c:1279
9         Malloc Stats -- (already displayed)
9         if send queue -- last acquired @ /freebsd/testing/src/sys/dev/fxp/if_fxp.c:1320
9         mbuf PCPU list lock -- last acquired @ /freebsd/testing/src/sys/kern/subr_mbuf.c:645
10         mbuf subsystem general lists lock -- last acquired @ /freebsd/testing/src/sys/kern/subr_mbuf.c:676
9         system map -- (already displayed)
7       ifnet -- last acquired @ /freebsd/testing/src/sys/net/if.c:1172
6      arc4_mtx -- last acquired @ /freebsd/testing/src/sys/libkern/arc4random.c:137
6      ip_inq -- last acquired @ /freebsd/testing/src/sys/net/netisr.c:137
6      sellck -- last acquired @ /freebsd/testing/src/sys/kern/sys_generic.c:816
11     sleep mtxpool -- last acquired @ /freebsd/testing/src/sys/kern/kern_descrip.c:1807
4    arp_inq -- last acquired @ /freebsd/testing/src/sys/net/netisr.c:137
4    udp -- last acquired @ /freebsd/testing/src/sys/netinet/udp_usrreq.c:263
5     inp -- (already displayed)
5     UMA lock -- last acquired @ /freebsd/testing/src/sys/vm/uma_core.c:1201
9      Malloc Stats -- (already displayed)
9      system map -- (already displayed)
3   ACPI task -- last acquired @ /freebsd/testing/src/sys/dev/acpica/Osd/OsdSchedule.c:110
3   bdone lock -- last acquired @ /freebsd/testing/src/sys/kern/vfs_bio.c:3752
3   buffer daemon lock -- last acquired @ /freebsd/testing/src/sys/kern/vfs_bio.c:396
3   needsbuffer lock -- last acquired @ /freebsd/testing/src/sys/kern/vfs_bio.c:291
3   runningbufspace lock -- last acquired @ /freebsd/testing/src/sys/kern/vfs_bio.c:309
3   rtsock route_cb lock -- last acquired @ /freebsd/testing/src/sys/net/rtsock.c:190
3   buf queue lock -- last acquired @ /freebsd/testing/src/sys/kern/vfs_bio.c:1490
12   vnode interlock -- (already displayed)
3   linux osname -- last acquired @ /freebsd/frog/src/sys/compat/linux/linux_mib.c:225
3   fdesc -- last acquired @ /freebsd/testing/src/sys/kern/kern_descrip.c:1515
4    filedesc structure -- last acquired @ /freebsd/testing/src/sys/kern/kern_descrip.c:1837
9     mbuf PCPU list lock -- (already displayed)
5     pipe mutex -- last acquired @ /freebsd/testing/src/sys/kern/sys_pipe.c:1051
6      sellck -- (already displayed)
6      sigio lock -- last acquired @ /freebsd/testing/src/sys/kern/kern_descrip.c:587
7       process group -- last acquired @ /freebsd/testing/src/sys/kern/kern_fork.c:575
8        process lock -- last acquired @ /freebsd/testing/src/sys/kern/kern_condvar.c:456
9         ktrace -- last acquired @ /freebsd/testing/src/sys/kern/kern_fork.c:601
9         struct pargs.ref -- last acquired @ /freebsd/testing/src/sys/kern/kern_proc.c:1071
9         sigacts -- last acquired @ /freebsd/testing/src/sys/kern/kern_condvar.c:457
9         session -- last acquired @ /freebsd/testing/src/sys/kern/kern_fork.c:584
12         vnode interlock -- (already displayed)
10         uidinfo hash -- last acquired @ /freebsd/testing/src/sys/kern/kern_resource.c:878
11          sleep mtxpool -- (already displayed)
11          uidinfo struct -- last acquired @ order list:0
12           allprison -- last acquired @ /freebsd/testing/src/sys/kern/kern_jail.c:414
0 g_disk_done -- last acquired @ /freebsd/testing/src/sys/geom/geom_disk.c:183
3  bio queue -- (already displayed)
12 UMA pcpu -- (already displayed)
0 GEOM event stalling -- last acquired @ /freebsd/testing/src/sys/geom/geom_event.c:157
1  GEOM topology -- last acquired @ /freebsd/testing/src/sys/geom/geom_event.c:158
2   swapdev -- last acquired @ /freebsd/testing/src/sys/vm/swap_pager.c:2285
2   Giant -- (already displayed)
0 module subsystem sx lock -- last acquired @ /freebsd/testing/src/sys/kern/kern_module.c:331
0 sysctl lock -- last acquired @ /freebsd/testing/src/sys/kern/kern_sysctl.c:1273
2  swapdev -- (already displayed)
1  filelist lock -- last acquired @ /freebsd/testing/src/sys/kern/kern_descrip.c:1181
4   filedesc structure -- (already displayed)
1  rip -- last acquired @ /freebsd/testing/src/sys/netinet/raw_ip.c:743
1  allproc -- last acquired @ /freebsd/testing/src/sys/kern/kern_proc.c:255
2   Giant -- (already displayed)
0 kernel environment -- last acquired @ /freebsd/testing/src/sys/kern/kern_environment.c:288
0 proctree -- last acquired @ /freebsd/testing/src/sys/kern/kern_exit.c:573
1  allproc -- (already displayed)

Spin locks:

Locks which were never acquired:
pseudofs_fileno
msq
semid
ACPI global lock
strategy
bounce pages lock
umtx
securelevel mutex lock
jumbo mutex
UUID generator mutex lock
phys_pager list
dev_pager list
dev_pager create
swap_pager list
vm map sleep mutex
lockmgr
vm86 lock
%%

>Fix:
I'm using patches similar to this one since about a year, the
kqueue-enabled make(1) runs without problems with it.  Unfortunately the
mutex is (I'm not sure it has to be) acquired and released rather often
in the loop in kqueue_scan() to avoid deadlocks due to LORs.  Maybe a
spin lock would increase the performance?  (I didn't do any benchmarks.)

I mailed jlemon@ some time ago about the problem, but never got a
response.  FWIW, a very similar fix was committed to NetBSD in the
meantime.  

--- kqueue_lock.diff begins here ---
Index: src/sys/kern/kern_event.c
===================================================================
RCS file: /usr/home/ncvs/src/sys/kern/kern_event.c,v
retrieving revision 1.60
diff -u -r1.60 kern_event.c
--- src/sys/kern/kern_event.c	18 Jun 2003 18:16:39 -0000	1.60
+++ src/sys/kern/kern_event.c	13 Oct 2003 00:35:42 -0000
@@ -389,6 +389,7 @@
 		goto done2;
 	kq = malloc(sizeof(struct kqueue), M_KQUEUE, M_WAITOK | M_ZERO);
 	TAILQ_INIT(&kq->kq_head);
+	mtx_init(&kq->kq_mtx, "kqueue mutex", NULL, MTX_DEF);
 	FILE_LOCK(fp);
 	fp->f_flag = FREAD | FWRITE;
 	fp->f_type = DTYPE_KQUEUE;
@@ -662,7 +663,7 @@
 	struct kevent *kevp;
 	struct timeval atv, rtv, ttv;
 	struct knote *kn, marker;
-	int s, count, timeout, nkev = 0, error = 0;
+	int count, timeout, nkev = 0, error = 0;
 
 	FILE_LOCK_ASSERT(fp, MA_NOTOWNED);
 
@@ -704,15 +705,16 @@
 
 start:
 	kevp = kq->kq_kev;
-	s = splhigh();
+	mtx_lock(&kq->kq_mtx);
 	if (kq->kq_count == 0) {
 		if (timeout < 0) { 
 			error = EWOULDBLOCK;
+			mtx_unlock(&kq->kq_mtx);
 		} else {
 			kq->kq_state |= KQ_SLEEP;
-			error = tsleep(kq, PSOCK | PCATCH, "kqread", timeout);
+			error = msleep(kq, &kq->kq_mtx, PSOCK | PCATCH | PDROP,
+			    "kqread", timeout);
 		}
-		splx(s);
 		if (error == 0)
 			goto retry;
 		/* don't restart after signals... */
@@ -728,20 +730,22 @@
 		kn = TAILQ_FIRST(&kq->kq_head);
 		TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); 
 		if (kn == &marker) {
-			splx(s);
+			mtx_unlock(&kq->kq_mtx);
 			if (count == maxevents)
 				goto retry;
 			goto done;
 		}
+		kq->kq_count--;
+		mtx_unlock(&kq->kq_mtx);
 		if (kn->kn_status & KN_DISABLED) {
 			kn->kn_status &= ~KN_QUEUED;
-			kq->kq_count--;
+			mtx_lock(&kq->kq_mtx);
 			continue;
 		}
 		if ((kn->kn_flags & EV_ONESHOT) == 0 &&
 		    kn->kn_fop->f_event(kn, 0) == 0) {
 			kn->kn_status &= ~(KN_QUEUED | KN_ACTIVE);
-			kq->kq_count--;
+			mtx_lock(&kq->kq_mtx);
 			continue;
 		}
 		*kevp = kn->kn_kevent;
@@ -749,34 +753,34 @@
 		nkev++;
 		if (kn->kn_flags & EV_ONESHOT) {
 			kn->kn_status &= ~KN_QUEUED;
-			kq->kq_count--;
-			splx(s);
 			kn->kn_fop->f_detach(kn);
 			knote_drop(kn, td);
-			s = splhigh();
 		} else if (kn->kn_flags & EV_CLEAR) {
 			kn->kn_data = 0;
 			kn->kn_fflags = 0;
 			kn->kn_status &= ~(KN_QUEUED | KN_ACTIVE);
-			kq->kq_count--;
 		} else {
+			mtx_lock(&kq->kq_mtx);
+			kq->kq_count++;
 			TAILQ_INSERT_TAIL(&kq->kq_head, kn, kn_tqe); 
+			mtx_unlock(&kq->kq_mtx);
 		}
 		count--;
 		if (nkev == KQ_NEVENTS) {
-			splx(s);
 			error = copyout(&kq->kq_kev, ulistp,
 			    sizeof(struct kevent) * nkev);
 			ulistp += nkev;
 			nkev = 0;
 			kevp = kq->kq_kev;
-			s = splhigh();
-			if (error)
+			if (error) {
+				mtx_lock(&kq->kq_mtx);
 				break;
+			}
 		}
+		mtx_lock(&kq->kq_mtx);
 	}
 	TAILQ_REMOVE(&kq->kq_head, &marker, kn_tqe); 
-	splx(s);
+	mtx_unlock(&kq->kq_mtx);
 done:
 	if (nkev != 0)
 		error = copyout(&kq->kq_kev, ulistp,
@@ -900,6 +904,7 @@
 		}
 	}
 	FILEDESC_UNLOCK(fdp);
+	mtx_destroy(&kq->kq_mtx);
 	free(kq, M_KQUEUE);
 	fp->f_data = NULL;
 
@@ -1052,14 +1057,14 @@
 knote_enqueue(struct knote *kn)
 {
 	struct kqueue *kq = kn->kn_kq;
-	int s = splhigh();
 
 	KASSERT((kn->kn_status & KN_QUEUED) == 0, ("knote already queued"));
 
+	mtx_lock(&kq->kq_mtx);
 	TAILQ_INSERT_TAIL(&kq->kq_head, kn, kn_tqe); 
 	kn->kn_status |= KN_QUEUED;
 	kq->kq_count++;
-	splx(s);
+	mtx_unlock(&kq->kq_mtx);
 	kqueue_wakeup(kq);
 }
 
@@ -1067,14 +1072,14 @@
 knote_dequeue(struct knote *kn)
 {
 	struct kqueue *kq = kn->kn_kq;
-	int s = splhigh();
 
 	KASSERT(kn->kn_status & KN_QUEUED, ("knote not queued"));
 
+	mtx_lock(&kq->kq_mtx);
 	TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); 
 	kn->kn_status &= ~KN_QUEUED;
 	kq->kq_count--;
-	splx(s);
+	mtx_unlock(&kq->kq_mtx);
 }
 
 static void
Index: src/sys/sys/eventvar.h
===================================================================
RCS file: /usr/home/ncvs/src/sys/sys/eventvar.h,v
retrieving revision 1.4
diff -u -r1.4 eventvar.h
--- src/sys/sys/eventvar.h	18 Jul 2000 19:31:48 -0000	1.4
+++ src/sys/sys/eventvar.h	8 Mar 2003 19:42:34 -0000
@@ -35,6 +35,7 @@
 struct kqueue {
 	TAILQ_HEAD(kqlist, knote) kq_head;	/* list of pending event */
 	int		kq_count;		/* number of pending events */
+	struct		mtx kq_mtx;
 	struct		selinfo kq_sel;	
 	struct		filedesc *kq_fdp;
 	int		kq_state;
--- kqueue_lock.diff ends here ---
>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031013101556.7FFCA482>