Date: Sun, 31 Aug 2014 13:34:19 -0700 From: John-Mark Gurney <jmg@funkthat.com> To: "Hiroo Ono (????????????)" <hiroo.ono+freebsd@gmail.com> Cc: freebsd-current@freebsd.org Subject: Re: Kernel page fault with non-sleepable locks held error with kernel r270837 Message-ID: <20140831203419.GU71691@funkthat.com> In-Reply-To: <CANtk6Siqth%2BT_GGiW5OaE=cMJSNoBGToRx2QmcaiTNuLhmJ7Zg@mail.gmail.com> References: <CANtk6SjbySdLt6m2zmkDSSeU3Hhisd-mzGKVaSSMOZJfUtnFXA@mail.gmail.com> <20140831064718.GT71691@funkthat.com> <CANtk6Siqth%2BT_GGiW5OaE=cMJSNoBGToRx2QmcaiTNuLhmJ7Zg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hiroo Ono (????????????) wrote this message on Sun, Aug 31, 2014 at 20:43 +0900: > Thank you for taking a look into this. > > 2014-08-31 15:47 GMT+09:00 John-Mark Gurney <jmg@funkthat.com>: > > Hiroo Ono (????????????) wrote this message on Sun, Aug 31, 2014 at 14:01 +0900: > >> During upgrading world and kernel from r26939 to r270837, I got the > >> following problem. > >> a) the arch is i386 > >> b) kernel is of r270837, userland is of r26939 (make kernel is done > >> and rebooted, make installworld not yet). > >> c) booting in single user mode is OK. > >> d) during startup of multi-user mode, when dhclient is run, the > >> following message appears, and the system freezes: > >> > >> Starting devd. > >> wlan0: link state changed to UP > >> Starting webcamd. > >> Attached to ugen4.2[0] > >> Starting webcready running for ugen4.2.0 > >> /usr/local/etc/rc.d/webcamd: WARNING: failed to start webcamd > >> Starting dhclient. > >> DHCPREQUEST on wlan0 to 255.255.255.255 port 67 > >> DHCPACK from 192.168.8.2 > >> Kernel page fault with the following non-sleepable locks held: > >> exclusive sleep mutex so_rcv (so_rcv) r = 0 (0xc713f078) locked @ > >> /usr/src/sys/kern/kern_event.c:2005 > > > > I'm puzzled by this line number... This line number doesn't do any > > locks, it is in the function knlist_remove_inevent... > > The line 2005 is "mtx_lock((struct mtx *)arg);" of knlist_mtx_lock() > https://svnweb.freebsd.org/base/head/sys/kern/kern_event.c?revision=268843&view=markup#l2005 > > this function is assigned to (struct knlist *)->kn_lock in knlist_init() > https://svnweb.freebsd.org/base/head/sys/kern/kern_event.c?revision=268843&view=markup#l2058 Sorry, turns out I had a local patch to my kern_event.c... Can you find out what line the filt_soread is on? This will help figure out if it's kn or so... If you could get the address of the page fault, that would also be helpful... Ok, a similar fix was committed in r133794, and a quick look at the code doesn't show any knote's that are allocated on the stack anymore... > >> KDB stack backtrace: > >> rapper+0x2d/frame 0xe8f42710 > >> kdb_backtrace(c11aaf80,0,c713f078,c119a9e8,7d5,...) at 0xc0b4b160 = > >> kdb_backtrace+0x30/frame 0xe8f42778 > >> witness_warn(5,0,c136b0a0,76fb000,c1833d58,...) at 0xc8b68a52 = > >> witness_warn+0x402/frame 0xe8f427c8 > >> trap_pfault(18,3fd,c0dcc2d0,c1f64a80,c75fa000,...) at 0xc102f46b = > >> trap_pfault+0x5b/frame 0xe8f42840 > >> trap(e8f42988) at 0xc102edcf = trap+0x6cf/frame 0xe8f4297c > >> calltrap() at 0xc1017c4c = calltrap+0x6/frame 0xe8f4297c > >> filt_soread(c75f7828,0,c119a9e8,48d,0,...) at 0xc0b9837d = > >> filt_soread+0x9d/frame 0xe8f429f0 > >> kqueue_register(c6f59310,1,1,4f5,0,...) at 0xc0ad1457 = > >> kqueue_register+0x807/frame 0xe8f42a68 > >> kern_kevent(c6f59310,7,12c217ce1 = Xint0x80), eip = > > calltrap() seems to be invoked by > SOCKBUF_LOCK_ASSERT(&so->so_rcv); > of filt_soread() in sys/kern/uipc_socket.c > https://svnweb.freebsd.org/base/head/sys/kern/uipc_socket.c?revision=270664&view=markup#l3250 > > but I do not know where &so->so_rcv was previously locked. > knlist_init_mtx (which then calls knlist_init) is called with > so->so_rcv in sys/kern/uipc_socket.c in > line 517: socreate() > https://svnweb.freebsd.org/base/head/sys/kern/uipc_socket.c?revision=270664&view=markup#l517 > and > line 606: sonewconn() > https://svnweb.freebsd.org/base/head/sys/kern/uipc_socket.c?revision=270664&view=markup#l606 > > so the problem may be around there. > but, I cannot track any further. the system freezes, so I cannot deal with ddb. > > > But notice the knlist_remove_inevent doesn't appear in the back > > trace... > > > > Can you confirm that your kern_event.c is: > > __FBSDID("$FreeBSD: head/sys/kern/kern_event.c 268843 2014-07-18 14:27:04Z bapt > > $"); > > I checked that it was this revision. > > >> instruction poi = 0x28:0xe8f429f0 fff, type 0x1b > >> DHCPREQUEST on wlan0 to 255.255.255.255 port 67 > >> DHCPACK from 192.168.8.2 > >> > >> e) kernel configuration differs from GENERIC on the following point > >> options VIMAGE > >> options DDB_NUMSYM > >> nocpu I486_CPU > >> nooptions VESA > >> > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140831203419.GU71691>