Date: Fri, 16 Nov 2018 17:56:54 +0100 From: Sylvain GALLIANO <sg@efficientip.com> To: markj@freebsd.org Cc: freebsd-current@freebsd.org Subject: Re: Panic on kern_event.c Message-ID: <CAHdyrku4OPRr1Ku0WF3XT3vK_dqvNzWN%2BMYz7pXTkiNJakfJGQ@mail.gmail.com> In-Reply-To: <20181116154210.GB17379@raichu> References: <CAHdyrkvqGp8PGFaCSGgeDFC7wBhjnHK4eL99WM5fMO_yZ_u5KA@mail.gmail.com> <20181107043503.GB30861@raichu> <CAHdyrkt42cn8%2BKqhp-jQ9iZNnreypMT1qybNTcFtx8JivKggZA@mail.gmail.com> <20181115221019.GA2514@raichu> <CAHdyrksHLvzXDkjoy2PpiTgb%2BmEKHJ979rwcW3RJx32qdAyJzg@mail.gmail.com> <20181116154210.GB17379@raichu>
next in thread | previous in thread | raw e-mail | index | archive | help
Le ven. 16 nov. 2018 =C3=A0 16:42, Mark Johnston <markj@freebsd.org> a =C3= =A9crit : > On Fri, Nov 16, 2018 at 03:47:39PM +0100, Sylvain GALLIANO wrote: > > Le jeu. 15 nov. 2018 =C3=A0 23:10, Mark Johnston <markj@freebsd.org> a = =C3=A9crit > : > > > > > On Thu, Nov 08, 2018 at 05:05:03PM +0100, Sylvain GALLIANO wrote: > > > > Hi, > > > > > > > > I replaced > > > > << printf("XXX knote %p already in tailq status:%x kq_count:%d [%= p > %p] > > > > > > > > %u\n",kn,kn->kn_status,kq->kq_count,kn->kn_tqe.tqe_next,kn->kn_tqe.tqe_pr= ev,__LINE__); > > > > by > > > > >> panic("XXX knote %p already in tailq status:%x kq_count:%d [%p > %p] > > > > > > > > %u\n",kn,kn->kn_status,kq->kq_count,kn->kn_tqe.tqe_next,kn->kn_tqe.tqe_pr= ev,__LINE__); > > > > > > > > Here is the stack during panic: > > > > panic: XXX knote 0xfffff801e1c6ddc0 already in tailq status:1 > kq_count:2 > > > > [0 0xfffff8000957a978] 2671 > > > > > > > Could you please give the following patch a try? > > > > > > If possible, could you also ktrace one of the active syslog-ng > processes > > > for some time, perhaps 15 seconds, and share the kdump? I have been > > > trying to reproduce the problem without any luck. > > > > > Unfortunately patched kernel is not stable: > > - some processes run at 100% CPU (STOP state) and cannot be killed > > - sometime the system completely freeze (need a hard reboot) > > > > I cannot reproduce the issue as soon as syslog-ng is under ktrace (even > > after 10GB of ktrace file) > > When I stop ktrace, issue come back after few minutes. > > That's ok, I'd like to see part of the ktrace even if the problem > doesn't occur; this bug appears to be a race condition, so it's not > surprising that ktrace might hide it. > Lucky ktrace this time, issue occured 2 times: Nov 16 16:13:29 solid kernel: XXX knote 0xfffff8003282fb40 already in tailq status:1 kq_count:1 [0 0xfffff80032883138] 2671 Nov 16 16:14:39 solid kernel: XXX knote 0xfffff8003282f3c0 already in tailq status:1 kq_count:1 [0 0xfffff80032883138] 2671 ktrace.out.xz located in: https://drive.google.com/drive/folders/1MbqJQm12-KOYDbb4-9uNRTnAdsNqLaIP?us= p=3Dsharing
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHdyrku4OPRr1Ku0WF3XT3vK_dqvNzWN%2BMYz7pXTkiNJakfJGQ>