Date: Fri, 16 Nov 2018 17:56:54 +0100 From: Sylvain GALLIANO <sg@efficientip.com> To: markj@freebsd.org Cc: freebsd-current@freebsd.org Subject: Re: Panic on kern_event.c Message-ID: <CAHdyrku4OPRr1Ku0WF3XT3vK_dqvNzWN%2BMYz7pXTkiNJakfJGQ@mail.gmail.com> In-Reply-To: <20181116154210.GB17379@raichu> References: <CAHdyrkvqGp8PGFaCSGgeDFC7wBhjnHK4eL99WM5fMO_yZ_u5KA@mail.gmail.com> <20181107043503.GB30861@raichu> <CAHdyrkt42cn8%2BKqhp-jQ9iZNnreypMT1qybNTcFtx8JivKggZA@mail.gmail.com> <20181115221019.GA2514@raichu> <CAHdyrksHLvzXDkjoy2PpiTgb%2BmEKHJ979rwcW3RJx32qdAyJzg@mail.gmail.com> <20181116154210.GB17379@raichu>
next in thread | previous in thread | raw e-mail | index | archive | help
Le ven. 16 nov. 2018 à 16:42, Mark Johnston <markj@freebsd.org> a écrit :
> On Fri, Nov 16, 2018 at 03:47:39PM +0100, Sylvain GALLIANO wrote:
> > Le jeu. 15 nov. 2018 à 23:10, Mark Johnston <markj@freebsd.org> a écrit
> :
> >
> > > On Thu, Nov 08, 2018 at 05:05:03PM +0100, Sylvain GALLIANO wrote:
> > > > Hi,
> > > >
> > > > I replaced
> > > > << printf("XXX knote %p already in tailq status:%x kq_count:%d [%p
> %p]
> > > >
> > >
> %u\n",kn,kn->kn_status,kq->kq_count,kn->kn_tqe.tqe_next,kn->kn_tqe.tqe_prev,__LINE__);
> > > > by
> > > > >> panic("XXX knote %p already in tailq status:%x kq_count:%d [%p
> %p]
> > > >
> > >
> %u\n",kn,kn->kn_status,kq->kq_count,kn->kn_tqe.tqe_next,kn->kn_tqe.tqe_prev,__LINE__);
> > > >
> > > > Here is the stack during panic:
> > > > panic: XXX knote 0xfffff801e1c6ddc0 already in tailq status:1
> kq_count:2
> > > > [0 0xfffff8000957a978] 2671
> > > >
> > > Could you please give the following patch a try?
> > >
> > > If possible, could you also ktrace one of the active syslog-ng
> processes
> > > for some time, perhaps 15 seconds, and share the kdump? I have been
> > > trying to reproduce the problem without any luck.
> > >
> > Unfortunately patched kernel is not stable:
> > - some processes run at 100% CPU (STOP state) and cannot be killed
> > - sometime the system completely freeze (need a hard reboot)
> >
> > I cannot reproduce the issue as soon as syslog-ng is under ktrace (even
> > after 10GB of ktrace file)
> > When I stop ktrace, issue come back after few minutes.
>
> That's ok, I'd like to see part of the ktrace even if the problem
> doesn't occur; this bug appears to be a race condition, so it's not
> surprising that ktrace might hide it.
>
Lucky ktrace this time, issue occured 2 times:
Nov 16 16:13:29 solid kernel: XXX knote 0xfffff8003282fb40 already in
tailq status:1 kq_count:1 [0 0xfffff80032883138] 2671
Nov 16 16:14:39 solid kernel: XXX knote 0xfffff8003282f3c0 already in
tailq status:1 kq_count:1 [0 0xfffff80032883138] 2671
ktrace.out.xz located in:
https://drive.google.com/drive/folders/1MbqJQm12-KOYDbb4-9uNRTnAdsNqLaIP?usp=sharing
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHdyrku4OPRr1Ku0WF3XT3vK_dqvNzWN%2BMYz7pXTkiNJakfJGQ>
