Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Jun 2020 11:39:38 -0500
From:      Kyle Evans <kevans@freebsd.org>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>,  svn-src-head <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r361967 - head/sys/kern
Message-ID:  <CACNAnaG8jDT8svqG2NRrqEU0u4GirHSbvstd2hyvVouDZPHRtA@mail.gmail.com>
In-Reply-To: <CAGudoHFUpFb6wBc=wxzwGJrOzER-xd6gU6pW6mLJLi6%2BvgYbqA@mail.gmail.com>
References:  <202006091517.059FHNS9050196@repo.freebsd.org> <CACNAnaHLmwemMtHLNA5QwCbFxnEkVc7D-kw1TCqNLh3v5QMJQw@mail.gmail.com> <CAGudoHFUpFb6wBc=wxzwGJrOzER-xd6gU6pW6mLJLi6%2BvgYbqA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jun 17, 2020 at 10:21 AM Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> pho@ reported triggering one of the asserts:
> https://people.freebsd.org/~pho/stress/log/mjguzik028.txt
>
> I did not have the time to properly investigate this yet and this does
> not reproduce for me.
>

Ah, this is good to know that he's reproduced it. There's only so many
places that we touch these. I can't quite envision how, but the only
scenario this would seem to be possible in is doenterpgrp() ->
fixjobc(p, p->p_grp, 0) -> adjusts some child with a different process
group without actually changing it, orphans the group, then we manage
to finalize killjobc() on a freshly-orphaned process, which hasn't had
its p_pgrp nullified.

I haven't yet traced it through completely enough to determine if
there's any way that can even happen.

> That said, I may either revert the assert (or degrade to a warning) or
> add some commentary if I don't sort this out this week.
>
> On 6/17/20, Kyle Evans <kevans@freebsd.org> wrote:
> > On Tue, Jun 9, 2020 at 10:17 AM Mateusz Guzik <mjg@freebsd.org> wrote:
> >>
> >> Author: mjg
> >> Date: Tue Jun  9 15:17:23 2020
> >> New Revision: 361967
> >> URL: https://svnweb.freebsd.org/changeset/base/361967
> >>
> >> Log:
> >>   Assert on pg_jobc state.
> >>
> >>   Stolen from NetBSD.
> >>
> >> Modified:
> >>   head/sys/kern/kern_proc.c
> >>
> >> Modified: head/sys/kern/kern_proc.c
> >> ==============================================================================
> >> --- head/sys/kern/kern_proc.c   Tue Jun  9 14:20:16 2020        (r361966)
> >> +++ head/sys/kern/kern_proc.c   Tue Jun  9 15:17:23 2020        (r361967)
> >> @@ -751,9 +751,11 @@ pgadjustjobc(struct pgrp *pgrp, int entering)
> >>  {
> >>
> >>         PGRP_LOCK(pgrp);
> >> -       if (entering)
> >> +       if (entering) {
> >> +               MPASS(pgrp->pg_jobc >= 0);
> >>                 pgrp->pg_jobc++;
> >> -       else {
> >> +       } else {
> >> +               MPASS(pgrp->pg_jobc > 0);
> >>                 --pgrp->pg_jobc;
> >>                 if (pgrp->pg_jobc == 0)
> >>                         orphanpg(pgrp);
> >
> > We seem to be doing something wrong here, but I'm still working on
> > reproducing it on a machine that actually has usable swap configured
> > to get a dump. I've hit one of these asserts just from hitting the
> > power button to shutoff from within xfce (hitting the latter branch
> > with pgrp->pg_jobc == 0 IIRC) and another laptop panicked sometime
> > overnight -- I wasn't able to catch the second one at all, because
> > it's hooked up to a switch that loses its mind once that laptop
> > panicks and it had to be rebooted before I could attend to it to
> > revive network for the other machines on the switch.
> >
> > Thanks,
> >
> > Kyle Evans
> >
>
>
> --
> Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACNAnaG8jDT8svqG2NRrqEU0u4GirHSbvstd2hyvVouDZPHRtA>