Date: Wed, 17 Jun 2020 11:39:38 -0500 From: Kyle Evans <kevans@freebsd.org> To: Mateusz Guzik <mjguzik@gmail.com> Cc: src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org> Subject: Re: svn commit: r361967 - head/sys/kern Message-ID: <CACNAnaG8jDT8svqG2NRrqEU0u4GirHSbvstd2hyvVouDZPHRtA@mail.gmail.com> In-Reply-To: <CAGudoHFUpFb6wBc=wxzwGJrOzER-xd6gU6pW6mLJLi6%2BvgYbqA@mail.gmail.com> References: <202006091517.059FHNS9050196@repo.freebsd.org> <CACNAnaHLmwemMtHLNA5QwCbFxnEkVc7D-kw1TCqNLh3v5QMJQw@mail.gmail.com> <CAGudoHFUpFb6wBc=wxzwGJrOzER-xd6gU6pW6mLJLi6%2BvgYbqA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jun 17, 2020 at 10:21 AM Mateusz Guzik <mjguzik@gmail.com> wrote: > > pho@ reported triggering one of the asserts: > https://people.freebsd.org/~pho/stress/log/mjguzik028.txt > > I did not have the time to properly investigate this yet and this does > not reproduce for me. > Ah, this is good to know that he's reproduced it. There's only so many places that we touch these. I can't quite envision how, but the only scenario this would seem to be possible in is doenterpgrp() -> fixjobc(p, p->p_grp, 0) -> adjusts some child with a different process group without actually changing it, orphans the group, then we manage to finalize killjobc() on a freshly-orphaned process, which hasn't had its p_pgrp nullified. I haven't yet traced it through completely enough to determine if there's any way that can even happen. > That said, I may either revert the assert (or degrade to a warning) or > add some commentary if I don't sort this out this week. > > On 6/17/20, Kyle Evans <kevans@freebsd.org> wrote: > > On Tue, Jun 9, 2020 at 10:17 AM Mateusz Guzik <mjg@freebsd.org> wrote: > >> > >> Author: mjg > >> Date: Tue Jun 9 15:17:23 2020 > >> New Revision: 361967 > >> URL: https://svnweb.freebsd.org/changeset/base/361967 > >> > >> Log: > >> Assert on pg_jobc state. > >> > >> Stolen from NetBSD. > >> > >> Modified: > >> head/sys/kern/kern_proc.c > >> > >> Modified: head/sys/kern/kern_proc.c > >> ============================================================================== > >> --- head/sys/kern/kern_proc.c Tue Jun 9 14:20:16 2020 (r361966) > >> +++ head/sys/kern/kern_proc.c Tue Jun 9 15:17:23 2020 (r361967) > >> @@ -751,9 +751,11 @@ pgadjustjobc(struct pgrp *pgrp, int entering) > >> { > >> > >> PGRP_LOCK(pgrp); > >> - if (entering) > >> + if (entering) { > >> + MPASS(pgrp->pg_jobc >= 0); > >> pgrp->pg_jobc++; > >> - else { > >> + } else { > >> + MPASS(pgrp->pg_jobc > 0); > >> --pgrp->pg_jobc; > >> if (pgrp->pg_jobc == 0) > >> orphanpg(pgrp); > > > > We seem to be doing something wrong here, but I'm still working on > > reproducing it on a machine that actually has usable swap configured > > to get a dump. I've hit one of these asserts just from hitting the > > power button to shutoff from within xfce (hitting the latter branch > > with pgrp->pg_jobc == 0 IIRC) and another laptop panicked sometime > > overnight -- I wasn't able to catch the second one at all, because > > it's hooked up to a switch that loses its mind once that laptop > > panicks and it had to be rebooted before I could attend to it to > > revive network for the other machines on the switch. > > > > Thanks, > > > > Kyle Evans > > > > > -- > Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACNAnaG8jDT8svqG2NRrqEU0u4GirHSbvstd2hyvVouDZPHRtA>