Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Aug 2020 16:13:30 +0200
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Mark Johnston <markj@freebsd.org>
Cc:        current@freebsd.org
Subject:   Re: "panic: malloc(M_WAITOK) in non-sleepable context" after r364296 -> r364341
Message-ID:  <CAGudoHF7o=DqmXXwjbQuk5EY62_WP9nDos=wgCOYtvEnUMMx3Q@mail.gmail.com>
In-Reply-To: <20200818140419.GB28906@raichu>
References:  <20200818124419.GO1394@albert.catwhisker.org> <CAGudoHEEnZVaBwY1L-wjmDcrsHyP4pE3-0qyurh-kWsCO9Edhg@mail.gmail.com> <20200818125825.GP1394@albert.catwhisker.org> <CAGudoHHJLoQhU%2Bb4%2Btt_dmChYj=qXA%2BiVaC5mFW8sFxoqq54pw@mail.gmail.com> <20200818134900.GA28906@raichu> <CAGudoHF0xTrw8h24iUOMr6YdL-z2Mce1z_ur0=bRvMQL1PD0Sg@mail.gmail.com> <20200818140419.GB28906@raichu>

next in thread | previous in thread | raw e-mail | index | archive | help
On 8/18/20, Mark Johnston <markj@freebsd.org> wrote:
> On Tue, Aug 18, 2020 at 03:55:15PM +0200, Mateusz Guzik wrote:
>> On 8/18/20, Mark Johnston <markj@freebsd.org> wrote:
>> > On Tue, Aug 18, 2020 at 03:24:52PM +0200, Mateusz Guzik wrote:
>> >> Try this:
>> >>
>> >> diff --git a/sys/kern/kern_malloc.c b/sys/kern/kern_malloc.c
>> >> index 46eb1c4347c..7b94ca7b880 100644
>> >> --- a/sys/kern/kern_malloc.c
>> >> +++ b/sys/kern/kern_malloc.c
>> >> @@ -618,8 +618,8 @@ void *
>> >>         unsigned long osize = size;
>> >>  #endif
>> >>
>> >> -       KASSERT((flags & M_WAITOK) == 0 || THREAD_CAN_SLEEP(),
>> >> -           ("malloc(M_WAITOK) in non-sleepable context"));
>> >> +       if ((flags & M_WAITOK) != 0)
>> >> +               WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL,
>> >> __func__);
>> >>
>> >>  #ifdef MALLOC_DEBUG
>> >>         va = NULL;
>> >> diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
>> >> index 37d78354200..2e1267ec02f 100644
>> >> --- a/sys/vm/uma_core.c
>> >> +++ b/sys/vm/uma_core.c
>> >> @@ -3355,8 +3355,8 @@ uma_zalloc_arg(uma_zone_t zone, void *udata, int
>> >> flags)
>> >>         uma_cache_bucket_t bucket;
>> >>         uma_cache_t cache;
>> >>
>> >> -       KASSERT((flags & M_WAITOK) == 0 || THREAD_CAN_SLEEP(),
>> >> -           ("uma_zalloc(M_WAITOK) in non-sleepable context"));
>> >> +       if ((flags & M_WAITOK) != 0)
>> >> +               WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL,
>> >> __func__);
>> >>
>> >>         /* Enable entropy collection for RANDOM_ENABLE_UMA kernel
>> >> option
>> >> */
>> >>         random_harvest_fast_uma(&zone, sizeof(zone), RANDOM_UMA);
>> >
>> > Doesn't it only print a warning if non-sleepable locks are held?
>> > THREAD_CAN_SLEEP() catches other cases, like epoch sections and worker
>> > threads that are not allowed to sleep (CAM doneq threads in this case).
>> > Otherwise uma_zalloc_debug() already checks with WITNESS.
>> >
>> > I'm inclined to simply revert the commit for now.  Alternately,
>> > disk_alloc() could be modified to handle M_NOWAIT/M_WAITOK flags, and
>> > that could be used in daregister() and other CAM periph drivers.
>> > daregister() already uses M_NOWAIT to allocate the driver softc itself.
>> >
>>
>> If WITNESS_WARN(.., WARN_SLEEPOK) does not check for all possible
>> blockers for going off CPU that's a bug. I do support reverting the
>> patch for now until the dust settles. I don't propose the above hack
>> as the final fix.
>
> Well, IMO the bug is that we have no subroutine to perform all of these
> checks (which includes those done by WITNESS_WARN(..., WARN_SLEEPOK)).
> WITNESS' responsibilities are more narrow.  I just meant that your patch
> effectively reverts Gleb's commit.
>

not exactly, as it still adds something for malloc.

I argued for a dedicated routine to perform all the relevant checks.
Note a dedicated routine instead of a handrolled call to
witness/panic/whatever can also report the exact blockers (if it knows
them), in this case what lock was taken and where. I would not mind
any extras (e.g., critnest).

>> As for the culrpit at hand, given the backtrace:
>> devfs_alloc() at devfs_alloc+0x28/frame 0xffffffff8218d570
>> make_dev_sv() at make_dev_sv+0x97/frame 0xffffffff8218d600
>> make_dev_s() at make_dev_s+0x3b/frame 0xffffffff8218d660
>> passregister() at passregister+0x3e7/frame 0xffffffff8218d8b0
>>
>> I think this is missing wait flags, resulting in M_WAITOK later, i.e.:
>
> Ah, I was looking at a different report.  All the more reason to revert
> for now.
>


-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHF7o=DqmXXwjbQuk5EY62_WP9nDos=wgCOYtvEnUMMx3Q>