Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Dec 2011 20:47:20 +0100
From:      Monthadar Al Jaberi <monthadar@gmail.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: loop inside uma_zfree critical section
Message-ID:  <CA%2BsBSoK-sT4t1Xi1NtUykH1O7j1zaQJWiHSqDGFJKiWYxHebcw@mail.gmail.com>
In-Reply-To: <CA%2BsBSoLOndp288yrH_w5W3MwhjtUZ4Dp2edGs-i9WQfK9oLvNg@mail.gmail.com>
References:  <CA%2BsBSoJrRf8t6KJQy6xwa_VoH67cYWo5ZUZKBTEwLwrx%2BiXknw@mail.gmail.com> <201112130935.33975.jhb@freebsd.org> <CA%2BsBSoLOndp288yrH_w5W3MwhjtUZ4Dp2edGs-i9WQfK9oLvNg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 13, 2011 at 4:50 PM, Monthadar Al Jaberi
<monthadar@gmail.com> wrote:
> On Tue, Dec 13, 2011 at 3:35 PM, John Baldwin <jhb@freebsd.org> wrote:
>> On Tuesday, December 13, 2011 7:46:34 am Monthadar Al Jaberi wrote:
>>> Hi,
>>>
>>> I am not sure why I am having this problem, but looking
>>> at the code I dont understand uma_core.c really good.
>>> So I hope someone can shed a light on this:
>>>
>>> I am running on an arm board and and running a kernel module
>>> that behaves like a wlan interface. so I tx and rx packets.
>>>
>>> For now tx is only only sending beacon like frames.
>>> This is done through using ieee80211_beacon_alloc().
>>>
>>> Then in a callout task to generate periodic beacons:
>>>
>>> =A0 =A0 m_dup(avp->beacon, M_DONTWAIT);
>>> =A0 =A0 mtx_lock(...);
>>> =A0 =A0 STAILQ_INSERT_TAIL(...);
>>> =A0 =A0 taskqueue_enqueue(...);
>>> =A0 =A0 mtx_unlock(...);
>>> =A0 =A0 callout_schedule(...);
>>>
>>> On the RX side, the interrupt handler will read out buffer
>>> then place it on a queue to be handled by wlan-glue code.
>>> For now wlan-glue code just frees the mbuf it instead of
>>> calling net80211 ieee80211_input() functions:
>>>
>>> =A0 =A0 m_copyback(...);
>>> =A0 =A0 /* Allocate new mbuf for next RX. */
>>> =A0 =A0 MGETHDR(..., M_DONTWAIT, MT_DATA);
>>> =A0 =A0 bzero((mtod(sc->Rx_m, void *)), MHLEN);
>>> =A0 =A0 sc->Rx_m->m_len =3D 0; /* NB: m_gethdr does not set */
>>> =A0 =A0 sc->Rx_m->m_data +=3D 20; /* make headroom */
>>>
>>> Then I use a lockmgr inside my kernel module that should
>>> make sure that we either are on TX or RX path.
>>
>> Uh, you can't use a lockmgr lock in interrupt handlers or in
>> if_transmit/if_start routines. =A0You should most likely just be using a=
 plain
>> mutex instead. =A0Also, new code shouldn't use lockmgr in general. =A0If=
 you
>> need a sleepable lock, use sx instead. =A0It has a more straightforward =
API.
>
> Ok, I will change the interrupt handler to do something like this:
>
> =A0 =A0disaple_interrupt();
> =A0 =A0taskqueue_enqueue(...); /* on new rx task queue */
>
> Then on the new rx proc:
>
> =A0 =A0sx_slock(...);
> =A0 =A0m_copyback(...);
> =A0 =A0enable_interrupt();
> =A0 =A0/* Allocate new mbuf for next RX. */
> =A0 =A0MGETHDR(..., M_DONTWAIT, MT_DATA);
> =A0 =A0bzero((mtod(sc->Rx_m, void *)), MHLEN);
> =A0 =A0sc->Rx_m->m_len =3D 0; /* NB: m_gethdr does not set */
> =A0 =A0sc->Rx_m->m_data +=3D 20; /* make headroom */
> =A0 =A0sx_sunlock(...);
>
> I lock TX/RX paths to make sure my code is threading safe.
> Also because while programming my deivce (SPI communicatioin)
> there will be a tsleep() waiting for the DMA interrupt and
> thus we could be prempted by e.g. a beacon_callout etc...
>

I did implement your suggestions, using sx and modified interrupt handler
as specified above. But still same problem as before.

>>
>>> The problem seems to be at [2], somehow after swapping
>>> buckets in uma_zfree m_dup returns a pointer to
>>> an mbuf that is still being used by us, [1] and [3]
>>> have same address.
>>> Then we call m_freem twice on same mbuf, [4] and [5].
>>> And a loop occurs inside uma_free.
>>> I am using mbufs in a wrong way? Shouldnt mbufs be thread safe?
>>> Problem seems to occur while swapping buckets.
>>
>> Hmm, the uma uses its own locking, so it should be safe, yes. =A0However=
, you
>> are correct about [1] and [3]. =A0The thing is, after [1] the mbuf shoul=
dn't
>> be in any buckets at all (it only gets put back into the bucket during a
>> free). =A0Are you sure the mbuf wasn't double free'd previously?

I rechecked and it is almost certain that I dont double free the mbuf
before [1].
And its not like it crashed in the beginning, it does run for a while
and then it crashes. So our code works for like a hundred beacons sent/rece=
ived
between two arm boards. Its feels like something is preempted, which explai=
ns
why the mbuf is still in the bucket (wrongly)?

>
> From my log I can only see the mbuf being used once before
> by a beacon_callout and then it was freed by m_freem().
> So I cant see that it was freed twice before that.
>
> How can I go by debbuging this?
>
>>
>> --
>> John Baldwin
>
>
>
> --
> Monthadar Al Jaberi



--=20
Monthadar Al Jaberi



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BsBSoK-sT4t1Xi1NtUykH1O7j1zaQJWiHSqDGFJKiWYxHebcw>