FreeBSD Mail Archives

Date:      Wed, 14 Dec 2011 15:10:49 -0500
From:      Arnaud Lacombe <lacombar@gmail.com>
To:        Monthadar Al Jaberi <monthadar@gmail.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: loop inside uma_zfree critical section
Message-ID:  <CACqU3MV-vXJ=GSV%2BOJWUwoymOQronbduTYgzbc82q=qPoXDSFA@mail.gmail.com>
In-Reply-To: <CA%2BsBSoK-sT4t1Xi1NtUykH1O7j1zaQJWiHSqDGFJKiWYxHebcw@mail.gmail.com>
References:  <CA%2BsBSoJrRf8t6KJQy6xwa_VoH67cYWo5ZUZKBTEwLwrx%2BiXknw@mail.gmail.com> <201112130935.33975.jhb@freebsd.org> <CA%2BsBSoLOndp288yrH_w5W3MwhjtUZ4Dp2edGs-i9WQfK9oLvNg@mail.gmail.com> <CA%2BsBSoK-sT4t1Xi1NtUykH1O7j1zaQJWiHSqDGFJKiWYxHebcw@mail.gmail.com>


Hi,

On Wed, Dec 14, 2011 at 2:47 PM, Monthadar Al Jaberi
<monthadar@gmail.com> wrote:
> On Tue, Dec 13, 2011 at 4:50 PM, Monthadar Al Jaberi
> <monthadar@gmail.com> wrote:
>> On Tue, Dec 13, 2011 at 3:35 PM, John Baldwin <jhb@freebsd.org> wrote:
>>> On Tuesday, December 13, 2011 7:46:34 am Monthadar Al Jaberi wrote:
>>>> Hi,
>>>>
>>>> I am not sure why I am having this problem, but looking
>>>> at the code I dont understand uma_core.c really good.
>>>> So I hope someone can shed a light on this:
>>>>
>>>> I am running on an arm board and and running a kernel module
>>>> that behaves like a wlan interface. so I tx and rx packets.
>>>>
>>>> For now tx is only only sending beacon like frames.
>>>> This is done through using ieee80211_beacon_alloc().
>>>>
>>>> Then in a callout task to generate periodic beacons:
>>>>
>>>> � � m_dup(avp->beacon, M_DONTWAIT);
>>>> � � mtx_lock(...);
>>>> � � STAILQ_INSERT_TAIL(...);
>>>> � � taskqueue_enqueue(...);
>>>> � � mtx_unlock(...);
>>>> � � callout_schedule(...);
>>>>
>>>> On the RX side, the interrupt handler will read out buffer
>>>> then place it on a queue to be handled by wlan-glue code.
>>>> For now wlan-glue code just frees the mbuf it instead of
>>>> calling net80211 ieee80211_input() functions:
>>>>
>>>> � � m_copyback(...);
>>>> � � /* Allocate new mbuf for next RX. */
>>>> � � MGETHDR(..., M_DONTWAIT, MT_DATA);
>>>> � � bzero((mtod(sc->Rx_m, void *)), MHLEN);
>>>> � � sc->Rx_m->m_len = 0; /* NB: m_gethdr does not set */
>>>> � � sc->Rx_m->m_data += 20; /* make headroom */
>>>>
>>>> Then I use a lockmgr inside my kernel module that should
>>>> make sure that we either are on TX or RX path.
>>>
>>> Uh, you can't use a lockmgr lock in interrupt handlers or in
>>> if_transmit/if_start routines. �You should most likely just be using a plain
>>> mutex instead. �Also, new code shouldn't use lockmgr in general. �If you
>>> need a sleepable lock, use sx instead. �It has a more straightforward API.
>>
>> Ok, I will change the interrupt handler to do something like this:
>>
>> � �disaple_interrupt();
>> � �taskqueue_enqueue(...); /* on new rx task queue */
>>
>> Then on the new rx proc:
>>
>> � �sx_slock(...);
>> � �m_copyback(...);
>> � �enable_interrupt();
>> � �/* Allocate new mbuf for next RX. */
>> � �MGETHDR(..., M_DONTWAIT, MT_DATA);
>> � �bzero((mtod(sc->Rx_m, void *)), MHLEN);
>> � �sc->Rx_m->m_len = 0; /* NB: m_gethdr does not set */
>> � �sc->Rx_m->m_data += 20; /* make headroom */
>> � �sx_sunlock(...);
>>
>> I lock TX/RX paths to make sure my code is threading safe.
>> Also because while programming my deivce (SPI communicatioin)
>> there will be a tsleep() waiting for the DMA interrupt and
>> thus we could be prempted by e.g. a beacon_callout etc...
>>
>
> I did implement your suggestions, using sx and modified interrupt handler
> as specified above. But still same problem as before.
>
>>>
>>>> The problem seems to be at [2], somehow after swapping
>>>> buckets in uma_zfree m_dup returns a pointer to
>>>> an mbuf that is still being used by us, [1] and [3]
>>>> have same address.
>>>> Then we call m_freem twice on same mbuf, [4] and [5].
>>>> And a loop occurs inside uma_free.
>>>> I am using mbufs in a wrong way? Shouldnt mbufs be thread safe?
>>>> Problem seems to occur while swapping buckets.
>>>
>>> Hmm, the uma uses its own locking, so it should be safe, yes. �However, you
>>> are correct about [1] and [3]. �The thing is, after [1] the mbuf shouldn't
>>> be in any buckets at all (it only gets put back into the bucket during a
>>> free). �Are you sure the mbuf wasn't double free'd previously?
>
> I rechecked and it is almost certain that I dont double free the mbuf
> before [1].
> And its not like it crashed in the beginning, it does run for a while
> and then it crashes. So our code works for like a hundred beacons sent/received
> between two arm boards. Its feels like something is preempted, which explains
> why the mbuf is still in the bucket (wrongly)?
>
are you running an INVARIANTS/DIAGNOSTICS/WITNESS/LOCK_DEBUG/...
enabled kernel ?

are you running on an SMP platform where there might be cache-coherency issue ?

Thanks,
 - Arnaud

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACqU3MV-vXJ=GSV%2BOJWUwoymOQronbduTYgzbc82q=qPoXDSFA>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation