Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Nov 2010 17:23:04 -0800
From:      mdf@FreeBSD.org
To:        Alexander Kabaev <kabaev@gmail.com>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: LOR with sysctl lock
Message-ID:  <AANLkTi=8HxTjmFjCih97FWTdAZ3-NQ1V6igiTBB7XgPv@mail.gmail.com>
In-Reply-To: <20101124184934.35b6766a@kan.dnsalias.net>
References:  <AANLkTinU78oZrw3UEKRQ3dCPFigch9CuSdsxt0=dGSBQ@mail.gmail.com> <20101124184934.35b6766a@kan.dnsalias.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Nov 24, 2010 at 3:49 PM, Alexander Kabaev <kabaev@gmail.com> wrote:
> On Wed, 24 Nov 2010 11:53:58 -0800
> mdf@FreeBSD.org wrote:
>
>> The sysctl lock can cause "random" LOR warnings. =A0These usually show
>> on reboot/shutdown when sysctl_ctx_free() is called by a kernel
>> module, since the mod handler is called with the module lock. =A0The
>> reason for the LOR is that, at least theoretically, the sysctl lock is
>> the first lock in any hierarchy, because a SYSCTL_PROC handler can
>> take any locks it wants, and will be called with the sysctl lock held
>> shared.
>>
>> The below patch will fix the problem generically and with no changes
>> to other code. =A0I slightly prefer this to an explicit
>> sysctl_ctx_free_sched(9), because many times code doesn't know if some
>> caller holds *any* lock at all; this is especially true for mod
>> handlers who shouldn't be expected to know how FreeBSD locks calls to
>> the handler.
>>
>> I also note that the return value from sysctl_ctx_free(9) is almost
>> never checked on CURRENT, and the only places it is, the value is
>> merely used to print a warning. =A0The only exception is canbus_detach()
>> in pc98/pc98/canbus.c. =A0So I wonder if sysctl_ctx_free(9) should
>> return void and print a warning itself.
>>
>> Patch:
>> http://people.freebsd.org/~mdf/0001-Proposed-patch-to-fix-LORs-with-sysc=
tl-lock.patch
>>
>> If there are no objections, I'd like to commit this next week.
>>
>> Thanks,
>> matthew
>
> Correct me if I am wrong, but doesn't this open a race where, say,
> device detach routine destroys the device softc and schedules sysctl
> context to be destroyed asynchronously via task queue? Since sysctl
> entries are still visible in between the point where softc is destroyed
> and the point where task queue picks the sysctl destroy task up, can any
> access to said sysctls potentially operate on now freed softc data?

D'oh, yeah.

I'm thinking of something a little more grand, now, that increments a
hold count on the sysctl oid and releases the lock before calling the
handler.  This would prevent the LOR, and allow sysctl_ctx_free(9) to
be run in-line, but it would then have to wait for any in progress
sysctl calls to an oid to drain.

I think this will work out; I'm planning to work on the code over the
Thanksgiving holiday and have something ready in a few days.

Thanks,
matthew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=8HxTjmFjCih97FWTdAZ3-NQ1V6igiTBB7XgPv>