Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 24 Nov 2012 17:47:15 -0800
From:      Alfred Perlstein <bright@mu.org>
To:        attilio@FreeBSD.org
Cc:        Mark Linimon <linimon@lonesome.com>, Oleksandr Tymoshenko <gonzo@bluezbox.com>, arch@freebsd.org
Subject:   Re: [RFC] sema_wait_sig
Message-ID:  <50B178A3.4070305@mu.org>
In-Reply-To: <CAJ-FndAetPqiZ0nCQTY1xAcqBJuuaq9dZfUhP9YXXw669o0WNQ@mail.gmail.com>
References:  <20121124193010.GB1627@lonesome.com> <50B12520.7040508@mu.org> <CAJ-FndBeLvsgXQ4fskRwdZh2qaWbn7-LrCOmJTjPcfnbmD7aYg@mail.gmail.com> <50B145C5.8070503@mu.org> <CAJ-FndC%2BjHO2uKg%2Bd8GmzWGkQ8nZ6_iefgyGAeno0VqprR84Wg@mail.gmail.com> <50B16E7A.60900@mu.org> <CAJ-FndAetPqiZ0nCQTY1xAcqBJuuaq9dZfUhP9YXXw669o0WNQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/24/12 5:16 PM, Attilio Rao wrote:
> On Sun, Nov 25, 2012 at 1:03 AM, Alfred Perlstein <bright@mu.org> wrote:
>> On 11/24/12 4:38 PM, Attilio Rao wrote:
>>> On Sat, Nov 24, 2012 at 10:10 PM, Alfred Perlstein <bright@mu.org> wrote:
>>>> I don't understand why you are the one who is so upset.  Your first email
>>>> to
>>>> me implied that I had 0 smp experience.
>>>>
>>>> Let me explain why this rototilling is unneeded.
>>>>
>>>> Go download a copy of linux and observe the following:
>>>> spin_lock(&mb_cache_spinlock);
>>>> spin_unlock(&mb_cache_spinlock);
>>>> spin_lock_irqsave, spin_unlock_irqrestore
>>>> up()
>>>> down(dqio_mutex)
>>>>
>>>> Those apis have been available for a decade at least.
>>>>
>>>> I'll cut to the point on this.
>>>>
>>>> If you want to change HOW the underlying freebsd SMP api works to improve
>>>> performance, then please do!
>>>>
>>>> But if you want to change the actual KPI, then please realize that Linux
>>>> SMP
>>>> does darn well with a KPI for SMP that's pretty much unchanged for nearly
>>>> 10
>>>> years.
>>>>
>>>> I would venture to say in this respect we've become what we used to mock
>>>> Linux for, an OS that gratuitously changes interfaces for the sake of
>>>> what
>>>> is cool, versus what our vendors need.
>>> Keeping old mechanisms/duplicate/etc. around just because they existed
>>> 10 years ago is not a good reason once their KPI is not only redundant
>>> but also dangerous. And this seems to be your only "technical" reason
>>> opposed to my proposals.
>> Whoa, wait a second.
>>
>> A user just proposed using the infrastructure to port linux drivers.
>>
>> Additionally the following subsystems make use of sema(9):
>> inifiband stack (linux compat shim).
>> sysv ipc.
>> ata.
>> opensolaris compat shim.
>> xfs.
>>
>> What would be the point of removing this KPI?
> Did you see also how they are used?
> In some places they have a counter of 1, which means they can be
> effectively replaced by an sx lock.
> In all the other places, they are used with a counter of 0, which
> means they can be effectively replaced by  mtx and sleep.
>
> Can you giving me a reason on why really keeping them?
>
> Also, if you think they would help a Linux compat shim layer, keep in
> mind the following:
> - a plan for something like that has been discussed for years and by
> several people and nothing concrete, happened, with a lot of
> disagreement (both technical and philosophical)
> - there is no plan for doing so in the foreeable future, neither there
> is agreement it is really a good idea. So you prefer to have
> completely redundant (and unused in the end) code just because it may
> or may not happen to help a compat layer that doesn't exist and maybe
> will never exits? Please answer openly.

1) compat layer
/usr/src.local/sys/ofed/drivers/infiniband/core #
cddl/contrib/opensolaris

2)
  if a user expects semaphores and we tell them to "rethink" things, 
then we're not providing the same facilities as every other non-BSD OS.

I guess that makes us "cool", but really it just seems out of touch.

The implementation is 176 lines of code + some headers.

The sad part to me is that the original user asked "hey I need 
sema+signal" but we don't know the facility they really need, count of 
1?  count of 10? instead of just giving them a textbook CS semaphore we 
tell them to "build your own using our primitives".

At some point an OS has to grow up and realize that by doing everything 
its own way it's not making itself special, so much as limiting its 
acceptance.

>> Those consumers would then just have to roll their own.
>>
>> Wouldn't that lead to duplicate code?
>>
>>       176 sys/kern/kern_sema.c
>>
>> It's not really a lot of code.
>>
>>
>>> Using disown for lockmgr is something very dangerous which should not
>>> be used out of his specific case for the buffer cache. I really don't
>>> want to incourage its use out of that and I'm sure people can build
>>> very dangerous policies using it (this is just an example, but it
>>> explains my point, I think).
>>> Maybe my proposed changes of mtx against rwlock are a bit too extreme,
>>> I could understand that and I'm very open on changing my mind on it,
>>> but I don't understand how would be useful to keep lockmgr() and
>>> sema() around honestly.
>>>
>>> It is just a burden of code duplication (in some places) and dangerous
>>> KPI (in other).
>> I agree that lockmgr is a very dangerous beast.   Whatever that can be done
>> to get rid of the complexity would be good.
>>
>> If we could hide some of the lockmgr "features" behind a "I know what i'm
>> doing fence" or maybe a "only to be used with filesystem code" fence that
>> might be good.
> I don't agree, I would just like to have a clean KPI and force people
> to do right things. That clean KPI already exists, we just need to
> conver current consumers in doing their dirtiness in "controled
> environment".

Well I was just trying to agree with you, to be honest I have no idea 
what your plans are.

I did want to explain that merging sx+lockmgr was tried before, and it 
failed.  You may have more skill with it and succeed, but you should 
check source history and mailing lists for the edge cases that made 
replacing it entirely fail.


-Alfred



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50B178A3.4070305>