Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Oct 2012 12:05:42 -0700
From:      mdf@FreeBSD.org
To:        Andre Oppermann <andre@freebsd.org>
Cc:        src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>, svn-src-user@freebsd.org, attilio@freebsd.org, Jeff Roberson <jroberson@jroberson.net>, Bruce Evans <brde@optusnet.com.au>
Subject:   Re: svn commit: r241889 - in user/andre/tcp_workqueue/sys: arm/arm cddl/compat/opensolaris/kern cddl/contrib/opensolaris/uts/common/dtrace cddl/contrib/opensolaris/uts/common/fs/zfs ddb dev/acpica dev/...
Message-ID:  <CAMBSHm8mDt2a_xw-3OaZLn=4SosRBaDCkDKd_zO_y_kCVUnbpA@mail.gmail.com>
In-Reply-To: <508965B3.2020705@freebsd.org>
References:  <201210221418.q9MEINkr026751@svn.freebsd.org> <201210241136.06154.jhb@freebsd.org> <CAJ-FndAG-Qp%2B1aQvoL7YRj=R151Qe9_wNrUeOAaDsdYao_-zCQ@mail.gmail.com> <201210241414.30723.jhb@freebsd.org> <CAJ-FndAu6BGeMMbtFTLaSqy82mbhM9CVEyJ3Lb1WhAogJr59yA@mail.gmail.com> <CAJ-FndBqRpkBhCntd2aqwVYPu%2B2EHGeuXr5srLtrNNDK-ButxA@mail.gmail.com> <508965B3.2020705@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Oct 25, 2012 at 9:15 AM, Andre Oppermann <andre@freebsd.org> wrote:
> I think we're completely overdoing it.

I agree, but in the opposite direction.  This is a solution looking
for a problem.

> On amd64 the size difference
> of 64B cache line aligning and padding all mutex, sx and rw_lock
> structures adds the tiny amount of 16K on a GENERIC kernel of 19MB.
> That is a 0.009% increase in size.  Of course dynamically allocated
> memory that includes a mutex grows a tiny bit at well.
>
> Hence I propose to unconditionally slap __aligned(CACHE_LINE_SIZE) into
> all locking structures and be done with it.  As an added benefit we
> don't have to worry about individual micro-optimizations on a case by
> case basis.

What problem are you trying to solve?  I understand all about cache
sharing, but if you force struct mtx to take its own cache line, I now
have no ability to put data accessed under the lock in the same cache
line.  You've de-optimized code and memory layout.  And like alc@
said, ignored the mtx embedded in many dynamically allocated
structures.

If certain, specific global mutexes will benefit, then they can be
explicitly allocated as __aligned and explicitly padded to a cache
line.  No other mtx except ones specifically identified as making a
performance difference should be touched.  There is no need for a
general solution.

Thanks,
matthew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMBSHm8mDt2a_xw-3OaZLn=4SosRBaDCkDKd_zO_y_kCVUnbpA>