Date: Mon, 3 Jun 2013 22:43:19 -0600 From: Warner Losh <imp@bsdimp.com> To: Adrian Chadd <adrian@freebsd.org> Cc: Ed Schouten <ed@80386.nl>, freebsd-mips@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Kernelspace C11 atomics for MIPS Message-ID: <A4020A8F-51DD-4C1F-902B-74C9F3C167D6@bsdimp.com> In-Reply-To: <CAJ-Vmo=vNbT9majPCZ8ugzPsNzh46DTD4mMDX-cuxx9Og91ptw@mail.gmail.com> References: <CAJOYFBD502MYbkVR2hnVDTYWOvOUr15=OPyjotNvv%2BZ09vQ1OQ@mail.gmail.com> <D02AF210-5129-40AB-9481-3F0A44575E98@bsdimp.com> <CAJ-Vmo=vNbT9majPCZ8ugzPsNzh46DTD4mMDX-cuxx9Og91ptw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jun 3, 2013, at 8:45 PM, Adrian Chadd wrote: > Speaking of this; any idea why the SYNC operators have 8 NOPs = following them? Yes, that's the exact issue that I've had with them, but have never had = time to sort it out... Warner > I noticed that when going through disassemblies of various mips24k .o = files. >=20 >=20 >=20 > Adrian >=20 > On 3 June 2013 10:53, Warner Losh <imp@bsdimp.com> wrote: >>=20 >> On Jun 3, 2013, at 8:04 AM, Ed Schouten wrote: >>=20 >>> Hi, >>>=20 >>> As of r251230, it should be possible to use C11 atomics in >>> kernelspace, by including <sys/stdatomic.h>! Even when not using = Clang >>> (but GCC 4.2), it is possible to use quite a large portion of the = API. >>> A couple of limitations: >>>=20 >>> - The memory order argument is simply ignored, making all the calls = do >>> a full memory barrier. >>> - At least Clang allows you to do arithmetic on C11 atomics directly >>> (e.g. "a +=3D 5" =3D=3D "atomic_fetch_add(&a, 5)"), which is of = course not >>> possible to mimick. >>> - The atomic functions only work on 1,2,4,8-byte types, which is >>> probably a good thing. >>>=20 >>> Amazingly, it turns out that it most of the architectures, with the >>> exception of ARM and MIPS. To make MIPS work, we need to implement >>> some of the __sync_* functions that are described here: >>>=20 >>> http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html >>>=20 >>> Some time ago I already added some of these functions to our >>> libcompiler-rt in userspace, to make atomics work there. >>> Unfortunately, these functions were quite horribly implemented, as I >>> tried to build them on top of <machine/atomic.h>, which is far from >>> trivial/efficient. It is also restricted to 4 and 8-byte types. = That's >>> why I thought: why not spend some time learning MIPS assembly and >>> write some decent implementations for these functions? >>>=20 >>> The result: >>>=20 >>> http://80386.nl/pub/mips-stdatomic.txt >>=20 >> The number of necessary syncs varies by processor type. There's also = newer synchronization instructions that make this as efficient as = possible for all mips32r2 and mips64r2-based machines. Older Caviums, at = least and maybe newer ones, also have their own variants. What you have = will mostly work for the processors we have to support. mips_sync could = therefore be better. Doing it before AND after seems like overkill as = well. Since sync is a fairly performance killing assembler instruction, = how would you feel about allowing optimizations? >>=20 >> This is my biggest single concern about the patch, but it also my = current biggest concern about the MIPS atomic operators in general. >>=20 >>> For now, please focus on sys/mips/mips/stdatomic.c. It implements = all >>> the __sync_* functions called by <stdatomic.h> for 1, 2, 4 and 8 = byte >>> types. There is some testing code in there as well, which can be >>> ignored. This code disassembles to the following: >>>=20 >>> http://80386.nl/pub/mips-stdatomic-disasm.txt >>>=20 >>> As I don't own a MIPS system myself, I was thinking about tinkering = a >>> bit with qemu to see whether these functions work properly. My >>> questions are: >>>=20 >>> - Does anyone have any comments on the C code and/or the machine = code >>> generated? Are there some nifty tricks I can apply to make the = machine >>> code more efficient that I am unaware o? >>> - Is there anyone interested in testing this code a bit more >>> thoroughly on physical hardware? >>> - Would anyone mind if I committed this to HEAD? >>=20 >> I have some cavium gear I can easily test on, and some other stuff I = can less-easily test on. >>=20 >> It wouldn't be horrible to commit to head, but it would affect = performance in many places. >>=20 >> Don't commit the kern/bla.c standard change to conf/files, it looks = to be bogus :) >>=20 >> Warner >>=20 >> _______________________________________________ >> freebsd-mips@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-mips >> To unsubscribe, send any mail to = "freebsd-mips-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A4020A8F-51DD-4C1F-902B-74C9F3C167D6>