Date: Mon, 3 Jun 2013 16:04:23 +0200 From: Ed Schouten <ed@80386.nl> To: freebsd-mips@freebsd.org Cc: freebsd-arch@freebsd.org Subject: Kernelspace C11 atomics for MIPS Message-ID: <CAJOYFBD502MYbkVR2hnVDTYWOvOUr15=OPyjotNvv%2BZ09vQ1OQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi, As of r251230, it should be possible to use C11 atomics in kernelspace, by including <sys/stdatomic.h>! Even when not using Clang (but GCC 4.2), it is possible to use quite a large portion of the API. A couple of limitations: - The memory order argument is simply ignored, making all the calls do a full memory barrier. - At least Clang allows you to do arithmetic on C11 atomics directly (e.g. "a += 5" == "atomic_fetch_add(&a, 5)"), which is of course not possible to mimick. - The atomic functions only work on 1,2,4,8-byte types, which is probably a good thing. Amazingly, it turns out that it most of the architectures, with the exception of ARM and MIPS. To make MIPS work, we need to implement some of the __sync_* functions that are described here: http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html Some time ago I already added some of these functions to our libcompiler-rt in userspace, to make atomics work there. Unfortunately, these functions were quite horribly implemented, as I tried to build them on top of <machine/atomic.h>, which is far from trivial/efficient. It is also restricted to 4 and 8-byte types. That's why I thought: why not spend some time learning MIPS assembly and write some decent implementations for these functions? The result: http://80386.nl/pub/mips-stdatomic.txt For now, please focus on sys/mips/mips/stdatomic.c. It implements all the __sync_* functions called by <stdatomic.h> for 1, 2, 4 and 8 byte types. There is some testing code in there as well, which can be ignored. This code disassembles to the following: http://80386.nl/pub/mips-stdatomic-disasm.txt As I don't own a MIPS system myself, I was thinking about tinkering a bit with qemu to see whether these functions work properly. My questions are: - Does anyone have any comments on the C code and/or the machine code generated? Are there some nifty tricks I can apply to make the machine code more efficient that I am unaware o? - Is there anyone interested in testing this code a bit more thoroughly on physical hardware? - Would anyone mind if I committed this to HEAD? Thanks, -- Ed Schouten <ed@80386.nl>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJOYFBD502MYbkVR2hnVDTYWOvOUr15=OPyjotNvv%2BZ09vQ1OQ>