Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Jun 2013 16:04:23 +0200
From:      Ed Schouten <ed@80386.nl>
To:        freebsd-mips@freebsd.org
Cc:        freebsd-arch@freebsd.org
Subject:   Kernelspace C11 atomics for MIPS
Message-ID:  <CAJOYFBD502MYbkVR2hnVDTYWOvOUr15=OPyjotNvv%2BZ09vQ1OQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,

As of r251230, it should be possible to use C11 atomics in
kernelspace, by including <sys/stdatomic.h>! Even when not using Clang
(but GCC 4.2), it is possible to use quite a large portion of the API.
A couple of limitations:

- The memory order argument is simply ignored, making all the calls do
a full memory barrier.
- At least Clang allows you to do arithmetic on C11 atomics directly
(e.g. "a += 5" == "atomic_fetch_add(&a, 5)"), which is of course not
possible to mimick.
- The atomic functions only work on 1,2,4,8-byte types, which is
probably a good thing.

Amazingly, it turns out that it most of the architectures, with the
exception of ARM and MIPS. To make MIPS work, we need to implement
some of the __sync_* functions that are described here:

http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html

Some time ago I already added some of these functions to our
libcompiler-rt in userspace, to make atomics work there.
Unfortunately, these functions were quite horribly implemented, as I
tried to build them on top of <machine/atomic.h>, which is far from
trivial/efficient. It is also restricted to 4 and 8-byte types. That's
why I thought: why not spend some time learning MIPS assembly and
write some decent implementations for these functions?

The result:

http://80386.nl/pub/mips-stdatomic.txt

For now, please focus on sys/mips/mips/stdatomic.c. It implements all
the __sync_* functions called by <stdatomic.h> for 1, 2, 4 and 8 byte
types. There is some testing code in there as well, which can be
ignored. This code disassembles to the following:

http://80386.nl/pub/mips-stdatomic-disasm.txt

As I don't own a MIPS system myself, I was thinking about tinkering a
bit with qemu to see whether these functions work properly. My
questions are:

- Does anyone have any comments on the C code and/or the machine code
generated? Are there some nifty tricks I can apply to make the machine
code more efficient that I am unaware o?
- Is there anyone interested in testing this code a bit more
thoroughly on physical hardware?
- Would anyone mind if I committed this to HEAD?

Thanks,
--
Ed Schouten <ed@80386.nl>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJOYFBD502MYbkVR2hnVDTYWOvOUr15=OPyjotNvv%2BZ09vQ1OQ>