From owner-freebsd-amd64@FreeBSD.ORG Thu Jul 19 03:10:40 2012 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 70444106566B for ; Thu, 19 Jul 2012 03:10:40 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail34.syd.optusnet.com.au (mail34.syd.optusnet.com.au [211.29.133.218]) by mx1.freebsd.org (Postfix) with ESMTP id E292B8FC15 for ; Thu, 19 Jul 2012 03:10:39 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail34.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6J3ATmA012239 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 19 Jul 2012 13:10:31 +1000 Date: Thu, 19 Jul 2012 13:10:29 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov In-Reply-To: <20120718081003.GT2676@deviant.kiev.zoral.com.ua> Message-ID: <20120719111846.G780@besplex.bde.org> References: <201207172150.q6HLoFvB096742@freefall.freebsd.org> <20120718103210.E834@besplex.bde.org> <20120718035220.GO2676@deviant.kiev.zoral.com.ua> <20120718140523.X1862@besplex.bde.org> <20120718050958.GQ2676@deviant.kiev.zoral.com.ua> <20120718153554.O2195@besplex.bde.org> <20120718081003.GT2676@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Mark Linimon , freebsd-amd64@freebsd.org Subject: Re: amd64/169927: siginfo, si_code for fpe errors when error occurs using the SSE math processor X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jul 2012 03:10:40 -0000 On Wed, 18 Jul 2012, Konstantin Belousov wrote: > On Wed, Jul 18, 2012 at 04:59:38PM +1000, Bruce Evans wrote: >> On Wed, 18 Jul 2012, Konstantin Belousov wrote: >> Putting the definiton in machine/pcpu.h should avoid changing the >> prerequistes for machine/pcb.h. >> >>> #ifndef _KERNEL >>> /* stuff that *used* to be included by user.h, or is now needed */ >>> >>> Please note the location in pcb.h an not in machine/pcpu.h, where it >>> cannot work for technical reasons (struct pcpu is not defined yet). >> >> Not applicable -- see above. > No, this cannot work. machine/pcpu.h defines PCPU_MD_FIELDS which is used > to provide md part of the struct pcpu. I see. Oops. But this problem is easy to avoid. There are at least the following alternatives: (1) hard-code the offset, as for curthread. This is ugly, but not too hard to maintain. The offset has been 4 * sizeof(struct thread *), for more than 10 years, and you can't change this without breaking the ABI. (2) #define the offset, as in raw asm code. The offset is #define'd in assym.s. This would give too much namespace pollution and bloat (unless a short assym.non-s is used). (3) use a macro instead of an inline function. Inline functions tend to require much more namespace pollution than macros, and this is an example of that. This is essentially the method that we already use. We define define the macro PCPU_GET(). We could define curpcb as PCPU_GET(curpcb). Instead, we write PCPU_GET(curpcb) everywhere that we want the value. I didn't like the roto-tilling of the source code for this and other pcpu fields, and we didn't do it for curproc (we roto-tilled curproc to curthread instead). Both curthread and curpcb are not as volatile as the other fields, so it is technically slightly incorrect to use the general volatile PCPU_GET() for them. We already use a de-volatalized PCPU_GET() for curthread. This was originally just an optimization and a de-obfuscation. PCPU_GET() expands to really horrible, hard-to debug code. It was so horrible and bloated that it broke compiling, so peter changed curthread to use an inline function in 2003. This is only done for some arches (just amd64 and i386?), and sys/pcpu.h still defines curthread as PCPU_GET(curthread) if it is not already defined. The inline function was originally equivalent to PCPU_GET(). It used a volatile asm and wasn't pure2. But since it was independent of PCPU_GET(), it was easier to optimize. To do the same thing for curpcb, provide a PCPU_NONVOLATILE_GET() which is the same as PCPU_GET() without the volatile in the asm, and #define curpcb as PCPU_NONVOLATILE_GET(curpcb). Or maybe start with the inline function used for curthread, and macroize it. This could be used for curthread too, but I think we want to keep it as an inline function since the macro expansions are still horrible. BTW, can you explain the locking for the pcpu access functions? It seems to be quite broken, with the volatile in the asms neither necessary nor sufficient. PCPU_PTR() seems to be almost unusable, but it is used. Here is the i386 version of __PCPU_GET(): % #define __PCPU_GET(name) __extension__ ({ \ % __pcpu_type(name) __res; \ % struct __s { \ % u_char __b[MIN(sizeof(__res), 4)]; \ % } __s; \ % \ % if (sizeof(__res) == 1 || sizeof(__res) == 2 || \ % sizeof(__res) == 4) { \ % __asm __volatile("mov %%fs:%1,%0" \ % : "=r" (__s) \ % : "m" (*(struct __s *)(__pcpu_offset(name)))); \ % *(struct __s *)(void *)&__res = __s; \ This part is more or less OK. It is subtle. The pcpu data starting at %fs:0 is volatile (or at least variable), and so is its address. Its address is not a C address, but is the constant 0 in %fs-space. %fs is equivalent to a volatile C pointer. When %fs changes, it is normally the responsibility of the caller to lock things so that this is not a problem or so that %fs can't change. (The %fs descriptor doesn't actually change, but the %fs data does). There are some important special cases where no locking is required: - for pc_curthread. This cannot change (except in context-switching code). The CPU running the code can change unless the caller locks things. Then %fs changes, but pc_curthread is switched so that it is the same for the new CPU as for the old CPU. - for pc_curpcb. This is switched like pc_curthread. - I don't know of any other pcpu variables that are as easy to use or as obviously correctly locked as the above 2. - counter variables. Now you want the %fs pointer to change to track the CPU. % } else { \ % __res = *__PCPU_PTR(name); \ __PCPU_PTR() seems to be almost unusable, and this is not one of the few cases where it is usable. Here it is used for wide or unusually-sized fields where atomic access is especially problematic, yet it does even less that the above to give atomicity. First consider using it in all cases to replace the above: - using it for fields like pc_curthead would be very broken. The pointer doesn't change, so if there is a context switch between reading the pointer and following it, the pointer reads the data for the wrong CPU. For pc_cuthread, the new pointer will give the old thread which is what curthread should return, but the old pointer will give some unrelated thread (whatever the old CPU is running now). - for counter variables, there is no problem. We don't care whether the pointer is for the new CPU or the old one, provided the access is atomic, and here I am only considering the case where accesses are natuarally atomic. (We actually use different access functions for counters so that things like incrementing counters is atomic.) Now consider it for wide data. The pointer doesn't change so the multiple accesses needed for the wide data are at least for the same CPU. This gives the same atomicity problems as for C pointers. The caller needs to lock. But PCPU_GET() is supposed to be a general interface. Its atomicity shouldn't depend on the size of the data. % } \ % __res; \ % }) What happens for arches where PCPU_GET() uses a C pointer? I don't see how this can work for fields that are switched like pc_curthread. Well, here is what happens in some cases: - on ia64, PCPU_GET() uses the C pointer pcpu which is a fixed register. curthread works because it doesn't use PCPU_GET() (it uses a __pure2 non-volatile asm similarly to i386). curpcb works beause it is not used. - sparc64 is similar to ia64 except it gives even more examples of the special switched fields: pcpu is C pointer in a fixed register; curpcb is a C pointer in another fixed register; curthread uses a __pure2 non-volatile asm similarly to ia64. Why is asm used for curthread on ia64 and sparc64? The asm seems to have no magic, but just follows the C pointer. Ah, this seems to be just bogus. Old versions did have magic -- they used a volatile asm and didn't use __pure2, but they were optimized like the x86 versions 2 years ago. So the problem seems to be well understood for the switched fields, and to not exist for counters. Other cases are hopefully handled by locking. Here are some that aren't: - PCPU_GET/SET(switchtime) in kern_resource.c. The accesses are only proc locked AFAIK. switchtime is uint64_t, so accesses to it are non-atomic on 32-bit arches. - PCPU_GET/SET(switchtime) in kern_synch.c and sched_*.c. Now the accesses are thread locked or sched_locked, which had better be enough (it is all that locks switching pc_curthread too?). - in clock code, there is a lot of thread locking, but this is localized and not all of the pcpu accesses are covered by it. Apparently, clock code depends on being called in fast interrupt handle context to give locking for pcpu. PCPU accesses are surprisingly rare except for counters, so there don't seem to be many problems. Most uses of PCP_PTR() are reasonable, since they are in places like subr_witness.c that should know to lock. I would remove the one in PCPU_GET(). This would break mainly the switchtime accesses on i386 and force callers to use PCPU_GET() and know that they need locking. Bruce