From owner-svn-src-all@FreeBSD.ORG Wed Jun 22 19:31:36 2011 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 802FB106566B; Wed, 22 Jun 2011 19:31:36 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id 137F48FC19; Wed, 22 Jun 2011 19:31:35 +0000 (UTC) Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au (c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p5MJVPgO031151 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 23 Jun 2011 05:31:26 +1000 Date: Thu, 23 Jun 2011 05:31:25 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Warner Losh In-Reply-To: <82E2B828-97C9-4C35-A619-ACDB5C40E99B@bsdimp.com> Message-ID: <20110623050025.O1587@besplex.bde.org> References: <201106191913.p5JJDOqJ006272@svn.freebsd.org> <20110622063258.D2275@besplex.bde.org> <4E0128FF.6020804@rice.edu> <82E2B828-97C9-4C35-A619-ACDB5C40E99B@bsdimp.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: src-committers@FreeBSD.org, Alan Cox , Alan Cox , svn-src-all@FreeBSD.org, Attilio Rao , "Bjoern A. Zeeb" , Bruce Evans , svn-src-head@FreeBSD.org Subject: Re: svn commit: r223307 - head/sys/vm X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Jun 2011 19:31:36 -0000 On Wed, 22 Jun 2011, Warner Losh wrote: > On Jun 22, 2011, at 3:26 AM, Attilio Rao wrote: > >> 2011/6/22 Warner Losh : >>> >>> On Jun 21, 2011, at 5:27 PM, Alan Cox wrote: >>> >>> On 06/21/2011 16:09, Attilio Rao wrote: >>> >>> 2011/6/21 Bruce Evans: >>> >>> On Tue, 21 Jun 2011, Bjoern A. Zeeb wrote: >>> >>> On Jun 19, 2011, at 7:13 PM, Alan Cox wrote: >>> >>> Hi Alan, >>> >>> Author: alc >>> >>> Date: Sun Jun 19 19:13:24 2011 >>> >>> New Revision: 223307 >>> >>> URL: http://svn.freebsd.org/changeset/base/223307 >>> >>> Log: >>> >>> Precisely document the synchronization rules for the page's dirty field. >>> >>> (Saying that the lock on the object that the page belongs to must be >>> >>> held >>> >>> only represents one aspect of the rules.) >>> >>> Eliminate the use of the page queues lock for atomically performing >>> >>> read- >>> >>> modify-write operations on the dirty field when the underlying >>> >>> architecture >>> >>> supports atomic operations on char and short types. >>> >>> Document the fact that 32KB pages aren't really supported. >>> >>> contrary to the tinderbox I'd like to point out that all mips kernels >>> >>> built by universe are broken with a SVN HEAD from earlier today. Could you >>> >>> please check and see if you can fix it? The errors I get are: >>> >>> vm_page.o: In function `vm_page_clear_dirty': >>> >>> /sys/vm/vm_page.c:(.text+0x18d0): undefined reference to `atomic_clear_8' >>> >>> /sys/vm/vm_page.c:(.text+0x18d0): relocation truncated to fit: R_MIPS_26 >>> >>> against `atomic_clear_8' >>> >>> vm_page.o: In function `vm_page_set_validclean': >>> >>> /sys/vm/vm_page.c:(.text+0x38f0): undefined reference to `atomic_clear_8' >>> >>> /sys/vm/vm_page.c:(.text+0x38f0): relocation truncated to fit: R_MIPS_26 >>> >>> against `atomic_clear_8' >>> >>> Atomic types shorter than int cannot be used in MI code, since they might >>> >>> not exist. Apparently they don't exist on mips. jake@ fixed all their >>> >>> old uses for sparc4 in ~Y2K. >>> >>> I'm sure they do, they exist in support.S though and may not have the >>> >>> _8 form (they may just have the _char version). I may look at the code >>> >>> again to be sure. >>> >>> >>> It appears that while mips/include/atomic.h declares the existence of >>> atomic_clear_8, mips/mips/support.S doesn't implement it. In other words, >>> only support for int's and short's is currently implemented, not char's: >>> >>> # grep atomic_clear mips/mips/support.S >>> * atomic_clear_32(u_int32_t *a, u_int32_t b) >>> LEAF(atomic_clear_32) >>> END(atomic_clear_32) >>> * atomic_clear_16(u_int16_t *a, u_int16_t b) >>> LEAF(atomic_clear_16) >>> END(atomic_clear_16) >>> >>> The current crop of atomic*16 and atomic*8 functions have the restriction >>> that the address must be 32-bit aligned (and it forces this by aligning to >>> 32-bits silently and then operates on the low 8 or 16 bits in that word!) >>> I'm guessing that this is likely just wrong. Comments? >>> Warner >> >> That is wrong, of course, and my personal opinion is that one should >> not implement atomic operations if they cannot be done efficiently >> (example: if you need to disable interrupts or similar expensive >> operation just to assure atomicity of operation, just don't support >> it) as long as not having _8/_char is perfectly fine. > > I think it can be efficient, for some reasonable definition of efficient. The masking and shifting operations aren't that onerous to write and the instructions cycles would be dwarfed by the ll/sc pair, which are required to implement atomic. > > The issue is that the code today is clearly wrong and needs to be fixed. I think it is still good to discourage use of non-bus-width atomic ops by not supporting at the MI level them for any arch. Suppose only bus-width accesses are atomic, as seems to be the case on mips. Then you can still do atomic accesses on bytes by doing an atomic op on the neigbouring 3 or 7 bytes. But this may increase memory contention by a factor of 3 or 7. The contention can be reduced either by putting only rarely-accessed bytes near the bytes that need the atomic ops, or by putting unused padding there, or by not using byte-sized objects to begin with (this gives the same effect and even the same memory accesses as unused padding bytes -- there are unused bits in the objects instead of external unused bytes). The former can be arranged easily enough in the vm_page struct. But MI code shouldn't know about this. Suppose byte accesses are atomic at the instruction level. Then the hardware probably still needs to expand to the bus width. Depending on how well it virtualizes this, there may be the same factor of 3, 7 or more in memory contention. The relevant bus width may be the cache line size which on modern CPUs is much more than 4 bytes and probably more like 32. So the CPU must already being doing good virtualization to make 4-byte atomic accesses not too slow (non-atomic accesses are already virtualized by the caches). For efficiency, we really should pay more attention to packing related int variables into cache lines (either the same line to increase cache hits or separate lines, to reduce contention), but the best way to do this is very MD and unclear. Bruce