Date: Fri, 5 May 2017 19:13:04 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Konstantin Belousov <kib@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r317809 - head/share/man/man7 Message-ID: <20170505174957.B875@besplex.bde.org> In-Reply-To: <201705042131.v44LVokb076951@repo.freebsd.org> References: <201705042131.v44LVokb076951@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 4 May 2017, Konstantin Belousov wrote: > Log: > Provide introduction for the arch(7) manpage. > > Start with some words about linear address space and its layout, then > explain pointers models and ABIs, providing explanation to the > structure of the tables. > > Reviewed by: emaste, imp > 'Future-proof' cheri wording by: brooks > > Modified: head/share/man/man7/arch.7 > ============================================================================== > --- head/share/man/man7/arch.7 Thu May 4 21:30:26 2017 (r317808) > +++ head/share/man/man7/arch.7 Thu May 4 21:31:50 2017 (r317809) > ... > @@ -35,9 +35,92 @@ > .Sh DESCRIPTION > Differences between CPU architectures and platforms supported by > .Fx . > -.Pp > +.Ss Introduction > If not explicitly mentioned, sizes are in bytes. > +.Pp > +FreeBSD uses flat address space for program execution, where > +pointers have the same binary representation as Minor grammar problems. "binary" is redundant. > +.Vt unsigned long > +variables, and > +.Vt uintptr_t > +and > +.Vt size_t > +types are synonyms for > +.Vt unsigned long . uintptr_t and size_t are are not synonyms for unsigned long on all arches. They only have the same respresentation on 32-bit arches. On 32-bit arches, they are synonyms for unsigned int, and thus have a lower rank than unsigned long. This mainly causes problems printing them, but might cause sign extension/overblow problems. For example, (size_t)0 + (long)-1 is unsigned and large positive on 64-bit arches, but signed and small negative on 32-bit arches. > +.Pp > +In order to maximize compatibility with future pointer integrity mechanisms, "pointer integrity mechanisms" sounds like management/marketingspeak. "integrity" isn't a relevant property of integer types. "mechanism" might mean the details of the representation (more than the size), but I think you just mean the size. Most manipulations of pointers as integers assume the same representation. You stated that the representation is the same [in future] above, and didn't use the usual caveat "on all supported arches". I don't like this, but lots of code depends on it. Translation of the above: "... compatibility with changes in the size of pointers in future implementations". > +manipulations of pointers as integers should be performed via > +.Vt uintptr_t > +or > +.Vt intptr_t > +and no other types. Except in the kernel, vm_offset_t should normally be used. In fact, it is wrong to use [u]intptr_t for anything except what is guaranteed by the C standard. The only guarantee is that you get back the same value (not necessarily the same bits) if you start with a pointer of type void * (possibly also qualified void *) and convert it to [u]intptr_t and back. You can also look at the bits in the integer representation, but don't expect these to be useful. Errors generally start in the cast. To convert a struct pointer to an integer back (with the same value), it is necessary to first convert to void *, then to [u]intptr_t, then back to void *, and finally back to the struct pointer. Use vm_offset_t for unportable uses. For flat address spaces, it is assumed that addition of offsets in the integer corresponds to addition of byte offsets in the pointer, as if the pointer is a pointer to unsigned char. Most other properties follow from that. There is a problem converting to vm_offset_t. We should guarantee that vm_offset_t has all the properties of uintptr_t and much more -- that it is not restricted to conversions between void * and back. The second guarantee requires compiler support in general, by we assume a flat address space so it just requires the compiler to not be perverse. Obviously, if [u]intptr_t exists, then the compiler can add the intermediate casts to and from void * to handle other pointer types. > +In particular, > +.Vt long > +and > +.Vt ptrdiff_t > +should be avoided. prtdiff_t should never be used in portable code. Neither should pointer subtraction. Only pointer differences of up to PTRDIFF_MIN/MAX. Otherwise, pointer subtraction is undefined. PTRDIFF_MIN/MAX can be as low as +-65535. Perverse and portability-testing implementations implement the handy type int17_t to use it perversely for ptrdiff_t, with size_t perhaps also perversely small (it can be uint15_t), but usually much larger than this ptrdiff_t. Pointer subtraction is thus undefined in general even within the same array if the array has 65536 eleemnts. There is a minor practical problem with non-perverse ptrdiff_t and a corresponding problem for vm_offset_t. 32-bit vm_offset_t has a range of 4G, but can't handle negative offsets, so you have to be careful not to subtract a larger pointer from a smaller one, or handle the wrap from this. 32-bit ptrdiff_t has a range of +-2G, so it can't hande pointers differing by half of the address space. > +Compilers define > +.Dv _LP64 > +symbol when compiling for an > +.Dv LP64 > +ABI. Further minor grammar problems here and elsewhere: - missing "the" before _LP64 - "an" is confusing. First, "a" might be correct depending on how you pronounce LP64. I pronounce it as "el ...", so "an" is better than "a". But there is only 1 LP64, so "the" is more correct. "the LP64 ABI" is confusing too. LP64 isn't an ABI or a collection of ABIs. The collection is of arches, many using a single LP64 sub-ABI with variations in other parts of their ABI. > ... > +Examples are: > +.Bl -column -offset indent "powerpc64" "Sy ILP32 counterpart" > +.It Sy LP64 Ta Sy ILP32 counterpart This has the "Sy" sizing bug in only 1 field in the header. > @@ -48,6 +131,9 @@ On all supported architectures: > .It float Ta 4 > .It double Ta 8 > .El > +Integers are represented as two-complement. > +Alignment of integer and pointer types is natural, that is, > +the address of the variable must be congruent to zero modulo type size. Missing "the" after "modulo". Is it natural for arm? arm has unnatural struct padding, at least at the end of structs. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170505174957.B875>