From owner-freebsd-amd64@FreeBSD.ORG Sun Jul 29 04:58:41 2012 Return-Path: Delivered-To: amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 49AED106566B; Sun, 29 Jul 2012 04:58:41 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id 8B1B08FC08; Sun, 29 Jul 2012 04:58:40 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6T4wbpf021806 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 29 Jul 2012 14:58:37 +1000 Date: Sun, 29 Jul 2012 14:58:36 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov In-Reply-To: <20120728160202.GI2676@deviant.kiev.zoral.com.ua> Message-ID: <20120729123231.K1193@besplex.bde.org> References: <20120728160202.GI2676@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: amd64@FreeBSD.org, Jung-uk Kim , Andriy Gapon Subject: Re: Use fences for kernel tsc reads. X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jul 2012 04:58:41 -0000 On Sat, 28 Jul 2012, Konstantin Belousov wrote: > This was discussed on somewhat unwieldly thread on svn-src@ as a followup > to the commit r238755 which uncovered the problem in the first place. > > Below is the commit candidate. It changes the fix in r238755 to use CPUID > instead of rmb(), since rmb() translates to locked operation on i386, > and experimentation shown that Nehalem's RDTSC can pass LOCK. At least remove TSC-low instead of building another layer of pessimizations and source bloat on top of it. I might not mind even more source code bloat for TSC-low and fences if it were runtime tests to avoid using them unless necessary. At least sysctls to avoid using them. When the kernel selects TSC-low, ordinary TSC becomes unavailable. > ... > Handling of usermode will follow later. I hesitate to mention that this doesn't pessimize all uses of rdtsc: - x86/isa/clock.c uses raw rdtsc32() for DELAY() - x86/x86/mca.c uses raw rdtsc() for mca_check_status() - x86/x86/tsc.c uses raw rdtsc() for: - calibrating the TSC frequency. The calibration is otherwise sloppy and would only be slightly affected by additional sloppiness or its removal. - the CPU ticker. A function pointer is used and the address of the static inline rdtsc() is taken. It becomes not so raw -- a normal static function. I don't like the CPU ticker. It gives huge inaccuracies in some cases. Most uses of it go through calcru(), and it ameliorates some of the bugs. It enforces monotonicity of rusage times for other reasons. When the inaccuracies or overflow causes the time to go backwards, calcru() complains about some cases. This is about the only place in the kernel that checks for the time going backwars. Most uses of the CPU ticker are for setting switchtime on context switches. These uses are far apart, and near heavy locking, and on a single CPU, so the TSC as read by them can probably never go backwards. (The same magic that allows switching to be per-CPU should allow the TSC to be per-CPU.) - {amd64,i386}/include/cpu.h uses the CPU ticker for get_cyclecount(). I don't like get_cyclecount(). It is a bad API resulting from previous mistakes in this area. It is basically the TSC spelled differently so that the TSC could be used when the TSC timecounter was not properly exported or the hardware doesn't have it. x86 hardware doesn't have it, and non-x86 might name it differently even if it has it. When the hardware doesn't have it, a weak replacement like the current time counter is used. But this makes it too slow to use, since all places that use it use it because they don't want slowness (otherwise they would just use a timecounter directly). The selection used to be ifdefed in i386/include/cpu.h. Now, it just uses the CPU ticker on 386 (it still uses a raw rdtsc() on amd64). It is actually documented in section 9, but the much more important CPU ticker isn't. Its comment in at least the i386 machine/cpu.h still says that it returns a "bogo-time" for random harvesting purposes, but the CPU ticker time is not so bogus (it must be good enough for thread accounting, using deltas on the same CPU), and of course this mistake is now used for more than random harvesting. On other arches, the implementation of get_cyclecount() is differently bogus: - on amd64, it is an inline function that calls the inline rdtsc() - on arm, it is an inline function that always uses binuptime() and never goes through the CPU ticker - on ia64, it is #defined as a raw counter API spelled without `()' so it looks like a variable, but I guess it is a function - on i386, it is an inline function that calls the non-inline cpu_ticks(), so it is slower than the old ifdefed version. cpu_ticks() goes through at least a function pointer to reach the non-inline rdtsc(). - on mips, it is #defined as a raw counter API spelled with `()' - on powerpc, it is a dedicated inline function with inline asm - on sparc64, it is an inline function that calls a raw counter API That is a lot of differently spelled bloat to support an API that should never have existed. All arches could have just the same slowness as i386, with no MD code, by translating it to cpu_ticks() as on i386. Or callers of it could abuse cpu_ticks() directly. get_cyclecount() has escaped from random harvesting to the following places: - dev/de/if_devar.h: used for timestamps under a performance-testing option that seems to not be the default (so hardly anyone except its developer ever used it). Old (10 or 100 Mbps here?) NIC drivers don't need very efficient timestamps, and it would be nice if they weren't in unscaled bogotime. I use microtime() in not-so-old (1Gbps) NIC drivers. This with a TSC works well for timestamping events a few usec apart. It wouldn't work so well with an APIC timecounter taking about 1 usec to read. But the APIC timecounter should be fast enough for this use at 100 Mbps. The i8254 timecounter is 5 times slower again, but fast enough at 10 Mbps (original de?). - dev/random/harvest.c: the original use. A cycle counter seems like an especially bad choice for randomness. It just has a bit or two of randomness in its low bits. The serialization bug makes it better for this. - dev/random/randomdev_soft.c: another original use. - kern/init_main.c: seeding for randomdev. - kern/kern_ktr.c: for timestamps that should at least be monotonic and unique even if they are in unscaled bogotime. get_cyclecount() is highly unsuitable for this, especially when it uses a timecounter (don't put any ktr calls in timecounter code, else you might get endless recursion). Apparently, get_cyclecount() was used because it was the only portable API to reach the hardware name. For this use, get_cyclecount() should not be faked. Serializing might be needed here. TSC-low lossage is unwanted here -- it might break uniqueness. - netinet/sctp_os_bsd.h: for tracing. Like ktr, except it doesn't need to worry about recursion. Perhaps can be more like NIC drivers should be and just use a timecounter. Anyway, the unserialized TSC serves for providing bogo-time for get_cyclecount() even better than it serves for providing time for the CPU ticker. - i386/i386/machdep.c uses raw rdtsc() for recalibrating the TSC frequency. Similar sloppyness to initial calibration. - i386/i386/perfmon.c: TSC ticks are just an an especially important event. Serializing would break performance testing by changing the performance significantly. Since this is only for a user API, the breakage from the increased overhead would be dominated by existing overheads, as for the clock_getttime() syscall. - include/xen/xen-os.h: home made version of rdtsc() with different spelling. To change the spelling, it should use the standard inline and not repeat asm. - isa/prof_machdep.c. Like perfmon, but now rdtsc() is called on every function call and return for high resolution profiling, and there is no slow syscall to already slow this down. Function call lengths down to about 10 nsec can be measured reliably on average. The overhead is already much more than this, but is compensated for, so lengths of 10 nsec might remain measureable, but the overhead is already too much and adding to it wouldn't help, and might disturb the performance too much. This use really wants serialization, and out-of-order timestamps might show up as negative function call times, but they never have. High resolution kernel profiling never worked for SMP (except in my version, it works except for excessive lock contention) and was broken by gcc-4 changing the default compiler options, so this problem can be ignored for now. Please trim lots of the above if you reply. > diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c > index c253a96..101cbb3 100644 > --- a/sys/x86/x86/tsc.c > +++ b/sys/x86/x86/tsc.c > ... > @@ -328,15 +344,26 @@ init_TSC(void) > > #ifdef SMP > > -/* rmb is required here because rdtsc is not a serializing instruction. */ > +/* > + * RDTSC is not a serializing instruction, so we need to drain > + * instruction stream before executing it. It could be fixed by use of > + * RDTSCP, except the instruction is not available everywhere. Sentence breaks are 2 spaces in KNF. There was a recent thread about this. I used to police style in this file more, and it mostly uses 2 spaces. > + * > + * Use CPUID for draining. The timecounters use MFENCE for AMD CPUs, > + * and LFENCE for others (Intel and VIA) when SSE2 is present, and > + * nothing on older machines which also do not issue RDTSC > + * prematurely. There, testing for SSE2 and vendor is too cumbersome, > + * and we learn about TSC presence from CPUID. > + */ RDTSC apparently doesn't need full serialization (else fences wouldn't be enough, and the necessary serialization would be slower). The necessary serialization is very complex and not fully described above, but the above is not a good place to describe many details. It already describes too much. It is obvious that we shouldn't use large code or ifdefs to be portable in places where we don't care at all about efficiency. Only delicate timing requirements would prevent us using the simple CPUID in such code. For example, in code very much like this, we might need to _not_ slow things down, since we might want to see small differences between the TSCs on different CPUs. > @@ -592,23 +628,55 @@ sysctl_machdep_tsc_freq(SYSCTL_HANDLER_ARGS) > SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_U64 | CTLFLAG_RW, > 0, 0, sysctl_machdep_tsc_freq, "QU", "Time Stamp Counter frequency"); > > -static u_int > +static inline u_int > tsc_get_timecount(struct timecounter *tc __unused) > { > > return (rdtsc32()); > } Adding `inline' gives up to 5 (!) style bugs at different levels: - `inline' is usually spelled `__inline'. The former is now standard, but sys/cdefs.h never caught up with this and has better magic for the latter. And `inline' doesn't match most existing spellings. - the function is still forward-declared without `inline'. The differenc is confusing. - forward declaration of inline functions is nonsense. It sort of asks for early uses to be non-inline and late uses to be inline. But gcc now doesn't care much about the order, since -funit-at-a-time is the default. - this function is never actually inline. You added calls to the others but not this. - if this function were inline, then this might happen automatically due to -finline-functions-called-once. See below. > > -static u_int > +static inline u_int > tsc_get_timecount_low(struct timecounter *tc) > { > uint32_t rv; > > __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" > - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > return (rv); > } - as above, except now the function can actually be inline. It is still derefenced, so it is not actually inline in all cases. You added 2 calls to it, so it is inlined in 2 cases. gcc now inlines functions called once, especially when you don't want this. Here this is wanted. I don't know if gcc considers a use to be "once" if there is a single inlineable use and also a non-inlineable use. I miscounted the number of uses at first. There are now 2 inlineable cases, and although inlining would be good, it is no longer "once" so gcc shouldn't do it without an explicit __inline. > > +static inline u_int > +tsc_get_timecount_lfence(struct timecounter *tc __unused) > +{ > + > + lfence(); > + return (rdtsc32()); > +} Here and for mfence, you call the raw function. That's why tsc_get_timecount() is never inline. I first thought that the patch was missing support for plain TSC. I also don't like the rdtsc32() API :-). Since it is inline, compilers should be able to ignore the high bits in rdtsc(). But we're micro- optimizing this to get nice looking code and save a whole cycle or so. I'm normally happy to sacrifice a cycle for cleaner source code but not for 4 cycles :-). If the extra code were before the fence, then its overhead would be lost in the stalling for the fence. Perhaps similatly without fences. Since it is after, it is more likely to have a full cost, with a latency of many more than 1 cycle now not hidden by out-of-order execution. > + > +static inline u_int > +tsc_get_timecount_low_lfence(struct timecounter *tc) > +{ > + > + lfence(); > + return (tsc_get_timecount_low(tc)); > +} > + > +static inline u_int > +tsc_get_timecount_mfence(struct timecounter *tc __unused) > +{ > + > + mfence(); > + return (rdtsc32()); > +} > + > +static inline u_int > +tsc_get_timecount_low_mfence(struct timecounter *tc) > +{ > + > + mfence(); > + return (tsc_get_timecount_low(tc)); > +} > + > uint32_t > cpu_fill_vdso_timehands(struct vdso_timehands *vdso_th) > { > I don't like the indirections for using these functions, but we already have them. Linux would modify the instruction stream according to a runtime probe reduce it to an inline rdtsc in binuptime() etc. if possible (otherwise modify to add fences as necessary). Not inlining rdtsc like we do probably reduces serialization problems. Bruce From owner-freebsd-amd64@FreeBSD.ORG Sun Jul 29 17:12:56 2012 Return-Path: Delivered-To: amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 12D91106566C; Sun, 29 Jul 2012 17:12:56 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 409108FC0A; Sun, 29 Jul 2012 17:12:54 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q6THCuo9011589; Sun, 29 Jul 2012 20:12:56 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q6THChCY032847; Sun, 29 Jul 2012 20:12:43 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q6THChNU032846; Sun, 29 Jul 2012 20:12:43 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 29 Jul 2012 20:12:43 +0300 From: Konstantin Belousov To: Bruce Evans Message-ID: <20120729171243.GO2676@deviant.kiev.zoral.com.ua> References: <20120728160202.GI2676@deviant.kiev.zoral.com.ua> <20120729123231.K1193@besplex.bde.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/bcALokiWR3y46GP" Content-Disposition: inline In-Reply-To: <20120729123231.K1193@besplex.bde.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: amd64@freebsd.org, Andriy Gapon Subject: Re: Use fences for kernel tsc reads. X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jul 2012 17:12:56 -0000 --/bcALokiWR3y46GP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jul 29, 2012 at 02:58:36PM +1000, Bruce Evans wrote: > On Sat, 28 Jul 2012, Konstantin Belousov wrote: > >... > >Handling of usermode will follow later. >=20 > I hesitate to mention that this doesn't pessimize all uses of rdtsc: > - x86/isa/clock.c uses raw rdtsc32() for DELAY() There, fence is not needed because we do not compare counters on different cores. Note that delay_tc() explicitely performs pin when tsc is going to be used. > - x86/x86/mca.c uses raw rdtsc() for mca_check_status() mca use of rdtsc() seems to tbe purely informational. > >diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c > >index c253a96..101cbb3 100644 > >--- a/sys/x86/x86/tsc.c > >+++ b/sys/x86/x86/tsc.c > >... > >@@ -328,15 +344,26 @@ init_TSC(void) > > > >#ifdef SMP > > > >-/* rmb is required here because rdtsc is not a serializing instruction.= */ > >+/* > >+ * RDTSC is not a serializing instruction, so we need to drain > >+ * instruction stream before executing it. It could be fixed by use of > >+ * RDTSCP, except the instruction is not available everywhere. >=20 > Sentence breaks are 2 spaces in KNF. There was a recent thread about thi= s. Changed. > RDTSC apparently doesn't need full serialization (else fences wouldn't > be enough, and the necessary serialization would be slower). The > necessary serialization is very complex and not fully described above, > but the above is not a good place to describe many details. It already > describes too much. It is obvious that we shouldn't use large code or > ifdefs to be portable in places where we don't care at all about > efficiency. Only delicate timing requirements would prevent us using > the simple CPUID in such code. For example, in code very much like > this, we might need to _not_ slow things down, since we might want > to see small differences between the TSCs on different CPUs. This is argument to stop using do_cpuid(), since it causes relatively large amount of unneeded memory writes. I inlined CPUID instead with explicit register clobber list. >=20 > >@@ -592,23 +628,55 @@ sysctl_machdep_tsc_freq(SYSCTL_HANDLER_ARGS) > >SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_U64 | CTLFLAG_RW, > > 0, 0, sysctl_machdep_tsc_freq, "QU", "Time Stamp Counter frequency"); > > > >-static u_int > >+static inline u_int > >tsc_get_timecount(struct timecounter *tc __unused) > >{ > > > > return (rdtsc32()); > >} >=20 > Adding `inline' gives up to 5 (!) style bugs at different levels: > - `inline' is usually spelled `__inline'. The former is now standard, > but sys/cdefs.h never caught up with this and has better magic for > the latter. And `inline' doesn't match most existing spellings. There is no reasons to use __inline in the kernel code anymore. We are in C99 mode always when compiling kernel. > - the function is still forward-declared without `inline'. The differenc > is confusing. > - forward declaration of inline functions is nonsense. It sort of asks > for early uses to be non-inline and late uses to be inline. But > gcc now doesn't care much about the order, since -funit-at-a-time is > the default. > - this function is never actually inline. You added calls to the others > but not this. > - if this function were inline, then this might happen automatically due > to -finline-functions-called-once. See below. >=20 > > > >-static u_int > >+static inline u_int > >tsc_get_timecount_low(struct timecounter *tc) > >{ > > uint32_t rv; > > > > __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" > >- : "=3Da" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > >+ : "=3Da" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > > return (rv); > >} =2E.. Added inline to declaration of tsc_get_timecounter_low. And removed 'inline' from other functions. Compiler still generates exactly '*fence; rdtsc' sequences. In truth, for _low variabnts, the load of tc_priv is done between fence and rdtsc, but I kept it as is for now. diff --git a/sys/amd64/include/cpufunc.h b/sys/amd64/include/cpufunc.h index 94d4133..881fcd2 100644 --- a/sys/amd64/include/cpufunc.h +++ b/sys/amd64/include/cpufunc.h @@ -290,6 +290,13 @@ popcntq(u_long mask) } =20 static __inline void +lfence(void) +{ + + __asm __volatile("lfence" : : : "memory"); +} + +static __inline void mfence(void) { =20 diff --git a/sys/i386/include/cpufunc.h b/sys/i386/include/cpufunc.h index 62d268d..7cd3663 100644 --- a/sys/i386/include/cpufunc.h +++ b/sys/i386/include/cpufunc.h @@ -155,6 +155,13 @@ cpu_mwait(u_long extensions, u_int hints) } =20 static __inline void +lfence(void) +{ + + __asm __volatile("lfence" : : : "memory"); +} + +static __inline void mfence(void) { =20 diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c index c253a96..3d8bd30 100644 --- a/sys/x86/x86/tsc.c +++ b/sys/x86/x86/tsc.c @@ -82,7 +82,11 @@ static void tsc_freq_changed(void *arg, const struct cf_= level *level, static void tsc_freq_changing(void *arg, const struct cf_level *level, int *status); static unsigned tsc_get_timecount(struct timecounter *tc); -static unsigned tsc_get_timecount_low(struct timecounter *tc); +static inline unsigned tsc_get_timecount_low(struct timecounter *tc); +static unsigned tsc_get_timecount_lfence(struct timecounter *tc); +static unsigned tsc_get_timecount_low_lfence(struct timecounter *tc); +static unsigned tsc_get_timecount_mfence(struct timecounter *tc); +static unsigned tsc_get_timecount_low_mfence(struct timecounter *tc); static void tsc_levels_changed(void *arg, int unit); =20 static struct timecounter tsc_timecounter =3D { @@ -262,6 +266,10 @@ probe_tsc_freq(void) (vm_guest =3D=3D VM_GUEST_NO && CPUID_TO_FAMILY(cpu_id) >=3D 0x10)) tsc_is_invariant =3D 1; + if (cpu_feature & CPUID_SSE2) { + tsc_timecounter.tc_get_timecount =3D + tsc_get_timecount_mfence; + } break; case CPU_VENDOR_INTEL: if ((amd_pminfo & AMDPM_TSC_INVARIANT) !=3D 0 || @@ -271,6 +279,10 @@ probe_tsc_freq(void) (CPUID_TO_FAMILY(cpu_id) =3D=3D 0xf && CPUID_TO_MODEL(cpu_id) >=3D 0x3)))) tsc_is_invariant =3D 1; + if (cpu_feature & CPUID_SSE2) { + tsc_timecounter.tc_get_timecount =3D + tsc_get_timecount_lfence; + } break; case CPU_VENDOR_CENTAUR: if (vm_guest =3D=3D VM_GUEST_NO && @@ -278,6 +290,10 @@ probe_tsc_freq(void) CPUID_TO_MODEL(cpu_id) >=3D 0xf && (rdmsr(0x1203) & 0x100000000ULL) =3D=3D 0) tsc_is_invariant =3D 1; + if (cpu_feature & CPUID_SSE2) { + tsc_timecounter.tc_get_timecount =3D + tsc_get_timecount_lfence; + } break; } =20 @@ -328,16 +344,31 @@ init_TSC(void) =20 #ifdef SMP =20 -/* rmb is required here because rdtsc is not a serializing instruction. */ -#define TSC_READ(x) \ -static void \ -tsc_read_##x(void *arg) \ -{ \ - uint32_t *tsc =3D arg; \ - u_int cpu =3D PCPU_GET(cpuid); \ - \ - rmb(); \ - tsc[cpu * 3 + x] =3D rdtsc32(); \ +/* + * RDTSC is not a serializing instruction, and does not drain + * instruction stream, so we need to drain the stream before executing + * it. It could be fixed by use of RDTSCP, except the instruction is + * not available everywhere. + * + * Use CPUID for draining in the boot-time SMP constistency test. The + * timecounters use MFENCE for AMD CPUs, and LFENCE for others (Intel + * and VIA) when SSE2 is present, and nothing on older machines which + * also do not issue RDTSC prematurely. There, testing for SSE2 and + * vendor is too cumbersome, and we learn about TSC presence from + * CPUID. + * + * Do not use do_cpuid(), since we do not need CPUID results, which + * have to be written into memory with do_cpuid(). + */ +#define TSC_READ(x) \ +static void \ +tsc_read_##x(void *arg) \ +{ \ + uint32_t *tsc =3D arg; \ + u_int cpu =3D PCPU_GET(cpuid); \ + \ + __asm __volatile("cpuid" : : : "eax", "ebx", "ecx", "edx"); \ + tsc[cpu * 3 + x] =3D rdtsc32(); \ } TSC_READ(0) TSC_READ(1) @@ -487,7 +518,16 @@ init: for (shift =3D 0; shift < 31 && (tsc_freq >> shift) > max_freq; shift++) ; if (shift > 0) { - tsc_timecounter.tc_get_timecount =3D tsc_get_timecount_low; + if (cpu_feature & CPUID_SSE2) { + if (cpu_vendor_id =3D=3D CPU_VENDOR_AMD) { + tsc_timecounter.tc_get_timecount =3D + tsc_get_timecount_low_mfence; + } else { + tsc_timecounter.tc_get_timecount =3D + tsc_get_timecount_low_lfence; + } + } else + tsc_timecounter.tc_get_timecount =3D tsc_get_timecount_low; tsc_timecounter.tc_name =3D "TSC-low"; if (bootverbose) printf("TSC timecounter discards lower %d bit(s)\n", @@ -599,16 +639,48 @@ tsc_get_timecount(struct timecounter *tc __unused) return (rdtsc32()); } =20 -static u_int +static inline u_int tsc_get_timecount_low(struct timecounter *tc) { uint32_t rv; =20 __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" - : "=3Da" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); + : "=3Da" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); return (rv); } =20 +static u_int +tsc_get_timecount_lfence(struct timecounter *tc __unused) +{ + + lfence(); + return (rdtsc32()); +} + +static u_int +tsc_get_timecount_low_lfence(struct timecounter *tc) +{ + + lfence(); + return (tsc_get_timecount_low(tc)); +} + +static u_int +tsc_get_timecount_mfence(struct timecounter *tc __unused) +{ + + mfence(); + return (rdtsc32()); +} + +static u_int +tsc_get_timecount_low_mfence(struct timecounter *tc) +{ + + mfence(); + return (tsc_get_timecount_low(tc)); +} + uint32_t cpu_fill_vdso_timehands(struct vdso_timehands *vdso_th) { --/bcALokiWR3y46GP Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlAVbwsACgkQC3+MBN1Mb4jAEQCfY3F7EPgEz65p43knh0BvqT+1 nqoAoNqLT0k8T8bX7TXWPSd0V47uOOS2 =Mait -----END PGP SIGNATURE----- --/bcALokiWR3y46GP-- From owner-freebsd-amd64@FreeBSD.ORG Mon Jul 30 11:07:09 2012 Return-Path: Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 204D4106564A for ; Mon, 30 Jul 2012 11:07:09 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 09F9F8FC0C for ; Mon, 30 Jul 2012 11:07:09 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q6UB786v001699 for ; Mon, 30 Jul 2012 11:07:08 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q6UB78s7001697 for freebsd-amd64@FreeBSD.org; Mon, 30 Jul 2012 11:07:08 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 30 Jul 2012 11:07:08 GMT Message-Id: <201207301107.q6UB78s7001697@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-amd64@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-amd64@FreeBSD.org X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jul 2012 11:07:09 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o amd64/170115 amd64 Serial boot broken in 9.0 o amd64/168659 amd64 [boot] FreeBSD 9 - Crash upon booting off install CD ( o amd64/167582 amd64 Compile of MySQL NDB Cluster Fails 8.2 AMD64 o amd64/167543 amd64 [kernel] Install FreeBSD can show error message with c o amd64/167393 amd64 [boot] MacBook4,1 hangs on SMP boot o amd64/166639 amd64 [boot] Syscons issue Intel D2700 o amd64/166229 amd64 [boot] Unable to install FreeBSD 9 on Acer Extensa 522 o amd64/165850 amd64 [build] 8.3-RC1 (amd64): world doesn't build with CPUT o amd64/165845 amd64 [build] Unable to build kernel on 8.2-STABLE o amd64/165351 amd64 [boot] Error while installing or booting the freeBSD O o amd64/164773 amd64 [boot] 9.0 amd64 fails to boot on HP DL145 G3 [regress o amd64/164707 amd64 FreeBSD 9 installer does not work with IBM uefi o amd64/164643 amd64 Kernel Panic at 9.0-RELEASE o amd64/164619 amd64 when logged in as root the user and group applications o amd64/164457 amd64 [install] Can't install FreeBSD 9.0 (amd64) on HP Blad o amd64/164301 amd64 [install] 9.0 - Can't install, no DHCP lease o amd64/164136 amd64 after fresh install 8.1 release or 8.2 release the har o amd64/164116 amd64 [boot] FreeBSD 9.0-RELEASE installations mediums fails o amd64/164089 amd64 FreeBSD-9.0-RELEASE-amd64-memstick.img does not boot o amd64/164073 amd64 /etc/rc warning after booting o amd64/164036 amd64 [keyboard] Moused fails on 9_0_RELENG o amd64/163736 amd64 Freebsd 8.2 with MPD5 and about 100 PPPoE clients pani o amd64/163710 amd64 setjump in userboot.so causes stack corruption o amd64/163625 amd64 Install problems of RC3 amd64 on ASRock N68 GE3 UCC o amd64/163568 amd64 hard drive naming o amd64/163285 amd64 when installing gnome2-lite not all dependent packages o amd64/163284 amd64 print manager failed to install correctly o amd64/163114 amd64 no boot on Via Nanao netbook Samsung NC20 o amd64/163092 amd64 FreeBSD 9.0-RC2 fails to boot from raid-z2 if AHCI is o amd64/163048 amd64 normal user cant mount ntfs-3g o amd64/162936 amd64 fails boot and destabilizes other OSes on FreeBSD 9 RC o amd64/162489 amd64 After some time X blanks the screen and does not respo o amd64/162314 amd64 not able to install FreeBSD-8.2-RELEASE-amd64-dvd1 as o amd64/162219 amd64 [REGRESSION] In KDE 4.7.2 cant enable OpenGL,in 4.6.5 o amd64/162170 amd64 Unable to install due to freeze at "run_interrupt_driv o amd64/161974 amd64 FreeBSD 9 new installer installs succesful, renders ma o kern/160833 amd64 Keyboard USB doesn't work o amd64/157386 amd64 [powerd] Enabling powerd(8) with default settings on I o amd64/156106 amd64 [boot] boot0 fails to start o amd64/155135 amd64 [boot] Does Not Boot On a Very Standard Hardware o amd64/154957 amd64 [boot] Install boot CD won't boot up - keeps rebooting o amd64/154629 amd64 [panic] Fatal trap 9: general protection fault while i o amd64/153935 amd64 [hang] system hangs while trying to do 'shutdown -h no o amd64/153831 amd64 [boot] CD bootloader won't on Tyan s2912G2nr o amd64/153496 amd64 [hyper-v] [install] Install on Hyper-V leaves corrupt o amd64/153372 amd64 [panic] kernel panic o amd64/153175 amd64 [amd64] Kernel Panic on only FreeBSD 8 amd64 o amd64/152874 amd64 [install] 8.1 install fails where 7.3 works due to lac o amd64/152430 amd64 [boot] HP ProLiant Microserver n36l cannot boot into i o amd64/145991 amd64 [NOTES] [patch] Add a requires line to /sys/amd64/conf o amd64/144405 amd64 [build] [patch] include /usr/obj/lib32 in cleanworld t s amd64/143173 amd64 [ata] Promise FastTrack TX4 + SATA DVD, installer can' p amd64/141413 amd64 [hang] Tyan 2881 m3289 SMDC freeze o amd64/137942 amd64 [pci] 8.0-BETA2 having problems with Asus M2N-SLI-delu o amd64/127640 amd64 [amd64] gcc(1) will not build shared libraries with -f o amd64/115194 amd64 LCD screen remains blank after Dell XPS M1210 lid is c 56 problems total. From owner-freebsd-amd64@FreeBSD.ORG Tue Jul 31 03:26:08 2012 Return-Path: Delivered-To: amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CFA3F106566B; Tue, 31 Jul 2012 03:26:08 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id 954E28FC1C; Tue, 31 Jul 2012 03:26:08 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.5/8.14.5) with ESMTP id q6V3Q8KM089291; Mon, 30 Jul 2012 23:26:08 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.5/8.14.5/Submit) id q6V3Q8AV089288; Tue, 31 Jul 2012 03:26:08 GMT (envelope-from tinderbox@freebsd.org) Date: Tue, 31 Jul 2012 03:26:08 GMT Message-Id: <201207310326.q6V3Q8AV089288@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [head tinderbox] failure on amd64/amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jul 2012 03:26:09 -0000 TB --- 2012-07-31 00:10:00 - tinderbox 2.9 running on freebsd-current.sentex.ca TB --- 2012-07-31 00:10:00 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 des@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC amd64 TB --- 2012-07-31 00:10:00 - starting HEAD tinderbox run for amd64/amd64 TB --- 2012-07-31 00:10:00 - cleaning the object tree TB --- 2012-07-31 00:10:00 - cvsupping the source tree TB --- 2012-07-31 00:10:00 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/HEAD/amd64/amd64/supfile TB --- 2012-07-31 00:12:32 - building world TB --- 2012-07-31 00:12:32 - CROSS_BUILD_TESTING=YES TB --- 2012-07-31 00:12:32 - MAKEOBJDIRPREFIX=/obj TB --- 2012-07-31 00:12:32 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-07-31 00:12:32 - SRCCONF=/dev/null TB --- 2012-07-31 00:12:32 - TARGET=amd64 TB --- 2012-07-31 00:12:32 - TARGET_ARCH=amd64 TB --- 2012-07-31 00:12:32 - TZ=UTC TB --- 2012-07-31 00:12:32 - __MAKE_CONF=/dev/null TB --- 2012-07-31 00:12:32 - cd /src TB --- 2012-07-31 00:12:32 - /usr/bin/make -B buildworld >>> World build started on Tue Jul 31 00:12:33 UTC 2012 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> stage 5.1: building 32 bit shim libraries >>> World build completed on Tue Jul 31 03:15:38 UTC 2012 TB --- 2012-07-31 03:15:38 - generating LINT kernel config TB --- 2012-07-31 03:15:38 - cd /src/sys/amd64/conf TB --- 2012-07-31 03:15:38 - /usr/bin/make -B LINT TB --- 2012-07-31 03:15:38 - cd /src/sys/amd64/conf TB --- 2012-07-31 03:15:38 - /usr/sbin/config -m LINT TB --- 2012-07-31 03:15:38 - building LINT kernel TB --- 2012-07-31 03:15:38 - CROSS_BUILD_TESTING=YES TB --- 2012-07-31 03:15:38 - MAKEOBJDIRPREFIX=/obj TB --- 2012-07-31 03:15:38 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-07-31 03:15:38 - SRCCONF=/dev/null TB --- 2012-07-31 03:15:38 - TARGET=amd64 TB --- 2012-07-31 03:15:38 - TARGET_ARCH=amd64 TB --- 2012-07-31 03:15:38 - TZ=UTC TB --- 2012-07-31 03:15:38 - __MAKE_CONF=/dev/null TB --- 2012-07-31 03:15:38 - cd /src TB --- 2012-07-31 03:15:38 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Tue Jul 31 03:15:38 UTC 2012 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/my/if_my.c cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/ncv/ncr53c500.c cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/ncv/ncr53c500_pccard.c cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/netmap/netmap.c cc1: warnings being treated as errors In file included from /src/sys/dev/netmap/netmap.c:99: /src/sys/dev/netmap/netmap_kern.h:84: warning: redundant redeclaration of 'M_NETMAP' [-Wredundant-decls] /src/sys/dev/netmap/netmap.c:95: warning: previous definition of 'M_NETMAP' was here *** Error code 1 Stop in /obj/amd64.amd64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2012-07-31 03:26:07 - WARNING: /usr/bin/make returned exit code 1 TB --- 2012-07-31 03:26:07 - ERROR: failed to build LINT kernel TB --- 2012-07-31 03:26:07 - 8390.59 user 1310.90 system 11767.54 real http://tinderbox.freebsd.org/tinderbox-head-HEAD-amd64-amd64.full From owner-freebsd-amd64@FreeBSD.ORG Tue Jul 31 10:01:28 2012 Return-Path: Delivered-To: amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BC693106564A; Tue, 31 Jul 2012 10:01:28 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id 8B7E18FC14; Tue, 31 Jul 2012 10:01:28 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.5/8.14.5) with ESMTP id q6VA1S3h073632; Tue, 31 Jul 2012 06:01:28 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.5/8.14.5/Submit) id q6VA1S67073628; Tue, 31 Jul 2012 10:01:28 GMT (envelope-from tinderbox@freebsd.org) Date: Tue, 31 Jul 2012 10:01:28 GMT Message-Id: <201207311001.q6VA1S67073628@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [head tinderbox] failure on amd64/amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jul 2012 10:01:28 -0000 TB --- 2012-07-31 06:20:00 - tinderbox 2.9 running on freebsd-current.sentex.ca TB --- 2012-07-31 06:20:00 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 des@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC amd64 TB --- 2012-07-31 06:20:00 - starting HEAD tinderbox run for amd64/amd64 TB --- 2012-07-31 06:20:00 - cleaning the object tree TB --- 2012-07-31 06:29:29 - cvsupping the source tree TB --- 2012-07-31 06:29:29 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/HEAD/amd64/amd64/supfile TB --- 2012-07-31 06:29:58 - building world TB --- 2012-07-31 06:29:58 - CROSS_BUILD_TESTING=YES TB --- 2012-07-31 06:29:58 - MAKEOBJDIRPREFIX=/obj TB --- 2012-07-31 06:29:58 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-07-31 06:29:58 - SRCCONF=/dev/null TB --- 2012-07-31 06:29:58 - TARGET=amd64 TB --- 2012-07-31 06:29:58 - TARGET_ARCH=amd64 TB --- 2012-07-31 06:29:58 - TZ=UTC TB --- 2012-07-31 06:29:58 - __MAKE_CONF=/dev/null TB --- 2012-07-31 06:29:58 - cd /src TB --- 2012-07-31 06:29:58 - /usr/bin/make -B buildworld >>> World build started on Tue Jul 31 06:29:59 UTC 2012 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> stage 5.1: building 32 bit shim libraries >>> World build completed on Tue Jul 31 09:51:00 UTC 2012 TB --- 2012-07-31 09:51:00 - generating LINT kernel config TB --- 2012-07-31 09:51:00 - cd /src/sys/amd64/conf TB --- 2012-07-31 09:51:00 - /usr/bin/make -B LINT TB --- 2012-07-31 09:51:00 - cd /src/sys/amd64/conf TB --- 2012-07-31 09:51:00 - /usr/sbin/config -m LINT TB --- 2012-07-31 09:51:00 - building LINT kernel TB --- 2012-07-31 09:51:00 - CROSS_BUILD_TESTING=YES TB --- 2012-07-31 09:51:00 - MAKEOBJDIRPREFIX=/obj TB --- 2012-07-31 09:51:00 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-07-31 09:51:00 - SRCCONF=/dev/null TB --- 2012-07-31 09:51:00 - TARGET=amd64 TB --- 2012-07-31 09:51:00 - TARGET_ARCH=amd64 TB --- 2012-07-31 09:51:00 - TZ=UTC TB --- 2012-07-31 09:51:00 - __MAKE_CONF=/dev/null TB --- 2012-07-31 09:51:00 - cd /src TB --- 2012-07-31 09:51:00 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Tue Jul 31 09:51:00 UTC 2012 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/my/if_my.c cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/ncv/ncr53c500.c cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/ncv/ncr53c500_pccard.c cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/netmap/netmap.c cc1: warnings being treated as errors In file included from /src/sys/dev/netmap/netmap.c:99: /src/sys/dev/netmap/netmap_kern.h:84: warning: redundant redeclaration of 'M_NETMAP' [-Wredundant-decls] /src/sys/dev/netmap/netmap.c:95: warning: previous definition of 'M_NETMAP' was here *** Error code 1 Stop in /obj/amd64.amd64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2012-07-31 10:01:27 - WARNING: /usr/bin/make returned exit code 1 TB --- 2012-07-31 10:01:27 - ERROR: failed to build LINT kernel TB --- 2012-07-31 10:01:27 - 8405.20 user 1320.39 system 13287.38 real http://tinderbox.freebsd.org/tinderbox-head-HEAD-amd64-amd64.full From owner-freebsd-amd64@FreeBSD.ORG Thu Aug 2 01:35:29 2012 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1F908106566B for ; Thu, 2 Aug 2012 01:35:29 +0000 (UTC) (envelope-from mikea@mikea.ath.cx) Received: from mikea.ath.cx (mikea.ath.cx [70.164.65.62]) by mx1.freebsd.org (Postfix) with ESMTP id E04818FC15 for ; Thu, 2 Aug 2012 01:35:28 +0000 (UTC) Received: from mikea.ath.cx (localhost [127.0.0.1]) by mikea.ath.cx (8.14.5/8.14.5) with ESMTP id q721ZR99087315 for ; Wed, 1 Aug 2012 20:35:28 -0500 (CDT) (envelope-from mikea@mikea.ath.cx) Received: (from mikea@localhost) by mikea.ath.cx (8.14.5/8.14.5/Submit) id q721ZRcN087314 for freebsd-amd64@freebsd.org; Wed, 1 Aug 2012 20:35:27 -0500 (CDT) (envelope-from mikea) Date: Wed, 1 Aug 2012 20:35:27 -0500 From: mike andrews To: freebsd-amd64@freebsd.org Message-ID: <20120802013527.GA87188@mikea.ath.cx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Subject: no such instructions: xsave, xsetbv, xrstor X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Aug 2012 01:35:29 -0000 On Sat, Jul 21, 2012 at 07:55:23PM +0200, Arvydas Sidorenko wrote: > This is the output I get when building 10-CURRENT from HEAD: > /usr/src/sys/amd64/amd64/cpu_switch.S: Assembler messages: > /usr/src/sys/amd64/amd64/cpu_switch.S:128: Error: no such instruction: > `xsave (%r8)' > /usr/src/sys/amd64/amd64/cpu_switch.S:504: Error: no such instruction: > `xsetbv' > /usr/src/sys/amd64/amd64/cpu_switch.S:505: Error: no such instruction: > `xrstor (%rbx)' Apologies for not replying into the thread, but I just subscribed and don't have that luxury until some listmail arrives for this thread. I'm getting the same problem doing a `make buildkernel KERNCONF=GENERIC` on an amd64 system (Intel core i5 4-way): : $ as --version : GNU assembler 2.17.50 [FreeBSD] 2007-07-03 : Copyright 2007 Free Software Foundation, Inc. : This program is free software; you may redistribute it under the terms of : the GNU General Public License. This program has absolutely no warranty. : This assembler was configured for a target of `x86_64-unknown-freebsd'. : $ gcc --version : gcc (GCC) 4.2.1 20070831 patched [FreeBSD] : Copyright (C) 2007 Free Software Foundation, Inc. : This is free software; see the source for copying conditions. There is NO : warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. : : $ uname -a : FreeBSD mikea.ath.cx 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Tue Jan 3 07:46:30 UTC 2012 root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Is the assembler somehow configured for the wrong target machine? I did a cvsup this morning to pull me to the most recent source code level, and still have the same problem. Ideas? Questions? This appears a bit more pervasive than I had initially thought, if this Joe User here (well, Joe Sysadmin) is getting caught by it. Thanks, and 73, de -- Mike Andrews, W5EGO mikea@mikea.ath.cx Tired old sysadmin From owner-freebsd-amd64@FreeBSD.ORG Thu Aug 2 02:07:51 2012 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41DDD106566B for ; Thu, 2 Aug 2012 02:07:51 +0000 (UTC) (envelope-from swhetzel@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id CA3268FC08 for ; Thu, 2 Aug 2012 02:07:50 +0000 (UTC) Received: by eaak11 with SMTP id k11so962526eaa.13 for ; Wed, 01 Aug 2012 19:07:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=PpAaYA5NO3gi3bi81SKkMKZq2pG9Rz+abPSFXKvnNmw=; b=QPihj51tpYDpLScJQlJhDI/GsB1DUUT5OqlgHFNba1xm+WGrjcXgQyFnmJqM8Aeawy 6SQ+1iIv98aN6KTJUQieOXeEgUnN6SP/Us/gxwyhJ+JCW/SR2RUjrOBoo+dJIGaqs3VT fc0UBuOkxIVuCjEEi02EksituzAaah9+f3OyUT5M5nG4VYwY7JM+ihcZWlruvNw63U7L llj9JzUVZDG3R5qtq7/07dFNOth1lESNV4s1RLyG+kLzWtJE2LmbxD1CjrVOJWY6Hz+/ yHyqlvPALGYmTzMDhsYU79gfLZU8/MBUEdmvaXg91/hWcDvS3kPTvJ/1kSUfg3FhOFDV Eh6Q== MIME-Version: 1.0 Received: by 10.14.181.132 with SMTP id l4mr3342925eem.17.1343873264444; Wed, 01 Aug 2012 19:07:44 -0700 (PDT) Received: by 10.14.177.3 with HTTP; Wed, 1 Aug 2012 19:07:44 -0700 (PDT) In-Reply-To: <20120802013527.GA87188@mikea.ath.cx> References: <20120802013527.GA87188@mikea.ath.cx> Date: Wed, 1 Aug 2012 21:07:44 -0500 Message-ID: From: Scot Hetzel To: mike andrews Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-amd64@freebsd.org Subject: Re: no such instructions: xsave, xsetbv, xrstor X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Aug 2012 02:07:51 -0000 On Wed, Aug 1, 2012 at 8:35 PM, mike andrews wrote: > On Sat, Jul 21, 2012 at 07:55:23PM +0200, Arvydas Sidorenko wrote: >> This is the output I get when building 10-CURRENT from HEAD: >> /usr/src/sys/amd64/amd64/cpu_switch.S: Assembler messages: >> /usr/src/sys/amd64/amd64/cpu_switch.S:128: Error: no such instruction: >> `xsave (%r8)' >> /usr/src/sys/amd64/amd64/cpu_switch.S:504: Error: no such instruction: >> `xsetbv' >> /usr/src/sys/amd64/amd64/cpu_switch.S:505: Error: no such instruction: >> `xrstor (%rbx)' > > Apologies for not replying into the thread, but I just subscribed and > don't have that luxury until some listmail arrives for this thread. > > I'm getting the same problem doing a `make buildkernel KERNCONF=GENERIC` > on an amd64 system (Intel core i5 4-way): > Did you do a `make clean && make buildworld` before your `make buildkernel`? Scot From owner-freebsd-amd64@FreeBSD.ORG Thu Aug 2 04:00:06 2012 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0D787106566C for ; Thu, 2 Aug 2012 04:00:05 +0000 (UTC) (envelope-from mikea@mikea.ath.cx) Received: from mikea.ath.cx (mikea.ath.cx [70.164.65.62]) by mx1.freebsd.org (Postfix) with ESMTP id B7A7D8FC08 for ; Thu, 2 Aug 2012 04:00:05 +0000 (UTC) Received: from mikea.ath.cx (localhost [127.0.0.1]) by mikea.ath.cx (8.14.5/8.14.5) with ESMTP id q724032E088079 for ; Wed, 1 Aug 2012 23:00:04 -0500 (CDT) (envelope-from mikea@mikea.ath.cx) Received: (from mikea@localhost) by mikea.ath.cx (8.14.5/8.14.5/Submit) id q72403t4088078 for freebsd-amd64@freebsd.org; Wed, 1 Aug 2012 23:00:03 -0500 (CDT) (envelope-from mikea) Date: Wed, 1 Aug 2012 23:00:03 -0500 From: mike andrews To: freebsd-amd64@freebsd.org Message-ID: <20120802040003.GA88054@mikea.ath.cx> References: <20120802013527.GA87188@mikea.ath.cx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Subject: Re: no such instructions: xsave, xsetbv, xrstor X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Aug 2012 04:00:06 -0000 On Wed, Aug 01, 2012 at 09:07:44PM -0500, Scot Hetzel wrote: > On Wed, Aug 1, 2012 at 8:35 PM, mike andrews wrote: > > On Sat, Jul 21, 2012 at 07:55:23PM +0200, Arvydas Sidorenko wrote: > >> This is the output I get when building 10-CURRENT from HEAD: > >> /usr/src/sys/amd64/amd64/cpu_switch.S: Assembler messages: > >> /usr/src/sys/amd64/amd64/cpu_switch.S:128: Error: no such instruction: > >> `xsave (%r8)' > >> /usr/src/sys/amd64/amd64/cpu_switch.S:504: Error: no such instruction: > >> `xsetbv' > >> /usr/src/sys/amd64/amd64/cpu_switch.S:505: Error: no such instruction: > >> `xrstor (%rbx)' > > > > Apologies for not replying into the thread, but I just subscribed and > > don't have that luxury until some listmail arrives for this thread. > > > > I'm getting the same problem doing a `make buildkernel KERNCONF=GENERIC` > > on an amd64 system (Intel core i5 4-way): > > > Did you do a `make clean && make buildworld` before your `make buildkernel`? No. I will get that started now. Thanks very much for the hint. -- Mike Andrews, W5EGO mikea@mikea.ath.cx Tired old sysadmin From owner-freebsd-amd64@FreeBSD.ORG Fri Aug 3 15:40:09 2012 Return-Path: Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 05AA010656FE for ; Fri, 3 Aug 2012 15:40:09 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id CD53D8FC17 for ; Fri, 3 Aug 2012 15:40:08 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q73Fe8FN011798 for ; Fri, 3 Aug 2012 15:40:08 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q73Fe87n011797; Fri, 3 Aug 2012 15:40:08 GMT (envelope-from gnats) Resent-Date: Fri, 3 Aug 2012 15:40:08 GMT Resent-Message-Id: <201208031540.q73Fe87n011797@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-amd64@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Ming Qiao Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A04A1065686 for ; Fri, 3 Aug 2012 15:35:21 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 249F98FC16 for ; Fri, 3 Aug 2012 15:35:21 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q73FZKb9014930 for ; Fri, 3 Aug 2012 15:35:20 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id q73FZKIr014920; Fri, 3 Aug 2012 15:35:20 GMT (envelope-from nobody) Message-Id: <201208031535.q73FZKIr014920@red.freebsd.org> Date: Fri, 3 Aug 2012 15:35:20 GMT From: Ming Qiao To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 X-Mailman-Approved-At: Fri, 03 Aug 2012 16:20:53 +0000 Cc: Subject: amd64/170351: [patch] amd64: 64-bit process can't always get unlimited rlimit X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2012 15:40:09 -0000 >Number: 170351 >Category: amd64 >Synopsis: [patch] amd64: 64-bit process can't always get unlimited rlimit >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-amd64 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Aug 03 15:40:08 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Ming Qiao >Release: FreeBSD 9.0-RC2 >Organization: Juniper Networks >Environment: FreeBSD neys 9.0-RC2 FreeBSD 9.0-RC2 #0: Thu Jul 26 01:27:46 UTC 2012 root@neys:/usr/obj/usr/src/sys/GENERIC amd64 >Description: On the amd64 platform, if a 32-bit process ever manually set its rlimit, none of its 64-bit child or offspring will be able to get the full 64-bit rlimit anymore, even if they explicitly set the limit to unlimited. Note that for the sake of simplicity, only datasize limit is referred in this report. But the same logic applies to all other memory segment (i.e. stacksize, etc.). Take the following scenario as an example: 1) Let's say we have a 32-bit process p1 whose hard limit is set to 500MB by calling setrlimit(). 2) p1 then exec another 32-bit process p2. 3) p2 set its hard limit to unlimited by calling setrlimit(). 4) p2 exec a 64-bit process p3. 5) check the hard limit of p3, we can see that it only has 3GB (value of ia32_maxdsiz) instead of 32GB which is the global kernel limit (value of maxdsiz) for a 64-bit process. The root cause is that on step 3, p2 didn't actually set its limit to the correct value when calling setrlimit(). Instead the limit is set to ia32_maxdsiz since ia32_fixlimit() is called in kern_proc_setrlimit(). >How-To-Repeat: There are 3 test programs attached in this report: 32_p1.c, 32_p2.c, and 64_p3.c. They can be used to reproduce the problem. 1) Compile 32_p1.c and 32_p2.c into 32-bit binaries. Compile 64_p3.c into 64-bit binary. 2) Put all 3 binaries into the same directory on a machine running FreeBSD amd64 version. 3) Run 32_p1 which will exec 32_p2 and 64_p3. The output of 64_p3 will show its limit is capped at ia32_maxdsiz. >Fix: The proposed fix is to change kern_proc_setrlimit() so that sv_fixlimit() will not be called if the caller wants to set the new limit to RLIM_INFINITY. Please refer to the attached diff file for the proposed fix. Patch attached with submission follows: # This is a shell archive. Save it in a file, remove anything before # this line, and then unpack it by entering "sh file". Note, it may # create directories; files and directories will be owned by you and # have default permissions. # # This archive contains: # # fix.diff # 32_p1.c # 32_p2.c # 64_p3.c # echo x - fix.diff sed 's/^X//' >fix.diff << 'bcc47fd7a380cd6506fa66c7fb3122d6' X--- kern_resource.c 2012-08-02 07:41:59.000000000 -0700 X+++ kern_resource.c.modified 2012-08-02 07:40:40.771115000 -0700 X@@ -663,6 +663,7 @@ X register struct rlimit *alimp; X struct rlimit oldssiz; X int error; X+ int is_lim_inf = 0; X X if (which >= RLIM_NLIMITS) X return (EINVAL); X@@ -701,6 +702,8 @@ X p->p_cpulimit = limp->rlim_cur; X break; X case RLIMIT_DATA: X+ if (limp->rlim_max == RLIM_INFINITY) X+ is_lim_inf = 1; X if (limp->rlim_cur > maxdsiz) X limp->rlim_cur = maxdsiz; X if (limp->rlim_max > maxdsiz) X@@ -736,7 +739,8 @@ X limp->rlim_max = 1; X break; X } X- if (p->p_sysent->sv_fixlimit != NULL) X+ if ((p->p_sysent->sv_fixlimit != NULL) && X+ (1 != is_lim_inf)) X p->p_sysent->sv_fixlimit(limp, which); X *alimp = *limp; X p->p_limit = newlim; bcc47fd7a380cd6506fa66c7fb3122d6 echo x - 32_p1.c sed 's/^X//' >32_p1.c << '2732294ba2da13cbcf3153434d6e3482' X/* X * Test program for FreeBSD rlimit issue. X * To be compiled to a 32-bit binary. X */ X X#include X#include X#include X X#include X#include X#include X#include X#include X Xint Xmain(int argc, char **argv) X{ X struct rlimit currlimit, lim_new; X char * argv_exec[] = {"./32_p2", 0}; X X printf( "\n *** Starting 32-b process 1 *** \n"); X/* sleep(15);*/ X X if ( 0 == getrlimit( RLIMIT_DATA, &currlimit ) ) { X printf("\n32_p1: rlim_cur = %lu, rlim_max = %lu\n", currlimit.rlim_cur, currlimit.rlim_max); X } X else { X printf("getrlimit failed!"); X } X X lim_new.rlim_cur = lim_new.rlim_max = 524288000; /* 500M */ X X if (setrlimit(RLIMIT_DATA, &lim_new) < 0) { X printf("setrlimit failed! err=%d\n", errno); X } X else { X printf("32_p1: set limits to 500M\n"); X } X X if ( 0 == getrlimit( RLIMIT_DATA, &currlimit ) ) { X printf("\n32_p1: new rlim_cur = %lu, rlim_max = %lu\n", currlimit.rlim_cur, currlimit.rlim_max); X } X else { X printf("getrlimit failed!"); X } X X printf("now exec 32_p2...\n"); X execv( argv_exec[0], argv_exec ); X exit(0); X} X 2732294ba2da13cbcf3153434d6e3482 echo x - 32_p2.c sed 's/^X//' >32_p2.c << '921653888f0311e6f9044b483b332874' X/* X * Test program for FreeBSD rlimit issue. X * To be compiled to a 32-bit binary. X */ X X#include X#include X#include X X#include X#include X#include X#include X#include X Xint Xmain(int argc, char **argv) X{ X struct rlimit currlimit, lim_new; X char * argv_exec[] = {"./64_p3", 0}; X X printf( "\n *** Starting 32-b process 2 *** \n"); X/* sleep(15);*/ X X if ( 0 == getrlimit( RLIMIT_DATA, &currlimit ) ) { X printf("\n32_p2: rlim_cur = %lu, rlim_max = %lu\n", currlimit.rlim_cur, currlimit.rlim_max); X } X else { X printf("getrlimit failed!"); X } X X lim_new.rlim_cur = lim_new.rlim_max = RLIM_INFINITY; X X if (setrlimit(RLIMIT_DATA, &lim_new) < 0) { X printf("setrlimit failed! err=%d\n", errno); X } X else { X printf("32_p2: set limits to RLIM_INFINITY\n"); X } X X if ( 0 == getrlimit( RLIMIT_DATA, &currlimit ) ) { X printf("\n32_p2: new rlim_cur = %lu, rlim_max = %lu\n", currlimit.rlim_cur, currlimit.rlim_max); X } X else { X printf("getrlimit failed!"); X } X X printf("now exec 64_p3...\n"); X X execv( argv_exec[0], argv_exec ); X exit(0); X} X 921653888f0311e6f9044b483b332874 echo x - 64_p3.c sed 's/^X//' >64_p3.c << 'e22e8191882a74b0e1f833ce4465896a' X/* X * Test program for FreeBSD rlimit issue. X * To be compiled to a 64-bit binary. X */ X X#include X#include X#include X X#include X#include X#include X#include X#include X Xint Xmain(int argc, char **argv) X{ X void * p = NULL; X unsigned long c; X struct rlimit currlimit; X X printf( "\n *** Starting 64-b process 3 *** \n"); X /* sleep(15); */ X X if ( 0 == getrlimit( RLIMIT_DATA, &currlimit ) ) { X printf("\n64_p3: rlim_cur = %lu, rlim_max = %lu\n", currlimit.rlim_cur, currlimit.rlim_max); X } X else { X printf("getrlimit failed!"); X } X X p = sbrk(0); X X while (brk(p + 1024*1024) == 0) { X c++; X p = sbrk(0); X } X X printf("64_p3: %d 1MB blocks allocated (%m).\n", c); X X exit(0); X} X e22e8191882a74b0e1f833ce4465896a exit >Release-Note: >Audit-Trail: >Unformatted: From owner-freebsd-amd64@FreeBSD.ORG Fri Aug 3 17:40:10 2012 Return-Path: Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 044041065674 for ; Fri, 3 Aug 2012 17:40:10 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D895C8FC15 for ; Fri, 3 Aug 2012 17:40:09 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q73He9Ul038373 for ; Fri, 3 Aug 2012 17:40:09 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q73He9iQ038372; Fri, 3 Aug 2012 17:40:09 GMT (envelope-from gnats) Date: Fri, 3 Aug 2012 17:40:09 GMT Message-Id: <201208031740.q73He9iQ038372@freefall.freebsd.org> To: freebsd-amd64@FreeBSD.org From: Konstantin Belousov Cc: Subject: Re: amd64/170351: [patch] amd64: 64-bit process can't always get unlimited rlimit X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Konstantin Belousov List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2012 17:40:10 -0000 The following reply was made to PR amd64/170351; it has been noted by GNATS. From: Konstantin Belousov To: Ming Qiao Cc: freebsd-gnats-submit@freebsd.org Subject: Re: amd64/170351: [patch] amd64: 64-bit process can't always get unlimited rlimit Date: Fri, 3 Aug 2012 20:39:23 +0300 --4rHvg5NaBspxp64F Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Aug 03, 2012 at 03:35:20PM +0000, Ming Qiao wrote: >=20 > >Number: 170351 > >Category: amd64 > >Synopsis: [patch] amd64: 64-bit process can't always get unlimited= rlimit > >Confidential: no > >Severity: non-critical > >Priority: low > >Responsible: freebsd-amd64 > >State: open > >Quarter: =20 > >Keywords: =20 > >Date-Required: > >Class: sw-bug > >Submitter-Id: current-users > >Arrival-Date: Fri Aug 03 15:40:08 UTC 2012 > >Closed-Date: > >Last-Modified: > >Originator: Ming Qiao > >Release: FreeBSD 9.0-RC2 > >Organization: > Juniper Networks > >Environment: > FreeBSD neys 9.0-RC2 FreeBSD 9.0-RC2 #0: Thu Jul 26 01:27:46 UTC 2012 > root@neys:/usr/obj/usr/src/sys/GENERIC amd64 > >Description: > On the amd64 platform, if a 32-bit process ever manually set its rlimit, > none of its 64-bit child or offspring will be able to get the full 64-bit > rlimit anymore, even if they explicitly set the limit to unlimited. >=20 > Note that for the sake of simplicity, only datasize limit is referred > in this report. But the same logic applies to all other memory segment > (i.e. stacksize, etc.). >=20 > Take the following scenario as an example: > 1) Let's say we have a 32-bit process p1 whose hard limit is set to 500MB= by > calling setrlimit(). > 2) p1 then exec another 32-bit process p2. > 3) p2 set its hard limit to unlimited by calling setrlimit(). > 4) p2 exec a 64-bit process p3. > 5) check the hard limit of p3, we can see that it only has 3GB (value of > ia32_maxdsiz) instead of 32GB which is the global kernel limit (value of > maxdsiz) for a 64-bit process. >=20 > The root cause is that on step 3, p2 didn't actually set its limit to > the correct value when calling setrlimit(). Instead the limit is set to > ia32_maxdsiz since ia32_fixlimit() is called in kern_proc_setrlimit(). > >How-To-Repeat: > There are 3 test programs attached in this report: 32_p1.c, 32_p2.c, and > 64_p3.c. They can be used to reproduce the problem. >=20 > 1) Compile 32_p1.c and 32_p2.c into 32-bit binaries. Compile 64_p3.c into > 64-bit binary. > 2) Put all 3 binaries into the same directory on a machine running FreeBSD > amd64 version. > 3) Run 32_p1 which will exec 32_p2 and 64_p3. The output of 64_p3 will sh= ow > its limit is capped at ia32_maxdsiz. > >Fix: > The proposed fix is to change kern_proc_setrlimit() so that sv_fixlimit() > will not be called if the caller wants to set the new limit to RLIM_INFIN= ITY. > Please refer to the attached diff file for the proposed fix. The 'fix' is wrong and does not address the issue. Instead, it uses some arbitrary properties of the scenario you considered and adapts kernel code to suit your scenario. Your deny the correction of the infinity limit, I do not see how it can be right. The problem you described is architectural. By design, Unix resource limits cannot be increased after they were decreased, except by root. In your scenario, the limits were decreased by mere fact of running the 32bit process which have lower 'infinity' limits then 64bit processes. That said, I see two possible solutions. First is to manually set compat.ia32.max* sysctls to 0. Then you get desired behaviour for 64bit processes execed from 32bit, it seems. It does not require code change. Since you are fine with denying fix for infinity, this setting gives the same effect as the patch. Second approach (which is essentially a correction to your approach from fix.diff) is to track the fact that corresponding rlimits are set to 'ABI infinity', in some per-struct rlimit flag. Then, get/setrlimit should first test the 'ABI infinity' flag and behave as if rlimit is set to infinity for current bitness even if the actual value of rlimit is not infinity. Flag is set when rlimit is set to infinity by current ABI. The second approach would provide 'correct' fix, but it is not trivial amount of work for very rare situation (execing 64bit process from 32bit), and current behaviour of inheriting 32bit limits may be argued as right. If you want, feel free to develop such patch, I will review and commit it, but I do not want to spend efforts on developing it myself ATM. --4rHvg5NaBspxp64F Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlAcDMoACgkQC3+MBN1Mb4gNuQCePkHJVJy34hUB34TjWliF/M53 V3wAn1Xito7num8GNVfJz0gw3Rb0o3Rz =InLX -----END PGP SIGNATURE----- --4rHvg5NaBspxp64F-- From owner-freebsd-amd64@FreeBSD.ORG Sat Aug 4 05:42:56 2012 Return-Path: Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 19EFD106564A for ; Sat, 4 Aug 2012 05:42:56 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail16.syd.optusnet.com.au (mail16.syd.optusnet.com.au [211.29.132.197]) by mx1.freebsd.org (Postfix) with ESMTP id A52F88FC12 for ; Sat, 4 Aug 2012 05:42:55 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail16.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q745gjal024666 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 4 Aug 2012 15:42:46 +1000 Date: Sat, 4 Aug 2012 15:42:44 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov In-Reply-To: <201208031740.q73He9iQ038372@freefall.freebsd.org> Message-ID: <20120804144130.V791@besplex.bde.org> References: <201208031740.q73He9iQ038372@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-amd64@FreeBSD.org Subject: Re: amd64/170351: [patch] amd64: 64-bit process can't always get unlimited rlimit X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Aug 2012 05:42:56 -0000 On Fri, 3 Aug 2012, Konstantin Belousov wrote: > On Fri, Aug 03, 2012 at 03:35:20PM +0000, Ming Qiao wrote: > > >Description: > > On the amd64 platform, if a 32-bit process ever manually set its rlimit, > > none of its 64-bit child or offspring will be able to get the full 64-bit > > rlimit anymore, even if they explicitly set the limit to unlimited. > >=20 > > Note that for the sake of simplicity, only datasize limit is referred > > in this report. But the same logic applies to all other memory segment > > (i.e. stacksize, etc.). > > ... > ... > The problem you described is architectural. By design, Unix resource > limits cannot be increased after they were decreased, except by root. > In your scenario, the limits were decreased by mere fact of running the > 32bit process which have lower 'infinity' limits then 64bit processes. > > That said, I see two possible solutions. > > First is to manually set compat.ia32.max* sysctls to 0. Then you get > desired behaviour for 64bit processes execed from 32bit, it seems. > It does not require code change. Since you are fine with denying fix > for infinity, this setting gives the same effect as the patch. > > Second approach (which is essentially a correction to your approach > from fix.diff) is to track the fact that corresponding rlimits are set > to 'ABI infinity', in some per-struct rlimit flag. Then, get/setrlimit > should first test the 'ABI infinity' flag and behave as if rlimit is set > to infinity for current bitness even if the actual value of rlimit is > not infinity. Flag is set when rlimit is set to infinity by current ABI. > > The second approach would provide 'correct' fix, but it is not trivial > amount of work for very rare situation (execing 64bit process from 32bit), > and current behaviour of inheriting 32bit limits may be argued as right. > If you want, feel free to develop such patch, I will review and commit it, > but I do not want to spend efforts on developing it myself ATM. Third approach: "unlimited" never really means unlimited, so leave the data size "unlimited" like most other defaults. RLIM_INFINITY is the same in 32-bit mode as in 64-bit mode, so there is no problem in representing "unlimited". Some defaults on a 9.0-STABLE i386 system, according to bash: % socket buffer size (bytes, -b) unlimited % core file size (blocks, -c) unlimited % data seg size (kbytes, -d) 524288 % file size (blocks, -f) unlimited % max locked memory (kbytes, -l) unlimited % max memory size (kbytes, -m) unlimited All the memory and file sizes are have finite limits, but the actual limits are very load-dependent and this won't tell us what these are. The data size limit is also load-dependent, and this may tell us a wrong value. % open files (-n) 3000 Knowing the actual limit for this is more important. I think this limit is required to be the same as getdtablesize() and sysconf(_SC_OPEN_MAX). % pipe size (512 bytes, -p) 1 Seems to be a bash bug. There is no rlimit for pipes, and the finite limit for this is not 512. (Like most limits, it depends in a complicated way on related and unrelated system resources, so the actual limit is sometimes 0 and sometimes closer to the real or virtual memory size. Not so close to the memory sizes for this, since there is a limit on pipe kva. This limit is more broken than most, since is is global. It's implementation has many style bugs.) % stack size (kbytes, -s) 65536 % cpu time (seconds, -t) unlimited % max user processes (-u) 5547 This is probably also required to track a sysconf() value. % virtual memory (kbytes, -v) unlimited % swap size (kbytes, -w) unlimited So there are only 4 finite "infinite" rlimits, with 2 probably required. Bash (4.2.20) is also missing support for the new limit on pseudo- terminals. This is shown by sh: % sbsize (bytes, -b) unlimited % pseudo-terminals (-p) unlimited The socket buffer limit is shown by both, but sh doesn't describe it properly (it uses the kernel/API abbreviated name for the descrption). On a 10.0-CURRENT amd64 system, according to bash: % data seg size (kbytes, -d) 33554432 % open files (-n) 11095 % pipe size (512 bytes, -p) 1 % stack size (kbytes, -s) 524288 % max user processes (-u) 5547 Now the finiteness of the data seg size limit is nonsense. The finite value is 32GB, but the system has only has 16GB of RAM and 8GB of swap. Overcommit may allow more virtual data, but you don't want that unless you want to have no limit. The correct spelling for no limit on the data seg size is "unlimited" (RLIM_INFINITY, not 32GB). 64-bit systems probably all have this this limit (but misspelled) without really trying, since the large virtual address space makes it very easy to exceed physical resources, and any arbitrary limit is likely to be smaller than necessary for some loads (especially overcommitted ones), or too large to always be physically satisfiable. 8GB swap with 16GB RAM is also nonsense. Might as well have the correct amount of swap (0), or if you want some swap then spare a few bytes of the 16GB for a RAM disk. On my i386 local system: % data seg size (kbytes) 524288 The above i386 system has 2GB of RAM and 4GB of swap. It can actually run up to 11 threads using the data limit without overcommit. But I have only 1GB of RAM and the correct amount of swap (0), so I can't run more than 1 thread using the data limit without overcommit. Bruce