From owner-freebsd-emulation@FreeBSD.ORG Mon Apr 20 11:06:50 2009 Return-Path: Delivered-To: freebsd-emulation@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A5F8E1065670 for ; Mon, 20 Apr 2009 11:06:50 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 934D58FC24 for ; Mon, 20 Apr 2009 11:06:50 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3KB6oQN032985 for ; Mon, 20 Apr 2009 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3KB6o3g032981 for freebsd-emulation@FreeBSD.org; Mon, 20 Apr 2009 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 20 Apr 2009 11:06:50 GMT Message-Id: <200904201106.n3KB6o3g032981@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-emulation@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-emulation@FreeBSD.org X-BeenThere: freebsd-emulation@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Development of Emulators of other operating systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Apr 2009 11:06:51 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/130724 emulation [linprocfs] [patch] cpuinfo in linprocfs is dated, cau o kern/129169 emulation [linux] [patch] Linux Emulation ENOTCONN error using n f ports/127018 emulation Linuxulator incapable of using FreeBSD's LDAP environm o kern/126232 emulation [linux] Linux ioctl TCGETS (0x5401) always fails o ports/121800 emulation x11-toolkits/linux-openmotif - OpenMotif upgrade to 2. o kern/97326 emulation [linux] file descriptor leakage in linux emulation o ports/91318 emulation [fix] graphics/linux_dri: works on amd64 too o kern/91293 emulation [svr4] [patch] *Experimental* Update to the SVR4 emula o kern/73777 emulation [linux] [patch] linux emulation: root dir special hand a kern/72920 emulation [linux]: path "prefixing" is not done on unix domain s o kern/56451 emulation [linprocfs] /compat/linux/proc/cpuinfo gives wrong CPU o kern/41543 emulation [patch] [request] easier wine/w23 support o kern/39201 emulation [linux] [patch] ptrace(2) and rfork(RFLINUXTHPN) confu o kern/29698 emulation [linux] [patch] linux ipcs doesn'work o kern/21463 emulation [linux] Linux compatability mode should not allow setu o kern/11165 emulation [ibcs2] IBCS2 doesn't work correctly with PID_MAX 9999 16 problems total. From owner-freebsd-emulation@FreeBSD.ORG Thu Apr 23 22:09:59 2009 Return-Path: Delivered-To: freebsd-emulation@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0276E1065670 for ; Thu, 23 Apr 2009 22:09:59 +0000 (UTC) (envelope-from nox@jelal.kn-bremen.de) Received: from smtp.kn-bremen.de (gelbbaer.kn-bremen.de [78.46.108.116]) by mx1.freebsd.org (Postfix) with ESMTP id 84D5C8FC19 for ; Thu, 23 Apr 2009 22:09:57 +0000 (UTC) (envelope-from nox@jelal.kn-bremen.de) Received: by smtp.kn-bremen.de (Postfix, from userid 10) id 510591E000F2; Fri, 24 Apr 2009 00:10:39 +0200 (CEST) Received: from triton.kn-bremen.de (noident@localhost [127.0.0.1]) by triton.kn-bremen.de (8.14.3/8.14.3) with ESMTP id n3NLl2HK083652; Thu, 23 Apr 2009 23:47:02 +0200 (CEST) (envelope-from nox@triton.kn-bremen.de) Received: (from nox@localhost) by triton.kn-bremen.de (8.14.3/8.14.3/Submit) id n3NLl2do083651; Thu, 23 Apr 2009 23:47:02 +0200 (CEST) (envelope-from nox) From: Juergen Lock Date: Thu, 23 Apr 2009 23:47:01 +0200 To: freebsd-emulation@freebsd.org Message-ID: <20090423214701.GA83621@triton.kn-bremen.de> References: <200904032223.n33MNTiq019599@triton.kn-bremen.de> <200904072137.n37LbbdC071227@triton.kn-bremen.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200904072137.n37LbbdC071227@triton.kn-bremen.de> User-Agent: Mutt/1.5.19 (2009-01-05) Cc: kalinoj1@iem.pw.edu.pl Subject: Re: Recent qemu and timers issue X-BeenThere: freebsd-emulation@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Development of Emulators of other operating systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2009 22:09:59 -0000 On Tue, Apr 07, 2009 at 11:37:37PM +0200, Juergen Lock wrote: > In article <200904062254.37824.kalinoj1@iem.pw.edu.pl> you write: > >Dnia sobota 04 kwietnia 2009 o 00:23:29 Juergen Lock napisa=C5=82(a): > >> In article you write: > >> >Hi all, > >> > > >> >When running FreeBSD 8-CURRENT system inside a qemu machine (from the > >> >latest port by nox@freebsd.org) I have strange timer problems in the > >> >virtual machine. > >> > >> What is the host running, also current? > > > >Yes, the host is also running current.=20 > > > >But the same host has no problem with older FreeBSD kernels on the guest, e= > >g.=20 > >=46reeSBIe 1.1 or FreeNAS (kernel v.6.4) > > > OK I did some tests in the meantime on my newest box that has 4 cpus and > is running 7-stable: (host) > >> > >> >My logs and console are full of messages like this: > >> >calcru: runtime went backwards from 11248014678717988 usec to 119242 usec > >> >for pid 60 (ps) > >> > > >> >calcru: runtime went backwards from 11248014678921829 usec to 2516605 us= > >ec > >> >for pid 60 (ps) > >> >calcru: runtime went backwards from 7014844 usec to 6270851 usec for pid= > > 0 > >> >(kernel) > >> > > >> >It even happens for some processes to die inside qemu guest. > >> > >> Is this an smp host? You could try forcing qemu onto one cpu by > >> doing like: > >> cpuset -l 0 qemu ... > >> > > > >Yes, it's an smp host (8 cpus). But cpuset -l 0 qemu ... (also -clock=20 > >dynticks) didnt't help. > > > Same here with an 8.0-HEAD-20090403-JPSNAP-i386 guest. (OK I didn't see > guest processes die, but I also didn't really load the guests so far...) > > A 7.1-RELENG_7-20090403-JPSNAP-i386 guest was not affected. > > >> >I tried to use all possible timers using sysctl, where I have: > >> >TSC(800) HPET(900) ACPI-safe(850) i8254(0) dummy(-1000000) > >> >None of these helped. > >> > >> On the host or in the guest? > >> > > > >It's in the guest machine., where I tried these settings=20 > > > >One important thing I forgot to mention in my initial posting: > >When I run 'systat -vm 1' o the guest machine - it's almost fully loaded - = > >and=20 > >80% (or more) of it's cpu time is spent in interrrupt handling! This is als= > >o=20 > >not the case with older kernels. > > > Hmm 'systat -vm 1' doesn't really load the 8.0 guest here. > > >> [snip] > >> >I would greatly appreciate any help or a hint - where should I look for > >> >help on this issue. > >> > >> If none of these help you could also try the qemu list: > >> http://lists.nongnu.org/mailman/listinfo/qemu-devel > >> (I think you can still post without being subscribed, altho in that case > >> it may take a while for the post to get thru.) > >> > >> HTH, > >> Juergen > > > >At this moment it's very hard for me to judge if it's qemu or kernel issue.. > >Any thoughts? > > Yeah, could be one or the other. I just noticed running any FreeBSD guest with bootverbose (boot -v, or menu item 5 in the default beastie menu) gives me lots of these: t_delta 16.0109662fcd2712a0 too long t_delta 16.014a1099ee815ca0 too long t_delta 16.0108d9c2ecd404c0 too long t_delta 16.0108d032cb334b60 too long t_delta 15.f58306afb53419a0 too short t_delta 16.010a762538a38680 too long etc. My guess is this could be related... Juergen From owner-freebsd-emulation@FreeBSD.ORG Fri Apr 24 15:05:03 2009 Return-Path: Delivered-To: freebsd-emulation@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E0D461065670 for ; Fri, 24 Apr 2009 15:05:03 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx10.syd.optusnet.com.au (fallbackmx10.syd.optusnet.com.au [211.29.132.251]) by mx1.freebsd.org (Postfix) with ESMTP id E989A8FC26 for ; Fri, 24 Apr 2009 15:05:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by fallbackmx10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n3OCKegU011073 for ; Fri, 24 Apr 2009 22:20:40 +1000 Received: from besplex.bde.org (c122-107-120-227.carlnfd1.nsw.optusnet.com.au [122.107.120.227]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n3OCKXqL001652 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 24 Apr 2009 22:20:36 +1000 Date: Fri, 24 Apr 2009 22:20:33 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Juergen Lock In-Reply-To: <20090423214701.GA83621@triton.kn-bremen.de> Message-ID: <20090424201623.N887@besplex.bde.org> References: <200904032223.n33MNTiq019599@triton.kn-bremen.de> <200904072137.n37LbbdC071227@triton.kn-bremen.de> <20090423214701.GA83621@triton.kn-bremen.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: kalinoj1@iem.pw.edu.pl, freebsd-emulation@FreeBSD.org Subject: Re: Recent qemu and timers issue X-BeenThere: freebsd-emulation@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Development of Emulators of other operating systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 15:05:04 -0000 On Thu, 23 Apr 2009, Juergen Lock wrote: > On Tue, Apr 07, 2009 at 11:37:37PM +0200, Juergen Lock wrote: >> In article <200904062254.37824.kalinoj1@iem.pw.edu.pl> you write: >>> Dnia sobota 04 kwietnia 2009 o 00:23:29 Juergen Lock napisa=C5=82(a): >>>> In article you write: >>> ... >> OK I did some tests in the meantime on my newest box that has 4 cpus and >> is running 7-stable: (host) >>>> >>>>> My logs and console are full of messages like this: >>>>> calcru: runtime went backwards from 11248014678717988 usec to 119242 usec >>>>> for pid 60 (ps) >>>>> >>>>> calcru: runtime went backwards from 11248014678921829 usec to 2516605 us= >>> ec >>>>> for pid 60 (ps) >>>>> calcru: runtime went backwards from 7014844 usec to 6270851 usec for pid= >>> 0 >> ... >>>>> I tried to use all possible timers using sysctl, where I have: >>>>> TSC(800) HPET(900) ACPI-safe(850) i8254(0) dummy(-1000000) >>>>> None of these helped. None of these are normally used for calculating runtimes. Normally on i386, the TSC is used. The only way to configure this is to edit the source code. Try removing the calls to set_cputicker() in the MD code. Then the MI and timecounter-based cputicker tc_cpu_ticks() will be used. A better implementation would use a user-selectable timecounter-based cputicker in all cases, but usually not the system timecounter since that is likely to be very slow so as to be more accurate. >>> At this moment it's very hard for me to judge if it's qemu or kernel issue.. >>> Any thoughts? >> >> Yeah, could be one or the other. > > I just noticed running any FreeBSD guest with bootverbose (boot -v, > or menu item 5 in the default beastie menu) gives me lots of these: > t_delta 16.0109662fcd2712a0 too long > t_delta 16.014a1099ee815ca0 too long > t_delta 16.0108d9c2ecd404c0 too long > t_delta 16.0108d032cb334b60 too long > t_delta 15.f58306afb53419a0 too short > t_delta 16.010a762538a38680 too long > etc. My guess is this could be related... As you probably know, virtualizing timers is hard to do correctly (see vmware docs), so the problem is most probably partly in qemu. The cputicker code is much more fragile and/or broken than the timecounter code, so imperfect virtualization is likely to cause more problems with it. The above t_delta printfs are a sign of problems. It is not surprising that the recalibration code (cpu_tick_calibrate() (sic)) has timing problems with virtual timers. When it detects a problem, it prints the above; then it continues using the old (possibly wrong) frequency. When it doesn't detect a problem, and the recalibrated frequency would be larger, then it continues using the new (possibly wrong) frequency. When it doesn't detect a problem, and the recalibrated frequency would be smaller, then it is just broken -- it continues using the old (certainly wrong) frequency. When miscalibrations are normal as appears to be the case with qemu, this bug probably makes their effects less obvious: It makes the miscalibrated frequencies monotonically increasing so they will soon converge to a limit and the frequency used will then be invariant until the next full calibration. When the limit is infinity, as is caused messing up the timers completely (e.g., by sitting at a debugger prompt for more than a couple of seconds) then other effects are fairly visible -- all user and sys times are broken (nearly 0). When the limit is only a few percent larger than the average, the user and sys times will be only a few percent lower, provided the difference between the limit and the actual (transient) frequency is not so large that calcru() sees problems and tries to fix them. Fixing there is fundamentally impossible but in some cases the damage is small enough to hide. In all cases, continuing with a wrong frequency will cause problems with the runtime calculations. Continuing with a right (but changed) frequency is little better. calcru() is fundamentally broken if the frequency actually changes, since it applies the current frequency to ticks accumulated at old frequencies. The following patch fixes a couple of the problems in recalibration: % diff -c2 ./kern/kern_tc.c~ ./kern/kern_tc.c % *** ./kern/kern_tc.c~ Thu Mar 20 01:05:27 2008 % --- ./kern/kern_tc.c Thu Mar 20 01:05:29 2008 % *************** % *** 884,888 **** % return; % % ! getbinuptime(&t_this); % c_this = cpu_ticks(); % if (t_last.sec != 0) { % --- 884,888 ---- % return; % % ! binuptime(&t_this); % c_this = cpu_ticks(); % if (t_last.sec != 0) { The existence of getbinuptime() is a bug, and its use here is especially wrong. We should use the current time, and binuptime() gives this as well as possible. The caller has just set the time cached for getbinutime() to the current time, but this is not quite the same. There is no locking here, except the implicit locking given by being in a fast interrupt handler, so the time between the caching and the use of the value is unbounded. We may be interrupted even if we are in a fast interrupt handler, depending on the implementation of fast interrupt handlers. Emulation is quite likely to result in fast interrupt handlers being interrupted -- they are probably as "fast" (really low-latency) as possible on the target, but they may be interrupted on the host and then long delayed on the target in some cases (hopefully not often). So using binuptime() is only slightly better here -- a delay may occur after the timecounter is read, no matter where it is read. Something like the following is needed: %%% binuptime(&start_bt); start_tsc = rdtsc(); do { binuptime(&sample_bt); tsc = rdtsc(); binuptime(&bt); bintime_sub(&bt, &sample_bt); if (bintimecmp(&bt, &best_bt, <)) { best_bt = bt; best_sample_bt = sample_bt; best_tsc = tsc; } bt = sample_bt; bintime_sub(&bt, &start_bt); } while (++nsamples < min_nsamples || bintimecmp(&best_bt, min_error_btp, >)); %%% Reads of the timer being calibrated (TSC here) are sandwiched between reads of the reference timer and ones that can't be measured to within a few parts per billion are discarded. I never saw any effects from the above change on non-virtual systems. Also, the sanity checks (which print the t_delta message) never fired for the extremely messed up timers given by running ddb. When running ddb, normal timing stops working after a few seoncs or millseconds since timecounters require interrupts to work. However, the cputicker keeps working precisely if it is the TSC. So the cputicker : delta-binuptime ratio soon becomes effectively infinite. The sanity checking code seems to always fail to detect this problem. % *************** % *** 922,931 **** % c_delta <<= 20; % c_delta /= divi; % ! if (c_delta > cpu_tick_frequency) { % ! if (0 && bootverbose) % ! printf("cpu_tick increased to %ju Hz\n", % ! c_delta); % ! cpu_tick_frequency = c_delta; % ! } % } % } % --- 922,930 ---- % c_delta <<= 20; % c_delta /= divi; % ! if ((0 && bootverbose)) % ! printf( % ! "cpu_tick_frequency changed from %ju to %ju Hz\n", % ! cpu_tick_frequency, c_delta); % ! cpu_tick_frequency = c_delta; % } % } This fixes the bug which makes cpu_tick_frequency non-decreasing. Another thing you can try here is to edit the source code to change the set_cputicker() calls to say that the frequency is not variable. I used this temporarily to work around the non-decreasing calibration. This should be the default for emulators for most cputickers -- emulators should emulate a constant frequency and not emulate the complexities for pwoer saving. This should probably be the default when the cputicker is the i386 TSC and the TSC is P-state invariant. Currently, P-state invariance only causes turning off of cpufreq's adjustment of the TSC as a timecounter. Bruce