From owner-freebsd-sparc64@freebsd.org Tue Sep 22 22:14:14 2015 Return-Path: Delivered-To: freebsd-sparc64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 82518A07BCD for ; Tue, 22 Sep 2015 22:14:14 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "alchemy.franken.de", Issuer "alchemy.franken.de" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 0F1011288; Tue, 22 Sep 2015 22:14:13 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.15.2/8.15.2/ALCHEMY.FRANKEN.DE) with ESMTPS id t8MME4oT081340 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 23 Sep 2015 00:14:04 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.15.2/8.15.2/Submit) id t8MME4G7081339; Wed, 23 Sep 2015 00:14:04 +0200 (CEST) (envelope-from marius) Date: Wed, 23 Sep 2015 00:14:04 +0200 From: Marius Strobl To: Mark Cave-Ayland Cc: Alexey Dokuchaev , "freebsd-sparc64@freebsd.org" Subject: Re: PCI range checking under qemu-system-sparc64 Message-ID: <20150922221404.GA81100@alchemy.franken.de> References: <20150913180126.GC7862@alchemy.franken.de> <55F89861.1030107@ilande.co.uk> <20150916031030.GA6711@FreeBSD.org> <55F9C2B8.7030605@ilande.co.uk> <20150916211914.GD18789@alchemy.franken.de> <20150917082817.GA71811@FreeBSD.org> <55FBB662.4080708@ilande.co.uk> <20150919211420.GK18789@alchemy.franken.de> <55FDEA3C.1010804@ilande.co.uk> <20150920043630.GA36162@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150920043630.GA36162@FreeBSD.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (alchemy.franken.de [0.0.0.0]); Wed, 23 Sep 2015 00:14:05 +0200 (CEST) X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Sep 2015 22:14:14 -0000 On Sun, Sep 20, 2015 at 04:36:31AM +0000, Alexey Dokuchaev wrote: > On Sun, Sep 20, 2015 at 12:05:32AM +0100, Mark Cave-Ayland wrote: > > [...] > > While I don't have any insight on the CPU tick interrupt yet, my initial > > feeling is that the ATA hang could be related to the PCI interrupt > > clearing issue that I started looking into a while back. Although it > > isn't a complete fix, does the attached patch against QEMU help at all? > > Did not help; putting back "device scbus+da+cd" hangs with patched QEMU. > Actually, it does make a bit of a difference; with a QEMU 2.4.0 built with apb-no-clear.patch applied the number of interrupts seen for the emulated CMD646U2 at the point of the hang decreases from reliably 5 to reliably 4. Now whether that is a relevant functional change or what's the appropriate number of interrupts at that point in time I don't know, though. I tried to look into the tick interrupt problem, as that seems more relevant. Apparently, writes to %tick and %tick_cmpr are hooked up to cpu_tick_set_count() and cpu_tick_set_limit() (in hw/sparc64/sun4u.c) respectively in target-sparc/translate.c by a detour via target-sparc/helper.c. Likewise for %{h,}stick and %{h,}stick_cpmr. So at first I just enabled DEBUG_TIMER, which yields: marius@enigma:/home/marius > qemu-system-sparc64 -nographic -cdrom FreeBSD-11.0-CURRENT-sparc64-bootonly.iso -boot d TIMER: tick set_limit limit=0x0000000000000000 (disabled) p=0x803a29810 called with limit=0x8000000000000000 at 0x0000000000004c97 (delta=0x0000000000000000) TIMER: tick set_limit limit=ZERO - not starting timer TIMER: tick get_count count=0x000000000005df82 (disabled) p=0x803a29810 TIMER: tick set_limit limit=0x00000000001521c2 (enabled) p=0x803a29810 called with limit=0x00000000001521c2 at 0x000000000005e0e9 (delta=0x00000000000f40d8) OpenBIOS for Sparc64 <...> The first invocation of cpu_tick_set_count() is here: TIMER: tick set_count count=0x0000000000000000 (enabled) p=0x803a29810 TIMER: tick set_limit limit=0x0000000000000000 (disabled) p=0x803a29810 called with limit=0x8000000000000000 at 0x0000000000000e73 (delta=0x0000000000000000) TIMER: tick set_limit limit=ZERO - not starting timer GDB: no debug ports present KDB: debugger backends: ddb KDB: current backend: ddb Copyright (c) 1992-2015 The FreeBSD Project. I don't know where that write to %tick is coming from but it's not performed by either the FreeBSD loader or kernel. However, the write to %tick_cmpr seems right as we do a `wrpr (1L << 63), 0, %tick_cmpr` very early in the kernel in order to prevent foot-shooting from firmware variants that leave the tick counter enabled. There's no other call of cpu_tick_set_count() but the next one to cpu_tick_set_limit() is when the kernel enables the tick counter: Timecounter "tick" frequency 100000000 Hz quality 1000 TIMER: tick get_count count=0x0000000006ead320 (disabled) p=0x803a29810 TIMER: tick get_count count=0x0000000006ead7da (disabled) p=0x803a29810 Event timer "tick" frequency 100000000 Hz quality 1000 TIMER: tick get_count count=0x0000000006ec9b12 (disabled) p=0x803a29810 TIMER: tick set_limit limit=0x0000000006ed5e61 (disabled) p=0x803a29810 called with limit=0x8000000006ed5e61 at 0x0000000006eca6e5 (delta=0x000000000000b77b) TIMER: (null) disabled Timecounters tick every 1.000 msec^M^M The "called with limit=0x8000000006ed5e61" is a plain lie! We simply do not set (1L << 63) when writing the future tick value to %tick_cmpr for the tick interrupt to go off. So I tried with a hack to cpu_tick_set_limit(): void cpu_tick_set_limit(CPUTimer *timer, uint64_t limit) { int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); uint64_t real_limit = limit & ~timer->disabled_mask; timer->disabled = (limit & timer->disabled_mask) ? 1 : 0; timer->disabled = limit == 0; Not that this shouldn't make any difference as it just special- cases the fact that apart from (1L << 63), FreeBSD writes a 0 value to %tick_cmpr for disabling the tick interrupt. Moreover, limit == 0 implies that (1L << 63) isn't set either, so if the kernel really would set (1L << 63) in the case in question, this change shouldn't cause a difference in behavior, but it in fact it does: Timecounter "tick" frequency 100000000 Hz quality 1000 Event timer "tick" frequency 100000000 Hz quality 1000 Timecounters tick every 1.000 msec IPsec: Initialized Security Association Processing. cd0 at ata3 bus 0 scbus1 target 0 lun 0 cd0: Removable CD-ROM SCSI device cd0: Serial Number QM00003 cd0: 13.300MB/s transfers (WDMA1, ATAPI 12bytes, PIO 65534bytes) cd0: cd present [103200 x 2048 byte records] WARNING: WITNESS option enabled, expect reduced performance. Trying to mount root from cd9660:/dev/iso9660/11_0_CURRENT_SPARC64_BO [ro]... [ thread pid 1 tid 100002 ] Stopped at tl1_trap+0x24: stx %o0, [%sp + 0x997] db> So apparently (1L << 63) in %tick_cmpr somehow is sticky within QEMU but I didn't manage to spot a bug in that regard. There actually are several obvious bugs in QEMU when it comes to %{hs,s,}tick and %{hs,s,}tick_cmpr handling, though. First off, both cpu_tick_set_count() and cpu_tick_set_limit() test for the disabled_mask being set in the value written to the register in question. This is wrong as only the %{s,}tick_cmpr registers have (1L << 63) as disable bit, in %{s,}tick (1L << 63) serves as NPT (Non-Privilege Trap) bit. The NPT bit causes an unprivileged read of %{s,}tick to trap when set (with real hardware that is). QEMU seems to re-invent and side-step that logic via CONFIG_USER_ONLY. I don't have documentation for %hstick and %hstick_cmpr at hand but they likely work the same way. Moreover, cpu_devinit() in hw/sparc64/sun4u.c seems to totally confuse NPT and disable bit in the first place: env->tick = cpu_timer_create("tick", cpu, tick_irq, tick_frequency, TICK_NPT_MASK); env->stick = cpu_timer_create("stick", cpu, stick_irq, stick_frequency, TICK_INT_DIS); env->hstick = cpu_timer_create("hstick", cpu, hstick_irq, hstick_frequency, TICK_INT_DIS); Depending on what "disabled" is supposed to mean, that should be either all TICK_INT_DIS or all TICK_NPT_MASK, but certainly not mixed. As for the panic I get with my hack applied, the back- trace is: Trying to mount root from cd9660:/dev/iso9660/11_0_CURRENT_SPARC64_BO [ro]... [ thread pid 1 tid 100002 ] Stopped at tl1_trap+0x24: stx %o0, [%sp + 0x997] db> bt Tracing pid 1 tid 100002 td 0xfffff800015e84d0 KDB: reentering KDB: stack backtrace: kdb_reenter() at kdb_reenter+0x5c trap() at trap+0x2fc -- kernel stack fault %o7=0xc0570d40 -- sched_clock() at sched_clock+0x94 KDB: reentering KDB: stack backtrace: kdb_reenter() at kdb_reenter+0x5c trap() at trap+0x2fc -- kernel stack fault %o7=0xc011a050 -- db_read_bytes() at db_read_bytes+0x44 KDB: reentering KDB: stack backtrace: kdb_reenter() at kdb_reenter+0x5c trap() at trap+0x2fc -- kernel stack fault %o7=0xc011a050 -- db_read_bytes() at db_read_bytes+0x44 KDB: reentering KDB: stack backtrace: kdb_reenter() at kdb_reenter+0x5c trap() at trap+0x2fc -- kernel stack fault %o7=0xc011a050 -- db_read_bytes() at db_read_bytes+0x44 So this is a trap while inside the kernel and a seemingly serious stack corruption. I have no idea how to even debug QEMU in that regard or where to start looking for that bug in QEMU. However, as long as the bug of (1L << 63) being corrupted in %tick_cmpr isn't fixed, i. e. its cause unknown, it likely also doesn't make much sense to dig further into that stack corruption. Err, what operating systems is qemu-system-sparc64 already able to successfully host again? :) Marius