From owner-freebsd-current@FreeBSD.ORG Fri Dec 27 23:06:42 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DDAD6F33; Fri, 27 Dec 2013 23:06:42 +0000 (UTC) Received: from hydra.pix.net (hydra.pix.net [IPv6:2001:470:e254::3c]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 888BE186A; Fri, 27 Dec 2013 23:06:42 +0000 (UTC) Received: from torb.pix.net (torb.pix.net [IPv6:2001:470:e254:10:12dd:b1ff:febf:eca9]) (authenticated bits=0) by hydra.pix.net (8.14.5/8.14.5) with ESMTP id rBRN6avb018062; Fri, 27 Dec 2013 18:06:37 -0500 (EST) (envelope-from lidl@pix.net) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.98 at mail.pix.net Message-ID: <52BE07FC.8020104@pix.net> Date: Fri, 27 Dec 2013 18:06:36 -0500 From: Kurt Lidl User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Marius Strobl Subject: Re: panic on sparc64 running 10-beta4 References: <529F51DA.1040703@pix.net> <20131208135023.GA75625@alchemy.franken.de> <20131227184234.GA1597@alchemy.franken.de> In-Reply-To: <20131227184234.GA1597@alchemy.franken.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD-Current , sparc64@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Dec 2013 23:06:42 -0000 On 12/27/13 1:42 PM, Marius Strobl wrote: > On Sun, Dec 08, 2013 at 02:50:23PM +0100, Marius Strobl wrote: >> On Wed, Dec 04, 2013 at 11:01:30AM -0500, Kurt Lidl wrote: >>> I installed a sparc V120 (4GB memory, dual 72GB disks) with the 10-beta4 >>> install image today. >>> >>> Installation went fine. I rebooted the machine, and then went to get >>> a fresh ports tree, and the machine panic'd: >>> >>> root@host:/usr/ports # portsnap fetch >>> Looking up portsnap.FreeBSD.org mirrors... 7 mirrors found. >>> Fetching public key from your-org.portsnap.freebsd.org... done. >>> Fetching snapshot tag from your-org.portsnap.freebsd.org... done. >>> Fetching snapshot metadata... done. >>> Fetching snapshot generated at Tue Dec 3 19:06:18 EST 2013: >>> 43b6803c6d94efd5b2e2bc9df0b66a84b75417fa3c1728100% of 69 MB 3225 kBps >>> 00m22s >>> Extracting snapshot... done. >>> Verifying snapshot integrity... panic: trap: illegal instruction (kernel) >>> cpuid = 0 >>> KDB: stack backtrace: >>> #0 0xc08836d4 at trap+0x554 >>> Uptime: 6m59s >>> Dumping 4096 MB (4 chunks) >>> chunk at 0: 1073741824 bytes ... ok >>> chunk at 0x40000000: 1073741824 bytes ... ok >>> chunk at 0x80000000: 1073741824 bytes ... ok >>> chunk at 0xc0000000: 1073741824 bytes ... ok >>> >>> Dump complete >>> Automatic reboot in 15 seconds - press a key on the console to abort >>> Rebooting... >>> >>> And then it panic'd again when attempting to run 'savecore'! >>> (I typed a after it printed out the line about >>> writing the core file, that's where the "load: 0.72 ..." line >>> came from...) >> >> Hrm, I don't seem to be able to reproduce this with an installation >> built from sources and also can't remember a commit between BETA3 and >> BETA4 which should be able to cause this. I currently can't test the >> 10-BETA4 install image, though. Was the machine in question running >> FreeBSD before, i. e. is it known good hardware? Did savecore eventually >> succeed on writing out a dump? Yes, this machine was successfully running 9/stable before this. Yes, I did ultimately get a successful savecore to run. The trick seems to be not to use ctrl-t to check the status of the machine. I loaded the RC1 build too, and restrained myself to not check via ctrl-t during the installation and unpacking of the OS, and again when doing a "portsnap fetch && portsnap unpack". I think the problem hinges on ctrl-t corrupting something that causes the panic soon thereafter. > > FYI, I tried again with a machine installed from the 10.0-RC3 binary > image and couldn't reproduce that problem either. I just tried it again with a freshly fetched and burned RC3 image, and was able to get it to panic while verifying the snapshot. My comments are in [square brackets]. root@dna:~ # portsnap fetch Looking up portsnap.FreeBSD.org mirrors... 7 mirrors found. Fetching public key from your-org.portsnap.freebsd.org... done. Fetching snapshot tag from your-org.portsnap.freebsd.org... done. Fetching snapshot metadata... done. Fetching snapshot generated at Thu Dec 26 19:11:40 EST 2013: [ I did several ctrl-t operations during the fetch, no problem ] Extracting snapshot... [ctrl-t] load: 0.55 cmd: bsdtar 1355 [runnable] 6.33r 1.39u 3.78s 37% 5384k In: 11851934 bytes, compression 23%; Out: 5320 files, 15471104 bytes Current: snap/3d543fc157d97d1617eeb20832bf2cb37d04aeb2bf068bd0a07533e5b67c02fe.gz (1152 bytes) [ctrl-t] load: 0.83 cmd: bsdtar 1355 [runnable] 11.43r 2.36u 6.55s 51% 5384k In: 19288110 bytes, compression 24%; Out: 9299 files, 25624576 bytes Current: snap/1856dcdc8799dd2b5a19d2d4720452bc77b4084088dd9ac5bd190da5ac211c4b.gz (101014 bytes) done. Verifying snapshot integrity... [ a bunch of rapid ctrl-t keystrokes ] load: 2.23 cmd: sha256 1370 [runnable] 0.49r 0.32u 0.00s 3% 2064k load: 2.21 cmd: sh 1539 [runnable] 0.04r 0.00u 0.00s 2% 0k load: 1.93 cmd: sha256 5705 [runnable] 0.02r 0.00u 0.00s 15% 1880k load: 1.93 cmd: sh 5715 [runnable] 0.03r 0.00u 0.00s 15% 3136k load: 1.93 cmd: gunzip 5728 [runnable] 0.01r 0.00u 0.00s 16% 1200k load: 1.93 cmd: gunzip 5737 [runnable] 0.02r 0.00u 0.01s 16% 2144k load: 1.93 cmd: sh 5749 [runnable] 0.00r 0.00u 0.00s 16% 3136k load: 1.93 cmd: sh 1391 [runnable] 68.71r 0.58u 5.18s 15% 3136k panic: trap: fast data access mmu miss (kernel) cpuid = 0 KDB: stack backtrace: #0 0xc0883954 at trap+0x554 Uptime: 1h1m23s Dumping 4096 MB (4 chunks) chunk at 0: 1073741824 bytes ... ok chunk at 0x40000000: 1073741824 bytes ... ok chunk at 0x80000000: 1073741824 bytes ... ok chunk at 0xc0000000: 1073741824 bytes ... ok Dump complete Here's the backtrace from the recovered crashdump, 'core.txt.0': Unread portion of the kernel message buffer: panic: trap: fast data access mmu miss (kernel) cpuid = 0 KDB: stack backtrace: #0 0xc0883954 at trap+0x554 Uptime: 1h1m23s Dumping 4096 MB (4 chunks) chunk at 0: 1073741824 bytes Reading symbols from /boot/kernel/zfs.ko.symbols...done. Loaded symbols for /boot/kernel/zfs.ko.symbols Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. Loaded symbols for /boot/kernel/opensolaris.ko.symbols Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. Loaded symbols for /boot/kernel/geom_mirror.ko.symbols #0 0x00000000c052f57c in doadump (textdump=) at /usr/src/sys/kern/kern_shutdown.c:258 258 savectx(&dumppcb); (kgdb) #0 0x00000000c052f57c in doadump (textdump=) at /usr/src/sys/kern/kern_shutdown.c:258 #1 0x00000000c052ff70 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x00000000c0530338 in panic (fmt=0xc0af4828 "trap: %s (kernel)") at /usr/src/sys/kern/kern_shutdown.c:754 #3 0x00000000c088395c in trap (tf=0xc1665040) at /usr/src/sys/sparc64/sparc64/trap.c:410 #4 0x00000000c00a1060 in tl1_trap () #5 0x00000000c051b3e8 in __mtx_lock_sleep (c=0xfffff800fca631e0, tid=18446735278028046848, opts=-56217240, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:432 #6 0x00000000c08108e8 in vm_page_insert_after (m=0xc0c58a98, object=0xfffff80002c73240, pindex=0, mpred=0x0) at /usr/src/sys/vm/vm_page.c:998 #7 0x00000000c080f780 in vm_page_dequeue (m=0xfffff800f981b368) at /usr/src/sys/vm/vm_page.c:2045 #8 0x00000000c07fcd80 in vm_fault_hold (map=0xfffff8000228ea00, vaddr=1083088896, fault_type=2 '\002', fault_flags=0, m_hold=0x0) at vm_page.h:644 #9 0x00000000c07feb90 in vm_fault (map=0xfffff8000228ea00, vaddr=1083088896, fault_type=2 '\002', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:224 #10 0x00000000c0882ffc in trap_pfault (td=, tf=0xc1665880) at /usr/src/sys/sparc64/sparc64/trap.c:501 #11 0x00000000c0883498 in trap (tf=0xc1665880) at /usr/src/sys/sparc64/sparc64/trap.c:289 #12 0x00000000c00a0e40 in tl0_intr () #13 0x0000000000000000 in ?? () (kgdb) -Kurt