Date: Thu, 29 Oct 2020 16:48:36 +0100 (CET) From: Christian Kratzer <ck-lists@cksoft.de> To: Andriy Gapon <avg@FreeBSD.org> Cc: freebsd-fs@freebsd.org Subject: Re: 12.1-RELEASE-p7 panic in zio_free_issue_4_6 Message-ID: <9ca55ded-1f91-5118-917e-3266946020@cksoft.de> In-Reply-To: <24b9cc11-0681-2f17-b634-d68878bc67ac@FreeBSD.org> References: <a6a55583-f7b8-ee59-e3c7-4d1fcc5b1de8@cksoft.de> <474d086c-5a36-0db5-974f-ccfa0acbd871@FreeBSD.org> <d66296be-a1b9-b2a1-c2ec-164a7ba178@cksoft.de> <24b9cc11-0681-2f17-b634-d68878bc67ac@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, On Thu, 29 Oct 2020, Andriy Gapon wrote: >> I will keep this pool around for a couple of days and will try to get a crash dump >> from the system. After that I will have delete and recreate the pool and just >> wait for backups to roll back in. > > > Okay, let's see if we can get a vmcore. > Otherwise, this is just a guess-work on my part. > The problem could be very different from my initial impression. so I added a swap device which for some reason was missing and was about to induce the crash again. But in order to get consistent versions of the kernel debug symbols I upgraded the system to 12.2-RELEASE. Now that everything was in place I was able to import the pool in readonly mode using zpool import -m o -readonly=on zpfra2 without the system crashing. It sits there happily now pool: zpfra2 state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: scrub repaired 0 in 0 days 01:31:43 with 0 errors on Fri Jul 17 18:55:35 2020 config: NAME STATE READ WRITE CKSUM zpfra2 DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/zfsfra1d02.eli ONLINE 0 0 0 gpt/zfsfra1d03.eli ONLINE 0 0 0 gpt/zfsfra1d04.eli ONLINE 0 0 0 gpt/zfsfra1d05.eli ONLINE 0 0 0 gpt/zfsfra1d06.eli ONLINE 0 0 0 gpt/zfsfra1d07.eli ONLINE 0 0 0 gpt/zfsfra1d08.eli ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 gpt/zfsfra1d10.eli ONLINE 0 0 0 gpt/zfsfra1d11.eli ONLINE 0 0 0 gpt/zfsfra1d12.eli ONLINE 0 0 0 gpt/zfsfra1d13.eli ONLINE 0 0 0 gpt/zfsfra1d14.eli ONLINE 0 0 0 gpt/zfsfra1d15.eli ONLINE 0 0 0 gpt/zfsfra1d16.eli ONLINE 0 0 0 logs mirror-2 UNAVAIL 0 0 0 3980362133776709100 UNAVAIL 0 0 0 was /dev/gpt/log1d0 6670731949941654186 UNAVAIL 0 0 0 was /dev/gpt/log1d1 The pool is degraded as I had already removed the log devices. I am pulling data of the pool as we speak and will recreate it. In case this happens again I will be prepared with a crash dump. I will also not enable dedup for this 15TB pool as long as the machine has only 128GB ram. Greetings Christian -- Christian Kratzer CK Software GmbH Email: ck@cksoft.de Wildberger Weg 24/2 Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer Web: http://www.cksoft.de/ From owner-freebsd-fs@freebsd.org Thu Oct 29 16:44:35 2020 Return-Path: <owner-freebsd-fs@freebsd.org> Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 53452457E3C for <freebsd-fs@mailman.nyi.freebsd.org>; Thu, 29 Oct 2020 16:44:35 +0000 (UTC) (envelope-from ck-lists@cksoft.de) Received: from mx1.cksoft.de (mx1.cksoft.de [IPv6:2001:67c:24f8:1::25:1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mx1.cksoft.de", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CMWW61kPvz42bT; Thu, 29 Oct 2020 16:44:33 +0000 (UTC) (envelope-from ck-lists@cksoft.de) Received: from m.cksoft.de (m.cksoft.de [IPv6:2001:67c:24f8:2003::25:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx1.cksoft.de (Postfix) with ESMTPS id B8D6A1EB6C7; Thu, 29 Oct 2020 17:44:32 +0100 (CET) Received: from amavisfra2 (unknown [IPv6:2001:67c:24f8:2003::25:a2]) by m.cksoft.de (Postfix) with ESMTP id 6FA44315909; Thu, 29 Oct 2020 17:44:32 +0100 (CET) X-Virus-Scanned: amavisd-new at cksoft.de Received: from m.cksoft.de ([192.168.35.42]) by amavisfra2 (amavisfra2.cksoft.de [192.168.35.44]) (amavisd-new, port 10051) with ESMTP id fNoDeWkmLIYt; Thu, 29 Oct 2020 17:44:31 +0100 (CET) Received: from nocfra1.cksoft.de (nocfra1.cksoft.de [IPv6:2001:67c:24f8:2001::53:1]) by m.cksoft.de (Postfix) with ESMTP id 0735330FC3A; Thu, 29 Oct 2020 17:44:30 +0100 (CET) Received: by nocfra1.cksoft.de (Postfix, from userid 1000) id D9CEA13E65; Thu, 29 Oct 2020 17:44:30 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by nocfra1.cksoft.de (Postfix) with ESMTP id D53A613E4A; Thu, 29 Oct 2020 17:44:30 +0100 (CET) Date: Thu, 29 Oct 2020 17:44:30 +0100 (CET) From: Christian Kratzer <ck-lists@cksoft.de> Reply-To: Christian Kratzer <ck@cksoft.de> To: Andriy Gapon <avg@FreeBSD.org> cc: freebsd-fs@freebsd.org Subject: Re: 12.1-RELEASE-p7 panic in zio_free_issue_4_6 In-Reply-To: <24b9cc11-0681-2f17-b634-d68878bc67ac@FreeBSD.org> Message-ID: <d4104c66-7683-1492-47ff-f6eb3f32eb7@cksoft.de> References: <a6a55583-f7b8-ee59-e3c7-4d1fcc5b1de8@cksoft.de> <474d086c-5a36-0db5-974f-ccfa0acbd871@FreeBSD.org> <d66296be-a1b9-b2a1-c2ec-164a7ba178@cksoft.de> <24b9cc11-0681-2f17-b634-d68878bc67ac@FreeBSD.org> X-NCC-RegID: de.cksoft X-Spammer-Kill-Ratio: 75% MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Rspamd-Queue-Id: 4CMWW61kPvz42bT X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of ck-lists@cksoft.de designates 2001:67c:24f8:1::25:1 as permitted sender) smtp.mailfrom=ck-lists@cksoft.de X-Spamd-Result: default: False [-2.80 / 15.00]; HAS_REPLYTO(0.00)[ck@cksoft.de]; ARC_NA(0.00)[]; RCVD_COUNT_FIVE(0.00)[6]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; REPLYTO_DN_EQ_FROM_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; REPLYTO_DOM_EQ_FROM_DOM(0.00)[]; DMARC_NA(0.00)[cksoft.de]; R_SPF_ALLOW(-0.20)[+a:mail.cksoft.de]; NEURAL_HAM_LONG(-1.00)[-1.002]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.49)[-0.494]; RCPT_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:57407, ipnet:2001:67c:24f8::/48, country:DE]; RCVD_TLS_LAST(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Filesystems <freebsd-fs.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>, <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/> List-Post: <mailto:freebsd-fs@freebsd.org> List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>, <mailto:freebsd-fs-request@freebsd.org?subject=subscribe> X-List-Received-Date: Thu, 29 Oct 2020 16:44:35 -0000 Hi, On Thu, 29 Oct 2020, Andriy Gapon wrote: > Okay, let's see if we can get a vmcore. > Otherwise, this is just a guess-work on my part. > The problem could be very different from my initial impression. got the data off the pool in ro and mounted again RW Fatal trap 12: page fault while in kernel mode cpuid = 7; apic id = 11 fault virtual address = 0x30 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff824b89e4 stack pointer = 0x28:0xfffffe012d6c7be0 frame pointer = 0x28:0xfffffe012d6c7be0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 3185 (zio_free_issue_6_3) trap number = 12 panic: page fault cpuid = 7 time = 1603989616 KDB: stack backtrace: #0 0xffffffff80c0a8f5 at kdb_backtrace+0x65 #1 0xffffffff80bbeb1b at vpanic+0x17b #2 0xffffffff80bbe993 at panic+0x43 #3 0xffffffff8108f911 at trap_fatal+0x391 #4 0xffffffff8108f96f at trap_pfault+0x4f #5 0xffffffff8108efb6 at trap+0x286 #6 0xffffffff81066f38 at calltrap+0x8 #7 0xffffffff8255e672 at zio_ddt_free+0x52 #8 0xffffffff8255ba2c at zio_execute+0xac #9 0xffffffff80c1cee4 at taskqueue_run_locked+0x144 #10 0xffffffff80c1e2d6 at taskqueue_thread_loop+0xb6 #11 0xffffffff80b8044e at fork_exit+0x7e #12 0xffffffff81067f6e at fork_trampoline+0xe Uptime: 1h13m32s Dumping 8797 out of 131020 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Dump complete Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... cpu_reset: Restarting BSP cpu_reset_proxy: Stopped CPU 7 This is on 12.2-RELEASE now and debug symbols are in place. I will be archiving the artefacts and will get back with proper tracebacks shortly. Let me know what you need and how you would like to receive it. Greetings Christian -- Christian Kratzer CK Software GmbH Email: ck@cksoft.de Wildberger Weg 24/2 Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer Web: http://www.cksoft.de/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9ca55ded-1f91-5118-917e-3266946020>