Date: Thu, 29 Oct 2020 16:48:36 +0100 (CET) From: Christian Kratzer <ck-lists@cksoft.de> To: Andriy Gapon <avg@FreeBSD.org> Cc: freebsd-fs@freebsd.org Subject: Re: 12.1-RELEASE-p7 panic in zio_free_issue_4_6 Message-ID: <9ca55ded-1f91-5118-917e-3266946020@cksoft.de> In-Reply-To: <24b9cc11-0681-2f17-b634-d68878bc67ac@FreeBSD.org> References: <a6a55583-f7b8-ee59-e3c7-4d1fcc5b1de8@cksoft.de> <474d086c-5a36-0db5-974f-ccfa0acbd871@FreeBSD.org> <d66296be-a1b9-b2a1-c2ec-164a7ba178@cksoft.de> <24b9cc11-0681-2f17-b634-d68878bc67ac@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi,
On Thu, 29 Oct 2020, Andriy Gapon wrote:
>> I will keep this pool around for a couple of days and will try to get a crash dump
>> from the system. After that I will have delete and recreate the pool and just
>> wait for backups to roll back in.
>
>
> Okay, let's see if we can get a vmcore.
> Otherwise, this is just a guess-work on my part.
> The problem could be very different from my initial impression.
so I added a swap device which for some reason was missing and was about to induce
the crash again.
But in order to get consistent versions of the kernel debug symbols I upgraded the system
to 12.2-RELEASE.
Now that everything was in place I was able to import the pool in readonly mode using
zpool import -m o -readonly=on zpfra2
without the system crashing. It sits there happily now
pool: zpfra2
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: scrub repaired 0 in 0 days 01:31:43 with 0 errors on Fri Jul 17 18:55:35 2020
config:
NAME STATE READ WRITE CKSUM
zpfra2 DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/zfsfra1d02.eli ONLINE 0 0 0
gpt/zfsfra1d03.eli ONLINE 0 0 0
gpt/zfsfra1d04.eli ONLINE 0 0 0
gpt/zfsfra1d05.eli ONLINE 0 0 0
gpt/zfsfra1d06.eli ONLINE 0 0 0
gpt/zfsfra1d07.eli ONLINE 0 0 0
gpt/zfsfra1d08.eli ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
gpt/zfsfra1d10.eli ONLINE 0 0 0
gpt/zfsfra1d11.eli ONLINE 0 0 0
gpt/zfsfra1d12.eli ONLINE 0 0 0
gpt/zfsfra1d13.eli ONLINE 0 0 0
gpt/zfsfra1d14.eli ONLINE 0 0 0
gpt/zfsfra1d15.eli ONLINE 0 0 0
gpt/zfsfra1d16.eli ONLINE 0 0 0
logs
mirror-2 UNAVAIL 0 0 0
3980362133776709100 UNAVAIL 0 0 0 was /dev/gpt/log1d0
6670731949941654186 UNAVAIL 0 0 0 was /dev/gpt/log1d1
The pool is degraded as I had already removed the log devices.
I am pulling data of the pool as we speak and will recreate it.
In case this happens again I will be prepared with a crash dump.
I will also not enable dedup for this 15TB pool as long as the machine has only 128GB ram.
Greetings
Christian
--
Christian Kratzer CK Software GmbH
Email: ck@cksoft.de Wildberger Weg 24/2
Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden
Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart
Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer
Web: http://www.cksoft.de/
From owner-freebsd-fs@freebsd.org Thu Oct 29 16:44:35 2020
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
by mailman.nyi.freebsd.org (Postfix) with ESMTP id 53452457E3C
for <freebsd-fs@mailman.nyi.freebsd.org>; Thu, 29 Oct 2020 16:44:35 +0000 (UTC)
(envelope-from ck-lists@cksoft.de)
Received: from mx1.cksoft.de (mx1.cksoft.de [IPv6:2001:67c:24f8:1::25:1])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
client-signature RSA-PSS (2048 bits) client-digest SHA256)
(Client CN "mx1.cksoft.de", Issuer "Let's Encrypt Authority X3" (verified OK))
by mx1.freebsd.org (Postfix) with ESMTPS id 4CMWW61kPvz42bT;
Thu, 29 Oct 2020 16:44:33 +0000 (UTC)
(envelope-from ck-lists@cksoft.de)
Received: from m.cksoft.de (m.cksoft.de [IPv6:2001:67c:24f8:2003::25:3])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
(No client certificate requested)
by mx1.cksoft.de (Postfix) with ESMTPS id B8D6A1EB6C7;
Thu, 29 Oct 2020 17:44:32 +0100 (CET)
Received: from amavisfra2 (unknown [IPv6:2001:67c:24f8:2003::25:a2])
by m.cksoft.de (Postfix) with ESMTP id 6FA44315909;
Thu, 29 Oct 2020 17:44:32 +0100 (CET)
X-Virus-Scanned: amavisd-new at cksoft.de
Received: from m.cksoft.de ([192.168.35.42])
by amavisfra2 (amavisfra2.cksoft.de [192.168.35.44]) (amavisd-new, port 10051)
with ESMTP id fNoDeWkmLIYt; Thu, 29 Oct 2020 17:44:31 +0100 (CET)
Received: from nocfra1.cksoft.de (nocfra1.cksoft.de
[IPv6:2001:67c:24f8:2001::53:1])
by m.cksoft.de (Postfix) with ESMTP id 0735330FC3A;
Thu, 29 Oct 2020 17:44:30 +0100 (CET)
Received: by nocfra1.cksoft.de (Postfix, from userid 1000)
id D9CEA13E65; Thu, 29 Oct 2020 17:44:30 +0100 (CET)
Received: from localhost (localhost [127.0.0.1])
by nocfra1.cksoft.de (Postfix) with ESMTP id D53A613E4A;
Thu, 29 Oct 2020 17:44:30 +0100 (CET)
Date: Thu, 29 Oct 2020 17:44:30 +0100 (CET)
From: Christian Kratzer <ck-lists@cksoft.de>
Reply-To: Christian Kratzer <ck@cksoft.de>
To: Andriy Gapon <avg@FreeBSD.org>
cc: freebsd-fs@freebsd.org
Subject: Re: 12.1-RELEASE-p7 panic in zio_free_issue_4_6
In-Reply-To: <24b9cc11-0681-2f17-b634-d68878bc67ac@FreeBSD.org>
Message-ID: <d4104c66-7683-1492-47ff-f6eb3f32eb7@cksoft.de>
References: <a6a55583-f7b8-ee59-e3c7-4d1fcc5b1de8@cksoft.de>
<474d086c-5a36-0db5-974f-ccfa0acbd871@FreeBSD.org>
<d66296be-a1b9-b2a1-c2ec-164a7ba178@cksoft.de>
<24b9cc11-0681-2f17-b634-d68878bc67ac@FreeBSD.org>
X-NCC-RegID: de.cksoft
X-Spammer-Kill-Ratio: 75%
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII; format=flowed
X-Rspamd-Queue-Id: 4CMWW61kPvz42bT
X-Spamd-Bar: --
Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none;
spf=pass (mx1.freebsd.org: domain of ck-lists@cksoft.de designates
2001:67c:24f8:1::25:1 as permitted sender) smtp.mailfrom=ck-lists@cksoft.de
X-Spamd-Result: default: False [-2.80 / 15.00];
HAS_REPLYTO(0.00)[ck@cksoft.de]; ARC_NA(0.00)[];
RCVD_COUNT_FIVE(0.00)[6]; MID_RHS_MATCH_FROM(0.00)[];
FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[];
TO_MATCH_ENVRCPT_ALL(0.00)[]; REPLYTO_DN_EQ_FROM_DN(0.00)[];
MIME_GOOD(-0.10)[text/plain]; REPLYTO_DOM_EQ_FROM_DOM(0.00)[];
DMARC_NA(0.00)[cksoft.de];
R_SPF_ALLOW(-0.20)[+a:mail.cksoft.de];
NEURAL_HAM_LONG(-1.00)[-1.002];
NEURAL_HAM_MEDIUM(-1.00)[-1.000];
NEURAL_HAM_SHORT(-0.49)[-0.494]; RCPT_COUNT_TWO(0.00)[2];
FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
MIME_TRACE(0.00)[0:+];
ASN(0.00)[asn:57407, ipnet:2001:67c:24f8::/48, country:DE];
RCVD_TLS_LAST(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.33
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Oct 2020 16:44:35 -0000
Hi,
On Thu, 29 Oct 2020, Andriy Gapon wrote:
> Okay, let's see if we can get a vmcore.
> Otherwise, this is just a guess-work on my part.
> The problem could be very different from my initial impression.
got the data off the pool in ro and mounted again RW
Fatal trap 12: page fault while in kernel mode
cpuid = 7; apic id = 11
fault virtual address = 0x30
fault code = supervisor write data, page not present
instruction pointer = 0x20:0xffffffff824b89e4
stack pointer = 0x28:0xfffffe012d6c7be0
frame pointer = 0x28:0xfffffe012d6c7be0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 3185 (zio_free_issue_6_3)
trap number = 12
panic: page fault
cpuid = 7
time = 1603989616
KDB: stack backtrace:
#0 0xffffffff80c0a8f5 at kdb_backtrace+0x65
#1 0xffffffff80bbeb1b at vpanic+0x17b
#2 0xffffffff80bbe993 at panic+0x43
#3 0xffffffff8108f911 at trap_fatal+0x391
#4 0xffffffff8108f96f at trap_pfault+0x4f
#5 0xffffffff8108efb6 at trap+0x286
#6 0xffffffff81066f38 at calltrap+0x8
#7 0xffffffff8255e672 at zio_ddt_free+0x52
#8 0xffffffff8255ba2c at zio_execute+0xac
#9 0xffffffff80c1cee4 at taskqueue_run_locked+0x144
#10 0xffffffff80c1e2d6 at taskqueue_thread_loop+0xb6
#11 0xffffffff80b8044e at fork_exit+0x7e
#12 0xffffffff81067f6e at fork_trampoline+0xe
Uptime: 1h13m32s
Dumping 8797 out of 131020 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
Dump complete
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
cpu_reset: Restarting BSP
cpu_reset_proxy: Stopped CPU 7
This is on 12.2-RELEASE now and debug symbols are in place.
I will be archiving the artefacts and will get back with proper tracebacks shortly.
Let me know what you need and how you would like to receive it.
Greetings
Christian
--
Christian Kratzer CK Software GmbH
Email: ck@cksoft.de Wildberger Weg 24/2
Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden
Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart
Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer
Web: http://www.cksoft.de/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9ca55ded-1f91-5118-917e-3266946020>
