Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Oct 2020 14:41:40 +0100 (CET)
From:      Christian Kratzer <ck-lists@cksoft.de>
To:        freebsd-fs@freebsd.org
Subject:   12.1-RELEASE-p7 panic in zio_free_issue_4_6
Message-ID:  <a6a55583-f7b8-ee59-e3c7-4d1fcc5b1de8@cksoft.de>

next in thread | raw e-mail | index | archive | help
Hi,

one of my servers with 12.1-RELEASE-p7 started crashing with following

Fatal trap 12: page fault while in kernel mode
cpuid = 19; apic id = 31
fault virtual address   = 0x30
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff826877f4
stack pointer           = 0x28:0xfffffe011cefeaa0
frame pointer           = 0x28:0xfffffe011cefeaa0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (zio_free_issue_2_3)
trap number             = 12
panic: page fault
cpuid = 19
time = 1603797129
KDB: stack backtrace:
#0 0xffffffff80c1d2f7 at kdb_backtrace+0x67
#1 0xffffffff80bd062d at vpanic+0x19d
#2 0xffffffff80bd0483 at panic+0x43
#3 0xffffffff810a8dcc at trap_fatal+0x39c
#4 0xffffffff810a8e19 at trap_pfault+0x49
#5 0xffffffff810a840f at trap+0x29f
#6 0xffffffff81081c9c at calltrap+0x8
#7 0xffffffff8272a903 at zio_ddt_free+0x53
#8 0xffffffff82727b7c at zio_execute+0xac
#9 0xffffffff80c2fad4 at taskqueue_run_locked+0x154
#10 0xffffffff80c30e08 at taskqueue_thread_loop+0x98
#11 0xffffffff80b90c43 at fork_exit+0x83
#12 0xffffffff81082cde at fork_trampoline+0xe
Uptime: 1m12s
Automatic reboot in 15 seconds - press a key on the console to abort


I traced thigs down to importing one of the zpools.

Ths machine has a 3 zpools

The first two are ok:

 	  pool: zroot
 	 state: ONLINE
 	  scan: scrub repaired 0 in 0 days 00:00:22 with 0 errors on Fri Jul 17 17:24:17 2020
 	config:

 		NAME           STATE     READ WRITE CKSUM
 		zroot          ONLINE       0     0     0
 		  mirror-0     ONLINE       0     0     0
 		    gpt/root0  ONLINE       0     0     0
 		    gpt/root1  ONLINE       0     0     0


 	root@zfsfra1:/var/crash # zpool status -v
 	  pool: zpfra1
 	 state: ONLINE
 	  scan: scrub repaired 0 in 0 days 00:48:16 with 0 errors on Fri Jul 17 18:12:04 2020
 	config:

 		NAME                    STATE     READ WRITE CKSUM
 		zpfra1                  ONLINE       0     0     0
 		  mirror-0              ONLINE       0     0     0
 		    gpt/zfsfra1d01.eli  ONLINE       0     0     0
 		    gpt/zfsfra1d09.eli  ONLINE       0     0     0
 		logs
 		  mirror-1              ONLINE       0     0     0
 		    gpt/log0d0          ONLINE       0     0     0
 		    gpt/log0d1          ONLINE       0     0     0


The last one has two sets of 7 disks in a raid-z2. I have removed the geli keys for the
disks so that it currently cannot be imported

 	  pool: zpfra2
 	 state: UNAVAIL
 	status: One or more devices could not be opened.  There are insufficient
 		replicas for the pool to continue functioning.
 	action: Attach the missing device and online it using 'zpool online'.
 	   see: http://illumos.org/msg/ZFS-8000-3C
 	  scan: none requested
 	config:

 		NAME                      STATE     READ WRITE CKSUM
 		zpfra2                    UNAVAIL      0     0     0
 		  raidz2-0                UNAVAIL      0     0     0
 		    7179798941412063472   UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d02.eli
 		    17119114611556833764  UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d03.eli
 		    8321725234410067709   UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d04.eli
 		    7897191132634569755   UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d05.eli
 		    16873755985119583929  UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d06.eli
 		    9644713294010671122   UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d07.eli
 		    1480177385910791788   UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d08.eli
 		  raidz2-1                UNAVAIL      0     0     0
 		    1498696212334632055   UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d10.eli
 		    5551216295452602020   UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d11.eli
 		    17197173774607757750  UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d12.eli
 		    12543220242988729823  UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d13.eli
 		    711115555895092704    UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d14.eli
 		    15806058868994893097  UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d15.eli
 		    7273134084268794449   UNAVAIL      0     0     0  was /dev/gpt/zfsfra1d16.eli
 		logs
 		  mirror-2                ONLINE       0     0     0
 		    gpt/log1d0            ONLINE       0     0     0
 		    gpt/log1d1            ONLINE       0     0     0

If I put the keys back the system will crash with above error after importing the pool.

I also tried importing the pool readonly but it also crashed.

Any ideas how to get this back into a sane state ?

Because of the zio_free_issue_2_3 error I am suspecting this to be something inconsistent in the log devices.

How could I remove those log devices and force import the pool.

Greetings
Christian

-- 
Christian Kratzer                   CK Software GmbH
Email:   ck@cksoft.de               Wildberger Weg 24/2
Phone:   +49 7032 893 997 - 0       D-71126 Gaeufelden
Fax:     +49 7032 893 997 - 9       HRB 245288, Amtsgericht Stuttgart
Mobile:  +49 171 1947 843           Geschaeftsfuehrer: Christian Kratzer
Web:     http://www.cksoft.de/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a6a55583-f7b8-ee59-e3c7-4d1fcc5b1de8>