Date: Mon, 11 Jun 2018 14:48:11 +0200 From: Willem Jan Withagen <wjw@digiware.nl> To: Stefan Wendler <stefan.wendler@tngtech.com> Cc: "stable@freebsd.org" <stable@FreeBSD.org> Subject: Re: Continuous crashing ZFS server Message-ID: <34c4a21b-9555-3b34-14a3-94cdacc22179@digiware.nl> In-Reply-To: <25b13f67-76fd-621d-22b8-f1efdcc4ae0a@tngtech.com> References: <f9ecab27-5201-4b60-ea75-e68dd5ffb44c@digiware.nl> <17446f39-97a1-8603-11a0-32176e8cb833@FreeBSD.org> <d75b7d81-67c8-d473-7652-c212700ef0d1@digiware.nl> <100ea6d0-5cf4-1a00-0e3a-dfad6175df6c@FreeBSD.org> <17ee24dd-93e5-dede-d7aa-90239c72c287@digiware.nl> <25b13f67-76fd-621d-22b8-f1efdcc4ae0a@tngtech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11-6-2018 14:35, Stefan Wendler wrote: > Do you use L2ARC/ZIL disks? I had a similar problem that turned out to > be a broken caching SSD. Scrubbing didn't help a bit because it reported > that data was okay. And SMART was fine as well. Fortunately I could > still send/recv snapshots to a backup disk but wasn't able to replace > the SSDs without a pool restore. ZFS just wouldn't sync some older ZIL > data to disk and also wouldn't release the SSDs from the pool. Did you > also check the logs for entries that look like broken RAM? That was one of the things I looked for, bad things in log files. But the server does not deem to have any hardware problems. I'll dive a bit deeper into my ZIL SSDs Thanx, --WjW > Cheers, > Stefan > > On 06/11/2018 01:29 PM, Willem Jan Withagen wrote: >> On 11-6-2018 12:53, Andriy Gapon wrote: >>> On 11/06/2018 13:26, Willem Jan Withagen wrote: >>>> On 11/06/2018 12:13, Andriy Gapon wrote: >>>>> On 08/06/2018 13:02, Willem Jan Withagen wrote: >>>>>> My file server is crashing about every 15 minutes at the moment. >>>>>> The panic looks like: >>>>>> >>>>>> Jun 8 11:48:43 zfs kernel: panic: Solaris(panic): zfs: allocating >>>>>> allocated segment(offset=12922221670400 size=24576) >>>>>> Jun 8 11:48:43 zfs kernel: >>>>>> Jun 8 11:48:43 zfs kernel: cpuid = 1 >>>>>> Jun 8 11:48:43 zfs kernel: KDB: stack backtrace: >>>>>> Jun 8 11:48:43 zfs kernel: #0 0xffffffff80aada57 at kdb_backtrace+0x67 >>>>>> Jun 8 11:48:43 zfs kernel: #1 0xffffffff80a6bb36 at vpanic+0x186 >>>>>> Jun 8 11:48:43 zfs kernel: #2 0xffffffff80a6b9a3 at panic+0x43 >>>>>> Jun 8 11:48:43 zfs kernel: #3 0xffffffff82488192 at vcmn_err+0xc2 >>>>>> Jun 8 11:48:43 zfs kernel: #4 0xffffffff821f73ba at zfs_panic_recover+0x5a >>>>>> Jun 8 11:48:43 zfs kernel: #5 0xffffffff821dff8f at range_tree_add+0x20f >>>>>> Jun 8 11:48:43 zfs kernel: #6 0xffffffff821deb06 at metaslab_free_dva+0x276 >>>>>> Jun 8 11:48:43 zfs kernel: #7 0xffffffff821debc1 at metaslab_free+0x91 >>>>>> Jun 8 11:48:43 zfs kernel: #8 0xffffffff8222296a at zio_dva_free+0x1a >>>>>> Jun 8 11:48:43 zfs kernel: #9 0xffffffff8221f6cc at zio_execute+0xac >>>>>> Jun 8 11:48:43 zfs kernel: #10 0xffffffff80abe827 at >>>>>> taskqueue_run_locked+0x127 >>>>>> Jun 8 11:48:43 zfs kernel: #11 0xffffffff80abf9c8 at >>>>>> taskqueue_thread_loop+0xc8 >>>>>> Jun 8 11:48:43 zfs kernel: #12 0xffffffff80a2f7d5 at fork_exit+0x85 >>>>>> Jun 8 11:48:43 zfs kernel: #13 0xffffffff80ec4abe at fork_trampoline+0xe >>>>>> Jun 8 11:48:43 zfs kernel: Uptime: 9m7s >>>>>> >>>>>> Maybe a known bug? >>>>>> Is there anything I can do about this? >>>>>> Any debugging needed? >>>>> >>>>> Sorry to inform you but your on-disk data got corrupted. >>>>> The most straightforward thing you can do is try to save data from the pool in >>>>> readonly mode. >>>> >>>> Hi Andriy, >>>> >>>> Auch, that is a first in 12 years of using ZFS. "Fortunately" it was of a test >>>> ZVOL->iSCSI->Win10 disk on which I spool my CAMs. >>>> >>>> Removing the ZVOL actually fixed the rebooting, but now the question is: >>>> Is the remainder of the zpools on the same disks in danger? >>> >>> You can try to check with zdb -b on an idle (better exported) pool. And zpool >>> scrub. >> >> If scrub says things are oke, I can start breathing again? >> exporting the pool is something for the small hours. >> >> Thanx, >> --WjW >> >> >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >> >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?34c4a21b-9555-3b34-14a3-94cdacc22179>