From owner-freebsd-scsi@freebsd.org Mon Jul 3 15:27:25 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A54FC9E8DA6; Mon, 3 Jul 2017 15:27:25 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: from mail-vk0-x232.google.com (mail-vk0-x232.google.com [IPv6:2607:f8b0:400c:c05::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 58EEA7435D; Mon, 3 Jul 2017 15:27:25 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: by mail-vk0-x232.google.com with SMTP id y70so97090644vky.3; Mon, 03 Jul 2017 08:27:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=W/0DKT8sHhZ3yNVWjKkTJ772GQVXevDvpoKXY+oigYk=; b=mgGq/hvQwILMsIomtgAL/VM6ON3YadRQNxKhk39W9dPGIGWhI6a+Re93+c5UXRy1Y9 KN0x8xh4fLOGV2avdc+TEGsNhZKwJ40Atd5mZ82PVvroDKMcPPtb1Z/1eE6nENeIemsY b7OVRYCcgiAtGlEdOeWNJ9sRoAGCdR8jMAGi6EVQAMQPtguf8y/48gKBFG3KUPMWcVB6 UbxhSbGeMlcTHuggsm+bN5Q5fDQ5B8t3H0G0J0lx+aHSAAlbliIfZu0Sb9KIXQxeuboJ 1Cmmc0YOFsimoAXajhlaclgufcieRQHKqSIpUh8BE7wza1zR7DbmRzEvke2vGzUqBgoi BhvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=W/0DKT8sHhZ3yNVWjKkTJ772GQVXevDvpoKXY+oigYk=; b=FrWqWEKqSnjHQNyE/rYvKyGLvy+P66XwWlo3RQoLOBAdaqWvBnr3HNFTcpZz6fv16k U80gmKjUZHigB0xEfKs45Bdgw4HR8n50+8oYyt10wcgliIzots8/FfdgiuUq2vS01ygC ca8hQDqyUmQ9pa0JqQQlOscV5HlXYS2ssX4ErF8v8qyU/An0jEgi2+R1rsTvaTAgm1zr 2RlM3cmR9tpYYvg+Fc04J+3CQ0uruSYNLONuaCN0CJMfjOTy/JEdwdDpAoStC4664xsv QXHaNCWnLY2GisjTjMyvpp0FqVn9/hC6QEwhmrKOmk3eaWNroN+Vf6m2Ew6U9dO4iW+Y y4fg== X-Gm-Message-State: AKS2vOzmePGCnJhJd7tbQZiZ7IcbfpTYed4g968rVnZ+J87ND6pH2aeq qvsGykdz3oh6SNNgOh+CMciHar5TFw== X-Received: by 10.31.172.88 with SMTP id v85mr19659845vke.57.1499095644405; Mon, 03 Jul 2017 08:27:24 -0700 (PDT) MIME-Version: 1.0 Sender: etnapierala@gmail.com Received: by 10.176.83.198 with HTTP; Mon, 3 Jul 2017 08:27:23 -0700 (PDT) In-Reply-To: <1F414ECE-1856-4EA3-A141-88B64703D4D6@gmail.com> References: <20170629144334.1e283570@fabiankeil.de> <1F414ECE-1856-4EA3-A141-88B64703D4D6@gmail.com> From: Edward Napierala Date: Mon, 3 Jul 2017 16:27:23 +0100 X-Google-Sender-Auth: MfH634NSHWAqw0dMIrXRjnSuuYA Message-ID: Subject: Re: I/O to pool appears to be hung, panic ! To: Ben RUBSON Cc: Freebsd fs , freebsd-scsi Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jul 2017 15:27:25 -0000 2017-07-03 14:36 GMT+01:00 Ben RUBSON : > > On 03 Jul 2017, at 13:10, Edward Napierala wrote: > > > > 2017-07-03 10:07 GMT+01:00 Ben RUBSON ben.rubson@gmail.com>>: > > > > > On 29 Jun 2017, at 15:36, Ben RUBSON ben.rubson@gmail.com>> wrote: > > > > > >> On 29 Jun 2017, at 14:43, Fabian Keil > wrote: > > > > > > Thank you for your feedback Fabian. > > > > > >> Ben RUBSON > > wrote: > > >> > > >>> One of my servers did a kernel panic last night, giving the > following message : > > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at > '/dev/label/G23iscsi'. > > >> [...] > > >>> Here are some numbers regarding this disk, taken from the server > hosting the pool : > > >>> (unfortunately not from the iscsi target server) > > >>> https://s23.postimg.org/zd8jy9xaj/busydisk.png < > https://s23.postimg.org/zd8jy9xaj/busydisk.png> > > >>> > > >>> We clearly see that suddendly, disk became 100% busy, meanwhile CPU > was almost idle. > > > > We also clearly see that 5 minutes later (02:09) disk seems to be back > but became 100% busy again, > > and that 16 minutes later (default vfs.zfs.deadman_synctime_ms), panic > occurred. > > > > >>> No error message at all on both servers. > > >> [...] > > >>> The only log I have is the following stacktrace taken from the > server console : > > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at > '/dev/label/G23iscsi'. > > >>> cpuid = 0 > > >>> KDB: stack backtrace: > > >>> #0 0xffffffff80b240f7 at kdb_backtrace+0x67 > > >>> #1 0xffffffff80ad9462 at vpanic+0x182 > > >>> #2 0xffffffff80ad92d3 at panic+0x43 > > >>> #3 0xffffffff82238fa7 at vdev_deadman+0x127 > > >>> #4 0xffffffff82238ec0 at vdev_deadman+0x40 > > >>> #5 0xffffffff82238ec0 at vdev_deadman+0x40 > > >>> #6 0xffffffff8222d0a6 at spa_deadman+0x86 > > >>> #7 0xffffffff80af32da at softclock_call_cc+0x18a > > >>> #8 0xffffffff80af3854 at softclock+0x94 > > >>> #9 0xffffffff80a9348f at intr_event_execute_handlers+0x20f > > >>> #10 0xffffffff80a936f6 at ithread_loop+0xc6 > > >>> #11 0xffffffff80a900d5 at fork_exit+0x85 > > >>> #12 0xffffffff80f846fe at fork_trampoline+0xe > > >>> Uptime: 92d2h47m6s > > >>> > > >>> I would have been pleased to make a dump available. > > >>> However, despite my (correct ?) configuration, server did not dump : > > >>> (nevertheless, "sysctl debug.kdb.panic=1" make it to dump) > > >>> # grep ^dump /boot/loader.conf /etc/rc.conf > > >>> /boot/loader.conf:dumpdev="/dev/mirror/swap" > > >>> /etc/rc.conf:dumpdev="AUTO" > > >> > > >> You may want to look at the NOTES section in gmirror(8). > > > > > > Yes, I should already be OK (prefer algorithm set). > > > > > >>> I use default kernel, with a rebuilt zfs module : > > >>> # uname -v > > >>> FreeBSD 11.0-RELEASE-p8 #0: Wed Feb 22 06:12:04 UTC 2017 > root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > > >>> > > >>> I use the following iSCSI configuration, which disconnects the disks > "as soon as" they are unavailable : > > >>> kern.iscsi.ping_timeout=5 > > >>> kern.iscsi.fail_on_disconnection=1 > > >>> kern.iscsi.iscsid_timeout=5 > > >>> > > >>> I then think disk was at least correctly reachable during these 20 > busy minutes. > > >>> > > >>> So, any idea why I could have faced this issue ? > > >> > > >> Is it possible that the system was under memory pressure? > > > > > > No I don't think it was : > > > https://s1.postimg.org/uvsebpyyn/busydisk2.png < > https://s1.postimg.org/uvsebpyyn/busydisk2.png> > > > More than 2GB of available memory. > > > Swap not used (624kB). > > > ARC behaviour seems correct (anon increases because ZFS can't actually > write I think). > > > Regarding the pool itself, it was receiving data at 6MB/s, sending > around 30kB blocks to disks. > > > When disk went busy, throughput fell to some kB, with 128kB blocks. > > > > > >> geli's use of malloc() is known to cause deadlocks under memory > pressure: > > >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209759 < > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209759> > > >> > > >> Given that gmirror uses malloc() as well it probably has the same > issue. > > > > > > I don't use geli so I should not face this issue. > > > > > >>> I would have thought ZFS would have taken the busy device offline, > instead of raising a panic. > > >>> Perhaps it is already possible to make ZFS behave like this ? > > >> > > >> There's a tunable for this: vfs.zfs.deadman_enabled. > > >> If the panic is just a symptom of the deadlock it's unlikely > > >> to help though. > > > > > > I think this tunable should have prevented the server from having > raised a panic : > > > # sysctl -d vfs.zfs.deadman_enabled > > > vfs.zfs.deadman_enabled: Kernel panic on stalled ZFS I/O > > > # sysctl vfs.zfs.deadman_enabled > > > vfs.zfs.deadman_enabled: 1 > > > > > > But not sure how it would have behaved then... > > > (busy disk miraculously back to normal status, memory pressure due to > anon increasing...) > > > > I then think it would be nice, once vfs.zfs.deadman_synctime_ms has > expired, > > to be able to take the busy device offline instead of raising a panic. > > Currently, disabling deadman will avoid the panic but will let the > device slowing down the pool. > > > > I still did not found the root cause of this issue, not sure I will, > > quite difficult actually with a stacktrace and some performance graphs > only :/ > > > > What exactly is the disk doing when that happens? What does "gstat" > say? If the iSCSI > > target is also FreeBSD, what does ctlstat say? > > As shown on this graph made with gstat numbers from initiator : > https://s23.postimg.org/zd8jy9xaj/busydisk.png > The disk is continuously writing 3 MBps before the issue happens. > When it occurs, response time increases to around 30 seconds (100% busy), > and consequently disk throughput drops down to some kBps. > CPU stays at an almost fully idle level. > > As shown here, no memory pressure : > https://s1.postimg.org/uvsebpyyn/busydisk2.png uvsebpyyn/busydisk2.png> > > At the end of graphs' lines, panic is raised. > > iSCSI target is also FreeBSD, unfortunately ctlstat was not running during > the issue occurred. > So numbers will be average since system startup (102 days ago). > I also do not have gstat numbers from this disk on target side > (to help finding if it's a hardware issue, a iSCSI issue or something > else). > I will think about collecting these numbers if ever issue occurs again. > It's kind of hard to say something definitive at this point, but I suspect it's a problem at the target side. I got a report about something quite similar some two years ago, and it turned out to be a problem with a disk controller on the target.