From owner-freebsd-fs@freebsd.org Mon Jul 3 09:07:11 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B98F29E1B47; Mon, 3 Jul 2017 09:07:11 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x22f.google.com (mail-wm0-x22f.google.com [IPv6:2a00:1450:400c:c09::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4C23A642A0; Mon, 3 Jul 2017 09:07:11 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x22f.google.com with SMTP id 62so162105234wmw.1; Mon, 03 Jul 2017 02:07:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=pHIxfnfO1kGjFbqmb23JJjYccYBWsUl9Y7g0WX6v6/s=; b=kvUMP3L93038P6ZUGTVt63XhSmAQZ111RDl2JkveNZTRIvifpW1ldTIKok9N+PRzmf 6UtW5IeI+x8ezGlJCiNcsuhimJJnm5YyI0JluisZ/gx7OleRSPKY2PIUtpdi/Ib3t3Vp TYVxRNkP+hucTxedrLDup32TGAYg74chnRW0FB1yxk3VCqQEbs8upM2PZAvFK2L3sju/ exJwR20L/pgVD+yUl1ZZs8OHAT+pjnZEp+I/Z+LGsaBBnnzNd7eM2drhP/rTIWohp2bX pqITY/jjs3rRpXDzOdmlOFHqwIw7aEjNc4Gn1HiXfFlD6cmIuefsHQPys4lWu93a7rL7 5bYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=pHIxfnfO1kGjFbqmb23JJjYccYBWsUl9Y7g0WX6v6/s=; b=TSiNIVb5ukyyuBsYLPZu0+0E/Z8lFUkhpaGU9NakFP4Au0RnHPErUln7stlAzZsLMY AlHbON4Vfd+mRWXA1BOaPjYeYc7NcWnV1qVfZ5NPXZ5oAZYvCY82VHuio9VdXiqzV/65 Rw5A9rAetRQ2Q0XbfVTiGjyZfFh0c4vm+p2GhP1nNlKA0W9G1pLQWLEC4sXObTc9SVl5 79CJV+OiiRJqby5EU2ketEzbHv2v/u8KMp7H56UcPBlFBk7G5l+Re4MpAhsxOwjT4lFj m8bfggTsAHYzEBQuWftzO0MmSu6IqdXtUHro5vkVsZBx9AkjhzqNfhj5JsiIlkPt07oz ropg== X-Gm-Message-State: AKS2vOyZi6hFyEOMMlNxZ5AoSXQalWsyBsavVddnZwrAg0A95BTUU5rJ qhvH+hop9uWmn1Snljk= X-Received: by 10.28.54.217 with SMTP id y86mr15164248wmh.81.1499072829298; Mon, 03 Jul 2017 02:07:09 -0700 (PDT) Received: from ben.home (LFbn-1-11339-180.w2-15.abo.wanadoo.fr. [2.15.165.180]) by smtp.gmail.com with ESMTPSA id c27sm13078689wrb.44.2017.07.03.02.07.08 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 03 Jul 2017 02:07:08 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: I/O to pool appears to be hung, panic ! From: Ben RUBSON In-Reply-To: Date: Mon, 3 Jul 2017 11:07:06 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20170629144334.1e283570@fabiankeil.de> To: Freebsd fs , freebsd-scsi@freebsd.org X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jul 2017 09:07:11 -0000 > On 29 Jun 2017, at 15:36, Ben RUBSON wrote: >=20 >> On 29 Jun 2017, at 14:43, Fabian Keil = wrote: >=20 > Thank you for your feedback Fabian. >=20 >> Ben RUBSON wrote: >>=20 >>> One of my servers did a kernel panic last night, giving the = following message : >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at = '/dev/label/G23iscsi'. >> [...]=20 >>> Here are some numbers regarding this disk, taken from the server = hosting the pool : >>> (unfortunately not from the iscsi target server) >>> https://s23.postimg.org/zd8jy9xaj/busydisk.png >>>=20 >>> We clearly see that suddendly, disk became 100% busy, meanwhile CPU = was almost idle. We also clearly see that 5 minutes later (02:09) disk seems to be back = but became 100% busy again, and that 16 minutes later (default vfs.zfs.deadman_synctime_ms), panic = occurred. >>> No error message at all on both servers. >> [...] >>> The only log I have is the following stacktrace taken from the = server console : >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at = '/dev/label/G23iscsi'. >>> cpuid =3D 0 >>> KDB: stack backtrace: >>> #0 0xffffffff80b240f7 at kdb_backtrace+0x67 >>> #1 0xffffffff80ad9462 at vpanic+0x182 >>> #2 0xffffffff80ad92d3 at panic+0x43 >>> #3 0xffffffff82238fa7 at vdev_deadman+0x127 >>> #4 0xffffffff82238ec0 at vdev_deadman+0x40 >>> #5 0xffffffff82238ec0 at vdev_deadman+0x40 >>> #6 0xffffffff8222d0a6 at spa_deadman+0x86 >>> #7 0xffffffff80af32da at softclock_call_cc+0x18a >>> #8 0xffffffff80af3854 at softclock+0x94 >>> #9 0xffffffff80a9348f at intr_event_execute_handlers+0x20f >>> #10 0xffffffff80a936f6 at ithread_loop+0xc6 >>> #11 0xffffffff80a900d5 at fork_exit+0x85 >>> #12 0xffffffff80f846fe at fork_trampoline+0xe >>> Uptime: 92d2h47m6s >>>=20 >>> I would have been pleased to make a dump available. >>> However, despite my (correct ?) configuration, server did not dump : >>> (nevertheless, "sysctl debug.kdb.panic=3D1" make it to dump) >>> # grep ^dump /boot/loader.conf /etc/rc.conf >>> /boot/loader.conf:dumpdev=3D"/dev/mirror/swap" >>> /etc/rc.conf:dumpdev=3D"AUTO" >>=20 >> You may want to look at the NOTES section in gmirror(8). >=20 > Yes, I should already be OK (prefer algorithm set). >=20 >>> I use default kernel, with a rebuilt zfs module : >>> # uname -v >>> FreeBSD 11.0-RELEASE-p8 #0: Wed Feb 22 06:12:04 UTC 2017 = root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC=20 >>>=20 >>> I use the following iSCSI configuration, which disconnects the disks = "as soon as" they are unavailable : >>> kern.iscsi.ping_timeout=3D5 >>> kern.iscsi.fail_on_disconnection=3D1 >>> kern.iscsi.iscsid_timeout=3D5 >>>=20 >>> I then think disk was at least correctly reachable during these 20 = busy minutes. >>>=20 >>> So, any idea why I could have faced this issue ? >>=20 >> Is it possible that the system was under memory pressure? >=20 > No I don't think it was : > https://s1.postimg.org/uvsebpyyn/busydisk2.png > More than 2GB of available memory. > Swap not used (624kB). > ARC behaviour seems correct (anon increases because ZFS can't actually = write I think). > Regarding the pool itself, it was receiving data at 6MB/s, sending = around 30kB blocks to disks. > When disk went busy, throughput fell to some kB, with 128kB blocks. >=20 >> geli's use of malloc() is known to cause deadlocks under memory = pressure: >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209759 >>=20 >> Given that gmirror uses malloc() as well it probably has the same = issue. >=20 > I don't use geli so I should not face this issue. >=20 >>> I would have thought ZFS would have taken the busy device offline, = instead of raising a panic. >>> Perhaps it is already possible to make ZFS behave like this ? >>=20 >> There's a tunable for this: vfs.zfs.deadman_enabled. >> If the panic is just a symptom of the deadlock it's unlikely >> to help though. >=20 > I think this tunable should have prevented the server from having = raised a panic : > # sysctl -d vfs.zfs.deadman_enabled > vfs.zfs.deadman_enabled: Kernel panic on stalled ZFS I/O > # sysctl vfs.zfs.deadman_enabled > vfs.zfs.deadman_enabled: 1 >=20 > But not sure how it would have behaved then... > (busy disk miraculously back to normal status, memory pressure due to = anon increasing...) I then think it would be nice, once vfs.zfs.deadman_synctime_ms has = expired, to be able to take the busy device offline instead of raising a panic. Currently, disabling deadman will avoid the panic but will let the = device slowing down the pool. I still did not found the root cause of this issue, not sure I will, quite difficult actually with a stacktrace and some performance graphs = only :/ Thank you again, Ben