From owner-freebsd-bugs@FreeBSD.ORG Thu Jan 29 07:38:50 2015 Return-Path: Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 587CCF3F for ; Thu, 29 Jan 2015 07:38:50 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3A769CD2 for ; Thu, 29 Jan 2015 07:38:50 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t0T7cojC017264 for ; Thu, 29 Jan 2015 07:38:50 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 197164] Zpool with L2ARC hangs whole system Date: Thu, 29 Jan 2015 07:38:50 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: Karli.Sjoberg@slu.se X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jan 2015 07:38:50 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D197164 Bug ID: 197164 Summary: Zpool with L2ARC hangs whole system Product: Base System Version: 10.1-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Many People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: Karli.Sjoberg@slu.se Created attachment 152328 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D152328&action= =3Dedit Graphite - System Overview Hi! At present we have 4 ZFS storage systems that _were_ configured with SSD di= sks as cache and after different periods of time, depending on amount of RAM and load, they go unresponsive. Initially you can ping them and change VT's at the console but nothing prin= ts when you type, all services are gone etc. After a while they stop respondin= g to ping as well. After a reboot all is good again for a while until the process repeats itself. Now I have found out exactly what=C2=B4s causing it: L2ARC! Just removing t= he cache drive(s), they run rock-solid again, but performance is severely degraded. = The caching in ZFS really does wonders to offload the "slow" rotating disks and we=C2=B4d very much like to be able to re-add them to our pools again. This is similar to: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D187594 But this might be another issue. And since the OP couldn=C2=B4t experiment = with the systems being in production, the case couldn=C2=B4t really come any further= , but this one can! We have a virtual machine set up exactly like our "real" storage's, but miniturized in performance and capacity. It=C2=B4s upgraded = to 10.1-RELEASE with these patches applied: https://svnweb.freebsd.org/base?view=3Drevision&revision=3D272875 With a script that loops copying files from my desktop to the VM and then b= ack again, I have been able to reliably hang the system just by re-adding the c= ache to the pool, take a look at the attached screenshot. It shows the system overview of this virtual storage server were I was running my script over n= ight and added the cache to the pool at around 9 AM. See what happens with the A= RC? That=C2=B4s the problem. And then it went unresponsive around 3-4 PM. Thanks in advance! Karli Sj=C3=B6berg --=20 You are receiving this mail because: You are the assignee for the bug.=