From owner-freebsd-fs@FreeBSD.ORG Sun Mar 29 02:56:20 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 169C3FCF; Sun, 29 Mar 2015 02:56:20 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 986EBF0E; Sun, 29 Mar 2015 02:56:19 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DRDACMaBdV/95baINcgygwXASDD8JKCoUqSQKBawEBAQEBAX1BAoNSAQEEAQEBCxUrFwkLGw4EBgICDRkCKQEJGA4GCAcEARwEiA4NsgyYZAEBAQEBBQEBAQEBAQEbgSGKCIQgAQYBARs0B4ItDC8SgTMFlE2DXYNCOoVSjQ0ihAoiMQd7AQgXIn8BAQE X-IronPort-AV: E=Sophos;i="5.11,486,1422939600"; d="scan'208";a="200302379" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 28 Mar 2015 22:56:17 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id ED295B3F77; Sat, 28 Mar 2015 22:56:17 -0400 (EDT) Date: Sat, 28 Mar 2015 22:56:17 -0400 (EDT) From: Rick Macklem To: Konstantin Belousov Message-ID: <69948517.7558875.1427597777962.JavaMail.root@uoguelph.ca> In-Reply-To: <20150328171315.GU2379@kib.kiev.ua> Subject: Re: MAXBSIZE increase MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@freebsd.org, Alexander Motin X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Mar 2015 02:56:20 -0000 Kostik wrote: > On Fri, Mar 27, 2015 at 10:57:05PM +0200, Alexander Motin wrote: > > Hi. > > > > Experimenting with NFS and ZFS I found an inter-operation issue: > > ZFS by > > default uses block of 128KB, while FreeBSD NFS (both client and > > server) > > is limited to 64KB requests by the value of MAXBSIZE. On file > > rewrite > > that limitation makes ZFS to do slow read-modify-write cycles for > > every > > write operation, instead of just writing the new data. Trivial > > iozone > > test show major difference between initial write and rewrite speeds > > because of this issue. > > > > Looking through the sources I've found and in r280347 fixed number > > of > > improper MAXBSIZE use cases in device drivers. After that I see no > > any > > reason why MAXBSIZE can not be increased to at least 128KB to match > > ZFS > > default (ZFS now supports block up to 1MB, but that is not default > > and > > so far rare). I've made a test build and also successfully created > > UFS > > file system with 128KB block -- not sure it is needed, but seems it > > survives this change well too. > > > > Is there anything I am missing, or it is safe to rise this limit > > now? > > This post is useless after the Bruce explanation, but I still want to > highlidht the most important point from that long story: > > increasing MAXBSIZE without tuning other buffer cache parameters > would dis-balance the buffer cache. Allowing bigger buffers > increases > fragmentation, while limiting the total number of buffers. Also, it > changes the tuning for runtime limits for amount of io in flight, see > hi/lo runningspace initialization. >From an NFS perspective, all it cares about is the maximum size of buffer cache block it can use. Maybe creating a separate constant that is specifically "max buffer cache block size", but does not define the maximum size of any file system's block would help? If the constant only defines maximum buffer cache block size, then it could be tuned based on architecture, so that amd64 could use much larger values for the buffer cache tunables. (As Bruce explained, i386 puts a very low limit on the buffer cache, due to KVM limitations.) Put another way, separate maximum buffer cache block from the maximum block size used by any on-disk file system. Other than KVM limits, I think the problems with increasing MAXBSIZE are because it is used as a maximum block size for file systems like UFS. Btw, since NFS already uses 64K buffers by default, there is already a dis-balanced buffer cache. Unfortunately, increasing BKVASIZE will make it allow even fewer buffers for i386. I don't know if buffer cache fragmentation has been causing anyone problems? Does anyone know if buffer cache fragmentation can cause any failure or will it just impact performance? (All I can see is that allocation of buffers > BKVASIZE can fragment the buffer cache's address space such that there might not be a contiguous area large enough for a buffer's allocation. I don't know what happens then?) I would like to see the NFS client be able to use 128K rsize/wsize. I would also like to see a larger buffer cache for machines like amd64 with a lot of RAM, so that wcommitsize (the size of write that the client can do asynchronously) can be much larger, too. (For i386, we probably have to live with a small buffer cache and maybe a 64K maximum buffer cache block size.) rick > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to > "freebsd-hackers-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Sun Mar 29 19:46:43 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2BF45FCB for ; Sun, 29 Mar 2015 19:46:43 +0000 (UTC) Received: from chez.mckusick.com (chez.mckusick.com [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F3443E6E for ; Sun, 29 Mar 2015 19:46:42 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id t2TJkUMv054849; Sun, 29 Mar 2015 12:46:30 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201503291946.t2TJkUMv054849@chez.mckusick.com> To: Da Rock Subject: Re: Delete a directory, crash the system In-reply-to: <55172A18.70601@herveybayaustralia.com.au> Date: Sun, 29 Mar 2015 12:46:30 -0700 From: Kirk McKusick Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Mar 2015 19:46:43 -0000 > Date: Sun, 29 Mar 2015 08:24:24 +1000 > From: Da Rock > To: Kirk McKusick > CC: Benjamin Kaduk , freebsd-fs@freebsd.org > Subject: Re: Delete a directory, crash the system > > On 03/29/15 08:02, Kirk McKusick wrote: > >> SU without journaling will maintain consistency. It is just that you >> will need to run fsck after a crash. That is the way FFS has been since >> it was written in 1982 and will allow you to recover from media errors >> which it appears your system is suffering from. SU+J is just a faster >> way of restarting but only works when you do not have media errors. > > I guess the point I'm driving at is that on a server this may be > an ok solution, but if you have workstations/desktops with users > who don't know how to do this properly, that is why the journalling > is an important feature. So its not just about faster restarts, but > a simple reboot/boot and everything is basically ok for them. Absent media errors, SU + fsck run at boot will always work without any intervention on the part of the users. When you run with SU, the default is to run fsck at every boot, so neither users nor administrators need to do anything other than hit the power-on button. > If there is any issue a system squawk at the sysadmin will then > allow them to come in at some point to run a proper check. But in > this case, we have a system which effectively crashes if there is > a problem. > > So thats why I mentioned the only other journal type fs' in freebsd, > because in this scenario a journal is required and it appears these > are the only alternative that don't create such a catastrophic effect. No journaling on any system can recover from media errors. Neither type on FreeBSD nor the one on Linux's ext4. The only way to recover from media errors is to have redundant metadata in the filesystem. ZFS has at least double and optionally triplely redundant metadata. If you want a system that will cleanly recover without any system administrator intervention in the face of media errors, that is what you should run. As you note, it is more resource hungry than FFS, but based on your requirement for no intervention in the face of media errors, that is what I would recommend. As long as you run on a 64-bit processor and have at least 4Gb of memory, it should have entirely reasonable performance. > Having made my point, what could be done about it - and what can I > do to help? Would drive details provide data required to pick up > the solution? Short of adding metadata redundancy to FFS, there is no solution. I have actively avoided putting such features into FFS as FreeBSD already has ZFS that does that (and many other things). My goal is to have a highly performant filesystem with minimal resource requirements. It by definition has limits, and administrator intervention in the face of media errors is one of them. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Sun Mar 29 21:00:35 2015 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 955A7B9F for ; Sun, 29 Mar 2015 21:00:35 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6B97297E for ; Sun, 29 Mar 2015 21:00:35 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t2TL0Z3f005542 for ; Sun, 29 Mar 2015 21:00:35 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <201503292100.t2TL0Z3f005542@kenobi.freebsd.org> From: bugzilla-noreply@FreeBSD.org To: freebsd-fs@FreeBSD.org Subject: Problem reports for freebsd-fs@FreeBSD.org that need special attention X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 Date: Sun, 29 Mar 2015 21:00:35 +0000 Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Mar 2015 21:00:35 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ------------+-----------+--------------------------------------------------- Open | 136470 | [nfs] Cannot mount / in read-only, over NFS Open | 139651 | [nfs] mount(8): read-only remount of NFS volume d Open | 144447 | [zfs] sharenfs fsunshare() & fsshare_main() non f 3 problems total for which you should take action. From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 14:42:59 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0F10C739 for ; Mon, 30 Mar 2015 14:42:59 +0000 (UTC) Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D34DDC73 for ; Mon, 30 Mar 2015 14:42:58 +0000 (UTC) Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailnew.nyi.internal (Postfix) with ESMTP id 95B751569 for ; Mon, 30 Mar 2015 10:42:47 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute6.internal (MEProxy); Mon, 30 Mar 2015 10:42:50 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=NwoZ3ionWpkPbsP +GiRUCBF9bRg=; b=LsumTWODST+L7UNIDsJdCz/QJ2Jn6uCeem8ppbp9o/sDOxo 1iUbs7F/HwpMUFy7gORBY7RIKf6kRSSKz5aUPECNKIzSTCuKYqBcq2HDJPYuncvG Gv4KWasv9P5++l0KCBypwZFE8x8loUlNetxwB089AQ6M+irS+qu962a0Z2Lw= Received: by web3.nyi.internal (Postfix, from userid 99) id 03FCD10921B; Mon, 30 Mar 2015 10:42:49 -0400 (EDT) Message-Id: <1427726569.287253.247062873.1F56BD89@webmail.messagingengine.com> X-Sasl-Enc: kpRJfFQ7P2ukOm/DomX9pJdvsdYL5uR2xZup4MN8g6uM 1427726569 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: MessagingEngine.com Webmail Interface - ajax-0b3c2300 Subject: Re: error in https://wiki.freebsd.org/HAST Date: Mon, 30 Mar 2015 09:42:49 -0500 In-Reply-To: References: X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 14:42:59 -0000 On Fri, Mar 27, 2015, at 13:56, Claude Morin wrote: > I note that the last edit to that page occurred more than three years > ago, so I realize the following may not be helpful :-). >=20 > =A7 "Replication modes" states: > Currently only the first replication mode described below is > supported... >=20 > However the second mode is the only implemented mode: > * The ordering of the three modes is 1) memsync, 2) fullsync, 3) > async. > * The first and third entries finish with "...currently not > implemented." > * The second entry finishes with "...is the default." >=20 > May I suggest the following? > Currently only the "fullsync" replication mode described below is > supported... >=20 Aren't all three modes now implemented and supported? From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 15:05:37 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C7C83BEF for ; Mon, 30 Mar 2015 15:05:37 +0000 (UTC) Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 985C5EBB for ; Mon, 30 Mar 2015 15:05:37 +0000 (UTC) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailnew.nyi.internal (Postfix) with ESMTP id 8B71414E0 for ; Mon, 30 Mar 2015 11:05:33 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute5.internal (MEProxy); Mon, 30 Mar 2015 11:05:36 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=rPkfvNZHaYl+RiD QUdFsgFTa9l8=; b=iJi8vKdQ5TuxtN7hFwIEsJTgdsR8efLtH1sJkmKgBUJOllE udaA1gvf8TPpGCU4q4e/1+NqyPvnhuSWmnOSOHtm01YaLBkzPo5AH1bu2RlUcXpw 0NcV6W0XxETQAGsCdq82tH7HhTUFu0HsSUFWqiUq7AsKOl+n8J1AALFI1Z9U= Received: by web3.nyi.internal (Postfix, from userid 99) id 64FC910A16B; Mon, 30 Mar 2015 11:05:36 -0400 (EDT) Message-Id: <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> X-Sasl-Enc: s5rOi9V2faa09bCunxSCjlJtwsFv6zUDLrvn54VxUv80 1427727936 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-0b3c2300 Subject: Re: Little research how rm -rf and tar kill server Date: Mon, 30 Mar 2015 10:05:36 -0500 In-Reply-To: <55170D9C.1070107@artem.ru> References: <55170D9C.1070107@artem.ru> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 15:05:37 -0000 On Sat, Mar 28, 2015, at 15:22, Artem Kuchin wrote: > > So, questions and thoughts: > 1) Why i had no problem such this in fbsd 9? I think the reason for the > problem is in kernel, not in hardware or mariadb+nginx because server > load did not > increase at all, even decreases a little. This is only anecdotal until you run the exact same OS version and load on the old hardware which had a different motherboard and (probably) disk controller. Can you provide us with any further hardware details between the two systems? > 2) I consider it a sever bug, because even normal used (and i have > plenty of them using ssh) can eventually do rm -rf and kill all sites. > Which means there are > must be some way to limit io usage per user > ZFS has far superior disk IO scheduling. You may have more success with that. Unfortunately I'm not aware of a way to limit user/process disk IO on FreeBSD. From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 15:45:40 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8AEC2C72 for ; Mon, 30 Mar 2015 15:45:40 +0000 (UTC) Received: from smtp21.mail.ru (smtp21.mail.ru [94.100.179.250]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5DA293B6 for ; Mon, 30 Mar 2015 15:45:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=EvG6BJ8oY3+arXzPURoJjq9z9OPR8Xr0iJDL0ItbKc0=; b=NBCusfHC503in5in3pUPwQhKB0L3BT6WgCM/nNs5wdqv5h/hTe1Bn4UFhPp8PQigLZUcWmWMMyiqmKgaV7yNDOZxqGN5bhgrC9XWCNETJ3fEZfv0pBHBiWu4F//bGjXPTflst+ao3OBHZZB6/eU5RZEzG23SdOM0fEOaYF+Gl84=; Received: from [109.188.127.13] (port=64864 helo=[192.168.0.12]) by smtp21.mail.ru with esmtpa (envelope-from ) id 1Ycbsd-0002q5-66 for freebsd-fs@freebsd.org; Mon, 30 Mar 2015 18:45:35 +0300 Message-ID: <55196FC7.8090107@artem.ru> Date: Mon, 30 Mar 2015 18:46:15 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Little research how rm -rf and tar kill server References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> In-Reply-To: <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> X-Spam: Not detected X-Mras: Ok Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 15:45:40 -0000 30.03.2015 18:05, Mark Felder пишет: > > > On Sat, Mar 28, 2015, at 15:22, Artem Kuchin wrote: >> >> So, questions and thoughts: 1) Why i had no problem such this in >> fbsd 9? I think the reason for the problem is in kernel, not in >> hardware or mariadb+nginx because server load did not increase at >> all, even decreases a little. > > This is only anecdotal until you run the exact same OS version and > load on the old hardware which had a different motherboard and > (probably) disk controller. Can you provide us with any further > hardware details between the two systems? My point was that the change what causes it was in freebsd kernel or ufs driver or sata driver or somewhere there, not the hardware.I even left the Kernel config file the same. I cannot provide exact hardware and dmesg for the old server (it is down and not available), but it was Intel Xeon E3-1245 /Quad Core/ 16 GB DDR3 ECC RAM 2x 3 TB SATA III Enterprise HDD (seagate ST33000650NS) New server kernel: Copyright (c) 1992-2015 The FreeBSD Project. kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 kernel: The Regents of the University of California. All rights reserved. kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. kernel: FreeBSD 10.1-STABLE #0 r279278: Wed Feb 25 15:17:48 MSK 2015 kernel: FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512 kernel: CPU: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz (3500.07-MHz K8-class CPU) kernel: Origin="GenuineIntel" Id=0x306c3 Family=0x6 Model=0x3c Stepping=3 kernel: Features=0xbfebfbff kernel: Features2=0x7ffafbff,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AE kernel: AMD Features=0x2c100800 kernel: AMD Features2=0x21 kernel: Structured Extended Features=0x2fbb kernel: XSAVE Features=0x1 kernel: VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID kernel: TSC: P-state invariant, performance statistics kernel: real memory = 34359738368 (32768 MB) kernel: avail memory = 33271001088 (31729 MB) kernel: Event timer "LAPIC" quality 600 kernel: ACPI APIC Table: kernel: FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs kernel: FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 SMT threads kernel: cpu0 (BSP): APIC ID: 0 kernel: cpu1 (AP): APIC ID: 1 kernel: cpu2 (AP): APIC ID: 2 kernel: cpu3 (AP): APIC ID: 3 kernel: cpu4 (AP): APIC ID: 4 kernel: cpu5 (AP): APIC ID: 5 kernel: cpu6 (AP): APIC ID: 6 kernel: cpu7 (AP): APIC ID: 7 kernel: ioapic0 irqs 0-23 on motherboard kernel: random: initialized kernel: module_register_init: MOD_LOAD (vesa, 0xffffffff80b6a090, 0) error 19 kernel: kbd1 at kbdmux0 kernel: module_register_init: MOD_LOAD (accf_data, 0xffffffff807e9af0, 0xffffffff8155d0c8) error 17 kernel: acpi0: on motherboard kernel: acpi0: Power Button (fixed) kernel: cpu0: on acpi0 kernel: cpu1: on acpi0 kernel: cpu2: on acpi0 kernel: cpu3: on acpi0 kernel: cpu4: on acpi0 kernel: cpu5: on acpi0 kernel: cpu6: on acpi0 kernel: cpu7: on acpi0 kernel: hpet0: iomem 0xfed00000-0xfed003ff on acpi0 kernel: Timecounter "HPET" frequency 14318180 Hz quality 950 kernel: Event timer "HPET" frequency 14318180 Hz quality 550 kernel: atrtc0: port 0x70-0x77 irq 8 on acpi0 kernel: Event timer "RTC" frequency 32768 Hz quality 0 kernel: attimer0: port 0x40-0x43,0x50-0x53 irq 0 on acpi0 kernel: Timecounter "i8254" frequency 1193182 Hz quality 0 kernel: Event timer "i8254" frequency 1193182 Hz quality 100 kernel: Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 kernel: acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1808-0x180b on acpi0 kernel: pcib0: port 0xcf8-0xcff on acpi0 kernel: pci0: on pcib0 kernel: xhci0: mem 0xf7220000-0xf722ffff irq 16 at device 20.0 on pci0 kernel: xhci0: 32 bytes context size, 64-bit DMA kernel: xhci0: Port routing mask set to 0xffffffff kernel: usbus0 on xhci0 kernel: em0: port 0xf020-0xf03f mem 0xf7200000-0xf721ffff,0xf7235000-0xf7235fff irq 20 at device 25.0 on pci0 kernel: em0: Using an MSI interrupt kernel: em0: Ethernet address: 00:25:90:47:42:57 kernel: ehci0: mem 0xf7234000-0xf72343ff irq 16 at device 26.0 on pci0 kernel: usbus1: EHCI version 1.0 kernel: usbus1 on ehci0 kernel: pcib1: irq 16 at device 28.0 on pci0 kernel: pci1: on pcib1 kernel: pcib2: at device 0.0 on pci1 kernel: pci2: on pcib2 kernel: vgapci0: port 0xe000-0xe07f mem 0xf6000000-0xf6ffffff,0xf7000000-0xf701ffff irq 16 at device 0.0 on pci2 kernel: vgapci0: Boot video device kernel: pcib3: irq 17 at device 28.1 on pci0 kernel: pci3: on pcib3 kernel: igb0: port 0xd000-0xd01f mem 0xf7100000-0xf717ffff,0xf7180000-0xf7183fff irq 17 at device 0.0 on pci3 kernel: igb0: Using MSIX interrupts with 5 vectors kernel: igb0: Ethernet address: 00:25:90:47:42:56 kernel: igb0: Bound queue 0 to cpu 0 kernel: igb0: Bound queue 1 to cpu 1 kernel: igb0: Bound queue 2 to cpu 2 kernel: igb0: Bound queue 3 to cpu 3 kernel: ehci1: mem 0xf7233000-0xf72333ff irq 23 at device 29.0 on pci0 kernel: usbus2: EHCI version 1.0 kernel: usbus2 on ehci1 kernel: isab0: at device 31.0 on pci0 kernel: isa0: on isab0 kernel: ahci0: port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf000-0xf01f mem 0xf7232000-0xf72327ff irq 19 at device 31.2 on pci0 kernel: ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported kernel: ahcich0: at channel 0 on ahci0 kernel: ahcich1: at channel 1 on ahci0 kernel: ahcich2: at channel 2 on ahci0 kernel: ahcich3: at channel 3 on ahci0 kernel: ahcich4: at channel 4 on ahci0 kernel: ahcich5: at channel 5 on ahci0 kernel: ahciem0: on ahci0 kernel: acpi_button0: on acpi0 kernel: acpi_button1: on acpi0 kernel: acpi_tz0: on acpi0 kernel: acpi_tz1: on acpi0 kernel: uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 kernel: uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 kernel: orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa0 kernel: sc0: at flags 0x100 on isa0 kernel: sc0: CGA <16 virtual consoles, flags=0x300> kernel: vga0: at port 0x3d0-0x3db iomem 0xb8000-0xbffff on isa0 kernel: ppc0: cannot reserve I/O port range kernel: est0: on cpu0 kernel: est1: on cpu1 kernel: est2: on cpu2 kernel: est3: on cpu3 kernel: est4: on cpu4 kernel: est5: on cpu5 kernel: est6: on cpu6 kernel: est7: on cpu7 kernel: random: unblocking device. kernel: usbus0: 5.0Gbps Super Speed USB v3.0 kernel: Timecounters tick every 1.000 msec kernel: ipfw2 (+ipv6) initialized, divert enabled, nat loadable, default to accept, logging disabled kernel: DUMMYNET 0 with IPv6 initialized (100409) kernel: load_dn_sched dn_sched FIFO loaded kernel: load_dn_sched dn_sched PRIO loaded kernel: load_dn_sched dn_sched QFQ loaded kernel: load_dn_sched dn_sched RR loaded kernel: load_dn_sched dn_sched WF2Q+ loaded kernel: usbus1: 480Mbps High Speed USB v2.0 kernel: usbus2: 480Mbps High Speed USB v2.0 kernel: ugen1.1: at usbus1 kernel: uhub0: on usbus1 kernel: ugen0.1: <0x8086> at usbus0 kernel: ugen2.1: at usbus2 kernel: uhub1: on usbus2 kernel: uhub2: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 kernel: ses0 at ahciem0 bus 0 scbus6 target 0 lun 0 kernel: ses0: SEMB S-E-S 2.00 device kernel: ses0: SEMB SES Device kernel: ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 kernel: ada0: ATA-8 SATA 3.x device kernel: ada0: Serial Number 64O4281GS kernel: ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) kernel: ada0: Command Queueing enabled kernel: ada0: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C) kernel: ada0: Previously was known as ad4 kernel: ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 kernel: ada1: ATA-8 SATA 3.x device kernel: ada1: Serial Number 64O43WZGS kernel: ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) kernel: ada1: Command Queueing enabled kernel: ada1: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C) kernel: ada1: Previously was known as ad6 kernel: SMP: AP CPU #1 Launched! kernel: SMP: AP CPU #4 Launched! kernel: SMP: AP CPU #2 Launched! kernel: SMP: AP CPU #3 Launched! kernel: SMP: AP CPU #6 Launched! kernel: SMP: AP CPU #7 Launched! kernel: SMP: AP CPU #5 Launched! kernel: Timecounter "TSC-low" frequency 1750036200 Hz quality 1000 kernel: GEOM_MIRROR: Device mirror/boot launched (2/2). kernel: GEOM_MIRROR: Device mirror/swap launched (2/2). kernel: GEOM_MIRROR: Device mirror/root launched (2/2). kernel: Root mount waiting for: usbus2 usbus1 usbus0 kernel: uhub2: 17 ports with 17 removable, self powered kernel: uhub0: 2 ports with 2 removable, self powered kernel: uhub1: 2 ports with 2 removable, self powered kernel: Root mount waiting for: usbus2 usbus1 usbus0 kernel: ugen0.2: at usbus0 kernel: uhub3: on usbus0 kernel: uhub3: 4 ports with 3 removable, self powered kernel: ugen1.2: at usbus1 kernel: uhub4: on usbus1 kernel: ugen2.2: at usbus2 kernel: uhub5: on usbus2 kernel: uhub4: 4 ports with 4 removable, self powered kernel: uhub5: 6 ports with 6 removable, self powered kernel: ugen0.3: at usbus0 kernel: ukbd0: on usbus0 kernel: kbd0 at ukbd0 kernel: ums0: on usbus0 kernel: ums0: 3 buttons and [Z] coordinates ID=0 kernel: Trying to mount root from ufs:/dev/mirror/root [rw,async,groupquota,noatime]... > > >> 2) I consider it a sever bug, because even normal used (and i have >> plenty of them using ssh) can eventually do rm -rf and kill all >> sites. Which means there are must be some way to limit io usage per >> user >> > > ZFS has far superior disk IO scheduling. You may have more success > with that. Unfortunately I'm not aware of a way to limit user/process > disk IO on FreeBSD. I hesitatated migrating to ZFS because "don't fix it if it ain't broken". And it was not broken.. until now. But now i cannot migrate to ZFS - server is too busy and actually i am afraid something bad will come up with ZFS eventually too. Well, if not limit, then maybe just monitor and kill user processes that create problems. But the problem is how to determine which processes overuse the io system. I have very little clue how to collect such information. Artem From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 15:49:59 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 219E213E for ; Mon, 30 Mar 2015 15:49:59 +0000 (UTC) Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E7178403 for ; Mon, 30 Mar 2015 15:49:58 +0000 (UTC) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailnew.nyi.internal (Postfix) with ESMTP id 8A8341169 for ; Mon, 30 Mar 2015 11:49:54 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute1.internal (MEProxy); Mon, 30 Mar 2015 11:49:57 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=H8fDdZ1l4O7izVb UzFROD8LfbgE=; b=Yg3wdgxfF5naAtbvWX1vGJyyV3FcGrEAW5rN9Xwsvk1/uCg UO7qIg54QBvHwGRimHZEC2If15T+f+mfYridrrxKgUBoWDiHiA+z42+nwnYxJU/7 Nyw9vxZiX5/uCd0iDICBt8FsnH7qH1nsjt+h0bMVH5PcfeijxvHmo5xizcIo= Received: by web3.nyi.internal (Postfix, from userid 99) id 75A8C10BD71; Mon, 30 Mar 2015 11:49:57 -0400 (EDT) Message-Id: <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> X-Sasl-Enc: xtWEx95Zp/MAY3s6NRVDwU3M50ATFhSfnWhb1wzaif1Q 1427730597 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-0b3c2300 In-Reply-To: <55196FC7.8090107@artem.ru> References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> Subject: Re: Little research how rm -rf and tar kill server Date: Mon, 30 Mar 2015 10:49:57 -0500 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 15:49:59 -0000 Just to feed my own curiosity, can you paste the output of this sysctl? sysctl kern.eventtimer Thanks From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 15:52:49 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7690F327 for ; Mon, 30 Mar 2015 15:52:49 +0000 (UTC) Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [94.100.177.100]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EAB106BA for ; Mon, 30 Mar 2015 15:52:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=f8d9mHHJymo+NNZdpiPjRwfVXs0NJR6tYKefLbVfjKw=; b=ad9H/JkdXyvpkhJrTFT+j8HMO9PgSGwpammlu2jauaiZsIoz1bPoAE+sUuvZfhJa/xz9dZNM5K6vqKt0L+RfsziNxeE3raOXUpzd6xiT04Ho+m7XogZOnRu9ztbZahGpJ3dWjsS76xnaWaAJVaKrZ1zdAaVsK1DskxslC+yHDek=; Received: from [109.188.127.13] (port=33295 helo=[192.168.0.12]) by smtp40.i.mail.ru with esmtpa (envelope-from ) id 1YcbzU-0007zt-MQ for freebsd-fs@freebsd.org; Mon, 30 Mar 2015 18:52:45 +0300 Message-ID: <5519716F.6060007@artem.ru> Date: Mon, 30 Mar 2015 18:53:19 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Little research how rm -rf and tar kill server References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> In-Reply-To: <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 15:52:49 -0000 30.03.2015 18:49, Mark Felder пишет: > Just to feed my own curiosity, can you paste the output of this sysctl? > > > sysctl kern.eventtimer > > > Of course, i will provide any info. # sysctl kern.eventtimer kern.eventtimer.et.LAPIC.flags: 7 kern.eventtimer.et.LAPIC.frequency: 50001037 kern.eventtimer.et.LAPIC.quality: 600 kern.eventtimer.et.HPET.flags: 7 kern.eventtimer.et.HPET.frequency: 14318180 kern.eventtimer.et.HPET.quality: 550 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.choice: LAPIC(600) HPET(550) i8254(100) RTC(0) kern.eventtimer.singlemul: 2 kern.eventtimer.idletick: 0 kern.eventtimer.timer: LAPIC kern.eventtimer.periodic: 0 This is normal state, not under rm -rf Do you need it during rm -rf ? Artem From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 15:57:42 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C20E24F4 for ; Mon, 30 Mar 2015 15:57:42 +0000 (UTC) Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 90C9275A for ; Mon, 30 Mar 2015 15:57:42 +0000 (UTC) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailnew.nyi.internal (Postfix) with ESMTP id 5799F168B for ; Mon, 30 Mar 2015 11:57:38 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute3.internal (MEProxy); Mon, 30 Mar 2015 11:57:41 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=gz3gREHf6CEmY3q cdc8XCePuHu8=; b=ieKOv+93EHwzIbv/DzY3J8MZtGEDIMM6cCP+zSS0kdSkGhx +ZNFhqfkiShy3y7oRVvx/zVLQVSIbA3GsMBTVRoQA42afnC/Bu2rcKT6SlG+0DVq kB4wqpUKVXrtzRCC25lgefGFmCw2+IMrPGVgqJz+XhRJPpxFnq3tOPf6qUdU= Received: by web3.nyi.internal (Postfix, from userid 99) id 1E9D510A621; Mon, 30 Mar 2015 11:57:41 -0400 (EDT) Message-Id: <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> X-Sasl-Enc: V8i8M7cbEVC/K9XMqjSLfpXKa8ZRr6C+6PDzjsDaEU7/ 1427731061 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-0b3c2300 In-Reply-To: <5519716F.6060007@artem.ru> References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> Subject: Re: Little research how rm -rf and tar kill server Date: Mon, 30 Mar 2015 10:57:41 -0500 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 15:57:42 -0000 On Mon, Mar 30, 2015, at 10:53, Artem Kuchin wrote: > > This is normal state, not under rm -rf > Do you need it during rm -rf ? > No, but I wonder if changing the timer from LAPIC to HPET or possibly one of the other timers makes the system more responsive under that load. Would you mind testing that? You can switch the timer like this: sysctl kern.eventtimer.timer=HPET And then run some of your I/O tests The full list of available timers is under sysctl kern.eventtimer.choice -- you could try any of them, but the higher the number next to the name is the higher perceived "quality" of the timer by the system. Note this doesn't survive a reboot, but could be set in /etc/sysctl.conf or /boot/loader.conf From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 16:03:50 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 824F979B for ; Mon, 30 Mar 2015 16:03:50 +0000 (UTC) Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [94.100.177.100]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EC1BE84D for ; Mon, 30 Mar 2015 16:03:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=89NKfQwkV17ifjc+5ihp5RuC0m+Xrcimsb5ot/Dl+Hc=; b=QYLjVxY6fozQF6VfCqDwvCgVYeybDFTNl5POHjtC23PfWoF3K2qbLHuVVJ99NNxrBZ8mIto6cdpb8icE2qSpUBraEFX24SUb/VW/Z9swb+tE8yhHGWIftvD6fOM6hgFWqpebiXLEB1y8GME1lLA4TXvmpYibcfC+6WUCCVE+OLY=; Received: from [109.188.127.13] (port=56181 helo=[192.168.0.12]) by smtp40.i.mail.ru with esmtpa (envelope-from ) id 1YccAE-0007CW-5x for freebsd-fs@freebsd.org; Mon, 30 Mar 2015 19:03:46 +0300 Message-ID: <5519740A.1070902@artem.ru> Date: Mon, 30 Mar 2015 19:04:26 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Little research how rm -rf and tar kill server References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> In-Reply-To: <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 16:03:50 -0000 30.03.2015 18:57, Mark Felder пишет: > > On Mon, Mar 30, 2015, at 10:53, Artem Kuchin wrote: >> This is normal state, not under rm -rf >> Do you need it during rm -rf ? >> > No, but I wonder if changing the timer from LAPIC to HPET or possibly > one of the other timers makes the system more responsive under that > load. Would you mind testing that? > > You can switch the timer like this: > > sysctl kern.eventtimer.timer=HPET > > And then run some of your I/O tests > I see. I will test at night, when load goes down. I cannot say sure that's a right way to dig, but i will test anything :) Just to remind: untar overloads the system, but untar + sync every 120s does not. That seems very strange to me. I think the problem might be somewhere here. Artem From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 16:09:20 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EB1A0B75 for ; Mon, 30 Mar 2015 16:09:20 +0000 (UTC) Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B84D6897 for ; Mon, 30 Mar 2015 16:09:20 +0000 (UTC) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailnew.nyi.internal (Postfix) with ESMTP id B40081308 for ; Mon, 30 Mar 2015 12:09:16 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute5.internal (MEProxy); Mon, 30 Mar 2015 12:09:19 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=gcEhBRj6SAhYpyW QyyG60I4WRco=; b=GYHsbjiZQOs+TnCfSQeHQacd5QiOpbsF8XFGs1oKTvyYge/ 1ueIwY1UFLvr8yad99S/meWqwtSdvTNhsVet2uCn0Hc13RjIuV/AjeskQlscoIKE lKkiDaI5fPczSKHoQiCS021EDEx7VnSikTIUVNA9qPdRl64m4hXvPfZtIluo= Received: by web3.nyi.internal (Postfix, from userid 99) id 5AD1510C486; Mon, 30 Mar 2015 12:09:19 -0400 (EDT) Message-Id: <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> X-Sasl-Enc: vXJtnszXY9NpZw+QK9YECx2yITMYdLE9WkVdytgYZpZX 1427731759 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-Mailer: MessagingEngine.com Webmail Interface - ajax-0b3c2300 Subject: Re: Little research how rm -rf and tar kill server Date: Mon, 30 Mar 2015 11:09:19 -0500 In-Reply-To: <5519740A.1070902@artem.ru> References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 16:09:21 -0000 On Mon, Mar 30, 2015, at 11:04, Artem Kuchin wrote: > 30.03.2015 18:57, Mark Felder =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > > > > On Mon, Mar 30, 2015, at 10:53, Artem Kuchin wrote: > >> This is normal state, not under rm -rf > >> Do you need it during rm -rf ? > >> > > No, but I wonder if changing the timer from LAPIC to HPET or possibly > > one of the other timers makes the system more responsive under that > > load. Would you mind testing that? > > > > You can switch the timer like this: > > > > sysctl kern.eventtimer.timer=3DHPET > > > > And then run some of your I/O tests > > >=20 > I see. I will test at night, when load goes down. > I cannot say sure that's a right way to dig, but i will test anything :) >=20 > Just to remind: untar overloads the system, but untar + sync every 120s=20 > does not. > That seems very strange to me. I think the problem might be somewhere > here. >=20 I just heard from mav that there was a bottleneck in gmirror/graid with regards to BIO_DELETE requests https://svnweb.freebsd.org/base?view=3Drevision&revision=3D280757 From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 16:17:12 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4B5023C6 for ; Mon, 30 Mar 2015 16:17:12 +0000 (UTC) Received: from smtp21.mail.ru (smtp21.mail.ru [94.100.179.250]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B7D829F6 for ; Mon, 30 Mar 2015 16:17:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=E3t3fX8LyVqCbIjvSVJ4eSMkEe7IZmlqK2SuY+dzgeo=; b=gqj/QGSXbsmvKRFAveCR28aE9CQAnosQaUgeC01qrLW3SxthrlBgV/zHUvDys363CW2o2qhaku4wjAvEctfkqNeUPdOKrFBIfnmTqyxT0+VRCg6lHHRa319L34agdGZtkGQy9qPOx9+8jbTc7YVEAVg62bi88rkMvFXsyxCna9g=; Received: from [109.188.127.13] (port=6292 helo=[192.168.0.12]) by smtp21.mail.ru with esmtpa (envelope-from ) id 1YccNA-0007LN-9Y for freebsd-fs@freebsd.org; Mon, 30 Mar 2015 19:17:08 +0300 Message-ID: <5519772C.4010106@artem.ru> Date: Mon, 30 Mar 2015 19:17:48 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Little research how rm -rf and tar kill server References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> In-Reply-To: <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 16:17:12 -0000 30.03.2015 19:09, Mark Felder пишет: > > On Mon, Mar 30, 2015, at 11:04, Artem Kuchin wrote: >> 30.03.2015 18:57, Mark Felder пишет: >>> On Mon, Mar 30, 2015, at 10:53, Artem Kuchin wrote: >>>> This is normal state, not under rm -rf >>>> Do you need it during rm -rf ? >>>> >>> No, but I wonder if changing the timer from LAPIC to HPET or possibly >>> one of the other timers makes the system more responsive under that >>> load. Would you mind testing that? >>> >>> You can switch the timer like this: >>> >>> sysctl kern.eventtimer.timer=HPET >>> >>> And then run some of your I/O tests >>> >> I see. I will test at night, when load goes down. >> I cannot say sure that's a right way to dig, but i will test anything :) >> >> Just to remind: untar overloads the system, but untar + sync every 120s >> does not. >> That seems very strange to me. I think the problem might be somewhere >> here. >> > I just heard from mav that there was a bottleneck in gmirror/graid with > regards to BIO_DELETE requests > Well, i have plenty of CPU power left if you look at TOP screenshots. But i will try this patch too. If i update sources of 10-STABLE and rebuild world will this patch be there? But still, this does not explain why untar need fsync to pay nice. Artem From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 21:24:07 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 337A555A for ; Mon, 30 Mar 2015 21:24:07 +0000 (UTC) Received: from internet06.ebureau.com (internet06.ebureau.com [65.127.24.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "internet06.ebureau.com", Issuer "internet06.ebureau.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D628D94D for ; Mon, 30 Mar 2015 21:24:06 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by internet06.ebureau.com (Postfix) with ESMTP id 89BD337F4B96 for ; Mon, 30 Mar 2015 16:14:35 -0500 (CDT) X-Virus-Scanned: amavisd-new at ebureau.com Received: from internet06.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ycsdlVNYRd1W for ; Mon, 30 Mar 2015 16:14:34 -0500 (CDT) Received: from square.office.ebureau.com (unknown [10.10.20.22]) by internet06.ebureau.com (Postfix) with ESMTPSA id 5C98A37F4B8B for ; Mon, 30 Mar 2015 16:14:34 -0500 (CDT) From: Dustin Wenz Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: All available memory used when deleting files from ZFS Message-Id: Date: Mon, 30 Mar 2015 16:14:34 -0500 To: "" Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) X-Mailer: Apple Mail (2.2070.6) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 21:24:07 -0000 I had several systems panic or hang over the weekend while deleting some = data off of their local zfs filesystem. It looks like they ran out of = physical memory (32GB), and hung when paging to swap-on-zfs (which is = not surprising, given that ZFS was likely using the memory). They were = running 10.1-STABLE r277139M, which I built in the middle of January. = The pools were about 35TB in size, and are a concatenation of 3TB = mirrors. They were maybe 95% full. I deleted just over 1000 files, = totaling 25TB on each system. It took roughly 10 minutes to remove that 25TB of data per host using a = remote rsync, and immediately after that everything seemed fine. = However, after several more minutes, every machine that had data removed = became unresponsive. Some had numerous "swap_pager: indefinite wait = buffer" errors followed by a panic, and some just died with no console = messages. The same thing would happen after a reboot, when FreeBSD = attempted to mount the local filesystem again. I was able to boot these systems after exporting the affected pool, but = the problem would recur several minutes after initiating a "zpool = import". Watching zfs statistics didn't seem to reveal where the memory = was going; ARC would only climb to about 4GB, but free memory would = decline rapidly. Eventually, after enough export/reboot/import cycles, = the pool would import successfully and everything would be fine from = then on. Note that there is no L2ARC or compression being used. Has anyone else run into this when deleting files on ZFS? It seems to be = a consistent problem under the versions of 10.1 I'm running. For reference, I've appended a zstat dump below that was taken 5 minutes = after starting a zpool import, and was about three minutes before the = machine became unresponsive. You can see that the ARC is only 4GB, but = free memory was down to 471MB (and continued to drop). - .Dustin ------------------------------------------------------------------------ ZFS Subsystem Report Mon Mar 30 12:35:27 2015 ------------------------------------------------------------------------ System Information: Kernel Version: 1001506 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 ZFS Storage pool Version: 5000 ZFS Filesystem Version: 5 FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 ------------------------------------------------------------------------ System Memory: 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact 98.34% 30.56 GiB Wired, 0.00% 0 Cache 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap Real Installed: 32.00 GiB Real Available: 99.82% 31.94 GiB Real Managed: 97.29% 31.08 GiB Logical Total: 32.00 GiB Logical Used: 98.56% 31.54 GiB Logical Free: 1.44% 471.57 MiB Kernel Memory: 3.17 GiB Data: 99.18% 3.14 GiB Text: 0.82% 26.68 MiB Kernel Memory Map: 31.08 GiB Size: 14.18% 4.41 GiB Free: 85.82% 26.67 GiB ------------------------------------------------------------------------ ARC Summary: (HEALTHY) Memory Throttle Count: 0 ARC Misc: Deleted: 145 Recycle Misses: 0 Mutex Misses: 0 Evict Skips: 0 ARC Size: 14.17% 4.26 GiB Target Size: (Adaptive) 100.00% 30.08 GiB Min Size (Hard Limit): 12.50% 3.76 GiB Max Size (High Water): 8:1 30.08 GiB ARC Size Breakdown: Recently Used Cache Size: 50.00% 15.04 GiB Frequently Used Cache Size: 50.00% 15.04 GiB ARC Hash Breakdown: Elements Max: 270.56k Elements Current: 100.00% 270.56k Collisions: 23.66k Chain Max: 3 Chains: 8.28k ------------------------------------------------------------------------ ARC Efficiency: 2.93m Cache Hit Ratio: 70.44% 2.06m Cache Miss Ratio: 29.56% 866.05k Actual Hit Ratio: 70.40% 2.06m Data Demand Efficiency: 97.47% 24.58k Data Prefetch Efficiency: 1.88% 479 CACHE HITS BY CACHE LIST: Anonymously Used: 0.05% 1.07k Most Recently Used: 71.82% 1.48m Most Frequently Used: 28.13% 580.49k Most Recently Used Ghost: 0.00% 0 Most Frequently Used Ghost: 0.00% 0 CACHE HITS BY DATA TYPE: Demand Data: 1.16% 23.96k Prefetch Data: 0.00% 9 Demand Metadata: 98.79% 2.04m Prefetch Metadata: 0.05% 1.08k CACHE MISSES BY DATA TYPE: Demand Data: 0.07% 621 Prefetch Data: 0.05% 470 Demand Metadata: 99.69% 863.35k Prefetch Metadata: 0.19% 1.61k ------------------------------------------------------------------------ L2ARC is disabled ------------------------------------------------------------------------ File-Level Prefetch: (HEALTHY) DMU Efficiency: 72.95k Hit Ratio: 70.83% 51.66k Miss Ratio: 29.17% 21.28k Colinear: 21.28k Hit Ratio: 0.01% 2 Miss Ratio: 99.99% 21.28k Stride: 50.45k Hit Ratio: 99.98% 50.44k Miss Ratio: 0.02% 9 DMU Misc: Reclaim: 21.28k Successes: 1.73% 368 Failures: 98.27% 20.91k Streams: 1.23k +Resets: 0.16% 2 -Resets: 99.84% 1.23k Bogus: 0 ------------------------------------------------------------------------ VDEV cache is disabled ------------------------------------------------------------------------ ZFS Tunables (sysctl): kern.maxusers 2380 vm.kmem_size 33367830528 vm.kmem_size_scale 1 vm.kmem_size_min 0 vm.kmem_size_max 1319413950874 vfs.zfs.arc_max 32294088704 vfs.zfs.arc_min 4036761088 vfs.zfs.arc_average_blocksize 8192 vfs.zfs.arc_shrink_shift 5 vfs.zfs.arc_free_target 56518 vfs.zfs.arc_meta_used 4534349216 vfs.zfs.arc_meta_limit 8073522176 vfs.zfs.l2arc_write_max 8388608 vfs.zfs.l2arc_write_boost 8388608 vfs.zfs.l2arc_headroom 2 vfs.zfs.l2arc_feed_secs 1 vfs.zfs.l2arc_feed_min_ms 200 vfs.zfs.l2arc_noprefetch 1 vfs.zfs.l2arc_feed_again 1 vfs.zfs.l2arc_norw 1 vfs.zfs.anon_size 1786368 vfs.zfs.anon_metadata_lsize 0 vfs.zfs.anon_data_lsize 0 vfs.zfs.mru_size 504812032 vfs.zfs.mru_metadata_lsize 415273472 vfs.zfs.mru_data_lsize 35227648 vfs.zfs.mru_ghost_size 0 vfs.zfs.mru_ghost_metadata_lsize 0 vfs.zfs.mru_ghost_data_lsize 0 vfs.zfs.mfu_size 3925990912 vfs.zfs.mfu_metadata_lsize 3901947392 vfs.zfs.mfu_data_lsize 7000064 vfs.zfs.mfu_ghost_size 0 vfs.zfs.mfu_ghost_metadata_lsize 0 vfs.zfs.mfu_ghost_data_lsize 0 vfs.zfs.l2c_only_size 0 vfs.zfs.dedup.prefetch 1 vfs.zfs.nopwrite_enabled 1 vfs.zfs.mdcomp_disable 0 vfs.zfs.max_recordsize 1048576 vfs.zfs.dirty_data_max 3429735628 vfs.zfs.dirty_data_max_max 4294967296 vfs.zfs.dirty_data_max_percent 10 vfs.zfs.dirty_data_sync 67108864 vfs.zfs.delay_min_dirty_percent 60 vfs.zfs.delay_scale 500000 vfs.zfs.prefetch_disable 0 vfs.zfs.zfetch.max_streams 8 vfs.zfs.zfetch.min_sec_reap 2 vfs.zfs.zfetch.block_cap 256 vfs.zfs.zfetch.array_rd_sz 1048576 vfs.zfs.top_maxinflight 32 vfs.zfs.resilver_delay 2 vfs.zfs.scrub_delay 4 vfs.zfs.scan_idle 50 vfs.zfs.scan_min_time_ms 1000 vfs.zfs.free_min_time_ms 1000 vfs.zfs.resilver_min_time_ms 3000 vfs.zfs.no_scrub_io 0 vfs.zfs.no_scrub_prefetch 0 vfs.zfs.free_max_blocks -1 vfs.zfs.metaslab.gang_bang 16777217 vfs.zfs.metaslab.fragmentation_threshold70 vfs.zfs.metaslab.debug_load 0 vfs.zfs.metaslab.debug_unload 0 vfs.zfs.metaslab.df_alloc_threshold 131072 vfs.zfs.metaslab.df_free_pct 4 vfs.zfs.metaslab.min_alloc_size 33554432 vfs.zfs.metaslab.load_pct 50 vfs.zfs.metaslab.unload_delay 8 vfs.zfs.metaslab.preload_limit 3 vfs.zfs.metaslab.preload_enabled 1 vfs.zfs.metaslab.fragmentation_factor_enabled1 vfs.zfs.metaslab.lba_weighting_enabled 1 vfs.zfs.metaslab.bias_enabled 1 vfs.zfs.condense_pct 200 vfs.zfs.mg_noalloc_threshold 0 vfs.zfs.mg_fragmentation_threshold 85 vfs.zfs.check_hostid 1 vfs.zfs.spa_load_verify_maxinflight 10000 vfs.zfs.spa_load_verify_metadata 1 vfs.zfs.spa_load_verify_data 1 vfs.zfs.recover 0 vfs.zfs.deadman_synctime_ms 1000000 vfs.zfs.deadman_checktime_ms 5000 vfs.zfs.deadman_enabled 1 vfs.zfs.spa_asize_inflation 24 vfs.zfs.spa_slop_shift 5 vfs.zfs.space_map_blksz 4096 vfs.zfs.txg.timeout 5 vfs.zfs.vdev.metaslabs_per_vdev 200 vfs.zfs.vdev.cache.max 16384 vfs.zfs.vdev.cache.size 0 vfs.zfs.vdev.cache.bshift 16 vfs.zfs.vdev.trim_on_init 1 vfs.zfs.vdev.mirror.rotating_inc 0 vfs.zfs.vdev.mirror.rotating_seek_inc 5 vfs.zfs.vdev.mirror.rotating_seek_offset1048576 vfs.zfs.vdev.mirror.non_rotating_inc 0 vfs.zfs.vdev.mirror.non_rotating_seek_inc1 vfs.zfs.vdev.async_write_active_min_dirty_percent30 vfs.zfs.vdev.async_write_active_max_dirty_percent60 vfs.zfs.vdev.max_active 1000 vfs.zfs.vdev.sync_read_min_active 10 vfs.zfs.vdev.sync_read_max_active 10 vfs.zfs.vdev.sync_write_min_active 10 vfs.zfs.vdev.sync_write_max_active 10 vfs.zfs.vdev.async_read_min_active 1 vfs.zfs.vdev.async_read_max_active 3 vfs.zfs.vdev.async_write_min_active 1 vfs.zfs.vdev.async_write_max_active 10 vfs.zfs.vdev.scrub_min_active 1 vfs.zfs.vdev.scrub_max_active 2 vfs.zfs.vdev.trim_min_active 1 vfs.zfs.vdev.trim_max_active 64 vfs.zfs.vdev.aggregation_limit 131072 vfs.zfs.vdev.read_gap_limit 32768 vfs.zfs.vdev.write_gap_limit 4096 vfs.zfs.vdev.bio_flush_disable 0 vfs.zfs.vdev.bio_delete_disable 0 vfs.zfs.vdev.trim_max_bytes 2147483648 vfs.zfs.vdev.trim_max_pending 64 vfs.zfs.max_auto_ashift 13 vfs.zfs.min_auto_ashift 9 vfs.zfs.zil_replay_disable 0 vfs.zfs.cache_flush_disable 0 vfs.zfs.zio.use_uma 1 vfs.zfs.zio.exclude_metadata 0 vfs.zfs.sync_pass_deferred_free 2 vfs.zfs.sync_pass_dont_compress 5 vfs.zfs.sync_pass_rewrite 2 vfs.zfs.snapshot_list_prefetch 0 vfs.zfs.super_owner 0 vfs.zfs.debug 0 vfs.zfs.version.ioctl 4 vfs.zfs.version.acl 1 vfs.zfs.version.spa 5000 vfs.zfs.version.zpl 5 vfs.zfs.vol.mode 1 vfs.zfs.vol.unmap_enabled 1 vfs.zfs.trim.enabled 1 vfs.zfs.trim.txg_delay 32 vfs.zfs.trim.timeout 30 vfs.zfs.trim.max_interval 1 ------------------------------------------------------------------------ From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 21:49:36 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D250325E for ; Mon, 30 Mar 2015 21:49:36 +0000 (UTC) Received: from smtp20.mail.ru (smtp20.mail.ru [94.100.179.251]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FF27CAD for ; Mon, 30 Mar 2015 21:49:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=6BDSuaB2JUjwRkpio8rERW9MxJJ/2PfZ3P3VdoX6w3o=; b=AcesHOoGmA0JE6sk17y4Rv+yDo/OcI53cdf50PsE25ltWKq6eLoEZYomb8Axdbs0cfyK6ZX0aLGtJOKBJI0fcGp31eZ33NkL+mdETbuBxlAgQDVHQOCURN7c7WXD9iEHAIfxHzlQSxrvD+4ilx+D7pb7Kfz0Wl/Q3A0Pu9prC2A=; Received: from [109.188.127.13] (port=21350 helo=[192.168.0.12]) by smtp20.mail.ru with esmtpa (envelope-from ) id 1YchYk-0006Sq-H0 for freebsd-fs@freebsd.org; Tue, 31 Mar 2015 00:49:26 +0300 Message-ID: <5519C523.7090009@artem.ru> Date: Tue, 31 Mar 2015 00:50:27 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Little research how rm -rf and tar kill server References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> In-Reply-To: <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 21:49:36 -0000 30.03.2015 18:57, Mark Felder пишет: > > On Mon, Mar 30, 2015, at 10:53, Artem Kuchin wrote: >> This is normal state, not under rm -rf >> Do you need it during rm -rf ? >> > No, but I wonder if changing the timer from LAPIC to HPET or possibly > one of the other timers makes the system more responsive under that > load. Would you mind testing that? > > You can switch the timer like this: > > sysctl kern.eventtimer.timer=HPET > > And then run some of your I/O tests > > The full list of available timers is under sysctl kern.eventtimer.choice > -- you could try any of them, but the higher the number next to the name > is the higher perceived "quality" of the timer by the system. > > Tried them all with rm -rf did not notice any difference at all problems start pretty much after the amount of time and severity is the same ssh terminal is very responsive using all of them until i do something which need to access hdd Artem From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 21:54:10 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9E801797 for ; Mon, 30 Mar 2015 21:54:10 +0000 (UTC) Received: from fs.denninger.net (wsip-70-169-168-7.pn.at.cox.net [70.169.168.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "NewFS.denninger.net", Issuer "NewFS.denninger.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 33315D92 for ; Mon, 30 Mar 2015 21:54:09 +0000 (UTC) Received: from [192.168.1.6] (localhost [127.0.0.1]) by fs.denninger.net (8.14.9/8.14.8) with ESMTP id t2ULg7Po062164 for ; Mon, 30 Mar 2015 16:42:09 -0500 (CDT) (envelope-from karl@denninger.net) Received: from [192.168.1.6] (TLS/SSL) [173.65.72.120] by Spamblock-sys (LOCAL/AUTH); Mon Mar 30 16:42:09 2015 Message-ID: <5519C329.3090001@denninger.net> Date: Mon, 30 Mar 2015 16:42:01 -0500 From: Karl Denninger User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: All available memory used when deleting files from ZFS References: In-Reply-To: Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha1; boundary="------------ms050403070308090107060506" X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 21:54:10 -0000 This is a cryptographically signed message in MIME format. --------------ms050403070308090107060506 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable What's the UMA memory use look like on that machine when the remove is initiated and progresses? Look with vmstat -z and see what the used and free counts look like for the zio allocations...... On 3/30/2015 4:14 PM, Dustin Wenz wrote: > I had several systems panic or hang over the weekend while deleting som= e data off of their local zfs filesystem. It looks like they ran out of p= hysical memory (32GB), and hung when paging to swap-on-zfs (which is not = surprising, given that ZFS was likely using the memory). They were runnin= g 10.1-STABLE r277139M, which I built in the middle of January. The pools= were about 35TB in size, and are a concatenation of 3TB mirrors. They we= re maybe 95% full. I deleted just over 1000 files, totaling 25TB on each = system. > > It took roughly 10 minutes to remove that 25TB of data per host using a= remote rsync, and immediately after that everything seemed fine. However= , after several more minutes, every machine that had data removed became = unresponsive. Some had numerous "swap_pager: indefinite wait buffer" erro= rs followed by a panic, and some just died with no console messages. The = same thing would happen after a reboot, when FreeBSD attempted to mount t= he local filesystem again. > > I was able to boot these systems after exporting the affected pool, but= the problem would recur several minutes after initiating a "zpool import= ". Watching zfs statistics didn't seem to reveal where the memory was goi= ng; ARC would only climb to about 4GB, but free memory would decline rapi= dly. Eventually, after enough export/reboot/import cycles, the pool would= import successfully and everything would be fine from then on. Note that= there is no L2ARC or compression being used. > > Has anyone else run into this when deleting files on ZFS? It seems to b= e a consistent problem under the versions of 10.1 I'm running. > > For reference, I've appended a zstat dump below that was taken 5 minute= s after starting a zpool import, and was about three minutes before the m= achine became unresponsive. You can see that the ARC is only 4GB, but fre= e memory was down to 471MB (and continued to drop). > > - .Dustin > > > -----------------------------------------------------------------------= - > ZFS Subsystem Report Mon Mar 30 12:35:27 2015 > -----------------------------------------------------------------------= - > > System Information: > > Kernel Version: 1001506 (osreldate) > Hardware Platform: amd64 > Processor Architecture: amd64 > > ZFS Storage pool Version: 5000 > ZFS Filesystem Version: 5 > > FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root > 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 > > -----------------------------------------------------------------------= - > > System Memory: > > 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact > 98.34% 30.56 GiB Wired, 0.00% 0 Cache > 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap > > Real Installed: 32.00 GiB > Real Available: 99.82% 31.94 GiB > Real Managed: 97.29% 31.08 GiB > > Logical Total: 32.00 GiB > Logical Used: 98.56% 31.54 GiB > Logical Free: 1.44% 471.57 MiB > > Kernel Memory: 3.17 GiB > Data: 99.18% 3.14 GiB > Text: 0.82% 26.68 MiB > > Kernel Memory Map: 31.08 GiB > Size: 14.18% 4.41 GiB > Free: 85.82% 26.67 GiB > > -----------------------------------------------------------------------= - > > ARC Summary: (HEALTHY) > Memory Throttle Count: 0 > > ARC Misc: > Deleted: 145 > Recycle Misses: 0 > Mutex Misses: 0 > Evict Skips: 0 > > ARC Size: 14.17% 4.26 GiB > Target Size: (Adaptive) 100.00% 30.08 GiB > Min Size (Hard Limit): 12.50% 3.76 GiB > Max Size (High Water): 8:1 30.08 GiB > > ARC Size Breakdown: > Recently Used Cache Size: 50.00% 15.04 GiB > Frequently Used Cache Size: 50.00% 15.04 GiB > > ARC Hash Breakdown: > Elements Max: 270.56k > Elements Current: 100.00% 270.56k > Collisions: 23.66k > Chain Max: 3 > Chains: 8.28k > > -----------------------------------------------------------------------= - > > ARC Efficiency: 2.93m > Cache Hit Ratio: 70.44% 2.06m > Cache Miss Ratio: 29.56% 866.05k > Actual Hit Ratio: 70.40% 2.06m > > Data Demand Efficiency: 97.47% 24.58k > Data Prefetch Efficiency: 1.88% 479 > > CACHE HITS BY CACHE LIST: > Anonymously Used: 0.05% 1.07k > Most Recently Used: 71.82% 1.48m > Most Frequently Used: 28.13% 580.49k > Most Recently Used Ghost: 0.00% 0 > Most Frequently Used Ghost: 0.00% 0 > > CACHE HITS BY DATA TYPE: > Demand Data: 1.16% 23.96k > Prefetch Data: 0.00% 9 > Demand Metadata: 98.79% 2.04m > Prefetch Metadata: 0.05% 1.08k > > CACHE MISSES BY DATA TYPE: > Demand Data: 0.07% 621 > Prefetch Data: 0.05% 470 > Demand Metadata: 99.69% 863.35k > Prefetch Metadata: 0.19% 1.61k > > -----------------------------------------------------------------------= - > > L2ARC is disabled > > -----------------------------------------------------------------------= - > > File-Level Prefetch: (HEALTHY) > > DMU Efficiency: 72.95k > Hit Ratio: 70.83% 51.66k > Miss Ratio: 29.17% 21.28k > > Colinear: 21.28k > Hit Ratio: 0.01% 2 > Miss Ratio: 99.99% 21.28k > > Stride: 50.45k > Hit Ratio: 99.98% 50.44k > Miss Ratio: 0.02% 9 > > DMU Misc: > Reclaim: 21.28k > Successes: 1.73% 368 > Failures: 98.27% 20.91k > > Streams: 1.23k > +Resets: 0.16% 2 > -Resets: 99.84% 1.23k > Bogus: 0 > > -----------------------------------------------------------------------= - > > VDEV cache is disabled > > -----------------------------------------------------------------------= - > > ZFS Tunables (sysctl): > kern.maxusers 2380 > vm.kmem_size 33367830528 > vm.kmem_size_scale 1 > vm.kmem_size_min 0 > vm.kmem_size_max 1319413950874 > vfs.zfs.arc_max 32294088704 > vfs.zfs.arc_min 4036761088 > vfs.zfs.arc_average_blocksize 8192 > vfs.zfs.arc_shrink_shift 5 > vfs.zfs.arc_free_target 56518 > vfs.zfs.arc_meta_used 4534349216 > vfs.zfs.arc_meta_limit 8073522176 > vfs.zfs.l2arc_write_max 8388608 > vfs.zfs.l2arc_write_boost 8388608 > vfs.zfs.l2arc_headroom 2 > vfs.zfs.l2arc_feed_secs 1 > vfs.zfs.l2arc_feed_min_ms 200 > vfs.zfs.l2arc_noprefetch 1 > vfs.zfs.l2arc_feed_again 1 > vfs.zfs.l2arc_norw 1 > vfs.zfs.anon_size 1786368 > vfs.zfs.anon_metadata_lsize 0 > vfs.zfs.anon_data_lsize 0 > vfs.zfs.mru_size 504812032 > vfs.zfs.mru_metadata_lsize 415273472 > vfs.zfs.mru_data_lsize 35227648 > vfs.zfs.mru_ghost_size 0 > vfs.zfs.mru_ghost_metadata_lsize 0 > vfs.zfs.mru_ghost_data_lsize 0 > vfs.zfs.mfu_size 3925990912 > vfs.zfs.mfu_metadata_lsize 3901947392 > vfs.zfs.mfu_data_lsize 7000064 > vfs.zfs.mfu_ghost_size 0 > vfs.zfs.mfu_ghost_metadata_lsize 0 > vfs.zfs.mfu_ghost_data_lsize 0 > vfs.zfs.l2c_only_size 0 > vfs.zfs.dedup.prefetch 1 > vfs.zfs.nopwrite_enabled 1 > vfs.zfs.mdcomp_disable 0 > vfs.zfs.max_recordsize 1048576 > vfs.zfs.dirty_data_max 3429735628 > vfs.zfs.dirty_data_max_max 4294967296 > vfs.zfs.dirty_data_max_percent 10 > vfs.zfs.dirty_data_sync 67108864 > vfs.zfs.delay_min_dirty_percent 60 > vfs.zfs.delay_scale 500000 > vfs.zfs.prefetch_disable 0 > vfs.zfs.zfetch.max_streams 8 > vfs.zfs.zfetch.min_sec_reap 2 > vfs.zfs.zfetch.block_cap 256 > vfs.zfs.zfetch.array_rd_sz 1048576 > vfs.zfs.top_maxinflight 32 > vfs.zfs.resilver_delay 2 > vfs.zfs.scrub_delay 4 > vfs.zfs.scan_idle 50 > vfs.zfs.scan_min_time_ms 1000 > vfs.zfs.free_min_time_ms 1000 > vfs.zfs.resilver_min_time_ms 3000 > vfs.zfs.no_scrub_io 0 > vfs.zfs.no_scrub_prefetch 0 > vfs.zfs.free_max_blocks -1 > vfs.zfs.metaslab.gang_bang 16777217 > vfs.zfs.metaslab.fragmentation_threshold70 > vfs.zfs.metaslab.debug_load 0 > vfs.zfs.metaslab.debug_unload 0 > vfs.zfs.metaslab.df_alloc_threshold 131072 > vfs.zfs.metaslab.df_free_pct 4 > vfs.zfs.metaslab.min_alloc_size 33554432 > vfs.zfs.metaslab.load_pct 50 > vfs.zfs.metaslab.unload_delay 8 > vfs.zfs.metaslab.preload_limit 3 > vfs.zfs.metaslab.preload_enabled 1 > vfs.zfs.metaslab.fragmentation_factor_enabled1 > vfs.zfs.metaslab.lba_weighting_enabled 1 > vfs.zfs.metaslab.bias_enabled 1 > vfs.zfs.condense_pct 200 > vfs.zfs.mg_noalloc_threshold 0 > vfs.zfs.mg_fragmentation_threshold 85 > vfs.zfs.check_hostid 1 > vfs.zfs.spa_load_verify_maxinflight 10000 > vfs.zfs.spa_load_verify_metadata 1 > vfs.zfs.spa_load_verify_data 1 > vfs.zfs.recover 0 > vfs.zfs.deadman_synctime_ms 1000000 > vfs.zfs.deadman_checktime_ms 5000 > vfs.zfs.deadman_enabled 1 > vfs.zfs.spa_asize_inflation 24 > vfs.zfs.spa_slop_shift 5 > vfs.zfs.space_map_blksz 4096 > vfs.zfs.txg.timeout 5 > vfs.zfs.vdev.metaslabs_per_vdev 200 > vfs.zfs.vdev.cache.max 16384 > vfs.zfs.vdev.cache.size 0 > vfs.zfs.vdev.cache.bshift 16 > vfs.zfs.vdev.trim_on_init 1 > vfs.zfs.vdev.mirror.rotating_inc 0 > vfs.zfs.vdev.mirror.rotating_seek_inc 5 > vfs.zfs.vdev.mirror.rotating_seek_offset1048576 > vfs.zfs.vdev.mirror.non_rotating_inc 0 > vfs.zfs.vdev.mirror.non_rotating_seek_inc1 > vfs.zfs.vdev.async_write_active_min_dirty_percent30 > vfs.zfs.vdev.async_write_active_max_dirty_percent60 > vfs.zfs.vdev.max_active 1000 > vfs.zfs.vdev.sync_read_min_active 10 > vfs.zfs.vdev.sync_read_max_active 10 > vfs.zfs.vdev.sync_write_min_active 10 > vfs.zfs.vdev.sync_write_max_active 10 > vfs.zfs.vdev.async_read_min_active 1 > vfs.zfs.vdev.async_read_max_active 3 > vfs.zfs.vdev.async_write_min_active 1 > vfs.zfs.vdev.async_write_max_active 10 > vfs.zfs.vdev.scrub_min_active 1 > vfs.zfs.vdev.scrub_max_active 2 > vfs.zfs.vdev.trim_min_active 1 > vfs.zfs.vdev.trim_max_active 64 > vfs.zfs.vdev.aggregation_limit 131072 > vfs.zfs.vdev.read_gap_limit 32768 > vfs.zfs.vdev.write_gap_limit 4096 > vfs.zfs.vdev.bio_flush_disable 0 > vfs.zfs.vdev.bio_delete_disable 0 > vfs.zfs.vdev.trim_max_bytes 2147483648 > vfs.zfs.vdev.trim_max_pending 64 > vfs.zfs.max_auto_ashift 13 > vfs.zfs.min_auto_ashift 9 > vfs.zfs.zil_replay_disable 0 > vfs.zfs.cache_flush_disable 0 > vfs.zfs.zio.use_uma 1 > vfs.zfs.zio.exclude_metadata 0 > vfs.zfs.sync_pass_deferred_free 2 > vfs.zfs.sync_pass_dont_compress 5 > vfs.zfs.sync_pass_rewrite 2 > vfs.zfs.snapshot_list_prefetch 0 > vfs.zfs.super_owner 0 > vfs.zfs.debug 0 > vfs.zfs.version.ioctl 4 > vfs.zfs.version.acl 1 > vfs.zfs.version.spa 5000 > vfs.zfs.version.zpl 5 > vfs.zfs.vol.mode 1 > vfs.zfs.vol.unmap_enabled 1 > vfs.zfs.trim.enabled 1 > vfs.zfs.trim.txg_delay 32 > vfs.zfs.trim.timeout 30 > vfs.zfs.trim.max_interval 1 > > -----------------------------------------------------------------------= - > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > %SPAMBLOCK-SYS: Matched [@freebsd.org+], message ok --=20 Karl Denninger karl@denninger.net /The Market Ticker/ --------------ms050403070308090107060506 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIGWTCC BlUwggQ9oAMCAQICARowDQYJKoZIhvcNAQELBQAwgZAxCzAJBgNVBAYTAlVTMRAwDgYDVQQI EwdGbG9yaWRhMRIwEAYDVQQHEwlOaWNldmlsbGUxGTAXBgNVBAoTEEN1ZGEgU3lzdGVtcyBM TEMxHDAaBgNVBAMTE0N1ZGEgU3lzdGVtcyBMTEMgQ0ExIjAgBgkqhkiG9w0BCQEWE0N1ZGEg U3lzdGVtcyBMTEMgQ0EwHhcNMTUwMzI1MTMxMDIwWhcNMjAwMzIzMTMxMDIwWjBTMQswCQYD VQQGEwJVUzEQMA4GA1UECBMHRmxvcmlkYTEZMBcGA1UEChMQQ3VkYSBTeXN0ZW1zIExMQzEX MBUGA1UEAxMOS2FybCBEZW5uaW5nZXIwggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIKAoIC AQCmk+jIznE3HgHbh4JU2s86dKGDs4f3ZdED6vCQx9+LnJl7GgT2aAUAARqNnH5dDuC4w/4h K1qb8sXu3yYWlXLLs+vw3oLnx284o0kSZZs/FQ9W90gVTeZ1iTybscN7iXkaf83g1jueBNby n4v1bJEwX/xe94NW0IwBPluOzzXVIMskaZWhqGLtSaiSo4PYUnYMXPRNG7NAWQ2VAZXJkIM2 AM0B3LfyTyZw+NDNJMMQLZBDqS5vHuS78UODXpyyliSsBgaa04KVRsrcz6S2aYxk9ZjU3yD2 JJ7ezlKnZ4j/pc+16rv5fPfJWZAmG3v3kMiMzoDMS+d6CsSYxyQYHDGt+2If0cGpFv3D7Xr6 jxHouLKtipMQ/pPd+T+lugdEj3JfRu0nIM38j+dQh1N+wdiCEgFo0XuPIWW9g7VGwk8n29KR LAT10QZH9ADYbQqwXeXe9xWXjMAXHm6NTXyxpYyuNAV5zwsT5N4fZRwxKn828XZAKLLyeDGg 2lSBKdnT3osk668Yi5hclZH3UX8JikOWzixQ7T1/lWYGAGbElwFUC3xKRv/TI8E6ZYYYbQVN 3JXLKNIfQ7I9fpqrQeMVe03zXGGsXcE1krA8M4VP1ipoDfGD0/Pt8k2mTLUc4PY9eKZJfVlY WWJ/RHPp+N+MFD1sKirYrDvCHaeWyyLDx2dcIwIDAQABo4H1MIHyMAkGA1UdEwQCMAAwEQYJ YIZIAYb4QgEBBAQDAgWgMAsGA1UdDwQEAwIF4DAsBglghkgBhvhCAQ0EHxYdT3BlblNTTCBH ZW5lcmF0ZWQgQ2VydGlmaWNhdGUwHQYDVR0OBBYEFCMvNiXRuCcqulq6cUWK8SNwo7vhMB8G A1UdIwQYMBaAFCRxm52Fffzd3b2wypKUA6H60201MB0GA1UdEQQWMBSBEmthcmxAZGVubmlu Z2VyLm5ldDA4BglghkgBhvhCAQMEKxYpaHR0cHM6Ly9jdWRhc3lzdGVtcy5uZXQ6MTE0NDMv cmV2b2tlZC5jcmwwDQYJKoZIhvcNAQELBQADggIBAHAUwvyHIJw3LTLqpF4apSzuIm5sBqyH rYg1mk7vPkgPFSrsr3AmmtR2iifN7fgAG6NzrL7SddhTiIMbW7mL32Tuklx9sUXM6iEyuiL/ /TRZ95ob7BtM58x2R6y2p00OKOfUCmjyqWy/pAjUAk7c5m9uLr6rVQUj0lGuvCMZEo1lnG6S +EUZ1Mi2mz9HrZBR2GhNPb5UgNVsX91So+uEF+1pRg1mQO6KvX4E84MOPe++qM76o+NvlEIw IU9tYHjSgjqWrqUQgEesMjahWEblfT+XPvPwy9WtICESMQGdGzVgDBgwoFrFnS2GyKlve0rj LKBs5ZtMrsASnbSvWX5uYy6Fb0Gv/F2neStmAyxL6Kupu4D28QpbtG1nxl3pN3SyQiUfXvVm LC3JvS6r6eQsG8Q/6fxCvUNQg8AjvVSMYTspAm3O0rihPQWX1GTAWS9fIxYlo5Y3NY8SpTXD 7RU5QzbQnK/mMJkuuysQeFAmK58El/7GcUy4zt2akuTBD0YroH2FHjfNUJej0lwHgxkNU3zG quKn+Llw1/u/+cncRiVPVatbqhUtXk2a0Y6OKrcAmFwzXlAi//hzofp3Sd1sWW0SUKQIizl7 xSxyu0cnYbxiBLDn3bmCTYCowUHLm6vBDc+3l6jxOM5fWOdJwq4hakYmrGonoI0pRTG732P9 Jf5/MYIE4zCCBN8CAQEwgZYwgZAxCzAJBgNVBAYTAlVTMRAwDgYDVQQIEwdGbG9yaWRhMRIw EAYDVQQHEwlOaWNldmlsbGUxGTAXBgNVBAoTEEN1ZGEgU3lzdGVtcyBMTEMxHDAaBgNVBAMT E0N1ZGEgU3lzdGVtcyBMTEMgQ0ExIjAgBgkqhkiG9w0BCQEWE0N1ZGEgU3lzdGVtcyBMTEMg Q0ECARowCQYFKw4DAhoFAKCCAiEwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG 9w0BCQUxDxcNMTUwMzMwMjE0MjAxWjAjBgkqhkiG9w0BCQQxFgQUCm+PSlW8t+tBDc4dtCRN Vah+jMswbAYJKoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqG SIb3DQMHMA4GCCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG 9w0DAgIBKDCBpwYJKwYBBAGCNxAEMYGZMIGWMIGQMQswCQYDVQQGEwJVUzEQMA4GA1UECBMH RmxvcmlkYTESMBAGA1UEBxMJTmljZXZpbGxlMRkwFwYDVQQKExBDdWRhIFN5c3RlbXMgTExD MRwwGgYDVQQDExNDdWRhIFN5c3RlbXMgTExDIENBMSIwIAYJKoZIhvcNAQkBFhNDdWRhIFN5 c3RlbXMgTExDIENBAgEaMIGpBgsqhkiG9w0BCRACCzGBmaCBljCBkDELMAkGA1UEBhMCVVMx EDAOBgNVBAgTB0Zsb3JpZGExEjAQBgNVBAcTCU5pY2V2aWxsZTEZMBcGA1UEChMQQ3VkYSBT eXN0ZW1zIExMQzEcMBoGA1UEAxMTQ3VkYSBTeXN0ZW1zIExMQyBDQTEiMCAGCSqGSIb3DQEJ ARYTQ3VkYSBTeXN0ZW1zIExMQyBDQQIBGjANBgkqhkiG9w0BAQEFAASCAgAe9x49JLhjZR+a FHJW+ic9KqcBMnQ/5LOk5eJ30QZf6Fs350241F9UahQ5m08pF0INpIHCGJnFu6iHUfPYfVEO h7nz237KJ9DpqtSlG/y2lSkraRzMfbpYXk5cNhc9suRbC6ezOBakbsAAxkPDUYKDgztqDA20 8MALBaGgiJE5Mk51QGNSinPkZcbibVZ6UyTqL4QdphevbIZyl1aFhNujYeSnO0JA3YW3pWdc +L7AtxU5KVLWjDZvklssYX7Hmieepfn/EpQKK7m/jKmut3a5F/szhs6K2Uyda9Yuvl1QOikF xumoEz3syKY2ibiGkmmwaB6+RLdpUXD6Lv7BqIzwrPQiLa4J4cmsbfh17VBuNJFhAaPkFABJ gvibagPw8gGjmiDat4U78XPtkn6Yrsl3WFX/HJjFIQBtgrWZeyh6/D06ROVoWAVr882Ff15z qpGtC4dE0+fiSsNNb4SyRb+Vcr3N0UG+yWCpLgJlSxpO9zL/RxbbJsd3sbe+e8MeqX7duPUy G7Jdulp9vSQdWbNLKQxfL6QEAuD4AgIbwj1fq5EZLZm+GfqjPgFxJS1LCadVAztjtWXPqoGa DhGumTQLPzaSRavs0FU5xj4FktRPomqyEY4H6wkKWo5OnZXhiCDc1yPBf1ptk86/6nthwWo5 KEFAnppdOAVuiOr/jPwd9AAAAAAAAA== --------------ms050403070308090107060506-- From owner-freebsd-fs@FreeBSD.ORG Mon Mar 30 23:31:01 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C1E166A0 for ; Mon, 30 Mar 2015 23:31:01 +0000 (UTC) Received: from internet06.ebureau.com (internet06.ebureau.com [65.127.24.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "internet06.ebureau.com", Issuer "internet06.ebureau.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 97775ABF for ; Mon, 30 Mar 2015 23:31:01 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by internet06.ebureau.com (Postfix) with ESMTP id 78C9037F683E; Mon, 30 Mar 2015 18:30:59 -0500 (CDT) X-Virus-Scanned: amavisd-new at ebureau.com Received: from internet06.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oPBFpzlTG1W0; Mon, 30 Mar 2015 18:30:58 -0500 (CDT) Received: from square.office.ebureau.com (unknown [10.10.20.22]) by internet06.ebureau.com (Postfix) with ESMTPSA id 1773A37F680D; Mon, 30 Mar 2015 18:30:58 -0500 (CDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: All available memory used when deleting files from ZFS From: Dustin Wenz In-Reply-To: <5519C329.3090001@denninger.net> Date: Mon, 30 Mar 2015 18:30:57 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <923828D6-503B-4FC3-89E8-1DC6DF0C9B6B@ebureau.com> References: <5519C329.3090001@denninger.net> To: Karl Denninger X-Mailer: Apple Mail (2.2070.6) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2015 23:31:01 -0000 Unfortunately, I just spent the day recovering from this, so I have no = way to easily get new memory stats now. I'm planning on doing a test = with additional data in an effort to understand more about the issue, = but it will take time to set something up. In the meantime, I'd advise anyone running ZFS on FreeBSD 10.x to be = mindful when freeing up lots of space all at once. - .Dustin > On Mar 30, 2015, at 4:42 PM, Karl Denninger = wrote: >=20 > What's the UMA memory use look like on that machine when the remove is > initiated and progresses? Look with vmstat -z and see what the used = and > free counts look like for the zio allocations...... >=20 > On 3/30/2015 4:14 PM, Dustin Wenz wrote: >> I had several systems panic or hang over the weekend while deleting = some data off of their local zfs filesystem. It looks like they ran out = of physical memory (32GB), and hung when paging to swap-on-zfs (which is = not surprising, given that ZFS was likely using the memory). They were = running 10.1-STABLE r277139M, which I built in the middle of January. = The pools were about 35TB in size, and are a concatenation of 3TB = mirrors. They were maybe 95% full. I deleted just over 1000 files, = totaling 25TB on each system. >>=20 >> It took roughly 10 minutes to remove that 25TB of data per host using = a remote rsync, and immediately after that everything seemed fine. = However, after several more minutes, every machine that had data removed = became unresponsive. Some had numerous "swap_pager: indefinite wait = buffer" errors followed by a panic, and some just died with no console = messages. The same thing would happen after a reboot, when FreeBSD = attempted to mount the local filesystem again. >>=20 >> I was able to boot these systems after exporting the affected pool, = but the problem would recur several minutes after initiating a "zpool = import". Watching zfs statistics didn't seem to reveal where the memory = was going; ARC would only climb to about 4GB, but free memory would = decline rapidly. Eventually, after enough export/reboot/import cycles, = the pool would import successfully and everything would be fine from = then on. Note that there is no L2ARC or compression being used. >>=20 >> Has anyone else run into this when deleting files on ZFS? It seems to = be a consistent problem under the versions of 10.1 I'm running. >>=20 >> For reference, I've appended a zstat dump below that was taken 5 = minutes after starting a zpool import, and was about three minutes = before the machine became unresponsive. You can see that the ARC is only = 4GB, but free memory was down to 471MB (and continued to drop). >>=20 >> - .Dustin >>=20 >>=20 >> = ------------------------------------------------------------------------ >> ZFS Subsystem Report Mon Mar 30 12:35:27 2015 >> = ------------------------------------------------------------------------ >>=20 >> System Information: >>=20 >> Kernel Version: 1001506 (osreldate) >> Hardware Platform: amd64 >> Processor Architecture: amd64 >>=20 >> ZFS Storage pool Version: 5000 >> ZFS Filesystem Version: 5 >>=20 >> FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root >> 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 >>=20 >> = ------------------------------------------------------------------------ >>=20 >> System Memory: >>=20 >> 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact >> 98.34% 30.56 GiB Wired, 0.00% 0 Cache >> 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap >>=20 >> Real Installed: 32.00 GiB >> Real Available: 99.82% 31.94 GiB >> Real Managed: 97.29% 31.08 GiB >>=20 >> Logical Total: 32.00 GiB >> Logical Used: 98.56% 31.54 GiB >> Logical Free: 1.44% 471.57 MiB >>=20 >> Kernel Memory: 3.17 GiB >> Data: 99.18% 3.14 GiB >> Text: 0.82% 26.68 MiB >>=20 >> Kernel Memory Map: 31.08 GiB >> Size: 14.18% 4.41 GiB >> Free: 85.82% 26.67 GiB >>=20 >> = ------------------------------------------------------------------------ >>=20 >> ARC Summary: (HEALTHY) >> Memory Throttle Count: 0 >>=20 >> ARC Misc: >> Deleted: 145 >> Recycle Misses: 0 >> Mutex Misses: 0 >> Evict Skips: 0 >>=20 >> ARC Size: 14.17% 4.26 GiB >> Target Size: (Adaptive) 100.00% 30.08 GiB >> Min Size (Hard Limit): 12.50% 3.76 GiB >> Max Size (High Water): 8:1 30.08 GiB >>=20 >> ARC Size Breakdown: >> Recently Used Cache Size: 50.00% 15.04 GiB >> Frequently Used Cache Size: 50.00% 15.04 GiB >>=20 >> ARC Hash Breakdown: >> Elements Max: 270.56k >> Elements Current: 100.00% 270.56k >> Collisions: 23.66k >> Chain Max: 3 >> Chains: 8.28k >>=20 >> = ------------------------------------------------------------------------ >>=20 >> ARC Efficiency: 2.93m >> Cache Hit Ratio: 70.44% 2.06m >> Cache Miss Ratio: 29.56% 866.05k >> Actual Hit Ratio: 70.40% 2.06m >>=20 >> Data Demand Efficiency: 97.47% 24.58k >> Data Prefetch Efficiency: 1.88% 479 >>=20 >> CACHE HITS BY CACHE LIST: >> Anonymously Used: 0.05% 1.07k >> Most Recently Used: 71.82% 1.48m >> Most Frequently Used: 28.13% 580.49k >> Most Recently Used Ghost: 0.00% 0 >> Most Frequently Used Ghost: 0.00% 0 >>=20 >> CACHE HITS BY DATA TYPE: >> Demand Data: 1.16% 23.96k >> Prefetch Data: 0.00% 9 >> Demand Metadata: 98.79% 2.04m >> Prefetch Metadata: 0.05% 1.08k >>=20 >> CACHE MISSES BY DATA TYPE: >> Demand Data: 0.07% 621 >> Prefetch Data: 0.05% 470 >> Demand Metadata: 99.69% 863.35k >> Prefetch Metadata: 0.19% 1.61k >>=20 >> = ------------------------------------------------------------------------ >>=20 >> L2ARC is disabled >>=20 >> = ------------------------------------------------------------------------ >>=20 >> File-Level Prefetch: (HEALTHY) >>=20 >> DMU Efficiency: 72.95k >> Hit Ratio: 70.83% 51.66k >> Miss Ratio: 29.17% 21.28k >>=20 >> Colinear: 21.28k >> Hit Ratio: 0.01% 2 >> Miss Ratio: 99.99% 21.28k >>=20 >> Stride: 50.45k >> Hit Ratio: 99.98% 50.44k >> Miss Ratio: 0.02% 9 >>=20 >> DMU Misc: >> Reclaim: 21.28k >> Successes: 1.73% 368 >> Failures: 98.27% 20.91k >>=20 >> Streams: 1.23k >> +Resets: 0.16% 2 >> -Resets: 99.84% 1.23k >> Bogus: 0 >>=20 >> = ------------------------------------------------------------------------ >>=20 >> VDEV cache is disabled >>=20 >> = ------------------------------------------------------------------------ >>=20 >> ZFS Tunables (sysctl): >> kern.maxusers 2380 >> vm.kmem_size 33367830528 >> vm.kmem_size_scale 1 >> vm.kmem_size_min 0 >> vm.kmem_size_max 1319413950874 >> vfs.zfs.arc_max 32294088704 >> vfs.zfs.arc_min 4036761088 >> vfs.zfs.arc_average_blocksize 8192 >> vfs.zfs.arc_shrink_shift 5 >> vfs.zfs.arc_free_target 56518 >> vfs.zfs.arc_meta_used 4534349216 >> vfs.zfs.arc_meta_limit 8073522176 >> vfs.zfs.l2arc_write_max 8388608 >> vfs.zfs.l2arc_write_boost 8388608 >> vfs.zfs.l2arc_headroom 2 >> vfs.zfs.l2arc_feed_secs 1 >> vfs.zfs.l2arc_feed_min_ms 200 >> vfs.zfs.l2arc_noprefetch 1 >> vfs.zfs.l2arc_feed_again 1 >> vfs.zfs.l2arc_norw 1 >> vfs.zfs.anon_size 1786368 >> vfs.zfs.anon_metadata_lsize 0 >> vfs.zfs.anon_data_lsize 0 >> vfs.zfs.mru_size 504812032 >> vfs.zfs.mru_metadata_lsize 415273472 >> vfs.zfs.mru_data_lsize 35227648 >> vfs.zfs.mru_ghost_size 0 >> vfs.zfs.mru_ghost_metadata_lsize 0 >> vfs.zfs.mru_ghost_data_lsize 0 >> vfs.zfs.mfu_size 3925990912 >> vfs.zfs.mfu_metadata_lsize 3901947392 >> vfs.zfs.mfu_data_lsize 7000064 >> vfs.zfs.mfu_ghost_size 0 >> vfs.zfs.mfu_ghost_metadata_lsize 0 >> vfs.zfs.mfu_ghost_data_lsize 0 >> vfs.zfs.l2c_only_size 0 >> vfs.zfs.dedup.prefetch 1 >> vfs.zfs.nopwrite_enabled 1 >> vfs.zfs.mdcomp_disable 0 >> vfs.zfs.max_recordsize 1048576 >> vfs.zfs.dirty_data_max 3429735628 >> vfs.zfs.dirty_data_max_max 4294967296 >> vfs.zfs.dirty_data_max_percent 10 >> vfs.zfs.dirty_data_sync 67108864 >> vfs.zfs.delay_min_dirty_percent 60 >> vfs.zfs.delay_scale 500000 >> vfs.zfs.prefetch_disable 0 >> vfs.zfs.zfetch.max_streams 8 >> vfs.zfs.zfetch.min_sec_reap 2 >> vfs.zfs.zfetch.block_cap 256 >> vfs.zfs.zfetch.array_rd_sz 1048576 >> vfs.zfs.top_maxinflight 32 >> vfs.zfs.resilver_delay 2 >> vfs.zfs.scrub_delay 4 >> vfs.zfs.scan_idle 50 >> vfs.zfs.scan_min_time_ms 1000 >> vfs.zfs.free_min_time_ms 1000 >> vfs.zfs.resilver_min_time_ms 3000 >> vfs.zfs.no_scrub_io 0 >> vfs.zfs.no_scrub_prefetch 0 >> vfs.zfs.free_max_blocks -1 >> vfs.zfs.metaslab.gang_bang 16777217 >> vfs.zfs.metaslab.fragmentation_threshold70 >> vfs.zfs.metaslab.debug_load 0 >> vfs.zfs.metaslab.debug_unload 0 >> vfs.zfs.metaslab.df_alloc_threshold 131072 >> vfs.zfs.metaslab.df_free_pct 4 >> vfs.zfs.metaslab.min_alloc_size 33554432 >> vfs.zfs.metaslab.load_pct 50 >> vfs.zfs.metaslab.unload_delay 8 >> vfs.zfs.metaslab.preload_limit 3 >> vfs.zfs.metaslab.preload_enabled 1 >> vfs.zfs.metaslab.fragmentation_factor_enabled1 >> vfs.zfs.metaslab.lba_weighting_enabled 1 >> vfs.zfs.metaslab.bias_enabled 1 >> vfs.zfs.condense_pct 200 >> vfs.zfs.mg_noalloc_threshold 0 >> vfs.zfs.mg_fragmentation_threshold 85 >> vfs.zfs.check_hostid 1 >> vfs.zfs.spa_load_verify_maxinflight 10000 >> vfs.zfs.spa_load_verify_metadata 1 >> vfs.zfs.spa_load_verify_data 1 >> vfs.zfs.recover 0 >> vfs.zfs.deadman_synctime_ms 1000000 >> vfs.zfs.deadman_checktime_ms 5000 >> vfs.zfs.deadman_enabled 1 >> vfs.zfs.spa_asize_inflation 24 >> vfs.zfs.spa_slop_shift 5 >> vfs.zfs.space_map_blksz 4096 >> vfs.zfs.txg.timeout 5 >> vfs.zfs.vdev.metaslabs_per_vdev 200 >> vfs.zfs.vdev.cache.max 16384 >> vfs.zfs.vdev.cache.size 0 >> vfs.zfs.vdev.cache.bshift 16 >> vfs.zfs.vdev.trim_on_init 1 >> vfs.zfs.vdev.mirror.rotating_inc 0 >> vfs.zfs.vdev.mirror.rotating_seek_inc 5 >> vfs.zfs.vdev.mirror.rotating_seek_offset1048576 >> vfs.zfs.vdev.mirror.non_rotating_inc 0 >> vfs.zfs.vdev.mirror.non_rotating_seek_inc1 >> vfs.zfs.vdev.async_write_active_min_dirty_percent30 >> vfs.zfs.vdev.async_write_active_max_dirty_percent60 >> vfs.zfs.vdev.max_active 1000 >> vfs.zfs.vdev.sync_read_min_active 10 >> vfs.zfs.vdev.sync_read_max_active 10 >> vfs.zfs.vdev.sync_write_min_active 10 >> vfs.zfs.vdev.sync_write_max_active 10 >> vfs.zfs.vdev.async_read_min_active 1 >> vfs.zfs.vdev.async_read_max_active 3 >> vfs.zfs.vdev.async_write_min_active 1 >> vfs.zfs.vdev.async_write_max_active 10 >> vfs.zfs.vdev.scrub_min_active 1 >> vfs.zfs.vdev.scrub_max_active 2 >> vfs.zfs.vdev.trim_min_active 1 >> vfs.zfs.vdev.trim_max_active 64 >> vfs.zfs.vdev.aggregation_limit 131072 >> vfs.zfs.vdev.read_gap_limit 32768 >> vfs.zfs.vdev.write_gap_limit 4096 >> vfs.zfs.vdev.bio_flush_disable 0 >> vfs.zfs.vdev.bio_delete_disable 0 >> vfs.zfs.vdev.trim_max_bytes 2147483648 >> vfs.zfs.vdev.trim_max_pending 64 >> vfs.zfs.max_auto_ashift 13 >> vfs.zfs.min_auto_ashift 9 >> vfs.zfs.zil_replay_disable 0 >> vfs.zfs.cache_flush_disable 0 >> vfs.zfs.zio.use_uma 1 >> vfs.zfs.zio.exclude_metadata 0 >> vfs.zfs.sync_pass_deferred_free 2 >> vfs.zfs.sync_pass_dont_compress 5 >> vfs.zfs.sync_pass_rewrite 2 >> vfs.zfs.snapshot_list_prefetch 0 >> vfs.zfs.super_owner 0 >> vfs.zfs.debug 0 >> vfs.zfs.version.ioctl 4 >> vfs.zfs.version.acl 1 >> vfs.zfs.version.spa 5000 >> vfs.zfs.version.zpl 5 >> vfs.zfs.vol.mode 1 >> vfs.zfs.vol.unmap_enabled 1 >> vfs.zfs.trim.enabled 1 >> vfs.zfs.trim.txg_delay 32 >> vfs.zfs.trim.timeout 30 >> vfs.zfs.trim.max_interval 1 >>=20 >> = ------------------------------------------------------------------------ >>=20 >>=20 >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>=20 >>=20 >> %SPAMBLOCK-SYS: Matched [@freebsd.org+], message ok >=20 > --=20 > Karl Denninger > karl@denninger.net > /The Market Ticker/ >=20 >=20 From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 00:07:18 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4731B25C for ; Tue, 31 Mar 2015 00:07:18 +0000 (UTC) Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CA829E45 for ; Tue, 31 Mar 2015 00:07:17 +0000 (UTC) Received: by wicne17 with SMTP id ne17so4322954wic.0 for ; Mon, 30 Mar 2015 17:07:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=S7/V0bf3sYq/CVtP9z8mxZtklPmkLHzIe7J0QGy8gtA=; b=k2mNEwosVfX0EgJLwVdye+4pbsF1GpBvKutBu8q7C5/sE7CpHNy4/Kb4lT4hazsuwl mziMZX8Ir24lUqIkFRnFuGsVhDX4M/r/3aVkmNcKbL+9WVLMqDFo/qA08ZXky0/yd6qi SIqVvkULp4ePEFMbqf4hzGX6/mgwSxhhb5561fgOP4QS6CakkNoMc3kdDXQV5lRhFMIg Dt3Ll1Ie8mvIHehojDXo0TTRMtW/y6PkzZn4ZFk5kRwDHRPpSyUZ0zsHYpvWqVg8se5B WUxkvRecmdycnkmA6phKAlMHcMTlUb7PjSN+nvhh1NcMPTE0DqkIO8scEexqi80AQRol 9FFg== X-Gm-Message-State: ALoCoQkRpQhyn5FHcJ82i8P1YZEOLGs9+raO32Yxm0gfFfyPMHLapRNJTb1hf73odOcJ2IU6Auzj X-Received: by 10.180.24.233 with SMTP id x9mr478748wif.9.1427760435744; Mon, 30 Mar 2015 17:07:15 -0700 (PDT) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id p9sm17874586wje.12.2015.03.30.17.07.14 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 30 Mar 2015 17:07:14 -0700 (PDT) Message-ID: <5519E53C.4060203@multiplay.co.uk> Date: Tue, 31 Mar 2015 01:07:24 +0100 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: All available memory used when deleting files from ZFS References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 00:07:18 -0000 Later versions have vfs.zfs.free_max_blocks which is likely to be the fix your looking for. It was added to head by r271532 and stable/10 by: https://svnweb.freebsd.org/base?view=revision&revision=272665 Description being: Add a new tunable/sysctl, vfs.zfs.free_max_blocks, which can be used to limit how many blocks can be free'ed before a new transaction group is created. The default is no limit (infinite), but we should probably have a lower default, e.g. 100,000. With this limit, we can guard against the case where ZFS could run out of memory when destroying large numbers of blocks in a single transaction group, as the entire DDT needs to be brought into memory. Illumos issue: 5138 add tunable for maximum number of blocks freed in one txg On 30/03/2015 22:14, Dustin Wenz wrote: > I had several systems panic or hang over the weekend while deleting some data off of their local zfs filesystem. It looks like they ran out of physical memory (32GB), and hung when paging to swap-on-zfs (which is not surprising, given that ZFS was likely using the memory). They were running 10.1-STABLE r277139M, which I built in the middle of January. The pools were about 35TB in size, and are a concatenation of 3TB mirrors. They were maybe 95% full. I deleted just over 1000 files, totaling 25TB on each system. > > It took roughly 10 minutes to remove that 25TB of data per host using a remote rsync, and immediately after that everything seemed fine. However, after several more minutes, every machine that had data removed became unresponsive. Some had numerous "swap_pager: indefinite wait buffer" errors followed by a panic, and some just died with no console messages. The same thing would happen after a reboot, when FreeBSD attempted to mount the local filesystem again. > > I was able to boot these systems after exporting the affected pool, but the problem would recur several minutes after initiating a "zpool import". Watching zfs statistics didn't seem to reveal where the memory was going; ARC would only climb to about 4GB, but free memory would decline rapidly. Eventually, after enough export/reboot/import cycles, the pool would import successfully and everything would be fine from then on. Note that there is no L2ARC or compression being used. > > Has anyone else run into this when deleting files on ZFS? It seems to be a consistent problem under the versions of 10.1 I'm running. > > For reference, I've appended a zstat dump below that was taken 5 minutes after starting a zpool import, and was about three minutes before the machine became unresponsive. You can see that the ARC is only 4GB, but free memory was down to 471MB (and continued to drop). > > - .Dustin > > > ------------------------------------------------------------------------ > ZFS Subsystem Report Mon Mar 30 12:35:27 2015 > ------------------------------------------------------------------------ > > System Information: > > Kernel Version: 1001506 (osreldate) > Hardware Platform: amd64 > Processor Architecture: amd64 > > ZFS Storage pool Version: 5000 > ZFS Filesystem Version: 5 > > FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root > 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 > > ------------------------------------------------------------------------ > > System Memory: > > 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact > 98.34% 30.56 GiB Wired, 0.00% 0 Cache > 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap > > Real Installed: 32.00 GiB > Real Available: 99.82% 31.94 GiB > Real Managed: 97.29% 31.08 GiB > > Logical Total: 32.00 GiB > Logical Used: 98.56% 31.54 GiB > Logical Free: 1.44% 471.57 MiB > > Kernel Memory: 3.17 GiB > Data: 99.18% 3.14 GiB > Text: 0.82% 26.68 MiB > > Kernel Memory Map: 31.08 GiB > Size: 14.18% 4.41 GiB > Free: 85.82% 26.67 GiB > > ------------------------------------------------------------------------ > > ARC Summary: (HEALTHY) > Memory Throttle Count: 0 > > ARC Misc: > Deleted: 145 > Recycle Misses: 0 > Mutex Misses: 0 > Evict Skips: 0 > > ARC Size: 14.17% 4.26 GiB > Target Size: (Adaptive) 100.00% 30.08 GiB > Min Size (Hard Limit): 12.50% 3.76 GiB > Max Size (High Water): 8:1 30.08 GiB > > ARC Size Breakdown: > Recently Used Cache Size: 50.00% 15.04 GiB > Frequently Used Cache Size: 50.00% 15.04 GiB > > ARC Hash Breakdown: > Elements Max: 270.56k > Elements Current: 100.00% 270.56k > Collisions: 23.66k > Chain Max: 3 > Chains: 8.28k > > ------------------------------------------------------------------------ > > ARC Efficiency: 2.93m > Cache Hit Ratio: 70.44% 2.06m > Cache Miss Ratio: 29.56% 866.05k > Actual Hit Ratio: 70.40% 2.06m > > Data Demand Efficiency: 97.47% 24.58k > Data Prefetch Efficiency: 1.88% 479 > > CACHE HITS BY CACHE LIST: > Anonymously Used: 0.05% 1.07k > Most Recently Used: 71.82% 1.48m > Most Frequently Used: 28.13% 580.49k > Most Recently Used Ghost: 0.00% 0 > Most Frequently Used Ghost: 0.00% 0 > > CACHE HITS BY DATA TYPE: > Demand Data: 1.16% 23.96k > Prefetch Data: 0.00% 9 > Demand Metadata: 98.79% 2.04m > Prefetch Metadata: 0.05% 1.08k > > CACHE MISSES BY DATA TYPE: > Demand Data: 0.07% 621 > Prefetch Data: 0.05% 470 > Demand Metadata: 99.69% 863.35k > Prefetch Metadata: 0.19% 1.61k > > ------------------------------------------------------------------------ > > L2ARC is disabled > > ------------------------------------------------------------------------ > > File-Level Prefetch: (HEALTHY) > > DMU Efficiency: 72.95k > Hit Ratio: 70.83% 51.66k > Miss Ratio: 29.17% 21.28k > > Colinear: 21.28k > Hit Ratio: 0.01% 2 > Miss Ratio: 99.99% 21.28k > > Stride: 50.45k > Hit Ratio: 99.98% 50.44k > Miss Ratio: 0.02% 9 > > DMU Misc: > Reclaim: 21.28k > Successes: 1.73% 368 > Failures: 98.27% 20.91k > > Streams: 1.23k > +Resets: 0.16% 2 > -Resets: 99.84% 1.23k > Bogus: 0 > > ------------------------------------------------------------------------ > > VDEV cache is disabled > > ------------------------------------------------------------------------ > > ZFS Tunables (sysctl): > kern.maxusers 2380 > vm.kmem_size 33367830528 > vm.kmem_size_scale 1 > vm.kmem_size_min 0 > vm.kmem_size_max 1319413950874 > vfs.zfs.arc_max 32294088704 > vfs.zfs.arc_min 4036761088 > vfs.zfs.arc_average_blocksize 8192 > vfs.zfs.arc_shrink_shift 5 > vfs.zfs.arc_free_target 56518 > vfs.zfs.arc_meta_used 4534349216 > vfs.zfs.arc_meta_limit 8073522176 > vfs.zfs.l2arc_write_max 8388608 > vfs.zfs.l2arc_write_boost 8388608 > vfs.zfs.l2arc_headroom 2 > vfs.zfs.l2arc_feed_secs 1 > vfs.zfs.l2arc_feed_min_ms 200 > vfs.zfs.l2arc_noprefetch 1 > vfs.zfs.l2arc_feed_again 1 > vfs.zfs.l2arc_norw 1 > vfs.zfs.anon_size 1786368 > vfs.zfs.anon_metadata_lsize 0 > vfs.zfs.anon_data_lsize 0 > vfs.zfs.mru_size 504812032 > vfs.zfs.mru_metadata_lsize 415273472 > vfs.zfs.mru_data_lsize 35227648 > vfs.zfs.mru_ghost_size 0 > vfs.zfs.mru_ghost_metadata_lsize 0 > vfs.zfs.mru_ghost_data_lsize 0 > vfs.zfs.mfu_size 3925990912 > vfs.zfs.mfu_metadata_lsize 3901947392 > vfs.zfs.mfu_data_lsize 7000064 > vfs.zfs.mfu_ghost_size 0 > vfs.zfs.mfu_ghost_metadata_lsize 0 > vfs.zfs.mfu_ghost_data_lsize 0 > vfs.zfs.l2c_only_size 0 > vfs.zfs.dedup.prefetch 1 > vfs.zfs.nopwrite_enabled 1 > vfs.zfs.mdcomp_disable 0 > vfs.zfs.max_recordsize 1048576 > vfs.zfs.dirty_data_max 3429735628 > vfs.zfs.dirty_data_max_max 4294967296 > vfs.zfs.dirty_data_max_percent 10 > vfs.zfs.dirty_data_sync 67108864 > vfs.zfs.delay_min_dirty_percent 60 > vfs.zfs.delay_scale 500000 > vfs.zfs.prefetch_disable 0 > vfs.zfs.zfetch.max_streams 8 > vfs.zfs.zfetch.min_sec_reap 2 > vfs.zfs.zfetch.block_cap 256 > vfs.zfs.zfetch.array_rd_sz 1048576 > vfs.zfs.top_maxinflight 32 > vfs.zfs.resilver_delay 2 > vfs.zfs.scrub_delay 4 > vfs.zfs.scan_idle 50 > vfs.zfs.scan_min_time_ms 1000 > vfs.zfs.free_min_time_ms 1000 > vfs.zfs.resilver_min_time_ms 3000 > vfs.zfs.no_scrub_io 0 > vfs.zfs.no_scrub_prefetch 0 > vfs.zfs.free_max_blocks -1 > vfs.zfs.metaslab.gang_bang 16777217 > vfs.zfs.metaslab.fragmentation_threshold70 > vfs.zfs.metaslab.debug_load 0 > vfs.zfs.metaslab.debug_unload 0 > vfs.zfs.metaslab.df_alloc_threshold 131072 > vfs.zfs.metaslab.df_free_pct 4 > vfs.zfs.metaslab.min_alloc_size 33554432 > vfs.zfs.metaslab.load_pct 50 > vfs.zfs.metaslab.unload_delay 8 > vfs.zfs.metaslab.preload_limit 3 > vfs.zfs.metaslab.preload_enabled 1 > vfs.zfs.metaslab.fragmentation_factor_enabled1 > vfs.zfs.metaslab.lba_weighting_enabled 1 > vfs.zfs.metaslab.bias_enabled 1 > vfs.zfs.condense_pct 200 > vfs.zfs.mg_noalloc_threshold 0 > vfs.zfs.mg_fragmentation_threshold 85 > vfs.zfs.check_hostid 1 > vfs.zfs.spa_load_verify_maxinflight 10000 > vfs.zfs.spa_load_verify_metadata 1 > vfs.zfs.spa_load_verify_data 1 > vfs.zfs.recover 0 > vfs.zfs.deadman_synctime_ms 1000000 > vfs.zfs.deadman_checktime_ms 5000 > vfs.zfs.deadman_enabled 1 > vfs.zfs.spa_asize_inflation 24 > vfs.zfs.spa_slop_shift 5 > vfs.zfs.space_map_blksz 4096 > vfs.zfs.txg.timeout 5 > vfs.zfs.vdev.metaslabs_per_vdev 200 > vfs.zfs.vdev.cache.max 16384 > vfs.zfs.vdev.cache.size 0 > vfs.zfs.vdev.cache.bshift 16 > vfs.zfs.vdev.trim_on_init 1 > vfs.zfs.vdev.mirror.rotating_inc 0 > vfs.zfs.vdev.mirror.rotating_seek_inc 5 > vfs.zfs.vdev.mirror.rotating_seek_offset1048576 > vfs.zfs.vdev.mirror.non_rotating_inc 0 > vfs.zfs.vdev.mirror.non_rotating_seek_inc1 > vfs.zfs.vdev.async_write_active_min_dirty_percent30 > vfs.zfs.vdev.async_write_active_max_dirty_percent60 > vfs.zfs.vdev.max_active 1000 > vfs.zfs.vdev.sync_read_min_active 10 > vfs.zfs.vdev.sync_read_max_active 10 > vfs.zfs.vdev.sync_write_min_active 10 > vfs.zfs.vdev.sync_write_max_active 10 > vfs.zfs.vdev.async_read_min_active 1 > vfs.zfs.vdev.async_read_max_active 3 > vfs.zfs.vdev.async_write_min_active 1 > vfs.zfs.vdev.async_write_max_active 10 > vfs.zfs.vdev.scrub_min_active 1 > vfs.zfs.vdev.scrub_max_active 2 > vfs.zfs.vdev.trim_min_active 1 > vfs.zfs.vdev.trim_max_active 64 > vfs.zfs.vdev.aggregation_limit 131072 > vfs.zfs.vdev.read_gap_limit 32768 > vfs.zfs.vdev.write_gap_limit 4096 > vfs.zfs.vdev.bio_flush_disable 0 > vfs.zfs.vdev.bio_delete_disable 0 > vfs.zfs.vdev.trim_max_bytes 2147483648 > vfs.zfs.vdev.trim_max_pending 64 > vfs.zfs.max_auto_ashift 13 > vfs.zfs.min_auto_ashift 9 > vfs.zfs.zil_replay_disable 0 > vfs.zfs.cache_flush_disable 0 > vfs.zfs.zio.use_uma 1 > vfs.zfs.zio.exclude_metadata 0 > vfs.zfs.sync_pass_deferred_free 2 > vfs.zfs.sync_pass_dont_compress 5 > vfs.zfs.sync_pass_rewrite 2 > vfs.zfs.snapshot_list_prefetch 0 > vfs.zfs.super_owner 0 > vfs.zfs.debug 0 > vfs.zfs.version.ioctl 4 > vfs.zfs.version.acl 1 > vfs.zfs.version.spa 5000 > vfs.zfs.version.zpl 5 > vfs.zfs.vol.mode 1 > vfs.zfs.vol.unmap_enabled 1 > vfs.zfs.trim.enabled 1 > vfs.zfs.trim.txg_delay 32 > vfs.zfs.trim.timeout 30 > vfs.zfs.trim.max_interval 1 > > ------------------------------------------------------------------------ > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 01:24:19 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 90C43CAF for ; Tue, 31 Mar 2015 01:24:19 +0000 (UTC) Received: from smtp27.mail.ru (smtp27.mail.ru [94.100.181.182]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0B85D8AF for ; Tue, 31 Mar 2015 01:24:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=tni1I9gKS/2WvXXoCbdZTOBaRmSY0gkk4bJikUbZhYA=; b=mfFOxtOttg8vaTcWx4qu+tTGU7olZNp4N1ImP0uLUrs3tvcMkJDQLWi9Rqqm7v+16oagcn4IALmxj2ZHyqWc9VzicBEJ8Tabf7K6WeSBguAJrtsHxtNXu9cRYz7bnpvz+Y1Otg+HrsVonlXIG73lt0iSuLbfHtVnC396bHchUK8=; Received: from [109.188.127.13] (port=55067 helo=[192.168.0.12]) by smtp27.mail.ru with esmtpa (envelope-from ) id 1YckuR-0003N5-4x for freebsd-fs@freebsd.org; Tue, 31 Mar 2015 04:24:08 +0300 Message-ID: <5519F74C.1040308@artem.ru> Date: Tue, 31 Mar 2015 04:24:28 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Little research how rm -rf and tar kill server References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> In-Reply-To: <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 01:24:19 -0000 30.03.2015 19:09, Mark Felder пишет: > > On Mon, Mar 30, 2015, at 11:04, Artem Kuchin wrote: >> 30.03.2015 18:57, Mark Felder пишет: >>> On Mon, Mar 30, 2015, at 10:53, Artem Kuchin wrote: >>>> This is normal state, not under rm -rf >>>> Do you need it during rm -rf ? >>>> >>> No, but I wonder if changing the timer from LAPIC to HPET or possibly >>> one of the other timers makes the system more responsive under that >>> load. Would you mind testing that? >>> >>> You can switch the timer like this: >>> >>> sysctl kern.eventtimer.timer=HPET >>> >>> And then run some of your I/O tests >>> >> I see. I will test at night, when load goes down. >> I cannot say sure that's a right way to dig, but i will test anything :) >> >> Just to remind: untar overloads the system, but untar + sync every 120s >> does not. >> That seems very strange to me. I think the problem might be somewhere >> here. >> > I just heard from mav that there was a bottleneck in gmirror/graid with > regards to BIO_DELETE requests > > https://svnweb.freebsd.org/base?view=revision&revision=280757 > I applied this patch manually and rebuilt the kernel. Hit this bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195458 on reboot, wasted 1 hour fsck-ing 2 times (was dirty after first fsck) and after boot tried doing rm -rf test1 I coult not test anything, because it complete after 1 minute, instead 15 minutes before. I copier the dir 4 times into subdirs and rm -rf full tree (4x larger) - fast and smooth, mariadb did not tonice this, server were working fine. However, i also noticed another thing: cp -Rp test test1 also work a lot faster now, probably 3-5 times faster Maybe it is because fs is free of tons BIO_DELETE from other processes Then i did the untar test at maximum speed ( no pv to limit bandwidth): i see that mysql request became slower, but mysql sql request queue built up slower now. However, when it reached 70 i stopped untar and mariadb could not recover from condition until i executed sync. However, this time sync took only a second. I see big improvement, but i still don't understand why i need to issue sync manually to push everything to recover from overload. # man 2 sync a sync() system call is issued frequently by the user process syncer(4) (about every 30 seconds). it does not seem to be true I checked syncer sysctl # sysctl kern.filedelay kern.filedelay: 30 # sysctl kern.dirdelay kern.dirdelay: 29 # sysctl kern.metadelay kern.metadelay: 28 # ps ax | grep sync 23 - DL 0:03.82 [syncer] no clue why need manual sync By the way: is there way to make sure that SU+J is really working? Maybe it is disabled for some reason and i don't know it. tunefs just shows stored setting, but, for example, with dirty fs, journaling is not working in reality. Any way to get current status of SU journaling? off topic: suggestion to move to ZFA was not so good, i see a "All available memory used when deleting files from ZFS" topic. I'd rather have slow server when i can login and fix than halted on panic. Just to point that ZFS still have plenty of unpredictable issues. Artem From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 03:52:42 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2D53E9DB for ; Tue, 31 Mar 2015 03:52:42 +0000 (UTC) Received: from internet06.ebureau.com (internet06.ebureau.com [65.127.24.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "internet06.ebureau.com", Issuer "internet06.ebureau.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id EDECDB91 for ; Tue, 31 Mar 2015 03:52:41 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by internet06.ebureau.com (Postfix) with ESMTP id 6E61F37F7C87; Mon, 30 Mar 2015 22:52:39 -0500 (CDT) X-Virus-Scanned: amavisd-new at ebureau.com Received: from internet06.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id k3PhhBmg_Nnl; Mon, 30 Mar 2015 22:52:37 -0500 (CDT) Received: from [10.238.39.191] (unknown [166.175.191.161]) by internet06.ebureau.com (Postfix) with ESMTPSA id 4288B37F7C7A; Mon, 30 Mar 2015 22:52:37 -0500 (CDT) From: Dustin Wenz Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (1.0) Subject: Re: All available memory used when deleting files from ZFS Message-Id: <3FF6F6A0-A23F-4EDE-98F6-5B8E41EC34A1@ebureau.com> Date: Mon, 30 Mar 2015 22:52:31 -0500 References: <5519E53C.4060203@multiplay.co.uk> In-Reply-To: <5519E53C.4060203@multiplay.co.uk> To: Steven Hartland , "" X-Mailer: iPhone Mail (12B411) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 03:52:42 -0000 Thanks, Steven! However, I have not enabled dedup on any of the affected fil= esystems. Unless it became a default at some point, I'm not sure how that tu= nable would help.=20 - .Dustin > On Mar 30, 2015, at 7:07 PM, Steven Hartland wro= te: >=20 > Later versions have vfs.zfs.free_max_blocks which is likely to be the fix y= our looking for. >=20 > It was added to head by r271532 and stable/10 by: > https://svnweb.freebsd.org/base?view=3Drevision&revision=3D272665 >=20 > Description being: >=20 > Add a new tunable/sysctl, vfs.zfs.free_max_blocks, which can be used to > limit how many blocks can be free'ed before a new transaction group is > created. The default is no limit (infinite), but we should probably have > a lower default, e.g. 100,000. >=20 > With this limit, we can guard against the case where ZFS could run out of > memory when destroying large numbers of blocks in a single transaction > group, as the entire DDT needs to be brought into memory. >=20 > Illumos issue: > 5138 add tunable for maximum number of blocks freed in one txg >=20 >=20 >=20 >> On 30/03/2015 22:14, Dustin Wenz wrote: >> I had several systems panic or hang over the weekend while deleting some d= ata off of their local zfs filesystem. It looks like they ran out of physica= l memory (32GB), and hung when paging to swap-on-zfs (which is not surprisin= g, given that ZFS was likely using the memory). They were running 10.1-STABL= E r277139M, which I built in the middle of January. The pools were about 35T= B in size, and are a concatenation of 3TB mirrors. They were maybe 95% full.= I deleted just over 1000 files, totaling 25TB on each system. >>=20 >> It took roughly 10 minutes to remove that 25TB of data per host using a r= emote rsync, and immediately after that everything seemed fine. However, aft= er several more minutes, every machine that had data removed became unrespon= sive. Some had numerous "swap_pager: indefinite wait buffer" errors followed= by a panic, and some just died with no console messages. The same thing wou= ld happen after a reboot, when FreeBSD attempted to mount the local filesyst= em again. >>=20 >> I was able to boot these systems after exporting the affected pool, but t= he problem would recur several minutes after initiating a "zpool import". Wa= tching zfs statistics didn't seem to reveal where the memory was going; ARC w= ould only climb to about 4GB, but free memory would decline rapidly. Eventua= lly, after enough export/reboot/import cycles, the pool would import success= fully and everything would be fine from then on. Note that there is no L2ARC= or compression being used. >>=20 >> Has anyone else run into this when deleting files on ZFS? It seems to be a= consistent problem under the versions of 10.1 I'm running. >>=20 >> For reference, I've appended a zstat dump below that was taken 5 minutes a= fter starting a zpool import, and was about three minutes before the machine= became unresponsive. You can see that the ARC is only 4GB, but free memory w= as down to 471MB (and continued to drop). >>=20 >> - .Dustin >>=20 >>=20 >> ------------------------------------------------------------------------ >> ZFS Subsystem Report Mon Mar 30 12:35:27 2015 >> ------------------------------------------------------------------------ >>=20 >> System Information: >>=20 >> Kernel Version: 1001506 (osreldate) >> Hardware Platform: amd64 >> Processor Architecture: amd64 >>=20 >> ZFS Storage pool Version: 5000 >> ZFS Filesystem Version: 5 >>=20 >> FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root >> 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 >>=20 >> ------------------------------------------------------------------------ >>=20 >> System Memory: >>=20 >> 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact >> 98.34% 30.56 GiB Wired, 0.00% 0 Cache >> 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap >>=20 >> Real Installed: 32.00 GiB >> Real Available: 99.82% 31.94 GiB >> Real Managed: 97.29% 31.08 GiB >>=20 >> Logical Total: 32.00 GiB >> Logical Used: 98.56% 31.54 GiB >> Logical Free: 1.44% 471.57 MiB >>=20 >> Kernel Memory: 3.17 GiB >> Data: 99.18% 3.14 GiB >> Text: 0.82% 26.68 MiB >>=20 >> Kernel Memory Map: 31.08 GiB >> Size: 14.18% 4.41 GiB >> Free: 85.82% 26.67 GiB >>=20 >> ------------------------------------------------------------------------ >>=20 >> ARC Summary: (HEALTHY) >> Memory Throttle Count: 0 >>=20 >> ARC Misc: >> Deleted: 145 >> Recycle Misses: 0 >> Mutex Misses: 0 >> Evict Skips: 0 >>=20 >> ARC Size: 14.17% 4.26 GiB >> Target Size: (Adaptive) 100.00% 30.08 GiB >> Min Size (Hard Limit): 12.50% 3.76 GiB >> Max Size (High Water): 8:1 30.08 GiB >>=20 >> ARC Size Breakdown: >> Recently Used Cache Size: 50.00% 15.04 GiB >> Frequently Used Cache Size: 50.00% 15.04 GiB >>=20 >> ARC Hash Breakdown: >> Elements Max: 270.56k >> Elements Current: 100.00% 270.56k >> Collisions: 23.66k >> Chain Max: 3 >> Chains: 8.28k >>=20 >> ------------------------------------------------------------------------ >>=20 >> ARC Efficiency: 2.93m >> Cache Hit Ratio: 70.44% 2.06m >> Cache Miss Ratio: 29.56% 866.05k >> Actual Hit Ratio: 70.40% 2.06m >>=20 >> Data Demand Efficiency: 97.47% 24.58k >> Data Prefetch Efficiency: 1.88% 479 >>=20 >> CACHE HITS BY CACHE LIST: >> Anonymously Used: 0.05% 1.07k >> Most Recently Used: 71.82% 1.48m >> Most Frequently Used: 28.13% 580.49k >> Most Recently Used Ghost: 0.00% 0 >> Most Frequently Used Ghost: 0.00% 0 >>=20 >> CACHE HITS BY DATA TYPE: >> Demand Data: 1.16% 23.96k >> Prefetch Data: 0.00% 9 >> Demand Metadata: 98.79% 2.04m >> Prefetch Metadata: 0.05% 1.08k >>=20 >> CACHE MISSES BY DATA TYPE: >> Demand Data: 0.07% 621 >> Prefetch Data: 0.05% 470 >> Demand Metadata: 99.69% 863.35k >> Prefetch Metadata: 0.19% 1.61k >>=20 >> ------------------------------------------------------------------------ >>=20 >> L2ARC is disabled >>=20 >> ------------------------------------------------------------------------ >>=20 >> File-Level Prefetch: (HEALTHY) >>=20 >> DMU Efficiency: 72.95k >> Hit Ratio: 70.83% 51.66k >> Miss Ratio: 29.17% 21.28k >>=20 >> Colinear: 21.28k >> Hit Ratio: 0.01% 2 >> Miss Ratio: 99.99% 21.28k >>=20 >> Stride: 50.45k >> Hit Ratio: 99.98% 50.44k >> Miss Ratio: 0.02% 9 >>=20 >> DMU Misc: >> Reclaim: 21.28k >> Successes: 1.73% 368 >> Failures: 98.27% 20.91k >>=20 >> Streams: 1.23k >> +Resets: 0.16% 2 >> -Resets: 99.84% 1.23k >> Bogus: 0 >>=20 >> ------------------------------------------------------------------------ >>=20 >> VDEV cache is disabled >>=20 >> ------------------------------------------------------------------------ >>=20 >> ZFS Tunables (sysctl): >> kern.maxusers 2380 >> vm.kmem_size 33367830528 >> vm.kmem_size_scale 1 >> vm.kmem_size_min 0 >> vm.kmem_size_max 1319413950874 >> vfs.zfs.arc_max 32294088704 >> vfs.zfs.arc_min 4036761088 >> vfs.zfs.arc_average_blocksize 8192 >> vfs.zfs.arc_shrink_shift 5 >> vfs.zfs.arc_free_target 56518 >> vfs.zfs.arc_meta_used 4534349216 >> vfs.zfs.arc_meta_limit 8073522176 >> vfs.zfs.l2arc_write_max 8388608 >> vfs.zfs.l2arc_write_boost 8388608 >> vfs.zfs.l2arc_headroom 2 >> vfs.zfs.l2arc_feed_secs 1 >> vfs.zfs.l2arc_feed_min_ms 200 >> vfs.zfs.l2arc_noprefetch 1 >> vfs.zfs.l2arc_feed_again 1 >> vfs.zfs.l2arc_norw 1 >> vfs.zfs.anon_size 1786368 >> vfs.zfs.anon_metadata_lsize 0 >> vfs.zfs.anon_data_lsize 0 >> vfs.zfs.mru_size 504812032 >> vfs.zfs.mru_metadata_lsize 415273472 >> vfs.zfs.mru_data_lsize 35227648 >> vfs.zfs.mru_ghost_size 0 >> vfs.zfs.mru_ghost_metadata_lsize 0 >> vfs.zfs.mru_ghost_data_lsize 0 >> vfs.zfs.mfu_size 3925990912 >> vfs.zfs.mfu_metadata_lsize 3901947392 >> vfs.zfs.mfu_data_lsize 7000064 >> vfs.zfs.mfu_ghost_size 0 >> vfs.zfs.mfu_ghost_metadata_lsize 0 >> vfs.zfs.mfu_ghost_data_lsize 0 >> vfs.zfs.l2c_only_size 0 >> vfs.zfs.dedup.prefetch 1 >> vfs.zfs.nopwrite_enabled 1 >> vfs.zfs.mdcomp_disable 0 >> vfs.zfs.max_recordsize 1048576 >> vfs.zfs.dirty_data_max 3429735628 >> vfs.zfs.dirty_data_max_max 4294967296 >> vfs.zfs.dirty_data_max_percent 10 >> vfs.zfs.dirty_data_sync 67108864 >> vfs.zfs.delay_min_dirty_percent 60 >> vfs.zfs.delay_scale 500000 >> vfs.zfs.prefetch_disable 0 >> vfs.zfs.zfetch.max_streams 8 >> vfs.zfs.zfetch.min_sec_reap 2 >> vfs.zfs.zfetch.block_cap 256 >> vfs.zfs.zfetch.array_rd_sz 1048576 >> vfs.zfs.top_maxinflight 32 >> vfs.zfs.resilver_delay 2 >> vfs.zfs.scrub_delay 4 >> vfs.zfs.scan_idle 50 >> vfs.zfs.scan_min_time_ms 1000 >> vfs.zfs.free_min_time_ms 1000 >> vfs.zfs.resilver_min_time_ms 3000 >> vfs.zfs.no_scrub_io 0 >> vfs.zfs.no_scrub_prefetch 0 >> vfs.zfs.free_max_blocks -1 >> vfs.zfs.metaslab.gang_bang 16777217 >> vfs.zfs.metaslab.fragmentation_threshold70 >> vfs.zfs.metaslab.debug_load 0 >> vfs.zfs.metaslab.debug_unload 0 >> vfs.zfs.metaslab.df_alloc_threshold 131072 >> vfs.zfs.metaslab.df_free_pct 4 >> vfs.zfs.metaslab.min_alloc_size 33554432 >> vfs.zfs.metaslab.load_pct 50 >> vfs.zfs.metaslab.unload_delay 8 >> vfs.zfs.metaslab.preload_limit 3 >> vfs.zfs.metaslab.preload_enabled 1 >> vfs.zfs.metaslab.fragmentation_factor_enabled1 >> vfs.zfs.metaslab.lba_weighting_enabled 1 >> vfs.zfs.metaslab.bias_enabled 1 >> vfs.zfs.condense_pct 200 >> vfs.zfs.mg_noalloc_threshold 0 >> vfs.zfs.mg_fragmentation_threshold 85 >> vfs.zfs.check_hostid 1 >> vfs.zfs.spa_load_verify_maxinflight 10000 >> vfs.zfs.spa_load_verify_metadata 1 >> vfs.zfs.spa_load_verify_data 1 >> vfs.zfs.recover 0 >> vfs.zfs.deadman_synctime_ms 1000000 >> vfs.zfs.deadman_checktime_ms 5000 >> vfs.zfs.deadman_enabled 1 >> vfs.zfs.spa_asize_inflation 24 >> vfs.zfs.spa_slop_shift 5 >> vfs.zfs.space_map_blksz 4096 >> vfs.zfs.txg.timeout 5 >> vfs.zfs.vdev.metaslabs_per_vdev 200 >> vfs.zfs.vdev.cache.max 16384 >> vfs.zfs.vdev.cache.size 0 >> vfs.zfs.vdev.cache.bshift 16 >> vfs.zfs.vdev.trim_on_init 1 >> vfs.zfs.vdev.mirror.rotating_inc 0 >> vfs.zfs.vdev.mirror.rotating_seek_inc 5 >> vfs.zfs.vdev.mirror.rotating_seek_offset1048576 >> vfs.zfs.vdev.mirror.non_rotating_inc 0 >> vfs.zfs.vdev.mirror.non_rotating_seek_inc1 >> vfs.zfs.vdev.async_write_active_min_dirty_percent30 >> vfs.zfs.vdev.async_write_active_max_dirty_percent60 >> vfs.zfs.vdev.max_active 1000 >> vfs.zfs.vdev.sync_read_min_active 10 >> vfs.zfs.vdev.sync_read_max_active 10 >> vfs.zfs.vdev.sync_write_min_active 10 >> vfs.zfs.vdev.sync_write_max_active 10 >> vfs.zfs.vdev.async_read_min_active 1 >> vfs.zfs.vdev.async_read_max_active 3 >> vfs.zfs.vdev.async_write_min_active 1 >> vfs.zfs.vdev.async_write_max_active 10 >> vfs.zfs.vdev.scrub_min_active 1 >> vfs.zfs.vdev.scrub_max_active 2 >> vfs.zfs.vdev.trim_min_active 1 >> vfs.zfs.vdev.trim_max_active 64 >> vfs.zfs.vdev.aggregation_limit 131072 >> vfs.zfs.vdev.read_gap_limit 32768 >> vfs.zfs.vdev.write_gap_limit 4096 >> vfs.zfs.vdev.bio_flush_disable 0 >> vfs.zfs.vdev.bio_delete_disable 0 >> vfs.zfs.vdev.trim_max_bytes 2147483648 >> vfs.zfs.vdev.trim_max_pending 64 >> vfs.zfs.max_auto_ashift 13 >> vfs.zfs.min_auto_ashift 9 >> vfs.zfs.zil_replay_disable 0 >> vfs.zfs.cache_flush_disable 0 >> vfs.zfs.zio.use_uma 1 >> vfs.zfs.zio.exclude_metadata 0 >> vfs.zfs.sync_pass_deferred_free 2 >> vfs.zfs.sync_pass_dont_compress 5 >> vfs.zfs.sync_pass_rewrite 2 >> vfs.zfs.snapshot_list_prefetch 0 >> vfs.zfs.super_owner 0 >> vfs.zfs.debug 0 >> vfs.zfs.version.ioctl 4 >> vfs.zfs.version.acl 1 >> vfs.zfs.version.spa 5000 >> vfs.zfs.version.zpl 5 >> vfs.zfs.vol.mode 1 >> vfs.zfs.vol.unmap_enabled 1 >> vfs.zfs.trim.enabled 1 >> vfs.zfs.trim.txg_delay 32 >> vfs.zfs.trim.timeout 30 >> vfs.zfs.trim.max_interval 1 >>=20 >> ------------------------------------------------------------------------ >>=20 >>=20 >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 08:11:55 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9CF1452A for ; Tue, 31 Mar 2015 08:11:55 +0000 (UTC) Received: from mail-wg0-f45.google.com (mail-wg0-f45.google.com [74.125.82.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2D469BCC for ; Tue, 31 Mar 2015 08:11:54 +0000 (UTC) Received: by wgbdm7 with SMTP id dm7so10111321wgb.1 for ; Tue, 31 Mar 2015 01:11:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=DVQDwigpWWiX0kUqG99N6LL+0dc/zrhEpQCrMS2yhjk=; b=K+G3Ies+QHIc75Yaz3ungwGdgO+8hSaD427PhGTxkuTyLKw/4/eoNEkCAtTB83bBX0 J0PlfKcjHYDTzIT3MSuDBDLUdIvxk2RIEpnWWUAW89jsy4jHD+a/OcgVZtiSFUQMmfZn 2ORjN/3yWGpLvQnkeLcHATipi9Umd+UsD06cK/zMI3b4t74BkwoBDh3d4ia0XHP89oa2 24vX8KfSYcb71zqnN/e4f6uXsFZxxDgfQ5o2mL86fFOckwO0D5dgR2uEZN6pekaio7kg d0XK/U7eWZvZfT8QLzZgVdTUNY/zTwuYIunWEGYLIiZ82RspGrhQ6Ukyj9C4Ese1MoyZ e2Ug== X-Gm-Message-State: ALoCoQndH/bW1jAMxcDdASCIAqK7V8NlHReU2gsE43KbfrwW5YPeBEA+qFZTKtwqArl5wJULNM/C X-Received: by 10.180.74.170 with SMTP id u10mr3277816wiv.46.1427789512851; Tue, 31 Mar 2015 01:11:52 -0700 (PDT) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id i5sm19172035wiz.0.2015.03.31.01.11.51 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 31 Mar 2015 01:11:51 -0700 (PDT) Message-ID: <551A56D2.3050006@multiplay.co.uk> Date: Tue, 31 Mar 2015 09:12:02 +0100 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Dustin Wenz , "" Subject: Re: All available memory used when deleting files from ZFS References: <5519E53C.4060203@multiplay.co.uk> <3FF6F6A0-A23F-4EDE-98F6-5B8E41EC34A1@ebureau.com> In-Reply-To: <3FF6F6A0-A23F-4EDE-98F6-5B8E41EC34A1@ebureau.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 08:11:55 -0000 Are your pools HDD or SSD based? If the latter then this still may be relevant, due to the TRIM support in FreeBSD and TBH with a large TXG of free's this still may help even if the DDT isn't the main cause, so may be worth testing. Regards Steve On 31/03/2015 04:52, Dustin Wenz wrote: > Thanks, Steven! However, I have not enabled dedup on any of the affected filesystems. Unless it became a default at some point, I'm not sure how that tunable would help. > > - .Dustin > >> On Mar 30, 2015, at 7:07 PM, Steven Hartland wrote: >> >> Later versions have vfs.zfs.free_max_blocks which is likely to be the fix your looking for. >> >> It was added to head by r271532 and stable/10 by: >> https://svnweb.freebsd.org/base?view=revision&revision=272665 >> >> Description being: >> >> Add a new tunable/sysctl, vfs.zfs.free_max_blocks, which can be used to >> limit how many blocks can be free'ed before a new transaction group is >> created. The default is no limit (infinite), but we should probably have >> a lower default, e.g. 100,000. >> >> With this limit, we can guard against the case where ZFS could run out of >> memory when destroying large numbers of blocks in a single transaction >> group, as the entire DDT needs to be brought into memory. >> >> Illumos issue: >> 5138 add tunable for maximum number of blocks freed in one txg >> >> >> >>> On 30/03/2015 22:14, Dustin Wenz wrote: >>> I had several systems panic or hang over the weekend while deleting some data off of their local zfs filesystem. It looks like they ran out of physical memory (32GB), and hung when paging to swap-on-zfs (which is not surprising, given that ZFS was likely using the memory). They were running 10.1-STABLE r277139M, which I built in the middle of January. The pools were about 35TB in size, and are a concatenation of 3TB mirrors. They were maybe 95% full. I deleted just over 1000 files, totaling 25TB on each system. >>> >>> It took roughly 10 minutes to remove that 25TB of data per host using a remote rsync, and immediately after that everything seemed fine. However, after several more minutes, every machine that had data removed became unresponsive. Some had numerous "swap_pager: indefinite wait buffer" errors followed by a panic, and some just died with no console messages. The same thing would happen after a reboot, when FreeBSD attempted to mount the local filesystem again. >>> >>> I was able to boot these systems after exporting the affected pool, but the problem would recur several minutes after initiating a "zpool import". Watching zfs statistics didn't seem to reveal where the memory was going; ARC would only climb to about 4GB, but free memory would decline rapidly. Eventually, after enough export/reboot/import cycles, the pool would import successfully and everything would be fine from then on. Note that there is no L2ARC or compression being used. >>> >>> Has anyone else run into this when deleting files on ZFS? It seems to be a consistent problem under the versions of 10.1 I'm running. >>> >>> For reference, I've appended a zstat dump below that was taken 5 minutes after starting a zpool import, and was about three minutes before the machine became unresponsive. You can see that the ARC is only 4GB, but free memory was down to 471MB (and continued to drop). >>> >>> - .Dustin >>> >>> >>> ------------------------------------------------------------------------ >>> ZFS Subsystem Report Mon Mar 30 12:35:27 2015 >>> ------------------------------------------------------------------------ >>> >>> System Information: >>> >>> Kernel Version: 1001506 (osreldate) >>> Hardware Platform: amd64 >>> Processor Architecture: amd64 >>> >>> ZFS Storage pool Version: 5000 >>> ZFS Filesystem Version: 5 >>> >>> FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root >>> 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 >>> >>> ------------------------------------------------------------------------ >>> >>> System Memory: >>> >>> 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact >>> 98.34% 30.56 GiB Wired, 0.00% 0 Cache >>> 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap >>> >>> Real Installed: 32.00 GiB >>> Real Available: 99.82% 31.94 GiB >>> Real Managed: 97.29% 31.08 GiB >>> >>> Logical Total: 32.00 GiB >>> Logical Used: 98.56% 31.54 GiB >>> Logical Free: 1.44% 471.57 MiB >>> >>> Kernel Memory: 3.17 GiB >>> Data: 99.18% 3.14 GiB >>> Text: 0.82% 26.68 MiB >>> >>> Kernel Memory Map: 31.08 GiB >>> Size: 14.18% 4.41 GiB >>> Free: 85.82% 26.67 GiB >>> >>> ------------------------------------------------------------------------ >>> >>> ARC Summary: (HEALTHY) >>> Memory Throttle Count: 0 >>> >>> ARC Misc: >>> Deleted: 145 >>> Recycle Misses: 0 >>> Mutex Misses: 0 >>> Evict Skips: 0 >>> >>> ARC Size: 14.17% 4.26 GiB >>> Target Size: (Adaptive) 100.00% 30.08 GiB >>> Min Size (Hard Limit): 12.50% 3.76 GiB >>> Max Size (High Water): 8:1 30.08 GiB >>> >>> ARC Size Breakdown: >>> Recently Used Cache Size: 50.00% 15.04 GiB >>> Frequently Used Cache Size: 50.00% 15.04 GiB >>> >>> ARC Hash Breakdown: >>> Elements Max: 270.56k >>> Elements Current: 100.00% 270.56k >>> Collisions: 23.66k >>> Chain Max: 3 >>> Chains: 8.28k >>> >>> ------------------------------------------------------------------------ >>> >>> ARC Efficiency: 2.93m >>> Cache Hit Ratio: 70.44% 2.06m >>> Cache Miss Ratio: 29.56% 866.05k >>> Actual Hit Ratio: 70.40% 2.06m >>> >>> Data Demand Efficiency: 97.47% 24.58k >>> Data Prefetch Efficiency: 1.88% 479 >>> >>> CACHE HITS BY CACHE LIST: >>> Anonymously Used: 0.05% 1.07k >>> Most Recently Used: 71.82% 1.48m >>> Most Frequently Used: 28.13% 580.49k >>> Most Recently Used Ghost: 0.00% 0 >>> Most Frequently Used Ghost: 0.00% 0 >>> >>> CACHE HITS BY DATA TYPE: >>> Demand Data: 1.16% 23.96k >>> Prefetch Data: 0.00% 9 >>> Demand Metadata: 98.79% 2.04m >>> Prefetch Metadata: 0.05% 1.08k >>> >>> CACHE MISSES BY DATA TYPE: >>> Demand Data: 0.07% 621 >>> Prefetch Data: 0.05% 470 >>> Demand Metadata: 99.69% 863.35k >>> Prefetch Metadata: 0.19% 1.61k >>> >>> ------------------------------------------------------------------------ >>> >>> L2ARC is disabled >>> >>> ------------------------------------------------------------------------ >>> >>> File-Level Prefetch: (HEALTHY) >>> >>> DMU Efficiency: 72.95k >>> Hit Ratio: 70.83% 51.66k >>> Miss Ratio: 29.17% 21.28k >>> >>> Colinear: 21.28k >>> Hit Ratio: 0.01% 2 >>> Miss Ratio: 99.99% 21.28k >>> >>> Stride: 50.45k >>> Hit Ratio: 99.98% 50.44k >>> Miss Ratio: 0.02% 9 >>> >>> DMU Misc: >>> Reclaim: 21.28k >>> Successes: 1.73% 368 >>> Failures: 98.27% 20.91k >>> >>> Streams: 1.23k >>> +Resets: 0.16% 2 >>> -Resets: 99.84% 1.23k >>> Bogus: 0 >>> >>> ------------------------------------------------------------------------ >>> >>> VDEV cache is disabled >>> >>> ------------------------------------------------------------------------ >>> >>> ZFS Tunables (sysctl): >>> kern.maxusers 2380 >>> vm.kmem_size 33367830528 >>> vm.kmem_size_scale 1 >>> vm.kmem_size_min 0 >>> vm.kmem_size_max 1319413950874 >>> vfs.zfs.arc_max 32294088704 >>> vfs.zfs.arc_min 4036761088 >>> vfs.zfs.arc_average_blocksize 8192 >>> vfs.zfs.arc_shrink_shift 5 >>> vfs.zfs.arc_free_target 56518 >>> vfs.zfs.arc_meta_used 4534349216 >>> vfs.zfs.arc_meta_limit 8073522176 >>> vfs.zfs.l2arc_write_max 8388608 >>> vfs.zfs.l2arc_write_boost 8388608 >>> vfs.zfs.l2arc_headroom 2 >>> vfs.zfs.l2arc_feed_secs 1 >>> vfs.zfs.l2arc_feed_min_ms 200 >>> vfs.zfs.l2arc_noprefetch 1 >>> vfs.zfs.l2arc_feed_again 1 >>> vfs.zfs.l2arc_norw 1 >>> vfs.zfs.anon_size 1786368 >>> vfs.zfs.anon_metadata_lsize 0 >>> vfs.zfs.anon_data_lsize 0 >>> vfs.zfs.mru_size 504812032 >>> vfs.zfs.mru_metadata_lsize 415273472 >>> vfs.zfs.mru_data_lsize 35227648 >>> vfs.zfs.mru_ghost_size 0 >>> vfs.zfs.mru_ghost_metadata_lsize 0 >>> vfs.zfs.mru_ghost_data_lsize 0 >>> vfs.zfs.mfu_size 3925990912 >>> vfs.zfs.mfu_metadata_lsize 3901947392 >>> vfs.zfs.mfu_data_lsize 7000064 >>> vfs.zfs.mfu_ghost_size 0 >>> vfs.zfs.mfu_ghost_metadata_lsize 0 >>> vfs.zfs.mfu_ghost_data_lsize 0 >>> vfs.zfs.l2c_only_size 0 >>> vfs.zfs.dedup.prefetch 1 >>> vfs.zfs.nopwrite_enabled 1 >>> vfs.zfs.mdcomp_disable 0 >>> vfs.zfs.max_recordsize 1048576 >>> vfs.zfs.dirty_data_max 3429735628 >>> vfs.zfs.dirty_data_max_max 4294967296 >>> vfs.zfs.dirty_data_max_percent 10 >>> vfs.zfs.dirty_data_sync 67108864 >>> vfs.zfs.delay_min_dirty_percent 60 >>> vfs.zfs.delay_scale 500000 >>> vfs.zfs.prefetch_disable 0 >>> vfs.zfs.zfetch.max_streams 8 >>> vfs.zfs.zfetch.min_sec_reap 2 >>> vfs.zfs.zfetch.block_cap 256 >>> vfs.zfs.zfetch.array_rd_sz 1048576 >>> vfs.zfs.top_maxinflight 32 >>> vfs.zfs.resilver_delay 2 >>> vfs.zfs.scrub_delay 4 >>> vfs.zfs.scan_idle 50 >>> vfs.zfs.scan_min_time_ms 1000 >>> vfs.zfs.free_min_time_ms 1000 >>> vfs.zfs.resilver_min_time_ms 3000 >>> vfs.zfs.no_scrub_io 0 >>> vfs.zfs.no_scrub_prefetch 0 >>> vfs.zfs.free_max_blocks -1 >>> vfs.zfs.metaslab.gang_bang 16777217 >>> vfs.zfs.metaslab.fragmentation_threshold70 >>> vfs.zfs.metaslab.debug_load 0 >>> vfs.zfs.metaslab.debug_unload 0 >>> vfs.zfs.metaslab.df_alloc_threshold 131072 >>> vfs.zfs.metaslab.df_free_pct 4 >>> vfs.zfs.metaslab.min_alloc_size 33554432 >>> vfs.zfs.metaslab.load_pct 50 >>> vfs.zfs.metaslab.unload_delay 8 >>> vfs.zfs.metaslab.preload_limit 3 >>> vfs.zfs.metaslab.preload_enabled 1 >>> vfs.zfs.metaslab.fragmentation_factor_enabled1 >>> vfs.zfs.metaslab.lba_weighting_enabled 1 >>> vfs.zfs.metaslab.bias_enabled 1 >>> vfs.zfs.condense_pct 200 >>> vfs.zfs.mg_noalloc_threshold 0 >>> vfs.zfs.mg_fragmentation_threshold 85 >>> vfs.zfs.check_hostid 1 >>> vfs.zfs.spa_load_verify_maxinflight 10000 >>> vfs.zfs.spa_load_verify_metadata 1 >>> vfs.zfs.spa_load_verify_data 1 >>> vfs.zfs.recover 0 >>> vfs.zfs.deadman_synctime_ms 1000000 >>> vfs.zfs.deadman_checktime_ms 5000 >>> vfs.zfs.deadman_enabled 1 >>> vfs.zfs.spa_asize_inflation 24 >>> vfs.zfs.spa_slop_shift 5 >>> vfs.zfs.space_map_blksz 4096 >>> vfs.zfs.txg.timeout 5 >>> vfs.zfs.vdev.metaslabs_per_vdev 200 >>> vfs.zfs.vdev.cache.max 16384 >>> vfs.zfs.vdev.cache.size 0 >>> vfs.zfs.vdev.cache.bshift 16 >>> vfs.zfs.vdev.trim_on_init 1 >>> vfs.zfs.vdev.mirror.rotating_inc 0 >>> vfs.zfs.vdev.mirror.rotating_seek_inc 5 >>> vfs.zfs.vdev.mirror.rotating_seek_offset1048576 >>> vfs.zfs.vdev.mirror.non_rotating_inc 0 >>> vfs.zfs.vdev.mirror.non_rotating_seek_inc1 >>> vfs.zfs.vdev.async_write_active_min_dirty_percent30 >>> vfs.zfs.vdev.async_write_active_max_dirty_percent60 >>> vfs.zfs.vdev.max_active 1000 >>> vfs.zfs.vdev.sync_read_min_active 10 >>> vfs.zfs.vdev.sync_read_max_active 10 >>> vfs.zfs.vdev.sync_write_min_active 10 >>> vfs.zfs.vdev.sync_write_max_active 10 >>> vfs.zfs.vdev.async_read_min_active 1 >>> vfs.zfs.vdev.async_read_max_active 3 >>> vfs.zfs.vdev.async_write_min_active 1 >>> vfs.zfs.vdev.async_write_max_active 10 >>> vfs.zfs.vdev.scrub_min_active 1 >>> vfs.zfs.vdev.scrub_max_active 2 >>> vfs.zfs.vdev.trim_min_active 1 >>> vfs.zfs.vdev.trim_max_active 64 >>> vfs.zfs.vdev.aggregation_limit 131072 >>> vfs.zfs.vdev.read_gap_limit 32768 >>> vfs.zfs.vdev.write_gap_limit 4096 >>> vfs.zfs.vdev.bio_flush_disable 0 >>> vfs.zfs.vdev.bio_delete_disable 0 >>> vfs.zfs.vdev.trim_max_bytes 2147483648 >>> vfs.zfs.vdev.trim_max_pending 64 >>> vfs.zfs.max_auto_ashift 13 >>> vfs.zfs.min_auto_ashift 9 >>> vfs.zfs.zil_replay_disable 0 >>> vfs.zfs.cache_flush_disable 0 >>> vfs.zfs.zio.use_uma 1 >>> vfs.zfs.zio.exclude_metadata 0 >>> vfs.zfs.sync_pass_deferred_free 2 >>> vfs.zfs.sync_pass_dont_compress 5 >>> vfs.zfs.sync_pass_rewrite 2 >>> vfs.zfs.snapshot_list_prefetch 0 >>> vfs.zfs.super_owner 0 >>> vfs.zfs.debug 0 >>> vfs.zfs.version.ioctl 4 >>> vfs.zfs.version.acl 1 >>> vfs.zfs.version.spa 5000 >>> vfs.zfs.version.zpl 5 >>> vfs.zfs.vol.mode 1 >>> vfs.zfs.vol.unmap_enabled 1 >>> vfs.zfs.trim.enabled 1 >>> vfs.zfs.trim.txg_delay 32 >>> vfs.zfs.trim.timeout 30 >>> vfs.zfs.trim.max_interval 1 >>> >>> ------------------------------------------------------------------------ >>> >>> >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 13:04:51 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 26996DA2 for ; Tue, 31 Mar 2015 13:04:51 +0000 (UTC) Received: from mail-qc0-x22d.google.com (mail-qc0-x22d.google.com [IPv6:2607:f8b0:400d:c01::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D1A6CF7F for ; Tue, 31 Mar 2015 13:04:50 +0000 (UTC) Received: by qcgx3 with SMTP id x3so12435529qcg.3 for ; Tue, 31 Mar 2015 06:04:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=PX75WBQNw4pae57qOIu3LFJzmwrJSupIpwvsjsS8C8s=; b=oPu323K5LXyRdmbFc2cUM4lMq8BU0Oco8sx18vYcT/tWmST5gUuYxqU0wdxbI1YK42 uS8v2MgOk+/OGZC6BEdTWBzmuV11QkbIHHImJEQjn1uHQQMnenvL06tSGXJgXbORq8Nd AMxaHuEZodG+QNcmvKDHvknDflssFiXEJW89F7CePMy63tcTYG01aeXO2F3WGCcoj0Pr +qyi/0Y58RKLAt0WkmhE7hPFaQ51Nxe0ki4x2HNTA2aMFj5XBZazeKPXSDLSxKYsp6Wp Wu4ao6RdrXu73r6o4av6A59Vbal0fmHQvqfaIejgYGKVKSAWRaG1kY+bJ7y+ny/QdsID 0WeA== MIME-Version: 1.0 X-Received: by 10.140.31.181 with SMTP id f50mr21818270qgf.23.1427807089950; Tue, 31 Mar 2015 06:04:49 -0700 (PDT) Received: by 10.96.9.35 with HTTP; Tue, 31 Mar 2015 06:04:49 -0700 (PDT) Received: by 10.96.9.35 with HTTP; Tue, 31 Mar 2015 06:04:49 -0700 (PDT) In-Reply-To: <550C4190.40508@kateleyco.com> References: <550C4190.40508@kateleyco.com> Date: Tue, 31 Mar 2015 09:04:49 -0400 Message-ID: Subject: Re: [zfs-discuss] Open-ZFS Office Hours From: Eric To: zfs-discuss@list.zfsonlinux.org, Linda Kateley Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: developer@open-zfs.org, freebsd-fs@freebsd.org, zfs-discuss@zfsonlinux.org, zfs@lists.illumos.org, freenas-devel@lists.freenas.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 13:04:51 -0000 Woah Justin Gibbs That's pretty cool On Mar 30, 2015 10:57 PM, "Linda Kateley via zfs-discuss" < zfs-discuss@list.zfsonlinux.org> wrote: > Hi, > > I am going to try and get office hours going again for the open-zfs > community. > > We have scheduled a meeting on Thursday April 2nd at 9 AM PDT 11AM EDT 4PM > GMT > > This months guest host is Justin Gibbs from freebsd. > > I am a little new at using hangouts, but I am pretty sure this link will > get you there > https://plus.google.com/events/ctt39ds1j8onc2uthm3kkaf6tl0 > This info is also posted on open-zfs.org site. > > If you have ideas for future meetings, let me know. > > Tia > > -- > Linda Kateley > President > Kateley Company > 612-807-6349 > Skype ID-kateleyco > http://kateleyco.com > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@list.zfsonlinux.org > http://list.zfsonlinux.org/mailman/listinfo/zfs-discuss > From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 14:55:05 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D7FDB7C7 for ; Tue, 31 Mar 2015 14:55:05 +0000 (UTC) Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A4394FB0 for ; Tue, 31 Mar 2015 14:55:05 +0000 (UTC) Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailnew.nyi.internal (Postfix) with ESMTP id 8057DADF for ; Tue, 31 Mar 2015 10:54:55 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute2.internal (MEProxy); Tue, 31 Mar 2015 10:54:58 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=eBrAFpU/i9EudS2 iG08R50Q4ZAQ=; b=jJUUxOhwLqaFVAhYcIi1PKQUFZj7HewjL363eHwiAirYtrq jffuy7AzcJwGQr4ehedXpzSXjkB/BW/L93ikyHQlyzTKHVHNtGtdg5YSnfqRJv7b E02AEYSUCWI4rlFgj3oNBm9IYYEoyOrzDghVQ4Q7KToPjVpBtckRH9gGESqc= Received: by web3.nyi.internal (Postfix, from userid 99) id 34137114A48; Tue, 31 Mar 2015 10:54:58 -0400 (EDT) Message-Id: <1427813698.641733.247585797.28816738@webmail.messagingengine.com> X-Sasl-Enc: CLndvw47TnUYuCiOc9jL8M+wOSnZKU5JziU8vAJnXn15 1427813698 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-Mailer: MessagingEngine.com Webmail Interface - ajax-0b3c2300 In-Reply-To: <5519F74C.1040308@artem.ru> References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> Subject: Re: Little research how rm -rf and tar kill server Date: Tue, 31 Mar 2015 09:54:58 -0500 Cc: mav@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 14:55:05 -0000 On Mon, Mar 30, 2015, at 20:24, Artem Kuchin wrote: > 30.03.2015 19:09, Mark Felder =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > > > > On Mon, Mar 30, 2015, at 11:04, Artem Kuchin wrote: > >> 30.03.2015 18:57, Mark Felder =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > >>> On Mon, Mar 30, 2015, at 10:53, Artem Kuchin wrote: > >>>> This is normal state, not under rm -rf > >>>> Do you need it during rm -rf ? > >>>> > >>> No, but I wonder if changing the timer from LAPIC to HPET or possibly > >>> one of the other timers makes the system more responsive under that > >>> load. Would you mind testing that? > >>> > >>> You can switch the timer like this: > >>> > >>> sysctl kern.eventtimer.timer=3DHPET > >>> > >>> And then run some of your I/O tests > >>> > >> I see. I will test at night, when load goes down. > >> I cannot say sure that's a right way to dig, but i will test anything= :) > >> > >> Just to remind: untar overloads the system, but untar + sync every 120s > >> does not. > >> That seems very strange to me. I think the problem might be somewhere > >> here. > >> > > I just heard from mav that there was a bottleneck in gmirror/graid with > > regards to BIO_DELETE requests > > > > https://svnweb.freebsd.org/base?view=3Drevision&revision=3D280757 > > >=20 > I applied this patch manually and rebuilt the kernel. > Hit this bug > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D195458 > on reboot, wasted 1 hour fsck-ing 2 times (was dirty after first fsck) > and after boot tried doing > rm -rf test1 > I coult not test anything, because it complete after 1 minute, instead=20 > 15 minutes before. > I copier the dir 4 times into subdirs and rm -rf full tree (4x larger) -= =20 > fast and smooth, > mariadb did not tonice this, server were working fine. >=20 > However, i also noticed another thing: > cp -Rp test test1 > also work a lot faster now, probably 3-5 times faster > Maybe it is because fs is free of tons BIO_DELETE from other processes >=20 >=20 > Then i did the untar test at maximum speed ( no pv to limit bandwidth): > i see that mysql request became slower, but mysql sql request queue=20 > built up slower now. > However, when it reached 70 i stopped untar and mariadb could not=20 > recover from condition > until i executed sync. However, this time sync took only a second. > I see big improvement, but i still don't understand why i need to issue=20 > sync manually to push > everything to recover from overload. >=20 > # man 2 sync > a sync() system call is issued frequently by the user process syncer(4)=20 > (about every 30 seconds). >=20 > it does not seem to be true >=20 > I checked syncer sysctl >=20 > # sysctl kern.filedelay > kern.filedelay: 30 > # sysctl kern.dirdelay > kern.dirdelay: 29 > # sysctl kern.metadelay > kern.metadelay: 28 >=20 > # ps ax | grep sync > 23 - DL 0:03.82 [syncer] >=20 > no clue why need manual sync >=20 > By the way: is there way to make sure that SU+J is really working?=20 > Maybe it is disabled for some reason > and i don't know it. tunefs just shows stored setting, but, for example,= =20 > with dirty fs, journaling is not > working in reality. Any way to get current status of SU journaling? >=20 > off topic: suggestion to move to ZFA was not so good, i see a "All=20 > available memory used when deleting files from ZFS" > topic. I'd rather have slow server when i can login and fix than halted=20 > on panic. Just to point that ZFS still have plenty > of unpredictable issues. >=20 This information is very good. Perhaps there is some more additional tweaking that could be done. I will cc mav@ on this. From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 16:42:09 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DD511857 for ; Tue, 31 Mar 2015 16:42:09 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 64C4BFBE for ; Tue, 31 Mar 2015 16:42:09 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t2VGg2BL004747 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 31 Mar 2015 19:42:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t2VGg2BL004747 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t2VGg233004746; Tue, 31 Mar 2015 19:42:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 31 Mar 2015 19:42:02 +0300 From: Konstantin Belousov To: Artem Kuchin Subject: Re: Little research how rm -rf and tar kill server Message-ID: <20150331164202.GN2379@kib.kiev.ua> References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5519F74C.1040308@artem.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 16:42:09 -0000 On Tue, Mar 31, 2015 at 04:24:28AM +0300, Artem Kuchin wrote: > 30.03.2015 19:09, Mark Felder ÐÉÛÅÔ: > > > > On Mon, Mar 30, 2015, at 11:04, Artem Kuchin wrote: > >> 30.03.2015 18:57, Mark Felder ÐÉÛÅÔ: > >>> On Mon, Mar 30, 2015, at 10:53, Artem Kuchin wrote: > >>>> This is normal state, not under rm -rf > >>>> Do you need it during rm -rf ? > >>>> > >>> No, but I wonder if changing the timer from LAPIC to HPET or possibly > >>> one of the other timers makes the system more responsive under that > >>> load. Would you mind testing that? > >>> > >>> You can switch the timer like this: > >>> > >>> sysctl kern.eventtimer.timer=HPET > >>> > >>> And then run some of your I/O tests > >>> > >> I see. I will test at night, when load goes down. > >> I cannot say sure that's a right way to dig, but i will test anything :) > >> > >> Just to remind: untar overloads the system, but untar + sync every 120s > >> does not. > >> That seems very strange to me. I think the problem might be somewhere > >> here. > >> > > I just heard from mav that there was a bottleneck in gmirror/graid with > > regards to BIO_DELETE requests > > > > https://svnweb.freebsd.org/base?view=revision&revision=280757 > > > > I applied this patch manually and rebuilt the kernel. > Hit this bug > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195458 > on reboot, wasted 1 hour fsck-ing 2 times (was dirty after first fsck) > and after boot tried doing > rm -rf test1 > I coult not test anything, because it complete after 1 minute, instead > 15 minutes before. > I copier the dir 4 times into subdirs and rm -rf full tree (4x larger) - > fast and smooth, > mariadb did not tonice this, server were working fine. > > However, i also noticed another thing: > cp -Rp test test1 > also work a lot faster now, probably 3-5 times faster > Maybe it is because fs is free of tons BIO_DELETE from other processes > > > Then i did the untar test at maximum speed ( no pv to limit bandwidth): > i see that mysql request became slower, but mysql sql request queue > built up slower now. > However, when it reached 70 i stopped untar and mariadb could not > recover from condition > until i executed sync. However, this time sync took only a second. > I see big improvement, but i still don't understand why i need to issue > sync manually to push > everything to recover from overload. > > # man 2 sync > a sync() system call is issued frequently by the user process syncer(4) > (about every 30 seconds). > > it does not seem to be true > > I checked syncer sysctl > > # sysctl kern.filedelay > kern.filedelay: 30 > # sysctl kern.dirdelay > kern.dirdelay: 29 > # sysctl kern.metadelay > kern.metadelay: 28 > > # ps ax | grep sync > 23 - DL 0:03.82 [syncer] > > no clue why need manual sync Syncer and sync(2) perform different kind of syncs. Take the snapshot of sysctl debug.softdep before and after the situation occur to have some hints what is going on. Also, it makes sense to test the HEAD kernel after the r280763. Even if I end up with any fix, it probably would require r280760 as prerequisite. > > By the way: is there way to make sure that SU+J is really working? > Maybe it is disabled for some reason > and i don't know it. tunefs just shows stored setting, but, for example, > with dirty fs, journaling is not > working in reality. Any way to get current status of SU journaling? Your statement about 'dirty fs' makes no sense. Journaling state is displayed by mount -v. From owner-freebsd-fs@FreeBSD.ORG Tue Mar 31 21:52:42 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AFDDAF54 for ; Tue, 31 Mar 2015 21:52:42 +0000 (UTC) Received: from internet06.ebureau.com (internet06.ebureau.com [65.127.24.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "internet06.ebureau.com", Issuer "internet06.ebureau.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7C49AD70 for ; Tue, 31 Mar 2015 21:52:41 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by internet06.ebureau.com (Postfix) with ESMTP id 4381338089E0 for ; Tue, 31 Mar 2015 16:52:35 -0500 (CDT) X-Virus-Scanned: amavisd-new at ebureau.com Received: from internet06.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id reGFSyl9U5Wv for ; Tue, 31 Mar 2015 16:52:33 -0500 (CDT) Received: from square.office.ebureau.com (unknown [10.10.20.22]) by internet06.ebureau.com (Postfix) with ESMTPSA id AEB0338089D3 for ; Tue, 31 Mar 2015 16:52:33 -0500 (CDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: All available memory used when deleting files from ZFS From: Dustin Wenz In-Reply-To: <923828D6-503B-4FC3-89E8-1DC6DF0C9B6B@ebureau.com> Date: Tue, 31 Mar 2015 16:52:33 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <712A53CA-7A54-420F-9721-592A39D9A717@ebureau.com> References: <5519C329.3090001@denninger.net> <923828D6-503B-4FC3-89E8-1DC6DF0C9B6B@ebureau.com> To: "" X-Mailer: Apple Mail (2.2070.6) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 21:52:42 -0000 I was able to do a little regression testing on this, since I still had = about 10 hosts remaining on FreeBSD 9.2. They are the same hardware and = disk configuration, and had the same data files and zpool configurations = (mirrors of 3TB mechanical disks) as the machines that blew up over the = weekend. The only difference is that they ran FreeBSD 9.2.=20 Using the same rsync procedure as before, I was able to delete the 25 TB = of data on all of the remaining hosts with no issues whatsoever. I never = saw any free memory reduction (if anything, it increased since ARC was = being freed up as well), no paging and no hangs. One other difference = was that it took about twice as long to delete the files on 9.2 as on = 10.1 (20 minutes instead of 10 minutes). So, it would appear that there is some different zfs behavior in FreeBSD = 10.1 that was not present in 9.2, and it's causing problems when freeing = up space. If I knew why it takes twice as long to delete files in 9.2, = that might shed some light on this. There is also the recent background = destroy feature that might be suspect, but I'm not destroying = filesystems here. What other recent zfs changes might apply to deleting = files? - .Dustin > On Mar 30, 2015, at 6:30 PM, Dustin Wenz = wrote: >=20 > Unfortunately, I just spent the day recovering from this, so I have no = way to easily get new memory stats now. I'm planning on doing a test = with additional data in an effort to understand more about the issue, = but it will take time to set something up. >=20 > In the meantime, I'd advise anyone running ZFS on FreeBSD 10.x to be = mindful when freeing up lots of space all at once. >=20 > - .Dustin >=20 >> On Mar 30, 2015, at 4:42 PM, Karl Denninger = wrote: >>=20 >> What's the UMA memory use look like on that machine when the remove = is >> initiated and progresses? Look with vmstat -z and see what the used = and >> free counts look like for the zio allocations...... >>=20 >> On 3/30/2015 4:14 PM, Dustin Wenz wrote: >>> I had several systems panic or hang over the weekend while deleting = some data off of their local zfs filesystem. It looks like they ran out = of physical memory (32GB), and hung when paging to swap-on-zfs (which is = not surprising, given that ZFS was likely using the memory). They were = running 10.1-STABLE r277139M, which I built in the middle of January. = The pools were about 35TB in size, and are a concatenation of 3TB = mirrors. They were maybe 95% full. I deleted just over 1000 files, = totaling 25TB on each system. >>>=20 >>> It took roughly 10 minutes to remove that 25TB of data per host = using a remote rsync, and immediately after that everything seemed fine. = However, after several more minutes, every machine that had data removed = became unresponsive. Some had numerous "swap_pager: indefinite wait = buffer" errors followed by a panic, and some just died with no console = messages. The same thing would happen after a reboot, when FreeBSD = attempted to mount the local filesystem again. >>>=20 >>> I was able to boot these systems after exporting the affected pool, = but the problem would recur several minutes after initiating a "zpool = import". Watching zfs statistics didn't seem to reveal where the memory = was going; ARC would only climb to about 4GB, but free memory would = decline rapidly. Eventually, after enough export/reboot/import cycles, = the pool would import successfully and everything would be fine from = then on. Note that there is no L2ARC or compression being used. >>>=20 >>> Has anyone else run into this when deleting files on ZFS? It seems = to be a consistent problem under the versions of 10.1 I'm running. >>>=20 >>> For reference, I've appended a zstat dump below that was taken 5 = minutes after starting a zpool import, and was about three minutes = before the machine became unresponsive. You can see that the ARC is only = 4GB, but free memory was down to 471MB (and continued to drop). >>>=20 >>> - .Dustin >>>=20 >>>=20 >>> = ------------------------------------------------------------------------ >>> ZFS Subsystem Report Mon Mar 30 = 12:35:27 2015 >>> = ------------------------------------------------------------------------ >>>=20 >>> System Information: >>>=20 >>> Kernel Version: 1001506 (osreldate) >>> Hardware Platform: amd64 >>> Processor Architecture: amd64 >>>=20 >>> ZFS Storage pool Version: 5000 >>> ZFS Filesystem Version: 5 >>>=20 >>> FreeBSD 10.1-STABLE #11 r277139M: Tue Jan 13 14:59:55 CST 2015 root >>> 12:35PM up 8 mins, 3 users, load averages: 7.23, 8.96, 4.87 >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> System Memory: >>>=20 >>> 0.17% 55.40 MiB Active, 0.14% 46.11 MiB Inact >>> 98.34% 30.56 GiB Wired, 0.00% 0 Cache >>> 1.34% 425.46 MiB Free, 0.00% 4.00 KiB Gap >>>=20 >>> Real Installed: 32.00 GiB >>> Real Available: 99.82% 31.94 GiB >>> Real Managed: 97.29% 31.08 GiB >>>=20 >>> Logical Total: 32.00 GiB >>> Logical Used: 98.56% 31.54 GiB >>> Logical Free: 1.44% 471.57 MiB >>>=20 >>> Kernel Memory: 3.17 GiB >>> Data: 99.18% 3.14 GiB >>> Text: 0.82% 26.68 MiB >>>=20 >>> Kernel Memory Map: 31.08 GiB >>> Size: 14.18% 4.41 GiB >>> Free: 85.82% 26.67 GiB >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> ARC Summary: (HEALTHY) >>> Memory Throttle Count: 0 >>>=20 >>> ARC Misc: >>> Deleted: 145 >>> Recycle Misses: 0 >>> Mutex Misses: 0 >>> Evict Skips: 0 >>>=20 >>> ARC Size: 14.17% 4.26 GiB >>> Target Size: (Adaptive) 100.00% 30.08 GiB >>> Min Size (Hard Limit): 12.50% 3.76 GiB >>> Max Size (High Water): 8:1 30.08 GiB >>>=20 >>> ARC Size Breakdown: >>> Recently Used Cache Size: 50.00% 15.04 GiB >>> Frequently Used Cache Size: 50.00% 15.04 GiB >>>=20 >>> ARC Hash Breakdown: >>> Elements Max: 270.56k >>> Elements Current: 100.00% 270.56k >>> Collisions: 23.66k >>> Chain Max: 3 >>> Chains: 8.28k >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> ARC Efficiency: 2.93m >>> Cache Hit Ratio: 70.44% 2.06m >>> Cache Miss Ratio: 29.56% 866.05k >>> Actual Hit Ratio: 70.40% 2.06m >>>=20 >>> Data Demand Efficiency: 97.47% 24.58k >>> Data Prefetch Efficiency: 1.88% 479 >>>=20 >>> CACHE HITS BY CACHE LIST: >>> Anonymously Used: 0.05% 1.07k >>> Most Recently Used: 71.82% 1.48m >>> Most Frequently Used: 28.13% 580.49k >>> Most Recently Used Ghost: 0.00% 0 >>> Most Frequently Used Ghost: 0.00% 0 >>>=20 >>> CACHE HITS BY DATA TYPE: >>> Demand Data: 1.16% 23.96k >>> Prefetch Data: 0.00% 9 >>> Demand Metadata: 98.79% 2.04m >>> Prefetch Metadata: 0.05% 1.08k >>>=20 >>> CACHE MISSES BY DATA TYPE: >>> Demand Data: 0.07% 621 >>> Prefetch Data: 0.05% 470 >>> Demand Metadata: 99.69% 863.35k >>> Prefetch Metadata: 0.19% 1.61k >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> L2ARC is disabled >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> File-Level Prefetch: (HEALTHY) >>>=20 >>> DMU Efficiency: 72.95k >>> Hit Ratio: 70.83% 51.66k >>> Miss Ratio: 29.17% 21.28k >>>=20 >>> Colinear: 21.28k >>> Hit Ratio: 0.01% 2 >>> Miss Ratio: 99.99% 21.28k >>>=20 >>> Stride: 50.45k >>> Hit Ratio: 99.98% 50.44k >>> Miss Ratio: 0.02% 9 >>>=20 >>> DMU Misc: >>> Reclaim: 21.28k >>> Successes: 1.73% 368 >>> Failures: 98.27% 20.91k >>>=20 >>> Streams: 1.23k >>> +Resets: 0.16% 2 >>> -Resets: 99.84% 1.23k >>> Bogus: 0 >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> VDEV cache is disabled >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>> ZFS Tunables (sysctl): >>> kern.maxusers 2380 >>> vm.kmem_size 33367830528 >>> vm.kmem_size_scale 1 >>> vm.kmem_size_min 0 >>> vm.kmem_size_max 1319413950874 >>> vfs.zfs.arc_max 32294088704 >>> vfs.zfs.arc_min 4036761088 >>> vfs.zfs.arc_average_blocksize 8192 >>> vfs.zfs.arc_shrink_shift 5 >>> vfs.zfs.arc_free_target 56518 >>> vfs.zfs.arc_meta_used 4534349216 >>> vfs.zfs.arc_meta_limit 8073522176 >>> vfs.zfs.l2arc_write_max 8388608 >>> vfs.zfs.l2arc_write_boost 8388608 >>> vfs.zfs.l2arc_headroom 2 >>> vfs.zfs.l2arc_feed_secs 1 >>> vfs.zfs.l2arc_feed_min_ms 200 >>> vfs.zfs.l2arc_noprefetch 1 >>> vfs.zfs.l2arc_feed_again 1 >>> vfs.zfs.l2arc_norw 1 >>> vfs.zfs.anon_size 1786368 >>> vfs.zfs.anon_metadata_lsize 0 >>> vfs.zfs.anon_data_lsize 0 >>> vfs.zfs.mru_size 504812032 >>> vfs.zfs.mru_metadata_lsize 415273472 >>> vfs.zfs.mru_data_lsize 35227648 >>> vfs.zfs.mru_ghost_size 0 >>> vfs.zfs.mru_ghost_metadata_lsize 0 >>> vfs.zfs.mru_ghost_data_lsize 0 >>> vfs.zfs.mfu_size 3925990912 >>> vfs.zfs.mfu_metadata_lsize 3901947392 >>> vfs.zfs.mfu_data_lsize 7000064 >>> vfs.zfs.mfu_ghost_size 0 >>> vfs.zfs.mfu_ghost_metadata_lsize 0 >>> vfs.zfs.mfu_ghost_data_lsize 0 >>> vfs.zfs.l2c_only_size 0 >>> vfs.zfs.dedup.prefetch 1 >>> vfs.zfs.nopwrite_enabled 1 >>> vfs.zfs.mdcomp_disable 0 >>> vfs.zfs.max_recordsize 1048576 >>> vfs.zfs.dirty_data_max 3429735628 >>> vfs.zfs.dirty_data_max_max 4294967296 >>> vfs.zfs.dirty_data_max_percent 10 >>> vfs.zfs.dirty_data_sync 67108864 >>> vfs.zfs.delay_min_dirty_percent 60 >>> vfs.zfs.delay_scale 500000 >>> vfs.zfs.prefetch_disable 0 >>> vfs.zfs.zfetch.max_streams 8 >>> vfs.zfs.zfetch.min_sec_reap 2 >>> vfs.zfs.zfetch.block_cap 256 >>> vfs.zfs.zfetch.array_rd_sz 1048576 >>> vfs.zfs.top_maxinflight 32 >>> vfs.zfs.resilver_delay 2 >>> vfs.zfs.scrub_delay 4 >>> vfs.zfs.scan_idle 50 >>> vfs.zfs.scan_min_time_ms 1000 >>> vfs.zfs.free_min_time_ms 1000 >>> vfs.zfs.resilver_min_time_ms 3000 >>> vfs.zfs.no_scrub_io 0 >>> vfs.zfs.no_scrub_prefetch 0 >>> vfs.zfs.free_max_blocks -1 >>> vfs.zfs.metaslab.gang_bang 16777217 >>> vfs.zfs.metaslab.fragmentation_threshold70 >>> vfs.zfs.metaslab.debug_load 0 >>> vfs.zfs.metaslab.debug_unload 0 >>> vfs.zfs.metaslab.df_alloc_threshold 131072 >>> vfs.zfs.metaslab.df_free_pct 4 >>> vfs.zfs.metaslab.min_alloc_size 33554432 >>> vfs.zfs.metaslab.load_pct 50 >>> vfs.zfs.metaslab.unload_delay 8 >>> vfs.zfs.metaslab.preload_limit 3 >>> vfs.zfs.metaslab.preload_enabled 1 >>> vfs.zfs.metaslab.fragmentation_factor_enabled1 >>> vfs.zfs.metaslab.lba_weighting_enabled 1 >>> vfs.zfs.metaslab.bias_enabled 1 >>> vfs.zfs.condense_pct 200 >>> vfs.zfs.mg_noalloc_threshold 0 >>> vfs.zfs.mg_fragmentation_threshold 85 >>> vfs.zfs.check_hostid 1 >>> vfs.zfs.spa_load_verify_maxinflight 10000 >>> vfs.zfs.spa_load_verify_metadata 1 >>> vfs.zfs.spa_load_verify_data 1 >>> vfs.zfs.recover 0 >>> vfs.zfs.deadman_synctime_ms 1000000 >>> vfs.zfs.deadman_checktime_ms 5000 >>> vfs.zfs.deadman_enabled 1 >>> vfs.zfs.spa_asize_inflation 24 >>> vfs.zfs.spa_slop_shift 5 >>> vfs.zfs.space_map_blksz 4096 >>> vfs.zfs.txg.timeout 5 >>> vfs.zfs.vdev.metaslabs_per_vdev 200 >>> vfs.zfs.vdev.cache.max 16384 >>> vfs.zfs.vdev.cache.size 0 >>> vfs.zfs.vdev.cache.bshift 16 >>> vfs.zfs.vdev.trim_on_init 1 >>> vfs.zfs.vdev.mirror.rotating_inc 0 >>> vfs.zfs.vdev.mirror.rotating_seek_inc 5 >>> vfs.zfs.vdev.mirror.rotating_seek_offset1048576 >>> vfs.zfs.vdev.mirror.non_rotating_inc 0 >>> vfs.zfs.vdev.mirror.non_rotating_seek_inc1 >>> vfs.zfs.vdev.async_write_active_min_dirty_percent30 >>> vfs.zfs.vdev.async_write_active_max_dirty_percent60 >>> vfs.zfs.vdev.max_active 1000 >>> vfs.zfs.vdev.sync_read_min_active 10 >>> vfs.zfs.vdev.sync_read_max_active 10 >>> vfs.zfs.vdev.sync_write_min_active 10 >>> vfs.zfs.vdev.sync_write_max_active 10 >>> vfs.zfs.vdev.async_read_min_active 1 >>> vfs.zfs.vdev.async_read_max_active 3 >>> vfs.zfs.vdev.async_write_min_active 1 >>> vfs.zfs.vdev.async_write_max_active 10 >>> vfs.zfs.vdev.scrub_min_active 1 >>> vfs.zfs.vdev.scrub_max_active 2 >>> vfs.zfs.vdev.trim_min_active 1 >>> vfs.zfs.vdev.trim_max_active 64 >>> vfs.zfs.vdev.aggregation_limit 131072 >>> vfs.zfs.vdev.read_gap_limit 32768 >>> vfs.zfs.vdev.write_gap_limit 4096 >>> vfs.zfs.vdev.bio_flush_disable 0 >>> vfs.zfs.vdev.bio_delete_disable 0 >>> vfs.zfs.vdev.trim_max_bytes 2147483648 >>> vfs.zfs.vdev.trim_max_pending 64 >>> vfs.zfs.max_auto_ashift 13 >>> vfs.zfs.min_auto_ashift 9 >>> vfs.zfs.zil_replay_disable 0 >>> vfs.zfs.cache_flush_disable 0 >>> vfs.zfs.zio.use_uma 1 >>> vfs.zfs.zio.exclude_metadata 0 >>> vfs.zfs.sync_pass_deferred_free 2 >>> vfs.zfs.sync_pass_dont_compress 5 >>> vfs.zfs.sync_pass_rewrite 2 >>> vfs.zfs.snapshot_list_prefetch 0 >>> vfs.zfs.super_owner 0 >>> vfs.zfs.debug 0 >>> vfs.zfs.version.ioctl 4 >>> vfs.zfs.version.acl 1 >>> vfs.zfs.version.spa 5000 >>> vfs.zfs.version.zpl 5 >>> vfs.zfs.vol.mode 1 >>> vfs.zfs.vol.unmap_enabled 1 >>> vfs.zfs.trim.enabled 1 >>> vfs.zfs.trim.txg_delay 32 >>> vfs.zfs.trim.timeout 30 >>> vfs.zfs.trim.max_interval 1 >>>=20 >>> = ------------------------------------------------------------------------ >>>=20 >>>=20 >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to = "freebsd-fs-unsubscribe@freebsd.org" >>>=20 >>>=20 >>> %SPAMBLOCK-SYS: Matched [@freebsd.org+], message ok >>=20 >> --=20 >> Karl Denninger >> karl@denninger.net >> /The Market Ticker/ >>=20 >>=20 >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Wed Apr 1 08:16:42 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8D356BE9 for ; Wed, 1 Apr 2015 08:16:42 +0000 (UTC) Received: from smtp42.i.mail.ru (smtp42.i.mail.ru [94.100.177.102]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 007A936C for ; Wed, 1 Apr 2015 08:16:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=j6F9PQj8ScTl+aIurEAEHjI4a35sVvb2OSARM30fTmU=; b=ojBUjUkhgej4yuNxh6Z2avAMjNMr11jkmZemDRxeXvQzEHdKhBdhlbfqUspw3zEjLrmOi+7Vcmbsj6SidxdovqgRgPso29DgCXYDpVgzs1yCKP77As09uw2aq8x4rq077N60zXnfi/YRtfk/rCx3+ozJ71HqC8sEfhZ+zulnvcs=; Received: from [109.188.127.13] (port=48335 helo=[192.168.0.12]) by smtp42.i.mail.ru with esmtpa (envelope-from ) id 1YdDp3-00005w-Uu; Wed, 01 Apr 2015 11:16:31 +0300 Message-ID: <551BA987.4050708@artem.ru> Date: Wed, 01 Apr 2015 11:17:11 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Little research how rm -rf and tar kill server References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> In-Reply-To: <20150331164202.GN2379@kib.kiev.ua> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Apr 2015 08:16:42 -0000 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: > On Tue, Mar 31, 2015 at 04:24:28AM +0300, Artem Kuchin wrote: >> 30.03.2015 19:09, Mark Felder ÐÉÛÅÔ: >> > Syncer and sync(2) perform different kind of syncs. This is strange. Did you see what man 2 sync said? It specifically refers to syncer, implying that syncer does the same sync as man 2 sync Is man page outdated or incorrect? > Take the snapshot of > sysctl debug.softdep before and after the situation occur to have some > hints what is going on. Will do tonight. Before, during and after the untaring. > > Also, it makes sense to test the HEAD kernel after the r280763. Even if > I end up with any fix, it probably would require r280760 as prerequisite. This is a working loaded server, i am afraid to test HEAD on it. That make not be compatible with the rest of the system. And actually, i don't know how to fetch only kernel from HEAD :) > >> By the way: is there way to make sure that SU+J is really working? >> Maybe it is disabled for some reason >> and i don't know it. tunefs just shows stored setting, but, for example, >> with dirty fs, journaling is not >> working in reality. Any way to get current status of SU journaling? > Your statement about 'dirty fs' makes no sense. Maybe i misunderstood something Here is the situation: I have dirty shutdown, so fs is dirty fsck started at boot but said something about unexpected inconsistency and journal is not used i stopped fsck and made mount -f -a sh /etc/rc to boot the server and make it work anyway (later i shutdown it again and did fsck normally) As i understand journaling was disabled when i did mount -f -a correct? > > Journaling state is displayed by mount -v. Thank you! From owner-freebsd-fs@FreeBSD.ORG Wed Apr 1 08:36:19 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5048C8CB for ; Wed, 1 Apr 2015 08:36:19 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E4CF2807 for ; Wed, 1 Apr 2015 08:36:18 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t318a9ok045098 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 1 Apr 2015 11:36:09 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t318a9ok045098 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t318a9uF045097; Wed, 1 Apr 2015 11:36:09 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 1 Apr 2015 11:36:09 +0300 From: Konstantin Belousov To: Artem Kuchin Subject: Re: Little research how rm -rf and tar kill server Message-ID: <20150401083609.GX2379@kib.kiev.ua> References: <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551BA987.4050708@artem.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <551BA987.4050708@artem.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Apr 2015 08:36:19 -0000 On Wed, Apr 01, 2015 at 11:17:11AM +0300, Artem Kuchin wrote: > 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: > > On Tue, Mar 31, 2015 at 04:24:28AM +0300, Artem Kuchin wrote: > >> 30.03.2015 19:09, Mark Felder ÐÉÛÅÔ: > >> > > Syncer and sync(2) perform different kind of syncs. > This is strange. Did you see what > man 2 sync > said? > It specifically refers to syncer, implying that syncer does the same sync as > man 2 sync Code may select whatever strategy is useful in given situation, which implements the specified outcome. Kernel syncer uses the different code path in ffs_sync() for periodic updates to avoid uneccessary work already done by buffer write-outs (which is initiated by the same syncer thread, FWIW). There might be a bug, which I am trying to track. > > Is man page outdated or incorrect? > > Take the snapshot of > > sysctl debug.softdep before and after the situation occur to have some > > hints what is going on. > > > Will do tonight. Before, during and after the untaring. > > > > > Also, it makes sense to test the HEAD kernel after the r280763. Even if > > I end up with any fix, it probably would require r280760 as prerequisite. > > This is a working loaded server, i am afraid to test HEAD on it. That > make not be compatible > with the rest of the system. And actually, i don't know how to fetch > only kernel from HEAD :) Checkout head, do make kernel-toolchain buildkernel installkernel. > > > > > >> By the way: is there way to make sure that SU+J is really working? > >> Maybe it is disabled for some reason > >> and i don't know it. tunefs just shows stored setting, but, for example, > >> with dirty fs, journaling is not > >> working in reality. Any way to get current status of SU journaling? > > Your statement about 'dirty fs' makes no sense. > Maybe i misunderstood something > Here is the situation: > I have dirty shutdown, so fs is dirty > fsck started at boot but said something about unexpected inconsistency > and journal is not used > i stopped fsck > and made > mount -f -a > sh /etc/rc > to boot the server and make it work anyway > (later i shutdown it again and did fsck normally) > > As i understand journaling was disabled when i did mount -f -a > correct? No, you did not read what fsck wrote. The journal was thrown out and not used for fsck. It has no relation to the mount operation. To see if journaling is performed, see mount -v output. What you did is actually dangerous. If due to software bugs or firmware errors the on-disk metadata structure was corrupted, mounting the volume could cause run-time check and panic in the best case. In the ideal world, only inode or block leaks can occur, which is innocent to not be fixed by fsck, but there is difference between ideal and real world. From owner-freebsd-fs@FreeBSD.ORG Wed Apr 1 09:00:40 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1A7E8118 for ; Wed, 1 Apr 2015 09:00:40 +0000 (UTC) Received: from smtp36.i.mail.ru (smtp36.i.mail.ru [94.100.177.96]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 81404A47 for ; Wed, 1 Apr 2015 09:00:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=EjmQFB84y9zQDnNKu3a7a+OO9ZUqLzyhXBjNoC2DVE4=; b=RAiqFmdmYYHbBrL2RorwTgF0xMVOfkcDKzu966/hFVCab9ZQLpNaPIOwWtkjbngdkjH7Ji+7U+Q1LxFZPZ6Uk8uLPor7lGWnsaPCR96H52c8IUNsaKJfQlOvS+9Ob2n/3mJCX5HsWHnuWWWPVbRKJDu5zs1F4pK2J5zNLgGvE6w=; Received: from [109.188.127.13] (port=64911 helo=[192.168.0.12]) by smtp36.i.mail.ru with esmtpa (envelope-from ) id 1YdERR-0001CM-6g; Wed, 01 Apr 2015 11:56:10 +0300 Message-ID: <551BB2D4.6070505@artem.ru> Date: Wed, 01 Apr 2015 11:56:52 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Little research how rm -rf and tar kill server References: <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551BA987.4050708@artem.ru> <20150401083609.GX2379@kib.kiev.ua> In-Reply-To: <20150401083609.GX2379@kib.kiev.ua> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Apr 2015 09:00:40 -0000 01.04.2015 11:36, Konstantin Belousov ÐÉÛÅÔ: >> >>>> By the way: is there way to make sure that SU+J is really working? >>>> Maybe it is disabled for some reason >>>> and i don't know it. tunefs just shows stored setting, but, for example, >>>> with dirty fs, journaling is not >>>> working in reality. Any way to get current status of SU journaling? >>> Your statement about 'dirty fs' makes no sense. >> Maybe i misunderstood something >> Here is the situation: >> I have dirty shutdown, so fs is dirty >> fsck started at boot but said something about unexpected inconsistency >> and journal is not used >> i stopped fsck >> and made >> mount -f -a >> sh /etc/rc >> to boot the server and make it work anyway >> (later i shutdown it again and did fsck normally) >> >> As i understand journaling was disabled when i did mount -f -a >> correct? > No, you did not read what fsck wrote. The journal was thrown out and > not used for fsck. It has no relation to the mount operation. To see > if journaling is performed, see mount -v output. I assumed what was not :) Did not know about mount -v at that time. > > What you did is actually dangerous. If due to software bugs or firmware > errors the on-disk metadata structure was corrupted, mounting the volume > could cause run-time check and panic in the best case. In the ideal > world, only inode or block leaks can occur, which is innocent to not > be fixed by fsck, but there is difference between ideal and real world. Yes, I know it is a bad way to go, but since fsck was dirty because of the hang bug the buffers were empty before reboot, so, basically sync was complete just clean flag not set. It happened before and fsck only found leaky inodes. From owner-freebsd-fs@FreeBSD.ORG Wed Apr 1 18:41:05 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 921D7D11 for ; Wed, 1 Apr 2015 18:41:05 +0000 (UTC) Received: from elf.torek.net (mail.torek.net [96.90.199.121]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 36FDA810 for ; Wed, 1 Apr 2015 18:41:03 +0000 (UTC) Received: from elf.torek.net (localhost [127.0.0.1]) by elf.torek.net (8.14.5/8.14.5) with ESMTP id t318CxQ7062913; Wed, 1 Apr 2015 02:13:00 -0600 (MDT) (envelope-from torek@torek.net) Message-Id: <201504010813.t318CxQ7062913@elf.torek.net> From: Chris Torek To: Da Rock , freebsd-fs@freebsd.org Subject: Re: Delete a directory, crash the system MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <62905.1427875979.1@elf.torek.net> Date: Wed, 01 Apr 2015 01:12:59 -0700 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (elf.torek.net [127.0.0.1]); Wed, 01 Apr 2015 02:13:00 -0600 (MDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Apr 2015 18:41:05 -0000 > ... run fsck after a crash ... will allow you to recover from > media errors which it appears your system is suffering from. > SU+J is just a faster way of restarting but only works when you > do not have media errors. It's worth adding that this is also needed after some non-media memory errors (on systems lacking ECC, which -- alas -- includes my main home system) that corrupt bits in FFS bitmaps and directories, which are then written back to the (perfectly good) medium. (I had a bad DRAM chip that memtest86+ found for me, that until I replaced it caused broken file systems, that after I replaced it in turn caused panics followed by inappropriate journal recovery. There's really nothing to do here other than force the fsck.) Chris From owner-freebsd-fs@FreeBSD.ORG Wed Apr 1 20:06:16 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8AC49112 for ; Wed, 1 Apr 2015 20:06:16 +0000 (UTC) Received: from mail.tezzaron.com (mail.tezzaron.com [50.206.41.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 519C11EB for ; Wed, 1 Apr 2015 20:06:16 +0000 (UTC) Received: from delaware.tezzaron.com ([10.252.50.1]) by mail.tezzaron.com (IceWarp 11.1.2.0 x64) with ASMTP (SSL) id 201504011503410510 for ; Wed, 01 Apr 2015 15:03:41 -0500 Message-ID: <551C4F1D.1000206@tezzaron.com> Date: Wed, 01 Apr 2015 15:03:41 -0500 From: Adam Guimont User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: NFSD high CPU usage Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Apr 2015 20:06:16 -0000 I have an issue where NFSD will max out the CPU (1200% in this case) when a client workstation runs out of memory while trying to write via NFS. What also happens is the TCP Recv-Q fills up and causes connection timeouts for any other client trying to use the NFS server. I can reproduce the issue by running stress on a low-end client workstation. Change into the NFS mounted directory and then use stress to write via NFS and exhaust the memory, example: stress --cpu 2 --io 4 --vm 20 --hdd 4 The client workstation will eventually run out of memory trying to write into the NFS directory, fill the TCP Recv-Q on the NFS server, and then NFSD will max out the CPU. The actual client workstations (~50) are not running stress when this happens, it's a mixture of EDA tools (simulation and verification). For what it's worth, this is how I've been monitoring the TCP buffer queues where "xx.xxx.xx.xxx" is the IP address of the NFS server: cmdwatch -n1 'netstat -an | grep -e "Proto" -e "tcp4" | grep -e "Proto" -e "xx.xxx.xx.xxx.2049"' I have tried several tuning recommendations but it has not solved the problem. Has anyone else experienced this and is anyone else able to reproduce it? --- NFS server specs: OS = FreeBSD 10.0-RELEASE CPU = E5-1650 v3 Memory = 96GB Disks = 24x ST6000NM0034 in 4x raidz2 HBA = LSI SAS 9300-8i NIC = Intel 10Gb X540-T2 --- /boot/loader.conf autoboot_delay="3" geom_mirror_load="YES" mpslsi3_load="YES" cc_htcp_load="YES" --- /etc/rc.conf hostname="***" ifconfig_ix0="inet *** netmask 255.255.248.0 -tso -vlanhwtso" defaultrouter="***" sshd_enable="YES" ntpd_enable="YES" zfs_enable="YES" sendmail_enable="NO" nfs_server_enable="YES" nfs_server_flags="-h *** -t -n 128" nfs_client_enable="YES" rpcbind_enable="YES" rpc_lockd_enable="YES" rpc_statd_enable="YES" samba_enable="YES" atop_enable="YES" atop_interval="5" zabbix_agentd_enable="YES" --- /etc/sysctl.conf vfs.nfsd.server_min_nfsvers=3 vfs.nfsd.cachetcp=0 kern.ipc.maxsockbuf=16777216 net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.sendspace=1048576 net.inet.tcp.recvspace=1048576 net.inet.tcp.sendbuf_inc=32768 net.inet.tcp.recvbuf_inc=65536 net.inet.tcp.keepidle=10000 net.inet.tcp.keepintvl=2500 net.inet.tcp.always_keepalive=1 net.inet.tcp.cc.algorithm=htcp net.inet.tcp.cc.htcp.adaptive_backoff=1 net.inet.tcp.cc.htcp.rtt_scaling=1 net.inet.tcp.sack.enable=0 kern.ipc.soacceptqueue=1024 net.inet.tcp.mssdflt=1460 net.inet.tcp.minmss=1300 net.inet.tcp.tso=0 --- Client workstations: OS = CentOS 6.6 x64 Mount options from `cat /proc/mounts` = rw,nosuid,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=***,mountvers=3,mountport=916,mountproto=udp,local_lock=none,addr=*** --- Regards, Adam Guimont From owner-freebsd-fs@FreeBSD.ORG Wed Apr 1 20:49:59 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C6CA6E5F for ; Wed, 1 Apr 2015 20:49:59 +0000 (UTC) Received: from smtp102-5.vfemail.net (eightfive.vfemail.net [96.30.253.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 92936870 for ; Wed, 1 Apr 2015 20:49:59 +0000 (UTC) Received: (qmail 4234 invoked by uid 89); 1 Apr 2015 20:43:14 -0000 Received: by simscan 1.4.0 ppid: 4225, pid: 4230, t: 0.0901s scanners:none Received: from unknown (HELO d3d3MTExQDE0Mjc5MjA5OTQ=) (cmlja0BoYXZva21vbi5jb21AMTQyNzkyMDk5NA==@MTcyLjE2LjEwMC45M0AxNDI3OTIwOTk0) by 172.16.100.62 with ESMTPA; 1 Apr 2015 20:43:14 -0000 Date: Wed, 01 Apr 2015 15:43:14 -0500 Message-ID: <20150401154314.Horde.e_w-9XEJOaa4SwYyNLlttA3@www.vfemail.net> From: Rick Romero To: freebsd-fs@freebsd.org Subject: Re: NFSD high CPU usage In-Reply-To: <551C4F1D.1000206@tezzaron.com> User-Agent: Internet Messaging Program (IMP) H5 (6.2.2) X-VFEmail-Originating-IP: MTIuMzEuMTAwLjE0Ng== X-VFEmail-AntiSpam: Notify admin@vfemail.net of any spam, and include VFEmail headers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed; DelSp=Yes Content-Transfer-Encoding: 8bit Content-Disposition: inline Content-Description: Plaintext Message X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Apr 2015 20:49:59 -0000 Quoting Adam Guimont : > I have an issue where NFSD will max out the CPU (1200% in this case) > when a client workstation runs out of memory while trying to write via > NFS. What also happens is the TCP Recv-Q fills up and causes connection > timeouts for any other client trying to use the NFS server. > > I can reproduce the issue by running stress on a low-end client > workstation. Change into the NFS mounted directory and then use stress > to write via NFS and exhaust the memory, example: > > stress --cpu 2 --io 4 --vm 20 --hdd 4 > > The client workstation will eventually run out of memory trying to write > into the NFS directory, fill the TCP Recv-Q on the NFS server, and then > NFSD will max out the CPU. > > The actual client workstations (~50) are not running stress when this > happens, it's a mixture of EDA tools (simulation and verification). > > For what it's worth, this is how I've been monitoring the TCP buffer > queues where "xx.xxx.xx.xxx" is the IP address of the NFS server: > > cmdwatch -n1 'netstat -an | grep -e "Proto" -e "tcp4" | grep -e "Proto" > -e "xx.xxx.xx.xxx.2049"' > > I have tried several tuning recommendations but it has not solved the > problem. > > Has anyone else experienced this and is anyone else able to reproduce it? > > --- > NFS server specs: > > OS = FreeBSD 10.0-RELEASE > CPU = E5-1650 v3 > Memory = 96GB > Disks = 24x ST6000NM0034 in 4x raidz2 > HBA = LSI SAS 9300-8i > NIC = Intel 10Gb X540-T2 > --- > /boot/loader.conf > > autoboot_delay="3" > geom_mirror_load="YES" > mpslsi3_load="YES" > cc_htcp_load="YES" > --- > /etc/rc.conf > > hostname="***" > ifconfig_ix0="inet *** netmask 255.255.248.0 -tso -vlanhwtso" > defaultrouter="***" > sshd_enable="YES" > ntpd_enable="YES" > zfs_enable="YES" > sendmail_enable="NO" > nfs_server_enable="YES" > nfs_server_flags="-h *** -t -n 128" > nfs_client_enable="YES" > rpcbind_enable="YES" > rpc_lockd_enable="YES" > rpc_statd_enable="YES" > samba_enable="YES" > atop_enable="YES" > atop_interval="5" > zabbix_agentd_enable="YES" > --- > /etc/sysctl.conf > > vfs.nfsd.server_min_nfsvers=3 > vfs.nfsd.cachetcp=0 > kern.ipc.maxsockbuf=16777216 > net.inet.tcp.sendbuf_max=16777216 > net.inet.tcp.recvbuf_max=16777216 > net.inet.tcp.sendspace=1048576 > net.inet.tcp.recvspace=1048576 > net.inet.tcp.sendbuf_inc=32768 > net.inet.tcp.recvbuf_inc=65536 > net.inet.tcp.keepidle=10000 > net.inet.tcp.keepintvl=2500 > net.inet.tcp.always_keepalive=1 > net.inet.tcp.cc.algorithm=htcp > net.inet.tcp.cc.htcp.adaptive_backoff=1 > net.inet.tcp.cc.htcp.rtt_scaling=1 > net.inet.tcp.sack.enable=0 > kern.ipc.soacceptqueue=1024 > net.inet.tcp.mssdflt=1460 > net.inet.tcp.minmss=1300 > net.inet.tcp.tso=0 Does your ZFS pool have log devices? How does gstat -d   look? If the drives are busy, try adding vfs.nfsd.async: 0 Rick From owner-freebsd-fs@FreeBSD.ORG Wed Apr 1 22:13:14 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 88AA5493 for ; Wed, 1 Apr 2015 22:13:14 +0000 (UTC) Received: from smtp9.mail.ru (smtp9.mail.ru [94.100.181.97]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9A0EE220 for ; Wed, 1 Apr 2015 22:13:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=+L7pksYdMGXPIHgMGKsiXlb5Z+VuUp46uS3rO2zk/yw=; b=TNHcQ0ikblNNKz50u7kASGmpkHPONiImpl+4Y066jkJ/zj7dMpxK9xxkCrmNeUwqd+8ob57700TYUgPMsbknXETcMeki4+slR8YN2aXx+5XlghCxAVkXnV4OHVw/B2SNQJQb3a0kk65PvMhIvok39EmSzDMAtv4lTmpprIqS0Ks=; Received: from [109.188.127.13] (port=17663 helo=[192.168.0.12]) by smtp9.mail.ru with esmtpa (envelope-from ) id 1YdQsh-0001eR-Qs; Thu, 02 Apr 2015 01:13:04 +0300 Message-ID: <551C6D9F.8010506@artem.ru> Date: Thu, 02 Apr 2015 01:13:51 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Little research how rm -rf and tar kill server References: <55170D9C.1070107@artem.ru> <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> In-Reply-To: <20150331164202.GN2379@kib.kiev.ua> Content-Type: multipart/mixed; boundary="------------050008080403010001010207" X-Spam: Not detected X-Mras: Ok Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Apr 2015 22:13:14 -0000 This is a multi-part message in MIME format. --------------050008080403010001010207 Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 8bit 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: > Syncer and sync(2) perform different kind of syncs. Take the snapshot of > sysctl debug.softdep before and after the situation occur to have some > hints what is going on. > > Okay. Here is the sysctl data start fro normal operations see 1_normal.txt (sysctl executed momentarily) then start untar and wait for about 3 minutes when sql queries start to build up (from usual 20 idle connection to 70 queries) in 'opening table' state and number of http and perl processes rise see 2_during.txt (sysctl executed 1 minute to show data) now CTRL-C the untar and wait 60 seconds, the number of queries increase more and more , number of processes slowly rise too, situation cannot fix itself and becomes worse with time see 3_after.txt (sysctl executed 1 minute to show data) Now i did sync from shell (takes several seconds) sql query queue immediately becomes empty, number of processes momentarily drops almost to normal see 4_after_sync.txt (sysctl executed momentarily) Hope this clarifies something. Artem --------------050008080403010001010207 Content-Type: text/plain; charset=windows-1251; name="1_normal.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="1_normal.txt" debug.softdep.total.pagedep: 450815 debug.softdep.total.inodedep: 2057986 debug.softdep.total.bmsafemap: 1643440 debug.softdep.total.newblk: 6861020 debug.softdep.total.allocdirect: 2747092 debug.softdep.total.indirdep: 28330 debug.softdep.total.allocindir: 4113928 debug.softdep.total.freefrag: 504393 debug.softdep.total.freeblks: 1300916 debug.softdep.total.freefile: 1244188 debug.softdep.total.diradd: 1704738 debug.softdep.total.mkdir: 679656 debug.softdep.total.dirrem: 1411511 debug.softdep.total.newdirblk: 340829 debug.softdep.total.freework: 1534540 debug.softdep.total.freedep: 4740 debug.softdep.total.jaddref: 2384394 debug.softdep.total.jremref: 1804487 debug.softdep.total.jmvref: 2713 debug.softdep.total.jnewblk: 6861020 debug.softdep.total.jfreeblk: 0 debug.softdep.total.jfreefrag: 504393 debug.softdep.total.jseg: 258774 debug.softdep.total.jsegdep: 11554294 debug.softdep.total.sbdep: 5437 debug.softdep.total.jtrunc: 0 debug.softdep.total.jfsync: 0 debug.softdep.highuse.pagedep: 43744 debug.softdep.highuse.inodedep: 129930 debug.softdep.highuse.bmsafemap: 866 debug.softdep.highuse.newblk: 4 debug.softdep.highuse.allocdirect: 43751 debug.softdep.highuse.indirdep: 544 debug.softdep.highuse.allocindir: 86553 debug.softdep.highuse.freefrag: 501 debug.softdep.highuse.freeblks: 129622 debug.softdep.highuse.freefile: 118234 debug.softdep.highuse.diradd: 110141 debug.softdep.highuse.mkdir: 140620 debug.softdep.highuse.dirrem: 86738 debug.softdep.highuse.newdirblk: 43739 debug.softdep.highuse.freework: 129640 debug.softdep.highuse.freedep: 902 debug.softdep.highuse.jaddref: 28040 debug.softdep.highuse.jremref: 2921 debug.softdep.highuse.jmvref: 393 debug.softdep.highuse.jnewblk: 9358 debug.softdep.highuse.jfreeblk: 0 debug.softdep.highuse.jfreefrag: 14 debug.softdep.highuse.jseg: 13633 debug.softdep.highuse.jsegdep: 149182 debug.softdep.highuse.sbdep: 1 debug.softdep.highuse.jtrunc: 0 debug.softdep.highuse.jfsync: 0 debug.softdep.current.pagedep: 19 debug.softdep.current.inodedep: 731 debug.softdep.current.bmsafemap: 26 debug.softdep.current.newblk: 0 debug.softdep.current.allocdirect: 29 debug.softdep.current.indirdep: 6 debug.softdep.current.allocindir: 28 debug.softdep.current.freefrag: 6 debug.softdep.current.freeblks: 30 debug.softdep.current.freefile: 1 debug.softdep.current.diradd: 15 debug.softdep.current.mkdir: 0 debug.softdep.current.dirrem: 14 debug.softdep.current.newdirblk: 0 debug.softdep.current.freework: 61 debug.softdep.current.freedep: 1 debug.softdep.current.jaddref: 3 debug.softdep.current.jremref: 0 debug.softdep.current.jmvref: 0 debug.softdep.current.jnewblk: 3 debug.softdep.current.jfreeblk: 0 debug.softdep.current.jfreefrag: 0 debug.softdep.current.jseg: 76 debug.softdep.current.jsegdep: 165 debug.softdep.current.sbdep: 0 debug.softdep.current.jtrunc: 0 debug.softdep.current.jfsync: 0 debug.softdep.write.pagedep: 451335 debug.softdep.write.inodedep: 2202365 debug.softdep.write.bmsafemap: 194364 debug.softdep.write.newblk: 0 debug.softdep.write.allocdirect: 957173 debug.softdep.write.indirdep: 36669 debug.softdep.write.allocindir: 4076840 debug.softdep.write.freefrag: 0 debug.softdep.write.freeblks: 400530 debug.softdep.write.freefile: 0 debug.softdep.write.diradd: 0 debug.softdep.write.mkdir: 0 debug.softdep.write.dirrem: 0 debug.softdep.write.newdirblk: 0 debug.softdep.write.freework: 0 debug.softdep.write.freedep: 0 debug.softdep.write.jaddref: 0 debug.softdep.write.jremref: 0 debug.softdep.write.jmvref: 0 debug.softdep.write.jnewblk: 0 debug.softdep.write.jfreeblk: 0 debug.softdep.write.jfreefrag: 0 debug.softdep.write.jseg: 258714 debug.softdep.write.jsegdep: 33764 debug.softdep.write.sbdep: 5438 debug.softdep.write.jtrunc: 0 debug.softdep.write.jfsync: 0 debug.softdep.max_softdeps: 2486412 debug.softdep.tickdelay: 2 debug.softdep.flush_threads: 1 debug.softdep.worklist_push: 0 debug.softdep.blk_limit_push: 0 debug.softdep.ino_limit_push: 0 debug.softdep.blk_limit_hit: 0 debug.softdep.ino_limit_hit: 0 debug.softdep.sync_limit_hit: 1136 debug.softdep.indir_blk_ptrs: 11229 debug.softdep.inode_bitmap: 35308 debug.softdep.direct_blk_ptrs: 46553 debug.softdep.dir_entry: 20915 debug.softdep.jaddref_rollback: 31626 debug.softdep.jnewblk_rollback: 72747 debug.softdep.journal_low: 22546 debug.softdep.journal_min: 0 debug.softdep.journal_wait: 47528 debug.softdep.jwait_filepage: 18646 debug.softdep.jwait_freeblks: 0 debug.softdep.jwait_inode: 24778 debug.softdep.jwait_newblk: 4104 debug.softdep.cleanup_blkrequests: 0 debug.softdep.cleanup_inorequests: 0 debug.softdep.cleanup_high_delay: 0 debug.softdep.cleanup_retries: 0 debug.softdep.cleanup_failures: 0 debug.softdep.flushcache: 0 debug.softdep.emptyjblocks: 0 debug.softdep.print_threads: 0 --------------050008080403010001010207 Content-Type: text/plain; charset=windows-1251; name="2_during.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="2_during.txt" debug.softdep.total.pagedep: 779163 debug.softdep.total.inodedep: 2389227 debug.softdep.total.bmsafemap: 1973594 debug.softdep.total.newblk: 7191570 debug.softdep.total.allocdirect: 3077407 debug.softdep.total.indirdep: 28415 debug.softdep.total.allocindir: 4114163 debug.softdep.total.freefrag: 505771 debug.softdep.total.freeblks: 1303240 debug.softdep.total.freefile: 1247037 debug.softdep.total.diradd: 2034447 debug.softdep.total.mkdir: 1333270 debug.softdep.total.dirrem: 1414343 debug.softdep.total.newdirblk: 668355 debug.softdep.total.freework: 1537006 debug.softdep.total.freedep: 4745 debug.softdep.total.jaddref: 3367732 debug.softdep.total.jremref: 1807338 debug.softdep.total.jmvref: 2717 debug.softdep.total.jnewblk: 7191594 debug.softdep.total.jfreeblk: 0 debug.softdep.total.jfreefrag: 505773 debug.softdep.total.jseg: 262170 debug.softdep.total.jsegdep: 12872452 debug.softdep.total.sbdep: 5453 debug.softdep.total.jtrunc: 0 debug.softdep.total.jfsync: 0 debug.softdep.highuse.pagedep: 70216 debug.softdep.highuse.inodedep: 133633 debug.softdep.highuse.bmsafemap: 866 debug.softdep.highuse.newblk: 4 debug.softdep.highuse.allocdirect: 70222 debug.softdep.highuse.indirdep: 544 debug.softdep.highuse.allocindir: 86553 debug.softdep.highuse.freefrag: 501 debug.softdep.highuse.freeblks: 129622 debug.softdep.highuse.freefile: 118234 debug.softdep.highuse.diradd: 133491 debug.softdep.highuse.mkdir: 182210 debug.softdep.highuse.dirrem: 86738 debug.softdep.highuse.newdirblk: 70210 debug.softdep.highuse.freework: 129640 debug.softdep.highuse.freedep: 902 debug.softdep.highuse.jaddref: 40772 debug.softdep.highuse.jremref: 2921 debug.softdep.highuse.jmvref: 393 debug.softdep.highuse.jnewblk: 13621 debug.softdep.highuse.jfreeblk: 0 debug.softdep.highuse.jfreefrag: 14 debug.softdep.highuse.jseg: 13633 debug.softdep.highuse.jsegdep: 208114 debug.softdep.highuse.sbdep: 1 debug.softdep.highuse.jtrunc: 0 debug.softdep.highuse.jfsync: 0 debug.softdep.current.pagedep: 4 debug.softdep.current.inodedep: 112 debug.softdep.current.bmsafemap: 9 debug.softdep.current.newblk: 0 debug.softdep.current.allocdirect: 10 debug.softdep.current.indirdep: 0 debug.softdep.current.allocindir: 0 debug.softdep.current.freefrag: 3 debug.softdep.current.freeblks: 2 debug.softdep.current.freefile: 1 debug.softdep.current.diradd: 6 debug.softdep.current.mkdir: 1 debug.softdep.current.dirrem: 2 debug.softdep.current.newdirblk: 0 debug.softdep.current.freework: 2 debug.softdep.current.freedep: 0 debug.softdep.current.jaddref: 9 debug.softdep.current.jremref: 0 debug.softdep.current.jmvref: 0 debug.softdep.current.jnewblk: 2 debug.softdep.current.jfreeblk: 0 debug.softdep.current.jfreefrag: 0 debug.softdep.current.jseg: 1769 debug.softdep.current.jsegdep: 33 debug.softdep.current.sbdep: 0 debug.softdep.current.jtrunc: 0 debug.softdep.current.jfsync: 0 debug.softdep.write.pagedep: 783366 debug.softdep.write.inodedep: 2625116 debug.softdep.write.bmsafemap: 204889 debug.softdep.write.newblk: 0 debug.softdep.write.allocdirect: 1284997 debug.softdep.write.indirdep: 36857 debug.softdep.write.allocindir: 4077080 debug.softdep.write.freefrag: 0 debug.softdep.write.freeblks: 401687 debug.softdep.write.freefile: 0 debug.softdep.write.diradd: 0 debug.softdep.write.mkdir: 0 debug.softdep.write.dirrem: 0 debug.softdep.write.newdirblk: 0 debug.softdep.write.freework: 0 debug.softdep.write.freedep: 0 debug.softdep.write.jaddref: 0 debug.softdep.write.jremref: 0 debug.softdep.write.jmvref: 0 debug.softdep.write.jnewblk: 0 debug.softdep.write.jfreeblk: 0 debug.softdep.write.jfreefrag: 0 debug.softdep.write.jseg: 262241 debug.softdep.write.jsegdep: 34358 debug.softdep.write.sbdep: 5454 debug.softdep.write.jtrunc: 0 debug.softdep.write.jfsync: 0 debug.softdep.max_softdeps: 2486412 debug.softdep.tickdelay: 2 debug.softdep.flush_threads: 1 debug.softdep.worklist_push: 0 debug.softdep.blk_limit_push: 0 debug.softdep.ino_limit_push: 0 debug.softdep.blk_limit_hit: 0 debug.softdep.ino_limit_hit: 0 debug.softdep.sync_limit_hit: 2388 debug.softdep.indir_blk_ptrs: 11345 debug.softdep.inode_bitmap: 38918 debug.softdep.direct_blk_ptrs: 55812 debug.softdep.dir_entry: 24407 debug.softdep.jaddref_rollback: 48407 debug.softdep.jnewblk_rollback: 85405 debug.softdep.journal_low: 38453 debug.softdep.journal_min: 0 debug.softdep.journal_wait: 48493 debug.softdep.jwait_filepage: 18935 debug.softdep.jwait_freeblks: 0 debug.softdep.jwait_inode: 25105 debug.softdep.jwait_newblk: 4461 debug.softdep.cleanup_blkrequests: 0 debug.softdep.cleanup_inorequests: 0 debug.softdep.cleanup_high_delay: 0 debug.softdep.cleanup_retries: 0 debug.softdep.cleanup_failures: 0 debug.softdep.flushcache: 0 debug.softdep.emptyjblocks: 0 debug.softdep.print_threads: 0 --------------050008080403010001010207 Content-Type: text/plain; charset=windows-1251; name="3_after.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="3_after.txt" debug.softdep.total.pagedep: 779819 debug.softdep.total.inodedep: 2390351 debug.softdep.total.bmsafemap: 1974305 debug.softdep.total.newblk: 7191979 debug.softdep.total.allocdirect: 3077801 debug.softdep.total.indirdep: 28433 debug.softdep.total.allocindir: 4114181 debug.softdep.total.freefrag: 505830 debug.softdep.total.freeblks: 1303445 debug.softdep.total.freefile: 1247282 debug.softdep.total.diradd: 2034791 debug.softdep.total.mkdir: 1333312 debug.softdep.total.dirrem: 1414636 debug.softdep.total.newdirblk: 668375 debug.softdep.total.freework: 1537216 debug.softdep.total.freedep: 4745 debug.softdep.total.jaddref: 3368107 debug.softdep.total.jremref: 1807626 debug.softdep.total.jmvref: 2717 debug.softdep.total.jnewblk: 7191998 debug.softdep.total.jfreeblk: 0 debug.softdep.total.jfreefrag: 505835 debug.softdep.total.jseg: 262755 debug.softdep.total.jsegdep: 12873590 debug.softdep.total.sbdep: 5453 debug.softdep.total.jtrunc: 0 debug.softdep.total.jfsync: 0 debug.softdep.highuse.pagedep: 70216 debug.softdep.highuse.inodedep: 133633 debug.softdep.highuse.bmsafemap: 866 debug.softdep.highuse.newblk: 4 debug.softdep.highuse.allocdirect: 70222 debug.softdep.highuse.indirdep: 544 debug.softdep.highuse.allocindir: 86553 debug.softdep.highuse.freefrag: 501 debug.softdep.highuse.freeblks: 129622 debug.softdep.highuse.freefile: 118234 debug.softdep.highuse.diradd: 133491 debug.softdep.highuse.mkdir: 182210 debug.softdep.highuse.dirrem: 86738 debug.softdep.highuse.newdirblk: 70210 debug.softdep.highuse.freework: 129640 debug.softdep.highuse.freedep: 902 debug.softdep.highuse.jaddref: 40772 debug.softdep.highuse.jremref: 2921 debug.softdep.highuse.jmvref: 393 debug.softdep.highuse.jnewblk: 13621 debug.softdep.highuse.jfreeblk: 0 debug.softdep.highuse.jfreefrag: 14 debug.softdep.highuse.jseg: 13633 debug.softdep.highuse.jsegdep: 208114 debug.softdep.highuse.sbdep: 1 debug.softdep.highuse.jtrunc: 0 debug.softdep.highuse.jfsync: 0 debug.softdep.current.pagedep: 4 debug.softdep.current.inodedep: 109 debug.softdep.current.bmsafemap: 7 debug.softdep.current.newblk: 0 debug.softdep.current.allocdirect: 11 debug.softdep.current.indirdep: 0 debug.softdep.current.allocindir: 0 debug.softdep.current.freefrag: 1 debug.softdep.current.freeblks: 1 debug.softdep.current.freefile: 1 debug.softdep.current.diradd: 3 debug.softdep.current.mkdir: 0 debug.softdep.current.dirrem: 1 debug.softdep.current.newdirblk: 0 debug.softdep.current.freework: 2 debug.softdep.current.freedep: 0 debug.softdep.current.jaddref: 7 debug.softdep.current.jremref: 0 debug.softdep.current.jmvref: 0 debug.softdep.current.jnewblk: 2 debug.softdep.current.jfreeblk: 0 debug.softdep.current.jfreefrag: 0 debug.softdep.current.jseg: 2369 debug.softdep.current.jsegdep: 22 debug.softdep.current.sbdep: 0 debug.softdep.current.jtrunc: 0 debug.softdep.current.jfsync: 0 debug.softdep.write.pagedep: 784161 debug.softdep.write.inodedep: 2631087 debug.softdep.write.bmsafemap: 208423 debug.softdep.write.newblk: 0 debug.softdep.write.allocdirect: 1285360 debug.softdep.write.indirdep: 36945 debug.softdep.write.allocindir: 4077099 debug.softdep.write.freefrag: 0 debug.softdep.write.freeblks: 401899 debug.softdep.write.freefile: 0 debug.softdep.write.diradd: 0 debug.softdep.write.mkdir: 0 debug.softdep.write.dirrem: 0 debug.softdep.write.newdirblk: 0 debug.softdep.write.freework: 0 debug.softdep.write.freedep: 0 debug.softdep.write.jaddref: 0 debug.softdep.write.jremref: 0 debug.softdep.write.jmvref: 0 debug.softdep.write.jnewblk: 0 debug.softdep.write.jfreeblk: 0 debug.softdep.write.jfreefrag: 0 debug.softdep.write.jseg: 262837 debug.softdep.write.jsegdep: 34611 debug.softdep.write.sbdep: 5456 debug.softdep.write.jtrunc: 0 debug.softdep.write.jfsync: 0 debug.softdep.max_softdeps: 2486412 debug.softdep.tickdelay: 2 debug.softdep.flush_threads: 1 debug.softdep.worklist_push: 0 debug.softdep.blk_limit_push: 0 debug.softdep.ino_limit_push: 0 debug.softdep.blk_limit_hit: 0 debug.softdep.ino_limit_hit: 0 debug.softdep.sync_limit_hit: 2973 debug.softdep.indir_blk_ptrs: 11400 debug.softdep.inode_bitmap: 39849 debug.softdep.direct_blk_ptrs: 58227 debug.softdep.dir_entry: 24529 debug.softdep.jaddref_rollback: 50497 debug.softdep.jnewblk_rollback: 86203 debug.softdep.journal_low: 45662 debug.softdep.journal_min: 0 debug.softdep.journal_wait: 48940 debug.softdep.jwait_filepage: 19079 debug.softdep.jwait_freeblks: 0 debug.softdep.jwait_inode: 25253 debug.softdep.jwait_newblk: 4609 debug.softdep.cleanup_blkrequests: 0 debug.softdep.cleanup_inorequests: 0 debug.softdep.cleanup_high_delay: 0 debug.softdep.cleanup_retries: 0 debug.softdep.cleanup_failures: 0 debug.softdep.flushcache: 0 debug.softdep.emptyjblocks: 0 debug.softdep.print_threads: 0 --------------050008080403010001010207 Content-Type: text/plain; charset=windows-1251; name="4_after_sync.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="4_after_sync.txt" debug.softdep.total.pagedep: 780250 debug.softdep.total.inodedep: 2391614 debug.softdep.total.bmsafemap: 1975152 debug.softdep.total.newblk: 7193175 debug.softdep.total.allocdirect: 3078971 debug.softdep.total.indirdep: 28450 debug.softdep.total.allocindir: 4114204 debug.softdep.total.freefrag: 506112 debug.softdep.total.freeblks: 1304037 debug.softdep.total.freefile: 1247903 debug.softdep.total.diradd: 2035419 debug.softdep.total.mkdir: 1333314 debug.softdep.total.dirrem: 1415336 debug.softdep.total.newdirblk: 668376 debug.softdep.total.freework: 1537886 debug.softdep.total.freedep: 4745 debug.softdep.total.jaddref: 3368733 debug.softdep.total.jremref: 1808324 debug.softdep.total.jmvref: 2717 debug.softdep.total.jnewblk: 7193175 debug.softdep.total.jfreeblk: 0 debug.softdep.total.jfreefrag: 506112 debug.softdep.total.jseg: 263090 debug.softdep.total.jsegdep: 12876344 debug.softdep.total.sbdep: 5457 debug.softdep.total.jtrunc: 0 debug.softdep.total.jfsync: 0 debug.softdep.highuse.pagedep: 70216 debug.softdep.highuse.inodedep: 133633 debug.softdep.highuse.bmsafemap: 866 debug.softdep.highuse.newblk: 4 debug.softdep.highuse.allocdirect: 70222 debug.softdep.highuse.indirdep: 544 debug.softdep.highuse.allocindir: 86553 debug.softdep.highuse.freefrag: 501 debug.softdep.highuse.freeblks: 129622 debug.softdep.highuse.freefile: 118234 debug.softdep.highuse.diradd: 133491 debug.softdep.highuse.mkdir: 182210 debug.softdep.highuse.dirrem: 86738 debug.softdep.highuse.newdirblk: 70210 debug.softdep.highuse.freework: 129640 debug.softdep.highuse.freedep: 902 debug.softdep.highuse.jaddref: 40772 debug.softdep.highuse.jremref: 2921 debug.softdep.highuse.jmvref: 393 debug.softdep.highuse.jnewblk: 13621 debug.softdep.highuse.jfreeblk: 0 debug.softdep.highuse.jfreefrag: 14 debug.softdep.highuse.jseg: 13633 debug.softdep.highuse.jsegdep: 208114 debug.softdep.highuse.sbdep: 1 debug.softdep.highuse.jtrunc: 0 debug.softdep.highuse.jfsync: 0 debug.softdep.current.pagedep: 27 debug.softdep.current.inodedep: 292 debug.softdep.current.bmsafemap: 36 debug.softdep.current.newblk: 0 debug.softdep.current.allocdirect: 87 debug.softdep.current.indirdep: 5 debug.softdep.current.allocindir: 11 debug.softdep.current.freefrag: 10 debug.softdep.current.freeblks: 104 debug.softdep.current.freefile: 95 debug.softdep.current.diradd: 34 debug.softdep.current.mkdir: 0 debug.softdep.current.dirrem: 31 debug.softdep.current.newdirblk: 0 debug.softdep.current.freework: 160 debug.softdep.current.freedep: 0 debug.softdep.current.jaddref: 10 debug.softdep.current.jremref: 1 debug.softdep.current.jmvref: 0 debug.softdep.current.jnewblk: 5 debug.softdep.current.jfreeblk: 0 debug.softdep.current.jfreefrag: 0 debug.softdep.current.jseg: 31 debug.softdep.current.jsegdep: 315 debug.softdep.current.sbdep: 0 debug.softdep.current.jtrunc: 0 debug.softdep.current.jfsync: 0 debug.softdep.write.pagedep: 784384 debug.softdep.write.inodedep: 2632692 debug.softdep.write.bmsafemap: 209270 debug.softdep.write.newblk: 0 debug.softdep.write.allocdirect: 1285495 debug.softdep.write.indirdep: 36966 debug.softdep.write.allocindir: 4077111 debug.softdep.write.freefrag: 0 debug.softdep.write.freeblks: 402072 debug.softdep.write.freefile: 0 debug.softdep.write.diradd: 0 debug.softdep.write.mkdir: 0 debug.softdep.write.dirrem: 0 debug.softdep.write.newdirblk: 0 debug.softdep.write.freework: 0 debug.softdep.write.freedep: 0 debug.softdep.write.jaddref: 0 debug.softdep.write.jremref: 0 debug.softdep.write.jmvref: 0 debug.softdep.write.jnewblk: 0 debug.softdep.write.jfreeblk: 0 debug.softdep.write.jfreefrag: 0 debug.softdep.write.jseg: 263021 debug.softdep.write.jsegdep: 34678 debug.softdep.write.sbdep: 5458 debug.softdep.write.jtrunc: 0 debug.softdep.write.jfsync: 0 debug.softdep.max_softdeps: 2486412 debug.softdep.tickdelay: 2 debug.softdep.flush_threads: 1 debug.softdep.worklist_push: 0 debug.softdep.blk_limit_push: 0 debug.softdep.ino_limit_push: 0 debug.softdep.blk_limit_hit: 0 debug.softdep.ino_limit_hit: 0 debug.softdep.sync_limit_hit: 3076 debug.softdep.indir_blk_ptrs: 11407 debug.softdep.inode_bitmap: 40042 debug.softdep.direct_blk_ptrs: 58465 debug.softdep.dir_entry: 24539 debug.softdep.jaddref_rollback: 50790 debug.softdep.jnewblk_rollback: 86373 debug.softdep.journal_low: 46780 debug.softdep.journal_min: 0 debug.softdep.journal_wait: 49010 debug.softdep.jwait_filepage: 19100 debug.softdep.jwait_freeblks: 0 debug.softdep.jwait_inode: 25281 debug.softdep.jwait_newblk: 4629 debug.softdep.cleanup_blkrequests: 0 debug.softdep.cleanup_inorequests: 0 debug.softdep.cleanup_high_delay: 0 debug.softdep.cleanup_retries: 0 debug.softdep.cleanup_failures: 0 debug.softdep.flushcache: 0 debug.softdep.emptyjblocks: 0 debug.softdep.print_threads: 0 --------------050008080403010001010207-- From owner-freebsd-fs@FreeBSD.ORG Thu Apr 2 02:28:17 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5E3A61D3 for ; Thu, 2 Apr 2015 02:28:17 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id EF491FCC for ; Thu, 2 Apr 2015 02:28:16 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CtBABWqBxV/95baINcg1pcBYMQwkMKhSpJAoITAQEBAQEBfoQfAQEEAQEBICsgCxsOCgICDRkCKQEJJgYIBwQBHASIDg20c5grAQEBAQEBAQMBAQEBAQEBARqBIYoIhBYQAgEFFwEzB4JogUUFlFaDXoN9kmoihAoiMQEGgT1/AQEB X-IronPort-AV: E=Sophos;i="5.11,507,1422939600"; d="scan'208";a="201338561" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Apr 2015 22:28:15 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id D968AB3EB2; Wed, 1 Apr 2015 22:28:15 -0400 (EDT) Date: Wed, 1 Apr 2015 22:28:15 -0400 (EDT) From: Rick Macklem To: Adam Guimont Message-ID: <1199661815.10758124.1427941695874.JavaMail.root@uoguelph.ca> In-Reply-To: <551C4F1D.1000206@tezzaron.com> Subject: Re: NFSD high CPU usage MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2015 02:28:17 -0000 Adam Guimont wrote: > I have an issue where NFSD will max out the CPU (1200% in this case) > when a client workstation runs out of memory while trying to write > via > NFS. What also happens is the TCP Recv-Q fills up and causes > connection > timeouts for any other client trying to use the NFS server. > > I can reproduce the issue by running stress on a low-end client > workstation. Change into the NFS mounted directory and then use > stress > to write via NFS and exhaust the memory, example: > > stress --cpu 2 --io 4 --vm 20 --hdd 4 > > The client workstation will eventually run out of memory trying to > write > into the NFS directory, fill the TCP Recv-Q on the NFS server, and > then > NFSD will max out the CPU. > > The actual client workstations (~50) are not running stress when this > happens, it's a mixture of EDA tools (simulation and verification). > > For what it's worth, this is how I've been monitoring the TCP buffer > queues where "xx.xxx.xx.xxx" is the IP address of the NFS server: > > cmdwatch -n1 'netstat -an | grep -e "Proto" -e "tcp4" | grep -e > "Proto" > -e "xx.xxx.xx.xxx.2049"' > > I have tried several tuning recommendations but it has not solved the > problem. > > Has anyone else experienced this and is anyone else able to reproduce > it? > > --- > NFS server specs: > > OS = FreeBSD 10.0-RELEASE > CPU = E5-1650 v3 > Memory = 96GB > Disks = 24x ST6000NM0034 in 4x raidz2 > HBA = LSI SAS 9300-8i > NIC = Intel 10Gb X540-T2 > --- > /boot/loader.conf > > autoboot_delay="3" > geom_mirror_load="YES" > mpslsi3_load="YES" > cc_htcp_load="YES" > --- > /etc/rc.conf > > hostname="***" > ifconfig_ix0="inet *** netmask 255.255.248.0 -tso -vlanhwtso" > defaultrouter="***" > sshd_enable="YES" > ntpd_enable="YES" > zfs_enable="YES" > sendmail_enable="NO" > nfs_server_enable="YES" > nfs_server_flags="-h *** -t -n 128" > nfs_client_enable="YES" > rpcbind_enable="YES" > rpc_lockd_enable="YES" > rpc_statd_enable="YES" > samba_enable="YES" > atop_enable="YES" > atop_interval="5" > zabbix_agentd_enable="YES" > --- > /etc/sysctl.conf > > vfs.nfsd.server_min_nfsvers=3 > vfs.nfsd.cachetcp=0 > kern.ipc.maxsockbuf=16777216 > net.inet.tcp.sendbuf_max=16777216 > net.inet.tcp.recvbuf_max=16777216 > net.inet.tcp.sendspace=1048576 > net.inet.tcp.recvspace=1048576 > net.inet.tcp.sendbuf_inc=32768 > net.inet.tcp.recvbuf_inc=65536 > net.inet.tcp.keepidle=10000 > net.inet.tcp.keepintvl=2500 > net.inet.tcp.always_keepalive=1 > net.inet.tcp.cc.algorithm=htcp > net.inet.tcp.cc.htcp.adaptive_backoff=1 > net.inet.tcp.cc.htcp.rtt_scaling=1 > net.inet.tcp.sack.enable=0 > kern.ipc.soacceptqueue=1024 > net.inet.tcp.mssdflt=1460 > net.inet.tcp.minmss=1300 > net.inet.tcp.tso=0 > --- > Client workstations: > > OS = CentOS 6.6 x64 > Mount options from `cat /proc/mounts` = > rw,nosuid,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=***,mountvers=3,mountport=916,mountproto=udp,local_lock=none,addr=*** I can think of two explanations for this. 1 - The server nfsd threads get confused when the TCP recv Q fills and start looping around. OR 2 - The client is sending massive #s of RPCs (or crap that is incomplete RPCs). To get a better idea w.r.t. what is going on, I'd suggest that you capture packets (for a relatively short period) when the server is 100% CPU busy. # tcpdump -s 0 -w out.pcap host - run on the server should do it. Then look at out.pcap in wireshark and see what the packets look like. (wireshark understands NFS, whereas tcpdump doesn't) If #1, I'd guess very little traffic (maybe TCP layer stuff), if #2, I'd guess you'll see a lot of RPC requests or garbage that isn't a valid request. (This latter case would suggest a CentOS problem.) If you capture the packets but can't look at them in wireshark, you could email me the packet capture as an attachment and I can look at it after Apr. 10, when I get home. rick > --- > > > Regards, > > Adam Guimont > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Thu Apr 2 15:14:06 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BEFE695E for ; Thu, 2 Apr 2015 15:14:06 +0000 (UTC) Received: from mail.tezzaron.com (mail.tezzaron.com [50.206.41.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 684A71F4 for ; Thu, 2 Apr 2015 15:14:05 +0000 (UTC) Received: from delaware.tezzaron.com ([10.252.50.1]) by mail.tezzaron.com (IceWarp 11.1.2.0 x64) with ASMTP (SSL) id 201504021014058602; Thu, 02 Apr 2015 10:14:05 -0500 Message-ID: <551D5CBC.1010009@tezzaron.com> Date: Thu, 02 Apr 2015 10:14:04 -0500 From: Adam Guimont User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: rick@havokmon.com Subject: Re: NFSD high CPU usage References: <20150401154314.Horde.e_w-9XEJOaa4SwYyNLlttA3@www.vfemail.net> In-Reply-To: <20150401154314.Horde.e_w-9XEJOaa4SwYyNLlttA3@www.vfemail.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2015 15:14:06 -0000 Rick Romero wrote: > Does your ZFS pool have log devices? > How does gstat -d look? > > If the drives are busy, try adding > vfs.nfsd.async: 0 No log devices but the disks are not busy when this happens. I have an atop snapshot from the last time it happened: http://pastebin.com/raw.php?i=LQjbKTXR Regards, Adam Guimont From owner-freebsd-fs@FreeBSD.ORG Thu Apr 2 15:51:11 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3A8B0EE2 for ; Thu, 2 Apr 2015 15:51:11 +0000 (UTC) Received: from smtp102-5.vfemail.net (eightfive.vfemail.net [96.30.253.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EB3598CC for ; Thu, 2 Apr 2015 15:51:10 +0000 (UTC) Received: (qmail 19216 invoked by uid 89); 2 Apr 2015 15:51:06 -0000 Received: by simscan 1.4.0 ppid: 19159, pid: 19212, t: 0.0815s scanners:none Received: from unknown (HELO d3d3MTExQDE0Mjc5ODk4NjY=) (cmlja0BoYXZva21vbi5jb21AMTQyNzk4OTg2Ng==@MTcyLjE2LjEwMC45M0AxNDI3OTg5ODY2) by 172.16.100.62 with ESMTPA; 2 Apr 2015 15:51:06 -0000 Date: Thu, 02 Apr 2015 10:50:40 -0500 Message-ID: <20150402105040.Horde.DpcVnMHXCV_MvaXmGcnU1g8@www.vfemail.net> From: Rick Romero To: Adam Guimont Subject: Re: NFSD high CPU usage References: <20150401154314.Horde.e_w-9XEJOaa4SwYyNLlttA3@www.vfemail.net> <551D5CBC.1010009@tezzaron.com> In-Reply-To: <551D5CBC.1010009@tezzaron.com> User-Agent: Internet Messaging Program (IMP) H5 (6.2.2) X-VFEmail-Originating-IP: MTIuMzEuMTAwLjE0Ng== X-VFEmail-AntiSpam: Notify admin@vfemail.net of any spam, and include VFEmail headers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed; DelSp=Yes Content-Transfer-Encoding: 8bit Content-Disposition: inline Content-Description: Plaintext Message X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2015 15:51:11 -0000 Quoting Adam Guimont : > Rick Romero wrote: >> Does your ZFS pool have log devices? >> How does gstat -d   look? >> >> If the drives are busy, try adding >> vfs.nfsd.async: 0 > > No log devices but the disks are not busy when this happens. > > I have an atop snapshot from the last time it happened: > http://pastebin.com/raw.php?i=LQjbKTXR Are the disks busy before it happens?   I'm far from an expert, but when running ZFS with NFS, I've had a lot of issues.  My final resolutions were to turn ASYNC off and have log devices and I even have SSD volumes now. Otherwise under load the NFS server gets hung up. It never seemed to happen on UFS, but due to the number of small files I have, ZFS provides the best backup functionality. I'm now trying to move all functions from NFS (to more TCP client/server). You have different info than I've gathered, and it might be because of usage. I actively use the system that I've seen NFS dump on, so I see the slowness beginning. Once NFS dies, the drive load goes back to normal. I wonder, if maybe you are just managing a system for others, and you don't see it until after the fact?  Just a thought based on my limited experience. Rick From owner-freebsd-fs@FreeBSD.ORG Thu Apr 2 16:05:10 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7D0D9832 for ; Thu, 2 Apr 2015 16:05:10 +0000 (UTC) Received: from p3plsmtpa06-07.prod.phx3.secureserver.net (p3plsmtpa06-07.prod.phx3.secureserver.net [173.201.192.108]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client CN "Bizanga Labs SMTP Client Certificate", Issuer "Bizanga Labs CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DA1CAA0 for ; Thu, 2 Apr 2015 16:05:09 +0000 (UTC) Received: from kateleycoimac.local ([63.231.252.189]) by p3plsmtpa06-07.prod.phx3.secureserver.net with id B43W1q00F45wnoF0143Xl7; Thu, 02 Apr 2015 09:03:33 -0700 Message-ID: <551D6852.2090308@kateleyco.com> Date: Thu, 02 Apr 2015 11:03:30 -0500 From: Linda Kateley User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: zfs-discuss@zfsonlinux.org, freebsd-fs@freebsd.org, zfs-discuss@list.zfsonlinux.org, freenas-devel@lists.freenas.org, zfs@lists.illumos.org, developer@open-zfs.org, omnios-discuss@lists.omniti.com Subject: Re: Open-ZFS Office Hours References: <550C4190.40508@kateleyco.com> In-Reply-To: <550C4190.40508@kateleyco.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2015 16:05:10 -0000 We had to update the broadcast link... https://plus.google.com/hangouts/_/hoaevent/AP36tYfW4J2Ht9Y1zKtsZ6IzajdVPD74JdriYSbwajrhRa51Wayw2g To participate via google+ hangout, YouTube, or IRC. See details on the webpage: http://www.open-zfs.org/wiki/OpenZFS_Office_Hours linda On 3/20/15 10:49 AM, Linda Kateley wrote: > Hi, > > I am going to try and get office hours going again for the open-zfs > community. > > We have scheduled a meeting on Thursday April 2nd at 9 AM PDT 11AM EDT > 4PM GMT > > This months guest host is Justin Gibbs from freebsd. > > I am a little new at using hangouts, but I am pretty sure this link > will get you there > https://plus.google.com/events/ctt39ds1j8onc2uthm3kkaf6tl0 > This info is also posted on open-zfs.org site. > > If you have ideas for future meetings, let me know. > > Tia > -- Linda Kateley President Kateley Company 612-807-6349 Skype ID-kateleyco http://kateleyco.com From owner-freebsd-fs@FreeBSD.ORG Thu Apr 2 19:25:58 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AE97CE19 for ; Thu, 2 Apr 2015 19:25:58 +0000 (UTC) Received: from mail.tezzaron.com (mail.tezzaron.com [50.206.41.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 727EE7A4 for ; Thu, 2 Apr 2015 19:25:58 +0000 (UTC) Received: from delaware.tezzaron.com ([10.252.50.1]) by mail.tezzaron.com (IceWarp 11.1.2.0 x64) with ASMTP (SSL) id 201504021425572207; Thu, 02 Apr 2015 14:25:57 -0500 Message-ID: <551D97C5.8020405@tezzaron.com> Date: Thu, 02 Apr 2015 14:25:57 -0500 From: Adam Guimont User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: rick@havokmon.com Subject: Re: NFSD high CPU usage References: <20150402105040.Horde.DpcVnMHXCV_MvaXmGcnU1g8@www.vfemail.net> In-Reply-To: <20150402105040.Horde.DpcVnMHXCV_MvaXmGcnU1g8@www.vfemail.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2015 19:25:58 -0000 Rick Romero wrote: > Are the disks busy before it happens? I'm far from an expert, but when > running ZFS with NFS, I've had a lot of issues. My final resolutions were > to turn ASYNC off and have log devices and I even have SSD volumes now. > Otherwise under load the NFS server gets hung up. It never seemed to happen > on UFS, but due to the number of small files I have, ZFS provides the best > backup functionality. I'm now trying to move all functions from NFS (to > more TCP client/server). > > You have different info than I've gathered, and it might be because of > usage. I actively use the system that I've seen NFS dump on, so I see the > slowness beginning. Once NFS dies, the drive load goes back to normal. I > wonder, if maybe you are just managing a system for others, and you don't > see it until after the fact? Just a thought based on my limited > experience. No, the disks are not busy before this happens. I use the server every day and keep a pretty close eye on it. The disks can get busy but it doesn't spike nfsd that much and usually doesn't last more than a few seconds. When this particular issue happens with the nfsd CPU spike, it will last until the job running on the client workstation gets killed or when the client workstation is rebooted. After that it takes a few seconds for the TCP buffers to flush out and allow other clients to connect again. Regards, Adam Guimont From owner-freebsd-fs@FreeBSD.ORG Thu Apr 2 21:02:50 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 694A02BA for ; Thu, 2 Apr 2015 21:02:50 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0BA275E2 for ; Thu, 2 Apr 2015 21:02:49 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t32L2fHX035087 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 3 Apr 2015 00:02:41 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t32L2fHX035087 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t32L2fZ6035086; Fri, 3 Apr 2015 00:02:41 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 3 Apr 2015 00:02:41 +0300 From: Konstantin Belousov To: Artem Kuchin Subject: Re: Little research how rm -rf and tar kill server Message-ID: <20150402210241.GD2379@kib.kiev.ua> References: <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <551C6D9F.8010506@artem.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2015 21:02:50 -0000 On Thu, Apr 02, 2015 at 01:13:51AM +0300, Artem Kuchin wrote: > 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: > > Syncer and sync(2) perform different kind of syncs. Take the snapshot of > > sysctl debug.softdep before and after the situation occur to have some > > hints what is going on. > > > > > > Okay. Here is the sysctl data Try this. It may be not enough, I will provide some update in this case. No need to resend the sysctl data. Just test whether explicit sync(2) is needed in your situation after the patch. diff --git a/sys/ufs/ffs/ffs_extern.h b/sys/ufs/ffs/ffs_extern.h index c29e5d5..8494223 100644 --- a/sys/ufs/ffs/ffs_extern.h +++ b/sys/ufs/ffs/ffs_extern.h @@ -160,7 +160,7 @@ void softdep_journal_fsync(struct inode *); void softdep_buf_append(struct buf *, struct workhead *); void softdep_inode_append(struct inode *, struct ucred *, struct workhead *); void softdep_freework(struct workhead *); - +int softdep_need_sbupdate(struct ufsmount *ump); /* * Things to request flushing in softdep_request_cleanup() diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c index ab2bd41..e6ed696 100644 --- a/sys/ufs/ffs/ffs_softdep.c +++ b/sys/ufs/ffs/ffs_softdep.c @@ -612,6 +612,13 @@ softdep_freework(wkhd) panic("softdep_freework called"); } +int +softdep_need_sbupdate(ump) + struct ufsmount *ump; +{ + + panic("softdep_need_sbupdate called"); +} #else FEATURE(softupdates, "FFS soft-updates support"); @@ -9479,6 +9486,18 @@ first_unlinked_inodedep(ump) return (inodedep); } +int +softdep_need_sbupdate(ump) + struct ufsmount *ump; +{ + struct inodedep *inodedep; + + ACQUIRE_LOCK(ump); + inodedep = first_unlinked_inodedep(ump); + FREE_LOCK(ump); + return (inodedep != NULL); +} + /* * Set the sujfree unlinked head pointer prior to writing a superblock. */ diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c index 6e2e556..b2973a2 100644 --- a/sys/ufs/ffs/ffs_vfsops.c +++ b/sys/ufs/ffs/ffs_vfsops.c @@ -1419,6 +1419,7 @@ static int ffs_sync_lazy(mp) struct mount *mp; { + struct ufsmount *ump; struct vnode *mvp, *vp; struct inode *ip; struct thread *td; @@ -1461,9 +1462,13 @@ qupdate: qsync(mp); #endif - if (VFSTOUFS(mp)->um_fs->fs_fmod != 0 && - (error = ffs_sbupdate(VFSTOUFS(mp), MNT_LAZY, 0)) != 0) - allerror = error; + ump = VFSTOUFS(mp); + if (ump->um_fs->fs_fmod != 0 || (MOUNTEDSUJ(mp) && + softdep_need_sbupdate(ump))) { + error = ffs_sbupdate(ump, MNT_LAZY, 0); + if (error != 0) + allerror = error; + } return (allerror); } From owner-freebsd-fs@FreeBSD.ORG Fri Apr 3 15:38:31 2015 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 88C20AAE for ; Fri, 3 Apr 2015 15:38:31 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6F1239BD for ; Fri, 3 Apr 2015 15:38:31 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t33FcV0o089754 for ; Fri, 3 Apr 2015 15:38:31 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194513] zfs recv hangs in state kmem arena Date: Fri, 03 Apr 2015 15:38:30 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: smh@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 15:38:31 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194513 Steven Hartland changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |smh@FreeBSD.org --- Comment #10 from Steven Hartland --- The following commit may have a positive effect for this issue: https://svnweb.freebsd.org/base?view=revision&revision=281026 -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Fri Apr 3 20:27:08 2015 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0B5C3675 for ; Fri, 3 Apr 2015 20:27:08 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E5FFFE39 for ; Fri, 3 Apr 2015 20:27:07 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t33KR7wI003401 for ; Fri, 3 Apr 2015 20:27:07 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194513] zfs recv hangs in state kmem arena Date: Fri, 03 Apr 2015 20:27:07 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: bra@fsn.hu X-Bugzilla-Status: In Progress X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 20:27:08 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194513 --- Comment #11 from bra@fsn.hu --- (In reply to Palle Girgensohn from comment #9) BTW, yes, vm.kmem_size. Raising that seemed to help here. -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Fri Apr 3 21:33:40 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 03DBE21D for ; Fri, 3 Apr 2015 21:33:40 +0000 (UTC) Received: from mail.tezzaron.com (mail.tezzaron.com [50.206.41.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BA7FF6CF for ; Fri, 3 Apr 2015 21:33:39 +0000 (UTC) Received: from delaware.tezzaron.com ([10.252.50.1]) by mail.tezzaron.com (IceWarp 11.1.2.0 x64) with ASMTP (SSL) id 201504031633310251; Fri, 03 Apr 2015 16:33:31 -0500 Message-ID: <551F072C.1000505@tezzaron.com> Date: Fri, 03 Apr 2015 16:33:32 -0500 From: Adam Guimont User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Rick Macklem Subject: Re: NFSD high CPU usage References: <1199661815.10758124.1427941695874.JavaMail.root@uoguelph.ca> In-Reply-To: <1199661815.10758124.1427941695874.JavaMail.root@uoguelph.ca> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 21:33:40 -0000 Rick Macklem wrote: > I can think of two explanations for this. > 1 - The server nfsd threads get confused when the TCP recv Q fills > and start looping around. > OR > 2 - The client is sending massive #s of RPCs (or crap that is > incomplete RPCs). > > To get a better idea w.r.t. what is going on, I'd suggest that > you capture packets (for a relatively short period) when the > server is 100% CPU busy. > # tcpdump -s 0 -w out.pcap host > - run on the server should do it. > Then look at out.pcap in wireshark and see what the packets > look like. (wireshark understands NFS, whereas tcpdump doesn't) > If #1, I'd guess very little traffic (maybe TCP layer stuff), > if #2, I'd guess you'll see a lot of RPC requests or garbage > that isn't a valid request. (This latter case would suggest a > CentOS problem.) > > If you capture the packets but can't look at them in wireshark, > you could email me the packet capture as an attachment and I > can look at it after Apr. 10, when I get home. > > rick > Thanks Rick, I was able to capture this today while it was happening. The capture is for about 100 seconds. I took a look at it in wireshark and to me it appears like the #2 situation you were describing. If you would like to confirm that I've uploaded the pcap file here: https://www.dropbox.com/s/pdhwj5z5tz7iwou/out.pcap.20150403 I will continue running some tests and trying to gather as much data as I can. Regards, Adam Guimont From owner-freebsd-fs@FreeBSD.ORG Fri Apr 3 21:59:35 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E3EBF629 for ; Fri, 3 Apr 2015 21:59:35 +0000 (UTC) Received: from smtp21.mail.ru (smtp21.mail.ru [94.100.179.250]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5B72E91F for ; Fri, 3 Apr 2015 21:59:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=ytW8LL25wCqeNUw7sXG2VZ/5y9s9DwqlYU9HThYYcok=; b=kDpL7O8moO+xxc3GfI3upK56OiK4g+b9s5I9aLGg2QWgaNb6J6MvQfdba9q+wlxAiz5TVj1wPgg1mgsvmD2qFomqgGBv0YUX4ZoaO5XDuzTSQqT8L5xQZgpxX5YFYl0MMlhqMOPEp1BgkVkJ2HnubHEXtizsnjoXH6uiwwnSNz8=; Received: from [109.188.125.8] (port=13271 helo=[192.168.0.12]) by smtp21.mail.ru with esmtpa (envelope-from ) id 1Ye9cb-0003wH-KT; Sat, 04 Apr 2015 00:59:26 +0300 Message-ID: <551F0D4A.5040007@artem.ru> Date: Sat, 04 Apr 2015 00:59:38 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Little research how rm -rf and tar kill server References: <1427727936.293597.247070269.5CE0D411@webmail.messagingengine.com> <55196FC7.8090107@artem.ru> <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> <20150402210241.GD2379@kib.kiev.ua> In-Reply-To: <20150402210241.GD2379@kib.kiev.ua> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 21:59:36 -0000 03.04.2015 0:02, Konstantin Belousov ÐÉÛÅÔ: > On Thu, Apr 02, 2015 at 01:13:51AM +0300, Artem Kuchin wrote: >> 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: >>> Syncer and sync(2) perform different kind of syncs. Take the snapshot of >>> sysctl debug.softdep before and after the situation occur to have some >>> hints what is going on. >>> >>> >> Okay. Here is the sysctl data > Try this. It may be not enough, I will provide some update in this case. > No need to resend the sysctl data. Just test whether explicit sync(2) is > needed in your situation after the patch. > > Okay, patches, recompiled and installed new kernel. The behaviour changed a bit. Now when i start untar mysql quickly rises to 40 queries in the queue in opening table state. (before the rise was slower) BUT after a while (20-30 seconds) all queries are executed. This cycle repeated 4 times and then situation aggravated quickly. It happened when untar reached big subtree with tons of small files. Queue grew to 70 queries, processes went to 600 (from 450). I stopped untar. Waited 3 minutes. Everything was becoming even worse (700 process, over 100 queries). Issued sync. It executed for 3 seconds and voila! 20 idle connections, 450 processes. So, manual sync is still need. Also it seems like during untar shell was less responsive than before. Also, when the system managed to flush query queue systat -io shows over 1000 tps, but when they got stuck it showed only about 200 tps. Artem From owner-freebsd-fs@FreeBSD.ORG Fri Apr 3 23:15:43 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 838A2D65 for ; Fri, 3 Apr 2015 23:15:43 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 25DC5FA9 for ; Fri, 3 Apr 2015 23:15:42 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t33NFUBR031021 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 4 Apr 2015 02:15:30 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t33NFUBR031021 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t33NFUO5031019; Sat, 4 Apr 2015 02:15:30 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 4 Apr 2015 02:15:30 +0300 From: Konstantin Belousov To: Artem Kuchin Subject: Re: Little research how rm -rf and tar kill server Message-ID: <20150403231530.GH2379@kib.kiev.ua> References: <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> <20150402210241.GD2379@kib.kiev.ua> <551F0D4A.5040007@artem.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <551F0D4A.5040007@artem.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 23:15:43 -0000 On Sat, Apr 04, 2015 at 12:59:38AM +0300, Artem Kuchin wrote: > 03.04.2015 0:02, Konstantin Belousov ÐÉÛÅÔ: > > On Thu, Apr 02, 2015 at 01:13:51AM +0300, Artem Kuchin wrote: > >> 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: > >>> Syncer and sync(2) perform different kind of syncs. Take the snapshot of > >>> sysctl debug.softdep before and after the situation occur to have some > >>> hints what is going on. > >>> > >>> > >> Okay. Here is the sysctl data > > Try this. It may be not enough, I will provide some update in this case. > > No need to resend the sysctl data. Just test whether explicit sync(2) is > > needed in your situation after the patch. > > > > > > Okay, patches, recompiled and installed new kernel. > > The behaviour changed a bit. > > Now when i start untar mysql quickly rises to 40 queries in the queue in > opening table state. > (before the rise was slower) > BUT after a while (20-30 seconds) all queries are executed. > This cycle repeated 4 times and then situation aggravated quickly. It > happened when untar > reached big subtree with tons of small files. > Queue grew to 70 queries, processes went to 600 (from 450). > I stopped untar. Waited 3 minutes. Everything was becoming even worse > (700 process, over 100 > queries). Issued sync. It executed for 3 seconds and voila! 20 idle > connections, 450 processes. > So, manual sync is still need. > > Also it seems like during untar shell was less responsive than before. > > Also, when the system managed to flush query queue systat -io shows over > 1000 tps, but when > they got stuck it showed only about 200 tps. So there were the i/o ops during the stall period ? I.e., a situation where there is clogged queue and hung processes, but no disk activity, does not occur, even temporary ? In what state the hung processes are blocked ? Look at the wchan name either in top or ps output. Are there processes in "suspfs" state ? Try the following patch. diff --git a/sys/ufs/ffs/ffs_extern.h b/sys/ufs/ffs/ffs_extern.h index c29e5d5..8494223 100644 --- a/sys/ufs/ffs/ffs_extern.h +++ b/sys/ufs/ffs/ffs_extern.h @@ -160,7 +160,7 @@ void softdep_journal_fsync(struct inode *); void softdep_buf_append(struct buf *, struct workhead *); void softdep_inode_append(struct inode *, struct ucred *, struct workhead *); void softdep_freework(struct workhead *); - +int softdep_need_sbupdate(struct ufsmount *ump); /* * Things to request flushing in softdep_request_cleanup() diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c index ab2bd41..da7a34f 100644 --- a/sys/ufs/ffs/ffs_softdep.c +++ b/sys/ufs/ffs/ffs_softdep.c @@ -612,6 +612,13 @@ softdep_freework(wkhd) panic("softdep_freework called"); } +int +softdep_need_sbupdate(ump) + struct ufsmount *ump; +{ + + panic("softdep_need_sbupdate called"); +} #else FEATURE(softupdates, "FFS soft-updates support"); @@ -3560,8 +3567,10 @@ softdep_process_journal(mp, needwk, flags) * unsuspend it if we already have. */ if (flags == 0 && jblocks->jb_suspended) { +#if 0 if (journal_unsuspend(ump)) return; +#endif FREE_LOCK(ump); VFS_SYNC(mp, MNT_NOWAIT); ffs_sbupdate(ump, MNT_WAIT, 0); @@ -9479,6 +9488,18 @@ first_unlinked_inodedep(ump) return (inodedep); } +int +softdep_need_sbupdate(ump) + struct ufsmount *ump; +{ + struct inodedep *inodedep; + + ACQUIRE_LOCK(ump); + inodedep = first_unlinked_inodedep(ump); + FREE_LOCK(ump); + return (inodedep != NULL); +} + /* * Set the sujfree unlinked head pointer prior to writing a superblock. */ diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c index 6e2e556..274c0f9 100644 --- a/sys/ufs/ffs/ffs_vfsops.c +++ b/sys/ufs/ffs/ffs_vfsops.c @@ -1419,7 +1419,8 @@ static int ffs_sync_lazy(mp) struct mount *mp; { - struct vnode *mvp, *vp; + struct ufsmount *ump; + struct vnode *devvp, *mvp, *vp; struct inode *ip; struct thread *td; int allerror, error; @@ -1461,9 +1462,21 @@ qupdate: qsync(mp); #endif - if (VFSTOUFS(mp)->um_fs->fs_fmod != 0 && - (error = ffs_sbupdate(VFSTOUFS(mp), MNT_LAZY, 0)) != 0) - allerror = error; + ump = VFSTOUFS(mp); + if (MOUNTEDSUJ(mp)) { + devvp = ump->um_devvp; + vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); + error = VOP_FSYNC(devvp, MNT_WAIT, td); + VOP_UNLOCK(devvp, 0); + if (error != 0) + allerror = error; + } + if (ump->um_fs->fs_fmod != 0 || (MOUNTEDSUJ(mp) && + softdep_need_sbupdate(ump))) { + error = ffs_sbupdate(ump, MNT_LAZY, 0); + if (error != 0) + allerror = error; + } return (allerror); } From owner-freebsd-fs@FreeBSD.ORG Fri Apr 3 23:23:00 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A8EE7E3E for ; Fri, 3 Apr 2015 23:23:00 +0000 (UTC) Received: from smtp37.i.mail.ru (smtp37.i.mail.ru [94.100.177.97]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1E931BD for ; Fri, 3 Apr 2015 23:22:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=WmMHNn51iS4X+MTQBo2d9LP0zZGl15NXZyevzzGxSzg=; b=fRNVK0BNwrmQ4i7rO1x0uB9rZ9rzVyBUPjEkAzDOFw4Kw9hPHZGEHdKKnFiNQJHBsXKiGBEv5Is+3yf5fZyWS4eHSvzIZJKyqL5pREpcLyJVhxU20YQYyO0NbIWRzU7C5B7y8gS3rnXVZwMUlSHYtIip1cIfos2kqwnKo6rp7nM=; Received: from [109.188.125.8] (port=13537 helo=[192.168.0.12]) by smtp37.i.mail.ru with esmtpa (envelope-from ) id 1YeAvG-00047m-5d; Sat, 04 Apr 2015 02:22:51 +0300 Message-ID: <551F20E0.9040103@artem.ru> Date: Sat, 04 Apr 2015 02:23:12 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Little research how rm -rf and tar kill server References: <1427730597.303984.247097389.165D5AAB@webmail.messagingengine.com> <5519716F.6060007@artem.ru> <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> <20150402210241.GD2379@kib.kiev.ua> <551F0D4A.5040007@artem.ru> <20150403231530.GH2379@kib.kiev.ua> In-Reply-To: <20150403231530.GH2379@kib.kiev.ua> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 23:23:00 -0000 04.04.2015 2:15, Konstantin Belousov ÐÉÛÅÔ: > On Sat, Apr 04, 2015 at 12:59:38AM +0300, Artem Kuchin wrote: >> 03.04.2015 0:02, Konstantin Belousov ÐÉÛÅÔ: >>> On Thu, Apr 02, 2015 at 01:13:51AM +0300, Artem Kuchin wrote: >>>> 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: >>>>> Syncer and sync(2) perform different kind of syncs. Take the snapshot of >>>>> sysctl debug.softdep before and after the situation occur to have some >>>>> hints what is going on. >>>>> >>>>> >>>> Okay. Here is the sysctl data >>> Try this. It may be not enough, I will provide some update in this case. >>> No need to resend the sysctl data. Just test whether explicit sync(2) is >>> needed in your situation after the patch. >>> >>> >> Okay, patches, recompiled and installed new kernel. >> >> The behaviour changed a bit. >> >> Now when i start untar mysql quickly rises to 40 queries in the queue in >> opening table state. >> (before the rise was slower) >> BUT after a while (20-30 seconds) all queries are executed. >> This cycle repeated 4 times and then situation aggravated quickly. It >> happened when untar >> reached big subtree with tons of small files. >> Queue grew to 70 queries, processes went to 600 (from 450). >> I stopped untar. Waited 3 minutes. Everything was becoming even worse >> (700 process, over 100 >> queries). Issued sync. It executed for 3 seconds and voila! 20 idle >> connections, 450 processes. >> So, manual sync is still need. >> >> Also it seems like during untar shell was less responsive than before. >> >> Also, when the system managed to flush query queue systat -io shows over >> 1000 tps, but when >> they got stuck it showed only about 200 tps. > So there were the i/o ops during the stall period ? I.e., a situation > where there is clogged queue and hung processes, but no disk activity, > does not occur, even temporary ? not, such does not happen. untar is always untarring and file bases sites continue to works, just slower, but mysql queries build up, but some are executed > > In what state the hung processes are blocked ? Look at the wchan name > either in top or ps output. Are there processes in "suspfs" state ? no, after the patch all in normal state, only mysql in UFS state and some perl and http (mayb 3 or 5) in ufs state too > > Try the following patch. trying now. From owner-freebsd-fs@FreeBSD.ORG Fri Apr 3 23:29:11 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1C97511F for ; Fri, 3 Apr 2015 23:29:11 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B1642D3 for ; Fri, 3 Apr 2015 23:29:10 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t33NT414034061 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 4 Apr 2015 02:29:05 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t33NT414034061 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t33NT4OQ034060; Sat, 4 Apr 2015 02:29:04 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 4 Apr 2015 02:29:04 +0300 From: Konstantin Belousov To: Artem Kuchin Subject: Re: Little research how rm -rf and tar kill server Message-ID: <20150403232904.GI2379@kib.kiev.ua> References: <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> <20150402210241.GD2379@kib.kiev.ua> <551F0D4A.5040007@artem.ru> <20150403231530.GH2379@kib.kiev.ua> <551F20E0.9040103@artem.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <551F20E0.9040103@artem.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 23:29:11 -0000 On Sat, Apr 04, 2015 at 02:23:12AM +0300, Artem Kuchin wrote: > 04.04.2015 2:15, Konstantin Belousov ÐÉÛÅÔ: > > On Sat, Apr 04, 2015 at 12:59:38AM +0300, Artem Kuchin wrote: > >> 03.04.2015 0:02, Konstantin Belousov ÐÉÛÅÔ: > >>> On Thu, Apr 02, 2015 at 01:13:51AM +0300, Artem Kuchin wrote: > >>>> 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: > >>>>> Syncer and sync(2) perform different kind of syncs. Take the snapshot of > >>>>> sysctl debug.softdep before and after the situation occur to have some > >>>>> hints what is going on. > >>>>> > >>>>> > >>>> Okay. Here is the sysctl data > >>> Try this. It may be not enough, I will provide some update in this case. > >>> No need to resend the sysctl data. Just test whether explicit sync(2) is > >>> needed in your situation after the patch. > >>> > >>> > >> Okay, patches, recompiled and installed new kernel. > >> > >> The behaviour changed a bit. > >> > >> Now when i start untar mysql quickly rises to 40 queries in the queue in > >> opening table state. > >> (before the rise was slower) > >> BUT after a while (20-30 seconds) all queries are executed. > >> This cycle repeated 4 times and then situation aggravated quickly. It > >> happened when untar > >> reached big subtree with tons of small files. > >> Queue grew to 70 queries, processes went to 600 (from 450). > >> I stopped untar. Waited 3 minutes. Everything was becoming even worse > >> (700 process, over 100 > >> queries). Issued sync. It executed for 3 seconds and voila! 20 idle > >> connections, 450 processes. > >> So, manual sync is still need. > >> > >> Also it seems like during untar shell was less responsive than before. > >> > >> Also, when the system managed to flush query queue systat -io shows over > >> 1000 tps, but when > >> they got stuck it showed only about 200 tps. > > So there were the i/o ops during the stall period ? I.e., a situation > > where there is clogged queue and hung processes, but no disk activity, > > does not occur, even temporary ? > not, such does not happen. untar is always untarring and file bases > sites continue > to works, just slower, but mysql queries build up, but some are executed > > > > In what state the hung processes are blocked ? Look at the wchan name > > either in top or ps output. Are there processes in "suspfs" state ? > > no, after the patch all in normal state, only mysql in UFS state and > some perl and http > (mayb 3 or 5) in ufs state too What about unpatched kernel ? Are "suspfs" blocked processes reported by either tool ? > > > > Try the following patch. > > trying now. From owner-freebsd-fs@FreeBSD.ORG Sat Apr 4 02:19:32 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D7A0BD7F for ; Sat, 4 Apr 2015 02:19:32 +0000 (UTC) Received: from smtp45.i.mail.ru (smtp45.i.mail.ru [94.100.177.105]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 51ABD207 for ; Sat, 4 Apr 2015 02:19:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=jXvIeyemngmg6FDPm9c6qCzf9iaYnblxAPN2F6QhqqI=; b=SPtyD8hlZVjQmduqcJFh809akDwUGaoFffuQvZBT7jt8hAx7DEdoFEYInsf4rWQfgkC3sXv+9fRtJSsgBwyBqQxXsAtdOzDTiQ5eDxu0dryyUILFAPSNmugoHgJeWHvmALUuSqviRtlrcudVkOrVk6++Qjt9AUTrbJNj3cBomOM=; Received: from [109.188.125.8] (port=14838 helo=[192.168.0.12]) by smtp45.i.mail.ru with esmtpa (envelope-from ) id 1YeDg9-0000oA-Df; Sat, 04 Apr 2015 05:19:21 +0300 Message-ID: <551F4A5D.1080008@artem.ru> Date: Sat, 04 Apr 2015 05:20:13 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Little research how rm -rf and tar kill server References: <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> <20150402210241.GD2379@kib.kiev.ua> <551F0D4A.5040007@artem.ru> <20150403231530.GH2379@kib.kiev.ua> <551F20E0.9040103@artem.ru> <20150403232904.GI2379@kib.kiev.ua> In-Reply-To: <20150403232904.GI2379@kib.kiev.ua> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 8bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Apr 2015 02:19:33 -0000 04.04.2015 2:29, Konstantin Belousov ÐÉÛÅÔ: > On Sat, Apr 04, 2015 at 02:23:12AM +0300, Artem Kuchin wrote: >> 04.04.2015 2:15, Konstantin Belousov ÐÉÛÅÔ: >>> On Sat, Apr 04, 2015 at 12:59:38AM +0300, Artem Kuchin wrote: >>>> 03.04.2015 0:02, Konstantin Belousov ÐÉÛÅÔ: >>>>> On Thu, Apr 02, 2015 at 01:13:51AM +0300, Artem Kuchin wrote: >>>>>> 31.03.2015 19:42, Konstantin Belousov ÐÉÛÅÔ: >>>>>>> Syncer and sync(2) perform different kind of syncs. Take the snapshot of >>>>>>> sysctl debug.softdep before and after the situation occur to have some >>>>>>> hints what is going on. >>>>>>> >>>>>>> >>>>>> Okay. Here is the sysctl data >>>>> Try this. It may be not enough, I will provide some update in this case. >>>>> No need to resend the sysctl data. Just test whether explicit sync(2) is >>>>> needed in your situation after the patch. >>>>> >>>>> >>>> Okay, patches, recompiled and installed new kernel. >>>> >>>> The behaviour changed a bit. >>>> >>>> Now when i start untar mysql quickly rises to 40 queries in the queue in >>>> opening table state. >>>> (before the rise was slower) >>>> BUT after a while (20-30 seconds) all queries are executed. >>>> This cycle repeated 4 times and then situation aggravated quickly. It >>>> happened when untar >>>> reached big subtree with tons of small files. >>>> Queue grew to 70 queries, processes went to 600 (from 450). >>>> I stopped untar. Waited 3 minutes. Everything was becoming even worse >>>> (700 process, over 100 >>>> queries). Issued sync. It executed for 3 seconds and voila! 20 idle >>>> connections, 450 processes. >>>> So, manual sync is still need. >>>> >>>> Also it seems like during untar shell was less responsive than before. >>>> >>>> Also, when the system managed to flush query queue systat -io shows over >>>> 1000 tps, but when >>>> they got stuck it showed only about 200 tps. >>> So there were the i/o ops during the stall period ? I.e., a situation >>> where there is clogged queue and hung processes, but no disk activity, >>> does not occur, even temporary ? >> not, such does not happen. untar is always untarring and file bases >> sites continue >> to works, just slower, but mysql queries build up, but some are executed >>> In what state the hung processes are blocked ? Look at the wchan name >>> either in top or ps output. Are there processes in "suspfs" state ? >> no, after the patch all in normal state, only mysql in UFS state and >> some perl and http >> (mayb 3 or 5) in ufs state too > What about unpatched kernel ? Are "suspfs" blocked processes reported > by either tool ? no, top says "ufs" state After i applied patch i get many Apr 4 02:44:39 omni kernel: fsync: giving up on dirty Apr 4 02:44:39 omni kernel: 0xfffff80013181b10: tag devfs, type VCHR Apr 4 02:44:39 omni kernel: usecount 1, writecount 0, refcount 571 mountedhere 0xfffff80013030a00 Apr 4 02:44:39 omni kernel: flags (VI_ACTIVE) Apr 4 02:44:39 omni kernel: v_object 0xfffff80013193200 ref 0 pages 4539 cleanbuf 26 dirtybuf 543 Apr 4 02:44:39 omni kernel: lock type devfs: EXCL by thread 0xfffff80010fbd000 (pid 23, syncer, tid 100087) Apr 4 02:44:39 omni kernel: dev mirror/root Is filesystem still okay after this? "giving up on dirty" does not sound good Artem From owner-freebsd-fs@FreeBSD.ORG Sat Apr 4 02:51:52 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CA852214 for ; Sat, 4 Apr 2015 02:51:52 +0000 (UTC) Received: from smtp12.mail.ru (smtp12.mail.ru [94.100.181.93]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 773BE776 for ; Sat, 4 Apr 2015 02:51:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=9+MX4S4j22KBAngc6MrWgF1aNToX67bDzTd4SYVygRA=; b=ANRIPw54hSPKiEx4NeNlQ2HvQDqL24a4fqvrP+zo1oOLJ/PHGRJwG2pc5TOE6wI4K7mnYqSKCwNvXezEVK4C2XXipNgDl1+S8zXUY4zZLi5uaLsu8N/CeIqJqD3lychy0PIwC2Pz5Y9NkRaAqDw3691XHmv51r1rNhHKRLx1egw=; Received: from [109.188.125.8] (port=15167 helo=[192.168.0.12]) by smtp12.mail.ru with esmtpa (envelope-from ) id 1YeEBN-0006E2-9w; Sat, 04 Apr 2015 05:51:42 +0300 Message-ID: <551F51ED.2060905@artem.ru> Date: Sat, 04 Apr 2015 05:52:29 +0300 From: Artem Kuchin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Little research how rm -rf and tar kill server References: <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> <20150402210241.GD2379@kib.kiev.ua> <551F0D4A.5040007@artem.ru> <20150403231530.GH2379@kib.kiev.ua> <551F20E0.9040103@artem.ru> <20150403232904.GI2379@kib.kiev.ua> In-Reply-To: <20150403232904.GI2379@kib.kiev.ua> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 7bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Apr 2015 02:51:52 -0000 about "giving up on dirty" i hit the bug with unclean shutdown and my mirror is resyncing maybe it is the reason for this message i DO hope it is harmless Artem From owner-freebsd-fs@FreeBSD.ORG Sat Apr 4 17:42:35 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DCBBB9CB for ; Sat, 4 Apr 2015 17:42:34 +0000 (UTC) Received: from mail-ie0-x22d.google.com (mail-ie0-x22d.google.com [IPv6:2607:f8b0:4001:c03::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A20662A1 for ; Sat, 4 Apr 2015 17:42:34 +0000 (UTC) Received: by iebrs15 with SMTP id rs15so23387795ieb.3 for ; Sat, 04 Apr 2015 10:42:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:date:message-id:subject:from:to:content-type; bh=aBT6IEEOpDSg5KcITbeW8qPmApTHDIifTDhxdssc+3A=; b=jmpra0KneVQeuytu+dc/5TfZQtuI7Y1Va65Qt7mbypA3aZ/bzcbbjYBA55eXj7uovx LOwTvFEiH7u5gFBSGuq05WahwmdC/NWiXyuWwukGAQFuonddYW7xpyuKGtOHRIB+gNV2 dehNkytY5y1yTgpjOyBitZvHbiJKFk0spSSU2HQv2g3bxF9BiyFpvuG9Z1URCvBPpUOV IgksHnAD4wzkX8ALRD8L0pVvtoRgxEhl4cE/XgKQC8LaqRbk504FLSZp8o4Wt5APeuZm 0boXJ1SeXkV7klDApYNWPsSiigKF5lXGGwW++/2GCJzg2AdDl4syCuHocNh9DLSPv+x6 dt5g== MIME-Version: 1.0 X-Received: by 10.42.37.5 with SMTP id w5mr90282icd.39.1428169354060; Sat, 04 Apr 2015 10:42:34 -0700 (PDT) Received: by 10.64.59.138 with HTTP; Sat, 4 Apr 2015 10:42:34 -0700 (PDT) Reply-To: araujo@FreeBSD.org Date: Sun, 5 Apr 2015 01:42:34 +0800 Message-ID: Subject: ndmp server. From: Marcelo Araujo To: "freebsd-fs@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Apr 2015 17:42:35 -0000 Hi guys, I saw at wiki[1], there is a project idea to have an NDMP data server. I have something workable based on Illumos implementation[2], it doesn't implement all features that Illumos version has. Also the protocol version is 4 and not the new one 5. I can do a backup/restore between two FreeBSD machines via NDMP with success. Also is necessary port or create another tools such like this one on Illumos[3], I have a very simple python script to setup it now. It needs more polishment on the code and a man page, as well as maybe the ndmpadm or something based on it. I'm wondering, if someone still has interesting on it, because with zfs send/recv I don't see too much utility for NDMP. Even a rsync can do the job. If you guys think, that have NDMP, at least the server on FreeBSD would be a good idea, let me know, and I can put some effort on it and I can provide a patch soon for tests. [1] https://wiki.freebsd.org/IdeasPage#NDMP_data_server [2] https://github.com/joyent/illumos-joyent/tree/master/usr/src/cmd/ndmpd [3] https://github.com/joyent/illumos-joyent/tree/master/usr/src/cmd/ndmpadm Best Regards, -- -- Marcelo Araujo (__)araujo@FreeBSD.org \\\'',)http://www.FreeBSD.org \/ \ ^ Power To Server. .\. /_) From owner-freebsd-fs@FreeBSD.ORG Sat Apr 4 22:10:32 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0FC64CD9 for ; Sat, 4 Apr 2015 22:10:32 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 84EA31AD for ; Sat, 4 Apr 2015 22:10:31 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t34MAKMl052845 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sun, 5 Apr 2015 01:10:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t34MAKMl052845 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t34MAKYQ052843; Sun, 5 Apr 2015 01:10:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 5 Apr 2015 01:10:20 +0300 From: Konstantin Belousov To: Artem Kuchin Subject: Re: Little research how rm -rf and tar kill server Message-ID: <20150404221020.GL2379@kib.kiev.ua> References: <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> <20150402210241.GD2379@kib.kiev.ua> <551F0D4A.5040007@artem.ru> <20150403231530.GH2379@kib.kiev.ua> <551F20E0.9040103@artem.ru> <20150403232904.GI2379@kib.kiev.ua> <551F51ED.2060905@artem.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <551F51ED.2060905@artem.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Apr 2015 22:10:32 -0000 On Sat, Apr 04, 2015 at 05:52:29AM +0300, Artem Kuchin wrote: > about > "giving up on dirty" > > i hit the bug with unclean shutdown and my mirror is resyncing > maybe it is the reason for this message > i DO hope it is harmless I noted earlier that you must use the HEAD with the patched bug. Still, both of your messages did not indicate whether the latest patch changed the behaviour of the system.