Date: Sat, 4 Apr 2015 02:29:04 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Artem Kuchin <artem@artem.ru> Cc: freebsd-fs@freebsd.org Subject: Re: Little research how rm -rf and tar kill server Message-ID: <20150403232904.GI2379@kib.kiev.ua> In-Reply-To: <551F20E0.9040103@artem.ru> References: <1427731061.306961.247099633.0A421E90@webmail.messagingengine.com> <5519740A.1070902@artem.ru> <1427731759.309823.247107417.308CD298@webmail.messagingengine.com> <5519F74C.1040308@artem.ru> <20150331164202.GN2379@kib.kiev.ua> <551C6D9F.8010506@artem.ru> <20150402210241.GD2379@kib.kiev.ua> <551F0D4A.5040007@artem.ru> <20150403231530.GH2379@kib.kiev.ua> <551F20E0.9040103@artem.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Apr 04, 2015 at 02:23:12AM +0300, Artem Kuchin wrote: > 04.04.2015 2:15, Konstantin Belousov пишет: > > On Sat, Apr 04, 2015 at 12:59:38AM +0300, Artem Kuchin wrote: > >> 03.04.2015 0:02, Konstantin Belousov пишет: > >>> On Thu, Apr 02, 2015 at 01:13:51AM +0300, Artem Kuchin wrote: > >>>> 31.03.2015 19:42, Konstantin Belousov пишет: > >>>>> Syncer and sync(2) perform different kind of syncs. Take the snapshot of > >>>>> sysctl debug.softdep before and after the situation occur to have some > >>>>> hints what is going on. > >>>>> > >>>>> > >>>> Okay. Here is the sysctl data > >>> Try this. It may be not enough, I will provide some update in this case. > >>> No need to resend the sysctl data. Just test whether explicit sync(2) is > >>> needed in your situation after the patch. > >>> > >>> > >> Okay, patches, recompiled and installed new kernel. > >> > >> The behaviour changed a bit. > >> > >> Now when i start untar mysql quickly rises to 40 queries in the queue in > >> opening table state. > >> (before the rise was slower) > >> BUT after a while (20-30 seconds) all queries are executed. > >> This cycle repeated 4 times and then situation aggravated quickly. It > >> happened when untar > >> reached big subtree with tons of small files. > >> Queue grew to 70 queries, processes went to 600 (from 450). > >> I stopped untar. Waited 3 minutes. Everything was becoming even worse > >> (700 process, over 100 > >> queries). Issued sync. It executed for 3 seconds and voila! 20 idle > >> connections, 450 processes. > >> So, manual sync is still need. > >> > >> Also it seems like during untar shell was less responsive than before. > >> > >> Also, when the system managed to flush query queue systat -io shows over > >> 1000 tps, but when > >> they got stuck it showed only about 200 tps. > > So there were the i/o ops during the stall period ? I.e., a situation > > where there is clogged queue and hung processes, but no disk activity, > > does not occur, even temporary ? > not, such does not happen. untar is always untarring and file bases > sites continue > to works, just slower, but mysql queries build up, but some are executed > > > > In what state the hung processes are blocked ? Look at the wchan name > > either in top or ps output. Are there processes in "suspfs" state ? > > no, after the patch all in normal state, only mysql in UFS state and > some perl and http > (mayb 3 or 5) in ufs state too What about unpatched kernel ? Are "suspfs" blocked processes reported by either tool ? > > > > Try the following patch. > > trying now.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150403232904.GI2379>