From owner-freebsd-geom@freebsd.org Sat Oct 15 12:02:05 2016 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 50276C12DE4 for ; Sat, 15 Oct 2016 12:02:05 +0000 (UTC) (envelope-from mosipov@gmx.de) Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B59A9EBF for ; Sat, 15 Oct 2016 12:02:04 +0000 (UTC) (envelope-from mosipov@gmx.de) Received: from [192.168.1.8] ([62.157.52.151]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0McluX-1cDKBc40UT-00Hsxr for ; Sat, 15 Oct 2016 14:02:02 +0200 Subject: Re: Abysmally slow write to geom class volume over network To: freebsd-geom@freebsd.org References: <33da0f73-48f1-4727-fe76-41343dc4955b@gmx.net> <18314f27-849b-31df-d88d-af64e89c133f@gmx.net> <346c7da7-02b2-ba34-1463-f3f0a5a3cd9a@rlwinm.de> <15e9cb94-7ad8-e547-b06a-699ce2250624@gmx.net> <20161013014143.GA1669@funkthat.com> <636763f0-a732-18ba-262b-c3fc01f4342c@gmx.net> <20161014114330.396fe534@fabiankeil.de> <5a12f70d-3488-e799-b875-9b358ef7aff9@gmx.net> <20161015132249.607b374c@fabiankeil.de> From: Michael Osipov Message-ID: <90ca3d9c-6016-8a63-64c5-ce3f829756dd@gmx.de> Date: Sat, 15 Oct 2016 14:01:59 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161015132249.607b374c@fabiankeil.de> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:ktBN6280/2iCxUygzkxrG7QJvUKDuLPP5XeExw5Htm8wo7cL9Jw VpA5wn/pNOiQf4Ixq1D8x6TlAvMh5Pg3KlypBRhj9N71sRufvM3rbeJpiCztCJ/evmu+hbh lrxshEpUdmlMvuA2OMH5BVcSjaFq9okcEyGv4uPj+xrWuAAc12A44lWFPub3Afffd3TJ0mm i5GlRdMuWcUjbgFz7gcGg== X-UI-Out-Filterresults: notjunk:1;V01:K0:GCyEBmCcCqM=:yadeHeb6fXW89R2FXLtjdg QqaPAqRusLpEt0tGTnEWTPslNeNmQ1EkKpF4OieAwPARYOFEPirJSmwgw+OKvyy+t2ozwJp7j rq1D6Zd4kyQ5/Ffi76jvc/WOrpe1ANVrH8+AWiHdEvOTTOPgGJYfCyCKmJ2HEiNXjOKQy+b2h a1gLm3xyDWJRq/BXWImCvgz1nHj8nNof0uRZf6/06TtpvOxmKiHQlmW2nI0dGFEeNiQw28NoX n8tVoop/f1Xq5BWLIVqfhBlF+uf8Ju3UYRI7/UHfRVwY5WIbTR3AVVlAFzMmu7JukDpyoFkdP MA4+kD7htLTn5xryrOkK2l578Q7gJR4Fx/Msrt/8BPpsyKqp+/jV/rorE6vgw/LTgGM9BKQrT MVRvLKlE1ILHDkZsfBcgQ9bzxF6lEyLpHP7Y6pQamxB6f6q6gVuJtEBNgrkxuCcLO+knETGkP 5wWnJ8nAyDxCZxrXb6COYgCTBXpzw6BIZ/Pgle2BpksmifMO59akS/SFNWQGS6FlhYvME/lVH ZPG/lt8WTwokoJ+XCtqPM/qv/umTTx8CJQcODekgsTae3tzA7tVX2iyoGKwIVihgP4BD7qKrD 8Cd+T3xD606kgP5lgIvHdV+fEKaqWNtba3cezyD9Gc6kQBkzNYu59Ns2kSjTxbQAn0F7Irb0o 5ethoA4kG2Gsg2IAYodaGkNdzkB5sMN0FcOzzuNDqu0CLygjYNfvMQSTD/wD0MsUA6WtbIGdy 674NU03po4bqax9gTNakuNLMNxhLeVI5tMbAMNii7/YHakKvblP4rsexUmljy+pk7fFuMlu2b 7gMOmX1 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Oct 2016 12:02:05 -0000 Am 2016-10-15 um 13:22 schrieb Fabian Keil: > Michael Osipov <1983-01-06@gmx.net> wrote: > >> Am 2016-10-14 um 11:43 schrieb Fabian Keil: >>> Michael Osipov <1983-01-06@gmx.net> wrote: >>> >>>> Am 2016-10-13 um 03:41 schrieb John-Mark Gurney: >>>>> Michael Osipov wrote this message on Wed, Oct 12, 2016 at 20:54 >>>>> +0200: >>>>>> As if there is a bottleneck between socket read and geom write to >>>>>> FS. >>>>>> >>>>>> Is that better? >>>>> >>>>> Have you run gstat on the system to see if there is an IO bottle >>>>> neck? Since you are using graid3, you want to look to see if >>>>> it's %busy is ~100, while the underlying components are not. >>>> >>>> This is hardly impossible because as soon as I start some SFTP >>>> transfer, all of my SSH sessions free or receive connetion >>>> timeout/abort. Doing a SFTP from FreeBSD to FreeBSD gives me on both >>>> physical disks and RAID3 volume a busy of zero to one perfect. In >>>> other terms, the drives are bored. >>> >>> Try checking the FAIL and SLEEP columns in the "vmstat -z" output. >> >> I assume that you expect a rise on those numbers. I have made several >> runs. Rebooted the machine and then started SFTP transfer. After seconds >> my SSH sessions locked up. The transfer was aborted manually after 10 >> minutes which should have saturated the entire connection. After that, I >> reran vmstat -z, no or minimal rise in FAIL and SLEEP. > > IIRC the SLEEP column only showns currently sleeping requests, > therefore you may want to run "vmstat -z" multiple times while > the transfer is ongoing. Having said that, a custom DTrace script > would probably be a better tool to diagnose the issue anyway. Ah ok, I need to switch physically to that machine because SSH is not possible. I don't mind to read the DTrace tutorial but what exactly should I trace? sshd? >> Interesting to say that this happens if is is a UFS volume on >> gconcat/graid3/gvinum/gstripe configuration. Regular gpart with GPT has >> no performance penalty. Additionally, it is not limited to SSH but >> virtually everything with sockets: nc, ggate, smb. >> >>> This could be related to: >>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209680#c2 >> >> It pretty much sounds like it, though I do not use ipfw, pf or any NAT >> stuff. I will try your first patch and let you know. >> >> Do you want me to add my usecase to the issue? > > If the patch helps, that could be useful once a committer > finds the time to look at the PR. I will switch from 11.0-RELEASE to 11.0-STABLE via Subversion first, rebuild world and kernel and then apply your patches. Hand on. Michael