Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 15 Oct 2016 14:01:59 +0200
From:      Michael Osipov <mosipov@gmx.de>
To:        freebsd-geom@freebsd.org
Subject:   Re: Abysmally slow write to geom class volume over network
Message-ID:  <90ca3d9c-6016-8a63-64c5-ce3f829756dd@gmx.de>
In-Reply-To: <20161015132249.607b374c@fabiankeil.de>
References:  <33da0f73-48f1-4727-fe76-41343dc4955b@gmx.net> <18314f27-849b-31df-d88d-af64e89c133f@gmx.net> <346c7da7-02b2-ba34-1463-f3f0a5a3cd9a@rlwinm.de> <15e9cb94-7ad8-e547-b06a-699ce2250624@gmx.net> <20161013014143.GA1669@funkthat.com> <636763f0-a732-18ba-262b-c3fc01f4342c@gmx.net> <20161014114330.396fe534@fabiankeil.de> <5a12f70d-3488-e799-b875-9b358ef7aff9@gmx.net> <20161015132249.607b374c@fabiankeil.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Am 2016-10-15 um 13:22 schrieb Fabian Keil:
> Michael Osipov <1983-01-06@gmx.net> wrote:
>
>> Am 2016-10-14 um 11:43 schrieb Fabian Keil:
>>> Michael Osipov <1983-01-06@gmx.net> wrote:
>>>
>>>> Am 2016-10-13 um 03:41 schrieb John-Mark Gurney:
>>>>> Michael Osipov wrote this message on Wed, Oct 12, 2016 at 20:54
>>>>> +0200:
>>>>>> As if there is a bottleneck between socket read and geom write to
>>>>>> FS.
>>>>>>
>>>>>> Is that better?
>>>>>
>>>>> Have you run gstat on the system to see if there is an IO bottle
>>>>> neck?  Since you are using graid3, you want to look to see if
>>>>> it's %busy is ~100, while the underlying components are not.
>>>>
>>>> This is hardly impossible because as soon as I start some SFTP
>>>> transfer, all of my SSH sessions free or receive connetion
>>>> timeout/abort.  Doing a SFTP from FreeBSD to FreeBSD gives me on both
>>>> physical disks and RAID3 volume a busy of zero to one perfect. In
>>>> other terms, the drives are bored.
>>>
>>> Try checking the FAIL and SLEEP columns in the "vmstat -z" output.
>>
>> I assume that you expect a rise on those numbers. I have made several
>> runs. Rebooted the machine and then started SFTP transfer. After seconds
>> my SSH sessions locked up. The transfer was aborted manually after 10
>> minutes which should have saturated the entire connection. After that, I
>> reran vmstat -z, no or minimal rise in FAIL and SLEEP.
>
> IIRC the SLEEP column only showns currently sleeping requests,
> therefore you may want to run "vmstat -z" multiple times while
> the transfer is ongoing. Having said that, a custom DTrace script
> would probably be a better tool to diagnose the issue anyway.

Ah ok, I need to switch physically to that machine because SSH is not 
possible. I don't mind to read the DTrace tutorial but what exactly 
should I trace? sshd?

>> Interesting to say that this happens if is is a UFS volume on
>> gconcat/graid3/gvinum/gstripe configuration. Regular gpart with GPT has
>> no performance penalty. Additionally, it is not limited to SSH but
>> virtually everything with sockets: nc, ggate, smb.
>>
>>> This could be related to:
>>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209680#c2
>>
>> It pretty much sounds like it, though I do not use ipfw, pf or any NAT
>> stuff. I will try your first patch and let you know.
>>
>> Do you want me to add my usecase to the issue?
>
> If the patch helps, that could be useful once a committer
> finds the time to look at the PR.

I will switch from 11.0-RELEASE to 11.0-STABLE via Subversion first, 
rebuild world and kernel and then apply your patches. Hand on.

Michael




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?90ca3d9c-6016-8a63-64c5-ce3f829756dd>