From owner-freebsd-fs@freebsd.org  Sun Jan  3 18:46:03 2016
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id E7BDFA5F216
 for <freebsd-fs@mailman.ysv.freebsd.org>; Sun,  3 Jan 2016 18:46:02 +0000 (UTC)
 (envelope-from mi+thun@aldan.algebra.com)
Received: from vms173007pub.verizon.net (vms173007pub.verizon.net
 [206.46.173.7])
 (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id CADD61DAA
 for <freebsd-fs@freebsd.org>; Sun,  3 Jan 2016 18:46:02 +0000 (UTC)
 (envelope-from mi+thun@aldan.algebra.com)
MIME-version: 1.0
Received: from aldan.narawntapu ([100.1.236.52]) by vms173007.mailsrvcs.net
 (Oracle Communications Messaging Server 7.0.5.32.0 64bit (built Jul 16 2014))
 with ESMTPA id <0O0E00HBJ1BJFW00@vms173007.mailsrvcs.net> for
 freebsd-fs@freebsd.org; Sun, 03 Jan 2016 11:45:23 -0600 (CST)
X-CMAE-Score: 0
X-CMAE-Analysis: v=2.1 cv=Nc0brD34 c=1 sm=1 tr=0	a=UorMnhrCY2jH/mPejITChw==:117
 a=LaogzpLLAAAA:8 a=oR5dmqMzAAAA:8	a=7aQ_Q-yQQ-AA:10 a=r77TgQKjGQsHNAKrUKIA:9
 a=DeA27PXj1JP4lalNxJwA:9	a=pILNOxqGKmIA:10 a=pGLkceISAAAA:8 a=RLQEIbXJAAAA:8
 a=RhaCqEVCSXuYShOpXacA:9 a=3hQuPBN62KDJHp1x:21 a=_W_S_7VecoQA:10
Subject: Re: NFS reads vs. writes
To: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>,
 Tom Curry <thomasrcurry@gmail.com>
References: <568880D3.3010402@aldan.algebra.com>
 <alpine.GSO.2.01.1601031006020.28454@freddy.simplesystems.org>
From: "Mikhail T." <mi+thun@aldan.algebra.com>
X-Enigmail-Draft-Status: N1110
Cc: freebsd-fs@freebsd.org
Message-id: <56895E2E.8060405@aldan.algebra.com>
Date: Sun, 03 Jan 2016 12:45:18 -0500
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101
 Thunderbird/38.4.0
In-reply-to: <alpine.GSO.2.01.1601031006020.28454@freddy.simplesystems.org>
Content-Type: text/plain; CHARSET=US-ASCII
Content-Transfer-Encoding: 7BIT
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Jan 2016 18:46:03 -0000

On 03.01.2016 10:32, Tom Curry wrote:
> What does disk activity (gstat or iostat) look like when this is going on?
I use systat for such observations. Here is a typical snapshot of
machine a, when it reads its own /a and writes over NFS to b:/b:

     3,6%Sys   0,0%Intr 15,6%User  0,0%Nice 80,8%Idle         ozfod    
    2 ata0 14
    |    |    |    |    |    |    |    |    |    |           %ozfod   
    69 uhci0 ehci
    ==>>>>>>>>                                                daefr  
    205 uhci1 ahci
                                            13 dtbuf          prcfr 
    2577 hpet0 20
    Namei     Name-cache   Dir-cache    248362 desvn     2374 totfr 
    1277 em0 256
       Calls    hits   %    hits   %     95980 numvn         
    react       hdac0 257
        1575    1508  96                 62091 frevn          pdwak   
    94 hdac1 258
                                                          288
    pdpgs       vgapci0
    Disks   md0  ada0  ada1  ada2  ada3   da0   da1           intrn
    KB/t   0,00   119 26,42 19,10 21,72  0,00  0,00   6720528 wire
    tps       0    47    72    64    69     0     0    721552 act
    MB/s   0,00  5,42  1,87  1,19  1,45  0,00  0,00   2331396 inact
    %busy     0     4    19    11    11     0     0        88 cache
                                                       403384 free
                                                      1056992 buf

The ada0 is the ssd hosting both read cache and zil devices, ada{1,2,3}
are the three disks comprising a RAID5 zpool.

Meanwhile on the b-side the following is going on:

     4,2%Sys   0,0%Intr  0,0%User  0,0%Nice 95,8%Idle        
    ozfod       hdac0 18
    |    |    |    |    |    |    |    |    |    |          
    %ozfod       fwohci0 19
    ==                                                        daefr  
    429 hpet0 uhci
                                            22 dtbuf          prcfr   
    50 uhci0 uhci
    Namei     Name-cache   Dir-cache    282383 desvn          totfr  
    598 atapci1 23
       Calls    hits   %    hits   %    107825 numvn          react  
    141 mpt0 257
          18      17  94                 70416 frevn          pdwak 
    1025 bce1 258
                                                           50 pdpgs
    Disks   md0  ada0   da0   da1   da2   da3   da4           intrn
    KB/t   0,00  6,50  0,00 80,21 16,00 79,59 68,42   4794972 wire
    tps       0   594     0    53     2    55    39    130060 act
    MB/s   0,00  3,77  0,00  4,18  0,03  4,29  2,63   7153984 inact
    %busy     0    95     0    10     1    14     8    131100 cache

Here too the ada0 hosts the log-device and appears to be the bottleneck.
There is no read-cache on b, and the zpool consists of da1, da3, and da4
simply striped together (no redundancy).

When, instead of /pushing/ data out of a, I begin /pulling/ it (a
different file from the same directory) from b, things change
drastically. a looks like this:

    Disks   md0  ada0  ada1  ada2  ada3   da0   da1           intrn
    KB/t   0,00 83,00 64,00 64,00 64,00  0,00  0,00   6547524 wire
    tps       0    27   469   456   472     0     0    744768 act
    MB/s   0,00  2,16 29,32 28,49 29,50  0,00  0,00   2722100 inact
    %busy     0     1    13    13    13     0     0       108 cache

and b like this:

    Disks   md0  ada0   da0   da1   da2   da3   da4           intrn
    KB/t   0,00 15,46  0,00   114  0,00   116   112   4627944 wire
    tps       0    45     0   189     0   192   160    130376 act
    MB/s   0,00  0,68  0,00 20,98  0,00 21,74 17,45   7308284 inact
    %busy     0    81     0    19     0    37    28    145200 cache

ada0 is no longer the bottleneck and the copy is over almost instantly.
> What is the average latency between the two machines?

ping-ing b from a:

    round-trip min/avg/max/stddev = 0.137/0.156/0.178/0.015 ms

ping-ing a from b:

    round-trip min/avg/max/stddev = 0.114/0.169/0.220/0.036 ms

On 03.01.2016 11:09, Bob Friesenhahn wrote:
> The most likely issue is a latency problem with synchronous writes on
> 'b'.  The main pool disks seem to be working ok.  Make sure that the
> SSD you are using for slog is working fine.  Maybe it is abnormally slow. 
Why would the same ZFS -- with the same slog -- be working faster, when
written to locally, than when over NFS?

    -mi