Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Oct 2008 10:06:16 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Danny Braniss <danny@cs.huji.ac.il>
Cc:        freebsd-hackers@freebsd.org, Jeremy Chadwick <koitsu@freebsd.org>, freebsd-stable@freebsd.org, Claus Guttesen <kometen@gmail.com>
Subject:   Re: bad NFS/UDP performance 
Message-ID:  <alpine.BSF.1.10.0810031003440.41647@fledge.watson.org>
In-Reply-To: <E1KlgYe-000Es2-8u@cs1.cs.huji.ac.il>
References:  <E1Kj7NA-000FXz-3F@cs1.cs.huji.ac.il> <20080926081806.GA19055@icarus.home.lan> <E1Kj9bR-000H7t-0g@cs1.cs.huji.ac.il> <20080926095230.GA20789@icarus.home.lan> <E1KjEZw-000KkH-GP@cs1.cs.huji.ac.il> <alpine.BSF.1.10.0809271114450.20117@fledge.watson.org> <E1KjY2h-0008GC-PP@cs1.cs.huji.ac.il> <b41c75520809290140i435a5f6dge5219cd03cad55fe@mail.gmail.com> <E1Klfac-000DzZ-Ie@cs1.cs.huji.ac.il> <alpine.BSF.1.10.0810030910351.41647@fledge.watson.org> <E1KlgYe-000Es2-8u@cs1.cs.huji.ac.il>

next in thread | previous in thread | raw e-mail | index | archive | help

On Fri, 3 Oct 2008, Danny Braniss wrote:

>> OK, so it looks like this was almost certainly the rwlock change.  What 
>> happens if you pretty much universally substitute the following in 
>> udp_usrreq.c:
>>
>> Currently		Change to
>> ---------		---------
>> INP_RLOCK		INP_WLOCK
>> INP_RUNLOCK		INP_WUNLOCK
>> INP_RLOCK_ASSERT	INP_WLOCK_ASSERT
>
> I guess you were almost certainly correct :-) I did the global subst. on the 
> udp_usrreq.c from 19/08, __FBSDID("$FreeBSD: src/sys/netinet/udp_usrreq.c,v 
> 1.218.2.3 2008/08/18 23:00:41 bz Exp $"); and now udp is fine again!

OK.  This is a change I'd rather not back out since it significantly improves 
performance for many other UDP workloads, so we need to figure out why it's 
hurting us so much here so that we know if there are reasonable alternatives.

Would it be possible for you to do a run of the workload with both kernels 
using LOCK_PROFILING around the benchmark, and then we can compare lock 
contention in the two cases?  What we often find is that relieving contention 
at one point causes new contention at another point, and if the primitive used 
at that point handles contention less well for whatever reason, performance 
can be reduced rather than improved.  So maybe we're looking at an issue in 
the dispatched UDP code from so_upcall?  Another less satisfying (and 
fundamentally more difficult) answer might be "something to do with the 
scheduler", but a bit more analysis may shed some light.

Robert N M Watson
Computer Laboratory
University of Cambridge

>
> danny
>
>
>> Robert N M Watson
>> Computer Laboratory
>> University of Cambridge
>>
>>>
>>> server is a NetApp:
>>>
>>> kernel from 18/08/08 00:00:0 :
>>>                    /----- UDP ----//---- TCP -------/
>>>       1*512  38528 0.19s   83.50MB 0.20s   80.82MB/s
>>>       2*512  19264 0.21s   76.83MB 0.21s   77.57MB/s
>>>       4*512   9632 0.19s   85.51MB 0.22s   73.13MB/s
>>>       8*512   4816 0.19s   83.76MB 0.21s   75.84MB/s
>>>      16*512   2408 0.19s   83.99MB 0.21s   77.18MB/s
>>>      32*512   1204 0.19s   84.45MB 0.22s   71.79MB/s
>>>      64*512    602 0.20s   79.98MB 0.20s   78.44MB/s
>>>     128*512    301 0.18s   86.51MB 0.22s   71.53MB/s
>>>     256*512    150 0.19s   82.83MB 0.20s   78.86MB/s
>>>     512*512     75 0.19s   82.77MB 0.21s   76.39MB/s
>>>    1024*512     37 0.19s   85.62MB 0.21s   76.64MB/s
>>>    2048*512     18 0.21s   77.72MB 0.20s   80.30MB/s
>>>    4096*512      9 0.26s   61.06MB 0.30s   53.79MB/s
>>>    8192*512      4 0.83s   19.20MB 0.41s   39.12MB/s
>>>   16384*512      2 0.84s   19.01MB 0.41s   39.03MB/s
>>>   32768*512      1 0.82s   19.59MB 0.39s   40.89MB/s
>>>
>>> kernel from 19/08/08 00:00:00:
>>>       1*512  38528 0.45s   35.59MB 0.20s   81.43MB/s
>>>       2*512  19264 0.45s   35.56MB 0.20s   79.24MB/s
>>>       4*512   9632 0.49s   32.66MB 0.22s   73.72MB/s
>>>       8*512   4816 0.47s   34.06MB 0.21s   75.52MB/s
>>>      16*512   2408 0.53s   30.16MB 0.22s   72.58MB/s
>>>      32*512   1204 0.31s   51.68MB 0.40s   40.14MB/s
>>>      64*512    602 0.43s   37.23MB 0.25s   63.57MB/s
>>>     128*512    301 0.51s   31.39MB 0.26s   62.70MB/s
>>>     256*512    150 0.47s   34.02MB 0.23s   69.06MB/s
>>>     512*512     75 0.47s   34.01MB 0.23s   70.52MB/s
>>>    1024*512     37 0.53s   30.12MB 0.22s   73.01MB/s
>>>    2048*512     18 0.55s   29.07MB 0.23s   70.64MB/s
>>>    4096*512      9 0.46s   34.69MB 0.21s   75.92MB/s
>>>    8192*512      4 0.81s   19.66MB 0.43s   36.89MB/s
>>>   16384*512      2 0.80s   19.99MB 0.40s   40.29MB/s
>>>   32768*512      1 1.11s   14.41MB 0.38s   42.56MB/s
>>>
>>>
>>>
>>>
>>>
>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.1.10.0810031003440.41647>