Date: Fri, 3 Oct 2008 10:06:16 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Danny Braniss <danny@cs.huji.ac.il> Cc: freebsd-hackers@freebsd.org, Jeremy Chadwick <koitsu@freebsd.org>, freebsd-stable@freebsd.org, Claus Guttesen <kometen@gmail.com> Subject: Re: bad NFS/UDP performance Message-ID: <alpine.BSF.1.10.0810031003440.41647@fledge.watson.org> In-Reply-To: <E1KlgYe-000Es2-8u@cs1.cs.huji.ac.il> References: <E1Kj7NA-000FXz-3F@cs1.cs.huji.ac.il> <20080926081806.GA19055@icarus.home.lan> <E1Kj9bR-000H7t-0g@cs1.cs.huji.ac.il> <20080926095230.GA20789@icarus.home.lan> <E1KjEZw-000KkH-GP@cs1.cs.huji.ac.il> <alpine.BSF.1.10.0809271114450.20117@fledge.watson.org> <E1KjY2h-0008GC-PP@cs1.cs.huji.ac.il> <b41c75520809290140i435a5f6dge5219cd03cad55fe@mail.gmail.com> <E1Klfac-000DzZ-Ie@cs1.cs.huji.ac.il> <alpine.BSF.1.10.0810030910351.41647@fledge.watson.org> <E1KlgYe-000Es2-8u@cs1.cs.huji.ac.il>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 3 Oct 2008, Danny Braniss wrote: >> OK, so it looks like this was almost certainly the rwlock change. What >> happens if you pretty much universally substitute the following in >> udp_usrreq.c: >> >> Currently Change to >> --------- --------- >> INP_RLOCK INP_WLOCK >> INP_RUNLOCK INP_WUNLOCK >> INP_RLOCK_ASSERT INP_WLOCK_ASSERT > > I guess you were almost certainly correct :-) I did the global subst. on the > udp_usrreq.c from 19/08, __FBSDID("$FreeBSD: src/sys/netinet/udp_usrreq.c,v > 1.218.2.3 2008/08/18 23:00:41 bz Exp $"); and now udp is fine again! OK. This is a change I'd rather not back out since it significantly improves performance for many other UDP workloads, so we need to figure out why it's hurting us so much here so that we know if there are reasonable alternatives. Would it be possible for you to do a run of the workload with both kernels using LOCK_PROFILING around the benchmark, and then we can compare lock contention in the two cases? What we often find is that relieving contention at one point causes new contention at another point, and if the primitive used at that point handles contention less well for whatever reason, performance can be reduced rather than improved. So maybe we're looking at an issue in the dispatched UDP code from so_upcall? Another less satisfying (and fundamentally more difficult) answer might be "something to do with the scheduler", but a bit more analysis may shed some light. Robert N M Watson Computer Laboratory University of Cambridge > > danny > > >> Robert N M Watson >> Computer Laboratory >> University of Cambridge >> >>> >>> server is a NetApp: >>> >>> kernel from 18/08/08 00:00:0 : >>> /----- UDP ----//---- TCP -------/ >>> 1*512 38528 0.19s 83.50MB 0.20s 80.82MB/s >>> 2*512 19264 0.21s 76.83MB 0.21s 77.57MB/s >>> 4*512 9632 0.19s 85.51MB 0.22s 73.13MB/s >>> 8*512 4816 0.19s 83.76MB 0.21s 75.84MB/s >>> 16*512 2408 0.19s 83.99MB 0.21s 77.18MB/s >>> 32*512 1204 0.19s 84.45MB 0.22s 71.79MB/s >>> 64*512 602 0.20s 79.98MB 0.20s 78.44MB/s >>> 128*512 301 0.18s 86.51MB 0.22s 71.53MB/s >>> 256*512 150 0.19s 82.83MB 0.20s 78.86MB/s >>> 512*512 75 0.19s 82.77MB 0.21s 76.39MB/s >>> 1024*512 37 0.19s 85.62MB 0.21s 76.64MB/s >>> 2048*512 18 0.21s 77.72MB 0.20s 80.30MB/s >>> 4096*512 9 0.26s 61.06MB 0.30s 53.79MB/s >>> 8192*512 4 0.83s 19.20MB 0.41s 39.12MB/s >>> 16384*512 2 0.84s 19.01MB 0.41s 39.03MB/s >>> 32768*512 1 0.82s 19.59MB 0.39s 40.89MB/s >>> >>> kernel from 19/08/08 00:00:00: >>> 1*512 38528 0.45s 35.59MB 0.20s 81.43MB/s >>> 2*512 19264 0.45s 35.56MB 0.20s 79.24MB/s >>> 4*512 9632 0.49s 32.66MB 0.22s 73.72MB/s >>> 8*512 4816 0.47s 34.06MB 0.21s 75.52MB/s >>> 16*512 2408 0.53s 30.16MB 0.22s 72.58MB/s >>> 32*512 1204 0.31s 51.68MB 0.40s 40.14MB/s >>> 64*512 602 0.43s 37.23MB 0.25s 63.57MB/s >>> 128*512 301 0.51s 31.39MB 0.26s 62.70MB/s >>> 256*512 150 0.47s 34.02MB 0.23s 69.06MB/s >>> 512*512 75 0.47s 34.01MB 0.23s 70.52MB/s >>> 1024*512 37 0.53s 30.12MB 0.22s 73.01MB/s >>> 2048*512 18 0.55s 29.07MB 0.23s 70.64MB/s >>> 4096*512 9 0.46s 34.69MB 0.21s 75.92MB/s >>> 8192*512 4 0.81s 19.66MB 0.43s 36.89MB/s >>> 16384*512 2 0.80s 19.99MB 0.40s 40.29MB/s >>> 32768*512 1 1.11s 14.41MB 0.38s 42.56MB/s >>> >>> >>> >>> >>> > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.1.10.0810031003440.41647>