From owner-freebsd-current@FreeBSD.ORG Wed Jan 28 10:20:12 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DCBA116A4CE for ; Wed, 28 Jan 2004 10:20:12 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id BA73143D1D for ; Wed, 28 Jan 2004 10:20:05 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.10/8.12.10) with ESMTP id i0SIHkUd015680; Wed, 28 Jan 2004 13:17:46 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i0SIHkTn015664; Wed, 28 Jan 2004 13:17:46 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Wed, 28 Jan 2004 13:17:46 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Don Bowman In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: "'freebsd-current@freebsd.org'" Subject: Re: system call performance 4.x vs 5.x [and UP vs MP] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jan 2004 18:20:13 -0000 On Wed, 28 Jan 2004, Don Bowman wrote: > This is a very simplistic benchmark, so don't get too hung up on the > accuracy. > > If you run this on a given machine on 4.x vs 5.x, you will notice a > dramatic difference [yes, invariants, et al are disabled]. > > For example, on a 2.0GHz P4-Xeon, HTT enabled, MP kernel, i can > do ~1M socket/s calls on 4.7, but only ~250K/s on 5.2. Make sure you are running with sys/proc.h:1.366; this removes two lock/unlocks of the process lock from the system call path. I've been doing some benchmarks of system call performance between 4.9 and 5.2-current on a dual-proc pIII box here, and by making this change, I saw about a 20% reduction in the cost of a system call from dropping those operations. You're running without INVARIANTS, which means you're skipping the other "big gratuitous locking in system call path" associated with the assertion checks in the trap code related to signalling. Also, I tend to use clock_gettime() to do time measurements, as among other things, it lets you ask for the clock resolution, and offers finer-grained timer measurements: struct timespec ts_start, ts_end, ts_res; #if 0 struct timespec ts_dummy; #endif assert(clock_getres(CLOCK_REALTIME, &ts_res) == 0); printf("Clock resolution: %d.%09lu\n", ts_res.tv_sec, ts_res.tv_nsec); assert(clock_gettime(CLOCK_REALTIME, &ts_start) == 0); for (i = 0; i < NUM; i++) #if 1 /* * Should require no locks; all thread-local data with * an MPSAFE system call. */ getuid() #endif #if 0 /* * This needs to grab the process lock to follow the * parent process pointer, and should cost more. */ getppid(); #endif #if 0 clock_gettime(CLOCK_REALTIME, &ts_dummy); #endif assert(clock_gettime(CLOCK_REALTIME, &ts_end) == 0); timespecsub(&ts_end, &ts_start); printf("%d.%09lu for %d iterations\n", ts_end.tv_sec, ts_end.tv_nsec, NUM); printf("%d.%09lu per/iteration\n", ts_end.tv_sec / NUM, ts_end.tv_nsec / NUM); I usually do about 100,000 or 200,000 iterations. Too many, and you lose the CPU. Also, depending on hardware, I've seen performance with SCHED_ULE plus the recent IPI changes improve with respect to SCHED_4BSD, or get worse. You might want to try both. If you're using ULE, try switching back to 4BSD and let us know what that changes. For the socket code, you really want the socket locking changes found in the netperf_socket branch. Once we get 5.2.1 out the door, the next priority will be to begin merging that, which pushes giant off the majority of the network stack. In the mean time, you might want to try measuring using pipe() instead, since the pipe code is Giant-free. There are a number of performance optimizations "in the works" for things like interrupt scheduling latency, cost of kernel context switches, etc, and hopefully we'll see those patches posted soon. I've benchmarked some of the early versions and see pretty dramatic improvements. The 5.x branch has been based on a lot of long-term investment in infrastructure, and many of the local optimizations have been deferred to get the architecture right. The result is hopefully an architecture that offers much more scalability and performance, but in the short term bad results for micro-benchmarks. We've about reached the point where we're ready to start with local optimizations, and I think you'll see a pretty rapid payoff. For example, with the various context switch/interrupt latency/... changes in the pipeline, I measure a halving of packet delivery latency end-to-end. Which doesn't mean we don't have further to go of course, but does suggest there's a lot of hope. BTW, is your table below "4.7 UP vs 5.x MP"? I was left unclear from the title. Generally, the results I see suggest 5.x UP is currently slower than 4.x UP (something we should make back up over the next three or four months), but that 5.x MP is quite a bit faster than 4.x MP in many interesting cases (i.e., network throughput, builds, etc). Especially with the recent IPI changes and scheduling changes, I see substantially lower latency in scheduling various kernel threads on 5.x-MP compared to 4.x-MP, which means a lot more work gets done. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research > > syscall 4.7 5.2 > write 1015036 169800 > socket 1078994 223253 > select 430564 155077 > gettimeofday 252762 183620 > > As a side note, any idea why gettimeofday is so much more > expensive than socket? > > Any suggestion on why such a difference between 4.x and 5.x? > code is compiled the same on each, 'gcc -O2', no threading > options chosen. > > For interest, you can try the same program on 4.x in UP vs MP, > and the difference is very dramatic too. > > #include > #include > #include > #include > #include > #include > #include > > #define M(n) measure(#n, n); > > static void > measure(char *name, void (*fp)(int,...)) > { > double speed; > int j; > unsigned long long i = 0; > unsigned long long us; > struct timeval tp,tp1; > gettimeofday(&tp, 0); > tp1 = tp; > > while (tp1.tv_sec - tp.tv_sec < 10) > { > for (j = 0; j < 1000000; j++) > { > fp(0,0,0,0); > i++; > } > gettimeofday(&tp1, 0); > } > us = ((tp1.tv_sec - tp.tv_sec) * 1000000) + (tp1.tv_usec - tp.tv_usec); > speed = (1000000.0 * i) / us; > printf("{%s: %llu %llu %6.2f}\n", name, i,us, speed); > } > > static void > doGettimeofday() > { > double speed; > unsigned long long i = 0; > unsigned long long us; > struct timeval tp,tp1; > gettimeofday(&tp, 0); > tp1 = tp; > > while (tp1.tv_sec - tp.tv_sec < 10) > { > gettimeofday(&tp1, 0); > i++; > } > us = ((tp1.tv_sec - tp.tv_sec) * 1000000) + (tp1.tv_usec - tp.tv_usec); > speed = (1000000.0 * i) / us; > printf("{gettimeofday: %llu %llu %6.2f}\n", i,us, speed); > } > > int > main(int argc, char **argv) > { > M(write); > M(socket); > M(select); > doGettimeofday(); > return 0; > } > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >