Date: Wed, 13 Jun 2007 13:56:10 -0700 From: "Aaron Kunze" <boilerpdx@gmail.com> To: freebsd-threads@FreeBSD.org Subject: Performance issue (bug?) in libpthread Message-ID: <1d26be380706131356t3fca2f7dk625f1f5c4234b56d@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hello! I am seeing a performance issue when I use libpthread to provide POSIX threads, and I think I've narrowed the problem down to a line of code in libpthread. I am interested to know if anyone else has seen this problem, or if anyone disagrees with my analysis. I am running on a dual Intel Xeon X5355 system, which gives me 8 cores total. I am running an amd64 build of FreeBSD 6.2 with the SMP and ULE options. The test program I am using just creates 8 threads that do nothing but computation. They do no synchronization or I/O. I have complete control of the system, and all other processes are idle. For the first minute of its run time, this test app uses less than 8 cores. In most of the test runs, it uses 7 cores, but I have seen other numbers. The remaining core(s) are idle. Here is a snippit from "top" that shows this: 55948 RUN 0 3:06 87.21% compute_threads_pt 55948 CPU5 5 3:06 87.21% compute_threads_pt 55948 CPU4 4 3:06 87.21% compute_threads_pt 55948 CPU2 2 3:06 87.21% compute_threads_pt 55948 CPU1 1 3:06 87.21% compute_threads_pt 55948 CPU6 6 3:06 87.21% compute_threads_pt 55948 CPU3 3 3:06 87.21% compute_threads_pt 55948 kserel 0 3:06 0.00% compute_threads_pt The first KSE is on the runqueue because "top" is running on core 0 at the instant it samples the system state. That's not the problem. It's the last KSE that shows the problem. It is in the kserelease state and stays there for exactly one minute. After a minute passes, the problem resolves itself, and the system becomes fully utilized. Here's the output of "top" after a minute: 55948 CPU5 5 7:40 92.48% compute_threads_pt 55948 CPU4 4 7:40 92.48% compute_threads_pt 55948 CPU1 1 7:40 92.48% compute_threads_pt 55948 CPU6 6 7:40 92.48% compute_threads_pt 55948 CPU3 3 7:40 92.48% compute_threads_pt 55948 CPU2 2 7:40 92.48% compute_threads_pt 55948 RUN 0 7:40 87.94% compute_threads_pt 55948 CPU7 7 7:40 36.82% compute_threads_pt I went looking through the libpthread code to find something that looked like it could cause this, and I found something at lines 1801-1812 of thr_kern.c in the kse_wait function: if ((td_wait == NULL) || (td_wait->wakeup_time.tv_sec < 0)) { /* Limit sleep to no more than 1 minute. */ ts_sleep.tv_sec = 60; ts_sleep.tv_nsec = 0; } else { KSE_GET_TOD(kse, &ts); TIMESPEC_SUB(&ts_sleep, &td_wait->wakeup_time, &ts); if (ts_sleep.tv_sec > 60) { ts_sleep.tv_sec = 60; ts_sleep.tv_nsec = 0; } } I interpret this code to be putting KSEs to sleep for one minute when they find no user-level threads either ready to run or waiting for some timed event. If I change the tv_sec member of ts_sleep to 0 instead of 60 on line 1803, the problem goes away entirely. Here's what I believe is happening: - When I create the first worker thread, pthread_create calls _kse_setthreaded. - _kse_setthreaded calls _thr_setmaxconcurrency, which calls _thr_setconcurrency with "8" as the argument. - _thr_setconcurrency creates 7 KSEs (one already exists) - But at this point, the app has only created one user-level thread! - In parallel, the main thread continues to create threads while the new KSEs wake up and look in the run queue for threads to schedule. - One or more KSEs get to the run queue before all of the threads have been created and find no work to do. - Those KSEs call kse_wait and sleep for 1 minute. So, has anyone seen this before? Did I miss something? Aaron BTW, I looked at the HEAD in CVS, and the code in question has not changed. So if this is a bug, it hasn't been fixed in the meantime.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1d26be380706131356t3fca2f7dk625f1f5c4234b56d>