Date: Tue, 1 Nov 2016 20:45:01 -0700 From: Kevin Oberman <rkoberman@gmail.com> To: Jason Harmening <jason.harmening@gmail.com> Cc: FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org> Subject: Re: huge nanosleep variance on 11-stable Message-ID: <CAN6yY1vKr_PAHp3bL-iiHndPxq58kz_qFqmjbEcK1CbmhywVZg@mail.gmail.com> In-Reply-To: <1c3f4599-8aef-471a-3a39-49d913f1a4e5@gmail.com> References: <c88341e2-4c52-ed3c-a469-6446da4415f4@gmail.com> <6167392c-c37a-6e39-aa22-ca45435d6088@gmail.com> <1c3f4599-8aef-471a-3a39-49d913f1a4e5@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Nov 1, 2016 at 2:36 PM, Jason Harmening <jason.harmening@gmail.com> wrote: > Sorry, that should be ~*30ms* to get 30fps, though the variance is still > up to 500ms for me either way. > > On 11/01/16 14:29, Jason Harmening wrote: > > repro code is at http://pastebin.com/B68N4AFY if anyone's interested. > > > > On 11/01/16 13:58, Jason Harmening wrote: > >> Hi everyone, > >> > >> I recently upgraded my main amd64 server from 10.3-stable (r302011) to > >> 11.0-stable (r308099). It went smoothly except for one big issue: > >> certain applications (but not the system as a whole) respond very > >> sluggishly, and video playback of any kind is extremely choppy. > >> > >> The system is under very light load, and I see no evidence of abnormal > >> interrupt latency or interrupt load. More interestingly, if I place the > >> system under full load (~0.0% idle) the problem *disappears* and > >> playback/responsiveness are smooth and quick. > >> > >> Running ktrace on some of the affected apps points me at the problem: > >> huge variance in the amount of time spent in the nanosleep system call. > >> A sleep of, say, 5ms might take anywhere from 5ms to ~500ms from entry > >> to return of the syscall. OTOH, anything CPU-bound or that waits on > >> condvars or I/O interrupts seems to work fine, so this doesn't seem to > >> be an issue with overall system latency. > >> > >> I can repro this with a simple program that just does a 3ms usleep in a > >> tight loop (i.e. roughly the amount of time a video player would sleep > >> between frames @ 30fps). At light load ktrace will show the huge > >> nanosleep variance; under heavy load every nanosleep will complete in > >> almost exactly 3ms. > >> > >> FWIW, I don't see this on -current, although right now all my -current > >> images are VMs on different HW so that might not mean anything. I'm not > >> aware of any recent timer- or scheduler- specific changes, so I'm > >> wondering if perhaps the recent IPI or taskqueue changes might be > >> somehow to blame. > >> > >> I'm not especially familiar w/ the relevant parts of the kernel, so any > >> guidance on where I should focus my debugging efforts would be much > >> appreciated. > >> > >> Thanks, > >> Jason > This is likely off track, but this is a behavior I have noticed since moving to 11, though it might have started in 10.3-STABLE before moving to head before 11 went to beta. I can't explain any way nanosleep could be involved, but I saw annoying lock-ups similar to yours. I also no longer see them. I eliminated the annoyance by change scheduler from ULE to 4BSD. That was it, but I have not seen the issue since. I'd be very interested in whether the scheduler is somehow impacting timing functions or it's s different issue. I've felt that there was something off in ULE for some time, but it was not until these annoying hiccups convinced me to try going back to 4BSD. Tip o' the hat to Doug B. for his suggestions that ULE may have issues that impacted interactivity.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAN6yY1vKr_PAHp3bL-iiHndPxq58kz_qFqmjbEcK1CbmhywVZg>