Date: Thu, 4 Mar 2004 01:38:30 -0800 (PST) From: Don Lewis <truckman@FreeBSD.org> To: rwatson@FreeBSD.org Cc: freebsd-current@FreeBSD.org Subject: Re: sysctl spinning (was: Re: ps Causes Hard Hang) Message-ID: <200403040938.i249cU7E003667@gw.catspoiler.org> In-Reply-To: <Pine.NEB.3.96L.1040303133639.27227D-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 3 Mar, Robert Watson wrote: > > On Wed, 3 Mar 2004, Cy Schubert wrote: > >> I'm running 5 -CURRENT systems. My firewall system, using IPF, hard >> hangs every time ps is entered -- totally unresponsive, requiring either >> a power cycle or reset switch to bring it back to life. >> >> Before I start digging into this seriously I'd like to possibly get info >> from anyone who may have experienced this before. > > Alan Cox and I have both experienced this -- it's actually only a hard > hang if you're trying to use the syscons break to debugger, serial break > to debugger can get into DDB fine. It looks like the sysctl code is > spinning in kernel, possibly due to looping waiting for a response other > than EAGAIN. I'm wonder if it was the recent limits on locked memory > changes in sysctl, although at first we thought it might be the sleepq > changes (seems less likely now). Because sysctl holds Giant, the other > CPUs are locked out of Giant-protected bits of the kernel (many of them), > including Syscons. That sounds quite possible, though I would only expect it to happen if userland passed a large output buffer to the sysctl call. In the current implementation, EAGAIN will only be returned when this condition is true: if (atop(size) + cnt.v_wire_count > vm_page_max_wired) return (EAGAIN); Hmn, it looks like vm_page_max_wired is dynamically set to one third of free system memory in vm_pageout(). /* XXX does not really belong here */ if (vm_page_max_wired == 0) vm_page_max_wired = cnt.v_free_count / 3; I was under the impression that it was one third of physical memory. I think there are three problems here: vm_page_max_wired is probably the wrong value. The sysctl code should not do a tight loop on an EAGAIN error. The sysctl handlers that wire memory should actually provide estimates of the amount of memory that needs to be wired. Should the failure to wire the buffer be mapped to a different errno? There may be cases when it is valid to retry the request. The code that loops on EAGAIN was added in the rev 1.63 of kern_sysctl.c.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200403040938.i249cU7E003667>