Date: Thu, 12 Mar 2015 10:39:10 -0400 From: John Baldwin <jhb@freebsd.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: freebsd-stable@freebsd.org, Mark Johnston <markj@freebsd.org>, Nick Frampton <nick.frampton@akips.com>, kib@freebsd.org Subject: Re: Suspected libkvm infinite loop Message-ID: <2374792.i316gF0qRo@ralph.baldwin.cx> In-Reply-To: <20150312104023.GL2379@kib.kiev.ua> References: <54FE3803.2000307@akips.com> <20150312043407.GA11120@raichu> <20150312104023.GL2379@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, March 12, 2015 12:40:23 PM Konstantin Belousov wrote: > On Wed, Mar 11, 2015 at 09:34:07PM -0700, Mark Johnston wrote: > > On Thu, Mar 12, 2015 at 02:05:32PM +1000, Nick Frampton wrote: > > > On 12/03/15 00:38, John Baldwin wrote: > > > >>> It sounds like this issue might be the one fixed in r272566: if the > > > >>> > >KERN_PROC_ALL sysctl is read with an insufficiently large buffer, an > > > >>> > >sbuf error return value could bubble up and be treated as ERESTART, > > > >>> > >resulting in a loop. > > > >>> > > > > > >>> > >This can be confirmed with something like > > > >>> > > > > > >>> > > dtrace -n 'syscall:::entry/pid == $target/{@[probefunc] = count();} tick-3s {exit(0);}' -p <pid of looping proc> > > > >>> > > > > > >>> > >If the output consists solely of __sysctl, this bug is likely the > > > >>> > >culprit. > > > >> > > > > >> >Unfortunately, I accidentally killed fstat this morning before I could do any further debug. > > > >> > > > > >> >I ran truss -p on it yesterday and it was spinning solely on __sysctl. > > > >> > > > > >> >I'll try compiling with debug symbols in case it happens again. I haven't been able to reproduce the > > > >> >problem in a reasonable time frame so it could be days or weeks before we see it happen again. > > > > Tha truss output is consistent with Mark's suggestion, so I would try > > > > his suggested fix of 272566. > > > > > > I patched the 10.1 kernel with r272566 and it appears to have fixed the issue. Is this patch likely > > > to be MFCed back to 10-stable? > > > > I can't see any reason it shouldn't be, and there was an MFC reminder in > > the commit log entry for that revision. I've cc'ed kib@, who might have a > > reason. > > The mentioned commit depends on r271976, in fact it depends on the series of > commits, including r271486 and r271489. > > I did not merged r271976 with manual resolution of the conficts, since it > means that the work done for HEAD needs to be redone for stable/10 to > ensure that all cases are covered. Later, when the mentioned series is > merged, the work should be redone once more. > > And to note, r271489 is not trivially mergeable as well, just checked. You could merge r272566 and just fixup the sbuf_bcat() in export_fd_to_sb() in kern_descrip.c instead. I hadn't really considered fo_fill_kinfo to be something that was mergeable to 10. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2374792.i316gF0qRo>