From owner-freebsd-current@FreeBSD.ORG Thu Jan 20 00:35:11 2011 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C44AC1065672 for ; Thu, 20 Jan 2011 00:35:11 +0000 (UTC) (envelope-from beat@chruetertee.ch) Received: from marvin.chruetertee.ch (marvin.chruetertee.ch [217.150.245.55]) by mx1.freebsd.org (Postfix) with ESMTP id 6FD718FC12 for ; Thu, 20 Jan 2011 00:35:10 +0000 (UTC) Received: from daedalus.network.local (215-212.2-85.cust.bluewin.ch [85.2.212.215]) (authenticated bits=0) by marvin.chruetertee.ch (8.14.3/8.14.3) with ESMTP id p0K0Z9ag015880 (version=TLSv1/SSLv3 cipher=DHE-DSS-CAMELLIA256-SHA bits=256 verify=NO); Thu, 20 Jan 2011 00:35:09 GMT (envelope-from beat@chruetertee.ch) Message-ID: <4D37833D.2050301@chruetertee.ch> Date: Thu, 20 Jan 2011 01:35:09 +0100 From: =?ISO-8859-1?Q?Beat_G=E4tzi?= User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.1.16) Gecko/20101210 Lightning/1.0b1 Thunderbird/3.0.11 MIME-Version: 1.0 To: Kostik Belousov References: <4D35A0BB.3010601@chruetertee.ch> <20110118144611.GP2518@deviant.kiev.zoral.com.ua> <4D35B2F2.1000804@chruetertee.ch> <20110118161355.GR2518@deviant.kiev.zoral.com.ua> <4D35C26E.4070108@chruetertee.ch> <20110119122417.GW2518@deviant.kiev.zoral.com.ua> In-Reply-To: <20110119122417.GW2518@deviant.kiev.zoral.com.ua> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: current@freebsd.org Subject: Re: Running linux ldconfig on tmpfs results in unkillable process X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Jan 2011 00:35:11 -0000 On 19.01.2011 13:24, Kostik Belousov wrote: > On Tue, Jan 18, 2011 at 05:40:14PM +0100, Beat G?tzi wrote: >> On 18.01.2011 17:13, Kostik Belousov wrote: >>> On Tue, Jan 18, 2011 at 04:34:10PM +0100, Beat G?tzi wrote: >>>> On 18.01.2011 15:46, Kostik Belousov wrote: >>>>> On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote: >>>>>> Hi, >>>>>> >>>>>> I've a tinderbox which uses tmpfs to build ports. Every time I build a >>>>>> port which executes linux ldconfig it results in an unkillable process >>>>>> which uses 100% CPU. The problem is reproduceable without tinderbox: >>>>>> >>>>>> # uname -a >>>>>> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3 >>>>>> r216761: Tue Dec 28 15:32:26 CET 2010 >>>>>> root@daedalus.network.local:/usr/obj/usr/src/sys/GENERIC i386 >>>>>> # mkdir /compat/test >>>>>> # mount -t tmpfs tmpfs /compat/test >>>>>> # cp -Rp /compat/linux/* /compat/test/ >>>>>> # mount -t linprocfs linprocfs /compat/test/proc >>>>>> # /compat/linux/sbin/ldconfig -r /compat/test/ >>>>>> # pgrep ldconfig >>>>>> 3449 >>>>>> # procstat -i 3449 | grep KILL >>>>>> 3449 ldconfig KILL --- >>>>>> # kill -9 3449 >>>>>> # procstat -i 3449 | grep KILL >>>>>> 3449 ldconfig KILL P-- >>>>>> >>>>>> >From top(1): >>>>>> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >>>>>> 3449 root 1 44 0 992K 712K CPU1 1 10:06 100.00% ldconfig >>>>>> >>>>>> When I reboot the machine it hangs after "All buffers synced.". >>>>>> >>>>>> I've uploaded some additional output of procstat and ktrace here: >>>>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt >>>>>> >>>>>> Anyone knows how to fix this? >>>>> kdump for the trace of the linux binary is a garbage. You need to >>>>> use linux_kdump (from ports). >>>>> >>>>> I think that your process is looping in the kernel, you can confirm this >>>>> by dropping in the ddb and doing "bt ". >>>> >>>> I've uploaded a screenshot from the output of bt in ddb: >>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg >>> >>> Please try this. >>> >>> diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c >>> index 9ff1cf0..44ad193 100644 >>> --- a/sys/compat/linux/linux_file.c >>> +++ b/sys/compat/linux/linux_file.c >>> @@ -369,7 +369,6 @@ getdents_common(struct thread *td, struct linux_getdents64_args *args, >>> lbuf = malloc(LINUX_MAXRECLEN, M_TEMP, M_WAITOK | M_ZERO); >>> vn_lock(vp, LK_SHARED | LK_RETRY); >>> >>> -again: >>> aiov.iov_base = buf; >>> aiov.iov_len = buflen; >>> auio.uio_iov = &aiov; >>> @@ -506,8 +505,10 @@ again: >>> break; >>> } >>> >>> - if (outp == (caddr_t)args->dirent) >>> - goto again; >>> + if (outp == (caddr_t)args->dirent) { >>> + nbytes = resid; >>> + goto eof; >>> + } >>> >>> fp->f_offset = off; >>> if (justone) >>> diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c >>> index 84a2038..62dd0bf 100644 >>> --- a/sys/fs/tmpfs/tmpfs_subr.c >>> +++ b/sys/fs/tmpfs/tmpfs_subr.c >>> @@ -827,9 +827,10 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio *uio, off_t *cntp) >>> /* Copy the new dirent structure into the output buffer and >>> * advance pointers. */ >>> error = uiomove(&d, d.d_reclen, uio); >>> - >>> - (*cntp)++; >>> - de = TAILQ_NEXT(de, td_entries); >>> + if (error == 0) { >>> + (*cntp)++; >>> + de = TAILQ_NEXT(de, td_entries); >>> + } >>> } while (error == 0 && uio->uio_resid > 0 && de != NULL); >>> >>> /* Update the offset and cache. */ >> >> This patch solves the problem. >> > Thank you, but apparently this is not the end of story. > > I committed the linuxolator part of change, but I think that tmpfs > change is uncomplete yet. Strictly following getdirentries(2), tmpfs > must return EINVAL in the case when no single record can be returned. > Currently, it indicates EOF instead. I think this could be a complete > solution, but it might break e.g. Linux ldconfig(8) since it exposed > the linuxolator situation. > > Can you apply the patch below over the latest HEAD with r217578 included > and retest ? Thanks. > > diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c > index 84a2038..62dd0bf 100644 > --- a/sys/fs/tmpfs/tmpfs_subr.c > +++ b/sys/fs/tmpfs/tmpfs_subr.c > @@ -827,9 +827,10 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio *uio, off_t *cntp) > /* Copy the new dirent structure into the output buffer and > * advance pointers. */ > error = uiomove(&d, d.d_reclen, uio); > - > - (*cntp)++; > - de = TAILQ_NEXT(de, td_entries); > + if (error == 0) { > + (*cntp)++; > + de = TAILQ_NEXT(de, td_entries); > + } > } while (error == 0 && uio->uio_resid > 0 && de != NULL); > > /* Update the offset and cache. */ > diff --git a/sys/fs/tmpfs/tmpfs_vnops.c b/sys/fs/tmpfs/tmpfs_vnops.c > index 059a790..a57c1f2 100644 > --- a/sys/fs/tmpfs/tmpfs_vnops.c > +++ b/sys/fs/tmpfs/tmpfs_vnops.c > @@ -1349,7 +1349,7 @@ outok: > MPASS(error >= -1); > > if (error == -1) > - error = 0; > + error = (cnt != 0) ? 0 : EINVAL; > > if (eofflag != NULL) > *eofflag = I've applied the new patch on top of r217615 and was not able to reproduce the problem. Thanks again, Beat