From owner-svn-src-all@FreeBSD.ORG Mon Jul 2 05:26:54 2012 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D2FD9106564A; Mon, 2 Jul 2012 05:26:54 +0000 (UTC) (envelope-from andreast@FreeBSD.org) Received: from smtp.fgznet.ch (mail.fgznet.ch [81.92.96.47]) by mx1.freebsd.org (Postfix) with ESMTP id 7CA2A8FC12; Mon, 2 Jul 2012 05:26:54 +0000 (UTC) Received: from deuterium.andreas.nets (dhclient-91-190-14-19.flashcable.ch [91.190.14.19]) by smtp.fgznet.ch (8.13.8/8.13.8/Submit_SMTPAUTH) with ESMTP id q625QpNE080706; Mon, 2 Jul 2012 07:26:52 +0200 (CEST) (envelope-from andreast@FreeBSD.org) Message-ID: <4FF1311B.1090906@FreeBSD.org> Date: Mon, 02 Jul 2012 07:26:51 +0200 From: Andreas Tobler User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:13.0) Gecko/20120601 Thunderbird/13.0 MIME-Version: 1.0 To: Konstantin Belousov References: <201206210926.q5L9Q6nR002030@svn.freebsd.org> <4FF03316.5050609@FreeBSD.org> <20120701120408.GM2337@deviant.kiev.zoral.com.ua> <4FF0528E.50002@FreeBSD.org> <20120701134132.GO2337@deviant.kiev.zoral.com.ua> <4FF05724.3050904@FreeBSD.org> <20120701170543.GP2337@deviant.kiev.zoral.com.ua> <4FF097E5.8030909@FreeBSD.org> <20120701214301.GQ2337@deviant.kiev.zoral.com.ua> <4FF1263B.20704@FreeBSD.org> In-Reply-To: <4FF1263B.20704@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.64 on 81.92.96.47 Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r237367 - head/sys/fs/nfsclient X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jul 2012 05:26:54 -0000 On 02.07.12 06:40, Andreas Tobler wrote: > On 01.07.12 23:43, Konstantin Belousov wrote: >> On Sun, Jul 01, 2012 at 08:33:09PM +0200, Andreas Tobler wrote: >>> On 01.07.12 19:05, Konstantin Belousov wrote: >>>> On Sun, Jul 01, 2012 at 03:56:52PM +0200, Andreas Tobler wrote: >>>>> On 01.07.12 15:41, Konstantin Belousov wrote: >>>>>> On Sun, Jul 01, 2012 at 03:37:18PM +0200, Andreas Tobler wrote: >>>>>>> On 01.07.12 14:04, Konstantin Belousov wrote: >>>>>>>> On Sun, Jul 01, 2012 at 01:23:02PM +0200, Andreas Tobler wrote: >>>>>>>>> On 21.06.12 11:26, Konstantin Belousov wrote: >>>>>>>>>> Author: kib >>>>>>>>>> Date: Thu Jun 21 09:26:06 2012 >>>>>>>>>> New Revision: 237367 >>>>>>>>>> URL: http://svn.freebsd.org/changeset/base/237367 >>>>>>>>>> >>>>>>>>>> Log: >>>>>>>>>> Enable deadlock avoidance code for NFS client. >>>>>>>>> >>>>>>>>> >>>>>>>>> Hm, since this commit I fail with my nfs installworld/kernel. >>>>>>>>> >>>>>>>>> I have a builder which installs world/kernel to a nfs mounted >>>>>>>>> directory. >>>>>>>>> Namely used for cross builds. >>>>>>>>> >>>>>>>>> Now since this commit I get the following when I install kernel to the >>>>>>>>> nfs directory: >>>>>>>>> >>>>>>>>> .. >>>>>>>>> install -o root -g wheel -m 555 zfs.ko.symbols >>>>>>>>> /netboot/sparc64/boot/kernel >>>>>>>>> install: /netboot/sparc64/boot/kernel/zfs.ko.symbols: No such file or >>>>>>>>> directory >>>>>>>>> *** [_kmodinstall] Error code 71 >>>>>>>>> .. >>>>>>>>> >>>>>>>>> The file is there, a local install of the tree works without problems. >>>>>>>>> Reverting to r237366 also makes it work again. >>>>>>>>> >>>>>>>>> The server is a -CURRENT, r237880, The client, -CURRENT too. >>>>>>>>> >>>>>>>>> How can I help to track down the real issue? >>>>>>>> >>>>>>>> Is it always the same file in the install procedure which causes the >>>>>>>> failure ? Even more, is the failure pattern always the same ? >>>>>>> >>>>>>> I'd say so yes. When installing a kernel onto a nfs mounted fs then >>>>>>> always (in my cases) the zfs.ko.symbols was the failing pattern. >>>>>>> I tried ppc64 and sparc64 as target. With both it was the above file. >>>>>>> >>>>>>> When doing a installworld, it was, also in both cases, ppc64/sparc64, >>>>>>> the cc1 in libexec which failed. >>>>>>> >>>>>>>> Might be, start with ktrace-ing the whole make invocation, including >>>>>>>> the children processes. >>>>>>> >>>>>>> Some recipes how to start? >>>>>> ktrace -o -i make installkernel >>>>>> Then kdump and cut the lines around relevant failure. >>>>> >>>>> ktrace -f, right? >>>> Right, but without -i it is useless. >>> >>> Ah, yes, seems clear now after reading the man page. >>> >>>>> I placed the whole kdump here: >>>>> >>>>> http://people.freebsd.org/~andreast/dumped_installkernel.log >>>>> >>>>> It is not clear to me where the failure starts :) >>>> Because logs do not contain tracepoints from the children. >>>> See above about -i. >>>> >>>> I asked about excerpt because I expect the proper log to have an order >>>> of magnitude bigger size. >>> >>> Ok. The dump is around 100MB, I hope I extracted as much as needed: >>> >>> http://people.freebsd.org/~andreast/dumped_installkernel-7.log >>> >>>>>>>> I used buildworld on the NFS-mounted obj/ as the test for the changes. >>>>>>> >>>>>>> Here the obj is local, only the src and the destination is on the >>>>>>> nfs/netboot server. >>>>>> >>>>>> I just finished build on NFS obj/ and did several rounds of installs >>>>>> for world and kernel into nfs-mounted destdir. It seems I cannot >>>>>> reproduce >>>>>> this locally. >>>>> >>>>> Ok. I try with an nfs obj too. >>> >>> So, I was not able to reproduce the failure with an nfs mounted obj dir. >>> >>> But I was able to reproduce the failure with three different machines >>> which all have the obj local and the destination mounted via nfs. >>> >>> Are you able to try with a local obj too? >> Below are two patches. Please follow my instructions literally to get >> most of your bug report. >> >> First, please apply the usr.bin/xinstall patch only, and retry installkernel >> (no need to use ktrace). It should show the proper error, short write, with >> zero-sized result, instead of garbage ENOENT from errno. > > Done. No expected output. Iow, no message containing this: > > short write to %s: %jd bytes written, %jd bytes asked to write Sorry for the confusion, I learned that I have to patch install from the buildworld and not only the host one: ===> zfs (install) install -o root -g wheel -m 555 zfs.ko /netboot/powerpc64/boot/kernel install -o root -g wheel -m 555 zfs.ko.symbols /netboot/powerpc64/boot/kernel install: short write to /netboot/powerpc64/boot/kernel/zfs.ko.symbols: 0 bytes written, 6959816 bytes asked to write *** [_kmodinstall] Error code 71 >> Next, please apply the sys/fs/nfsclient patch, which should fix the core >> cause. And it really helps if I install the fresh built kernel to the right place :( ===> zfs (install) install -o root -g wheel -m 555 zfs.ko /netboot/powerpc64/boot/kernel install -o root -g wheel -m 555 zfs.ko.symbols /netboot/powerpc64/boot/kernel ===> zlib (install) install -o root -g wheel -m 555 zlib.ko /netboot/powerpc64/boot/kernel install -o root -g wheel -m 555 zlib.ko.symbols /netboot/powerpc64/boot/kernel kldxref /netboot/powerpc64/boot/kernel Thank you very much for the patience and the solution! Andreas