From owner-svn-src-all@freebsd.org Fri Nov 18 10:37:40 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 73F34C4802A; Fri, 18 Nov 2016 10:37:40 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 01D2E89F; Fri, 18 Nov 2016 10:37:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id uAIAbSvw045790 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 18 Nov 2016 12:37:28 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua uAIAbSvw045790 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id uAIAbSIe045789; Fri, 18 Nov 2016 12:37:28 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 18 Nov 2016 12:37:28 +0200 From: Konstantin Belousov To: Ruslan Bukin Cc: Alan Cox , Alan Cox , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r308691 - in head/sys: cddl/compat/opensolaris/sys cddl/contrib/opensolaris/uts/common/fs/zfs fs/tmpfs kern vm Message-ID: <20161118103728.GE54029@kib.kiev.ua> References: <201611151822.uAFIMoj2092581@repo.freebsd.org> <20161116133718.GA10251@bsdpad.com> <20161116165343.GX54029@kib.kiev.ua> <20161116165939.GA12498@bsdpad.com> <20161116175210.GA13203@bsdpad.com> <9047aad0-0713-5d7a-f92e-6f931642bb27@rice.edu> <20161118102235.GA40554@bsdpad.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161118102235.GA40554@bsdpad.com> User-Agent: Mutt/1.7.1 (2016-10-04) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Nov 2016 10:37:40 -0000 On Fri, Nov 18, 2016 at 10:22:35AM +0000, Ruslan Bukin wrote: > On Thu, Nov 17, 2016 at 10:51:40AM -0600, Alan Cox wrote: > > On 11/16/2016 11:52, Ruslan Bukin wrote: > > > On Wed, Nov 16, 2016 at 04:59:39PM +0000, Ruslan Bukin wrote: > > >> On Wed, Nov 16, 2016 at 06:53:43PM +0200, Konstantin Belousov wrote: > > >>> On Wed, Nov 16, 2016 at 01:37:18PM +0000, Ruslan Bukin wrote: > > >>>> I have a panic with this on RISC-V. Any ideas ? > > >>> How did you checked that the revision you replied to, makes the problem ? > > >>> Note that the backtrace below is not reasonable. > > >> I reverted this commit like that and rebuilt kernel: > > >> git show 2fa36073055134deb2df39c7ca46264cfc313d77 | patch -p1 -R > > >> > > >> So the problem is reproducible on dual-core with 32mb mdroot. > > >> > > > I just found another interesting behavior: > > > depending on amount of physical memory : > > > 700m - panic > > > 800m - works fine > > > 1024m - panic > > > > I think that this behavior is not inconsistent with your report of the > > system crashing if you enabled two cores but not one. Specifically, > > changing the number of active cores will slightly affect the amount of > > memory that is allocated during initialization. > > > > There is nothing unusual in the sysctl output that you sent out. > > > > I have two suggestions. Try these in order. > > > > 1. r308691 reduced the size of struct vm_object. Try undoing the one > > snippet that reduced the vm object size and see if that makes a difference. > > > > > > @@ -118,7 +118,6 @@ > > vm_ooffset_t backing_object_offset;/* Offset in backing object */ > > TAILQ_ENTRY(vm_object) pager_object_list; /* list of all objects of this pager type */ > > LIST_HEAD(, vm_reserv) rvq; /* list of reservations */ > > - struct vm_radix cache; /* (o + f) root of the cache page radix trie */ > > void *handle; > > union { > > /* > > > > > > 2. I'd like to know if vm_page_scan_contig() is being called. > > > > Finally, to simply the situation a little, I would suggest that you > > disable superpage reservations in vmparam.h. You have no need for them. > > > > > > I made another one merge from svn-head and problem disappeared for 700m,1024m of physical memory, but now I able to reproduce it with 900m of physical memory. > > Restoring 'struct vm_radix cache' in struct vm_object gives no behavior changes. > > Adding a panic() call to vm_page_scan_contig gives an original panic (so vm_page_scan_contig is not called), > it looks like size of function is changed and it unhides the original problem. > > Disable superpage reservations changes behavior and gives same panic on 1024m boot. > > Finally, if I comment ruxagg call in kern_resource then I can't reproduce the problem any more with any amount of memory in any setup: > > --- a/sys/kern/kern_resource.c > +++ b/sys/kern/kern_resource.c > @@ -1063,7 +1063,7 @@ rufetch(struct proc *p, struct rusage *ru) > *ru = p->p_ru; > if (p->p_numthreads > 0) { > FOREACH_THREAD_IN_PROC(p, td) { > - ruxagg(p, td); > + //ruxagg(p, td); > rucollect(ru, &td->td_ru); > } > } > > I found this patch in my early RISC-V development directory, so it looks the problem persist whole the freebsd/riscv life, but was hidden until now. > If you comment out the rufetch() call in proc0_post(), does the problem go away as well ? I suggest to start with fixing the backtrace anyway, because the backtrace you posted is wrong.