From owner-freebsd-current@FreeBSD.ORG Wed Aug 14 04:43:43 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 11E17D88 for ; Wed, 14 Aug 2013 04:43:43 +0000 (UTC) (envelope-from jroberson@jroberson.net) Received: from mail-oa0-f44.google.com (mail-oa0-f44.google.com [209.85.219.44]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CA6E02D45 for ; Wed, 14 Aug 2013 04:43:42 +0000 (UTC) Received: by mail-oa0-f44.google.com with SMTP id l20so12657138oag.31 for ; Tue, 13 Aug 2013 21:43:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version:content-type; bh=J4DuFAnOC8jl5sc2lYjbN1A8CH6xe0ghVgbZ1PhswQk=; b=W2Pr2iZlce28l0H2s3ioqc7kJwX6GeTAAyb1d2nNKy04MKcGM+tJSHhPvF+2WrOdq5 QRq6N3+Q0UrfvD2omqIuDFPRN2j20RvE8HZaUJiy2odhu2sdizgcpSfn1C78kz3J6PV3 HQion8mpAheZtqsDD+hEeNPpUaI7GjT0FsBAZC0jzqQ4xkE+YjOe/VVDtn7M3RnDbG2T zBI3Xnb9fBEGx6NLl1/h4D7Wh0t4TSQQm86K4OXH3hx+otUucXm/o05APDbT6JBlZ9vr jm2FJcaNL6ZXQd+asmOqOUf3jVanO6ImnBa76APyDiT3d0o/wszc7mdIpTOa4/b7SbTI Bbqw== X-Gm-Message-State: ALoCoQnKQzKboqnbnhAPeww7+YkxUw4bKUbTxZQUpf9TlZkjt5l0xhh2T6Jr4IRki3E+lkVuwQ4S X-Received: by 10.60.94.39 with SMTP id cz7mr7530621oeb.17.1376451945623; Tue, 13 Aug 2013 20:45:45 -0700 (PDT) Received: from rrcs-66-91-135-210.west.biz.rr.com (rrcs-66-91-135-210.west.biz.rr.com. [66.91.135.210]) by mx.google.com with ESMTPSA id ya5sm43743583obc.1.2013.08.13.20.45.43 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 13 Aug 2013 20:45:44 -0700 (PDT) Date: Tue, 13 Aug 2013 17:47:34 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Jim Harris Subject: Re: panic: UMA: Increase vm.boot_pages with 32 CPUs In-Reply-To: Message-ID: References: <5208A488.2050603@freebsd.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="2547152148-481170591-1376452058=:4083" Cc: Alan Cox , Attilio Rao , FreeBSD current , Konstantin Belousov , Colin Percival X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Aug 2013 04:43:43 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --2547152148-481170591-1376452058=:4083 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8BIT On Tue, 13 Aug 2013, Jim Harris wrote: > > > > On Tue, Aug 13, 2013 at 3:05 PM, Jeff Roberson > wrote: > On Mon, 12 Aug 2013, Colin Percival wrote: > > Hi all, > > A HEAD@254238 kernel fails to boot in EC2 with > panic: UMA: Increase vm.boot_pages > > on 32-CPU instances.  Instances with up to 16 CPUs > boot fine. > > I know there has been some mucking about with VM > recently -- anyone want > to claim this, or should I start doing a binary > search? > > > It's not any one commit really, just creeping demand for more pages > before the VM can get started.  I would suggest making boot pages > scale with MAXCPU.  Or just raising it as the panic suggests.  We > could rewrite the way that the vm gets these early pages but it's a > lot of work and typically people just bump it and forget about it. > > > I ran into this problem today when enabling hyperthreading on my dual-socket > Xeon E5 system. > > It looks like r254025 is actually the culprit.  Specifically, the new > mallocinit()/kmeminit() now invoke the new vmem_init() before > uma_startup2(), which allocates 16 zones out of the boot pages if I am > reading this correctly.  This is all done before uma_startup2() is called, > triggering the panic. > I just disabled the quantum caches in vmem which allocate those 16 zones. This may alleviate the problem for now. Thanks, Jeff > Anything less than 28 CPUs, and the zone size (uma_zone + uma_cache * > (mp_maxid + 1)) is <= PAGE_SIZE and we can successfully boot.  So at 32 > CPUs, we need two boot pages per zone which consumes more than the default > 64 boot pages.  The size of these structures do not appear to have > materially changed any time recently. > > Scaling with MAXCPU seems to be an OK solution, but should it be based > directly on the size of (uma_zone + uma_cache * MAXCPU)?  I am not very > familiar with uma startup, but it seems like these zones are the primary > consumers of the boot pages, so the UMA_BOOT_PAGES default should be based > directly on that size.. > > Regards, > > -Jim > > > --2547152148-481170591-1376452058=:4083--