From owner-freebsd-current@FreeBSD.ORG Wed Aug 14 01:04:02 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id CBD1DF23; Wed, 14 Aug 2013 01:04:02 +0000 (UTC) (envelope-from jim.harris@gmail.com) Received: from mail-ee0-x231.google.com (mail-ee0-x231.google.com [IPv6:2a00:1450:4013:c00::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B566922CB; Wed, 14 Aug 2013 01:04:01 +0000 (UTC) Received: by mail-ee0-f49.google.com with SMTP id d41so4518003eek.22 for ; Tue, 13 Aug 2013 18:04:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=EJ7/7gjuHw/l7R9asMzpY9LTKGr6G0QPunPiP0lp2Qc=; b=FNTn8Wsgsi/9qRBkG2YTPgqV58+BcCxsO/ZK13kDuigjSs30zlHhK+lS99mFdaKVan ZdfMWGU3ye4tFusysBxn7DT3GzHO/1FW3FtYQzeIIYbL7FULOfmCsaqAw+jR9eIbOUPG GpjXtShA413WxK+4b9GUP+H4LC8O3Is09jAr5UrLAvlSwfZWfN6eOjr3WNttKlQ+LX5l H041vPAjuCXx1MYlX3VDETa041Hz2ilFtWAh+W08lLmHCwIDDoraCzU2mscrml/2z05D wdB7lw16hGAvdxi0HTZxUTkeDnPQihko88xRbF0JBbOLn+XISY8QKnqZ5UjwIZCZLsQF RNdA== MIME-Version: 1.0 X-Received: by 10.15.26.66 with SMTP id m42mr29872eeu.73.1376442240030; Tue, 13 Aug 2013 18:04:00 -0700 (PDT) Received: by 10.14.143.80 with HTTP; Tue, 13 Aug 2013 18:03:59 -0700 (PDT) In-Reply-To: References: <5208A488.2050603@freebsd.org> Date: Tue, 13 Aug 2013 18:03:59 -0700 Message-ID: Subject: Re: panic: UMA: Increase vm.boot_pages with 32 CPUs From: Jim Harris To: Jeff Roberson Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Alan Cox , Attilio Rao , FreeBSD current , Konstantin Belousov , Colin Percival X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Aug 2013 01:04:02 -0000 On Tue, Aug 13, 2013 at 3:05 PM, Jeff Roberson wrote: > On Mon, 12 Aug 2013, Colin Percival wrote: > > Hi all, >> >> A HEAD@254238 kernel fails to boot in EC2 with >> >>> panic: UMA: Increase vm.boot_pages >>> >> on 32-CPU instances. Instances with up to 16 CPUs boot fine. >> >> I know there has been some mucking about with VM recently -- anyone want >> to claim this, or should I start doing a binary search? >> > > It's not any one commit really, just creeping demand for more pages before > the VM can get started. I would suggest making boot pages scale with > MAXCPU. Or just raising it as the panic suggests. We could rewrite the > way that the vm gets these early pages but it's a lot of work and typically > people just bump it and forget about it. > > I ran into this problem today when enabling hyperthreading on my dual-socket Xeon E5 system. It looks like r254025 is actually the culprit. Specifically, the new mallocinit()/kmeminit() now invoke the new vmem_init() before uma_startup2(), which allocates 16 zones out of the boot pages if I am reading this correctly. This is all done before uma_startup2() is called, triggering the panic. Anything less than 28 CPUs, and the zone size (uma_zone + uma_cache * (mp_maxid + 1)) is <= PAGE_SIZE and we can successfully boot. So at 32 CPUs, we need two boot pages per zone which consumes more than the default 64 boot pages. The size of these structures do not appear to have materially changed any time recently. Scaling with MAXCPU seems to be an OK solution, but should it be based directly on the size of (uma_zone + uma_cache * MAXCPU)? I am not very familiar with uma startup, but it seems like these zones are the primary consumers of the boot pages, so the UMA_BOOT_PAGES default should be based directly on that size.. Regards, -Jim