From owner-freebsd-arch@FreeBSD.ORG Sat Jul 9 05:58:40 2011 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 426E1106564A; Sat, 9 Jul 2011 05:58:40 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id B695E8FC0C; Sat, 9 Jul 2011 05:58:39 +0000 (UTC) Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au (c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p695wYdx021085 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 9 Jul 2011 15:58:37 +1000 Date: Sat, 9 Jul 2011 15:58:34 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Peter Wemm In-Reply-To: Message-ID: <20110709151800.Q1090@besplex.bde.org> References: <20110708164844.GZ48734@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Kostik Belousov , Attilio Rao , Sergey Kandaurov , freebsd-arch@freebsd.org Subject: Re: [PATCH] Add MAXCPU as a kernel config option and quality discussion on this X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jul 2011 05:58:40 -0000 On Fri, 8 Jul 2011, Peter Wemm wrote: > On Fri, Jul 8, 2011 at 9:48 AM, Kostik Belousov wrote: >> On Fri, Jul 08, 2011 at 05:37:17PM +0200, Attilio Rao wrote: >>> ... >>> What are your ideas on that? Do you think that including opt_maxcpu.h >>> would be acceptable for the time being? >> >> I vote for putting MAXCPU in opt_global.h. > > That isn't right either. Nothing will work :-). > ... > A more correct solution would be something vaguely along these lines: > > - remove the #ifndef MAXCPU / #define MAXCPU 32 / #endif from the include files > - put "options MAXCPU=32" into the machine/conf/DEFAULTS files > - put MAXCPU into opt_maxcpu.h > - replace code constructs in include files that will fail with things > that will safely fail: > > eg: sys/pcpu.h: > #define PCPU_NAME_LEN (sizeof("CPU ") + sizeof(__XSTRING(MAXCPU) + 1)) > > becomes.. > > #ifdef MAXCPU > #define PCPU_NAME_LEN (sizeof("CPU ") + sizeof(__XSTRING(MAXCPU) + 1)) > #endif That won't work at all, since PCPU_NAME_LEN is only used (in sys/pcpu.h) in a critical struct declaration: % #define PCPU_NAME_LEN (sizeof("CPU ") + sizeof(__XSTRING(MAXCPU) + 1)) % % struct pcpu { % struct thread *pc_curthread; /* Current thread */ % ... % #ifdef KTR % char pc_name[PCPU_NAME_LEN]; /* String name for KTR */ % #endif % } __aligned(CACHE_LINE_SIZE); The KTR ifdef is another source of errors. It is defined in opt_global.h. Modules that are actually modular cannot depend on options, so KTR is never defined for modules. Thus all modules are that use pcpu are likely to be broken if the kernel is compiled with KTR. We often avoid this problem by declaring fields in structs without ifdefs. But if MAXCPU is an option, even declaring the fields depends on including the options header. The above ifdef will mainly detect the dependency on the options header and show that it is nearely 100% global. There is massive namespace pollution involving sys/pcpu.h. It is included nested in even more critical headers like mutex.h and proc.h :-(. > and > extern struct pcpu *cpuid_to_pcpu[MAXCPU]; > should have been: > extern struct pcpu *cpuid_to_pcpu[]; As you noticed later, at least CPU_SETSIZE depends on MAXCPU in a more global way than this. Avoiding this might take significant changes to the API. There can be no structs or arrays with fixed sizes, or APIs that pass pointers to such. > or amd64/include/intr_machdep.h: > #define INTRCNT_COUNT (1 + NUM_IO_INTS * 2 + (1 + 7) * MAXCPU) > becomes > #ifdef MAXCPU > #define INTRCNT_COUNT (1 + NUM_IO_INTS * 2 + (1 + 7) * MAXCPU) > #endif Works better because it is localized, and also because the userland API never involve fixed sizes (it uses either kmem pointer to the start and end, with the end pointer constructed from statically configured sizes, or the sysctl interface which essentially involves dynamic arrays). I think recent changes for cpusets involve using dynamic arrays for the userland API. This might be OK for the kernel too. We have some experience expanding the static array limit for fd sets. FD_SETSIZE = 1024 once seemed large enough for anyone, and it still isn't small enough to be efficient as possible when only a few fd's are used, so APIs and implementations got mainly negative benefits from it having a fixed size even when that size was large enough. E.g., select() has an nfds count so that the whole array doesn't always have to be looked at every time, but the whole array still has to be allocated. Bruce