From owner-freebsd-current@FreeBSD.ORG Mon Jan 5 14:08:04 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0A463BE1 for ; Mon, 5 Jan 2015 14:08:04 +0000 (UTC) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BD5626484C for ; Mon, 5 Jan 2015 14:08:03 +0000 (UTC) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id B00341FE022; Mon, 5 Jan 2015 15:08:01 +0100 (CET) Message-ID: <54AA9AF1.5020807@selasky.org> Date: Mon, 05 Jan 2015 15:08:49 +0100 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: [RFC] Start SMP subsystem earlier References: <54AA8F19.9030300@selasky.org> <20150105134316.GE42409@kib.kiev.ua> In-Reply-To: <20150105134316.GE42409@kib.kiev.ua> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: markb@mellanox.com, FreeBSD Current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jan 2015 14:08:04 -0000 On 01/05/15 14:43, Konstantin Belousov wrote: > On Mon, Jan 05, 2015 at 02:18:17PM +0100, Hans Petter Selasky wrote: >> Hi, >> >> There is a limitiation on the number of interrupt vectors available when >> only a single processor is running. To have more interrupts available we >> need to start SMP earlier when building a monotolith kernel and not >> loading drivers as modules. The driver in question is a network driver >> and because it cannot be started after SI_SUB_ROOT_CONF due to PXE >> support I see no other option than to move SI_SUB_SMP earlier. >> >> Suggested patch: >> >>> Index: sys/kernel.h >>> =================================================================== >>> --- sys/kernel.h (revision 276691) >>> +++ sys/kernel.h (working copy) >>> @@ -152,6 +152,7 @@ >>> SI_SUB_KPROF = 0x9000000, /* kernel profiling*/ >>> SI_SUB_KICK_SCHEDULER = 0xa000000, /* start the timeout events*/ >>> SI_SUB_INT_CONFIG_HOOKS = 0xa800000, /* Interrupts enabled config */ >>> + SI_SUB_SMP = 0xa850000, /* start the APs*/ >>> SI_SUB_ROOT_CONF = 0xb000000, /* Find root devices */ >>> SI_SUB_DUMP_CONF = 0xb200000, /* Find dump devices */ >>> SI_SUB_RAID = 0xb380000, /* Configure GEOM classes */ >>> @@ -165,7 +166,6 @@ >>> SI_SUB_KTHREAD_BUF = 0xea00000, /* buffer daemon*/ >>> SI_SUB_KTHREAD_UPDATE = 0xec00000, /* update daemon*/ >>> SI_SUB_KTHREAD_IDLE = 0xee00000, /* idle procs*/ >>> - SI_SUB_SMP = 0xf000000, /* start the APs*/ >>> SI_SUB_RACCTD = 0xf100000, /* start racctd*/ >>> SI_SUB_LAST = 0xfffffff /* final initialization */ >>> }; > Did you inspected all reordered sysinit routines and ensured that the > reordering is safe ? At very least, SUB_SMP starts event timers, > while KTHREAD_IDLE is about configuring some hardware which might > be required/not ready for that. Hi, I did not inspect everything myself yet regarding this change. That's why I'm sending this e-mail out. The problem is simply that the total number of interrupts appears to be limited by "APIC_NUM_IOINTS" and "NUM_IO_INTS" which is per CPU from what I understand. Until SMP is activated the newbus code is simply distributing the IRQ vectors on the available IRQs, then when SMP is up it is re-shuffling them all. I was initially thinking that a hack might be possible, like using RF_SHARED for the IRQ resource, but then noticed that we were using MSI interrupts, which are not allocated in the same manner. The other issue is that the IRQs should be functional too, so that PXE boot can work. --HPS > >> >> This fixes a problem for Mellanox drivers in the OFED layer. Possibly we >> need to move the SMP even earlier to not miss the generic FreeBSD PCI >> device enumeration or maybe this is not possible. Does anyone know how >> early we can start SMP? > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >