From owner-freebsd-current@freebsd.org Sun Nov 11 21:14:50 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AFE21112564B for ; Sun, 11 Nov 2018 21:14:50 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C3F296EDDF for ; Sun, 11 Nov 2018 21:14:49 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id wABLEYLu024442 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 11 Nov 2018 23:14:37 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua wABLEYLu024442 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id wABLEYW0024441; Sun, 11 Nov 2018 23:14:34 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 11 Nov 2018 23:14:34 +0200 From: Konstantin Belousov To: Guido Falsi Cc: freebsd-current@freebsd.org Subject: Re: 13.0 failing to boot multiuser on one PC due to system utilities crashing during rc scipt Message-ID: <20181111211434.GS2378@kib.kiev.ua> References: <62bdb5ff-4d68-cf52-4dd5-f0a3cfa1c788@madpilot.net> <791e3488-b838-5cfd-8dca-8db8c74167a0@madpilot.net> <20181110230744.GN2378@kib.kiev.ua> <5176caee-126f-2709-d09a-0dcf5190e319@madpilot.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-Rspamd-Queue-Id: C3F296EDDF X-Spamd-Result: default: False [-4.91 / 200.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.98)[-0.976,0]; IP_SCORE(-1.92)[ip: (-1.50), ipnet: 2001:470::/32(-4.47), asn: 6939(-3.54), country: US(-0.09)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; RCVD_TLS_LAST(0.00)[]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Nov 2018 21:14:51 -0000 On Sun, Nov 11, 2018 at 08:44:24PM +0100, Guido Falsi wrote: > On 11/11/18 11:10, Guido Falsi wrote: > > On 11/11/18 00:07, Konstantin Belousov wrote: > >> On Sat, Nov 10, 2018 at 05:27:09PM +0100, Guido Falsi wrote: > >>> On 10/11/18 13:08, Guido Falsi wrote: > >>>> I'll to bisect things, but it will be a slow process. > >>> > >>> I narrowed it down to r339895. > >> I somehow doubt that this is the case. > >> > > > > I did not mean to accuse you. Instead thanks for this reply and the > > suggestions. Really appreciated. > > > > I simply found out that removing that commit from my sources gives me a > > stable system and reported such finding. > > > > I understand that the actual cause could be an interaction with other > > code and am ready to review my findings. > > > >> If you take post-r339895 kernel and start e.g. 11.2-RELEASE userspace > >> (untar the installation into jail to avoid reinstallation), does it > >> still demonstrate the behaviour ? > >> > >> Also try to run pre-r339895 with the 12.0 userspace from e.g. 12.0-BETA4 > >> builds. > > > > I'll perform such tests. Please allow me some time to report back what I > > get. > > I performed these tests. I downloaded the 12.0-BETA4 and 11.2 > installation images and replaced the kernels in there. This was faster > than working with jails on a crippled system. > > r339895 kernel on 11.2-RELEASE causes fsck (launched by rc) to dump core > and this stops the boot procedure. > > r339894 kernel on 12.0-BETA4 works fine. Ok, let try to find some reason. - When you build your kernels, you do not use any cpu-specific optimization flags, do you ? More, you follow the standard build procedure and your make.conf and src.conf are empty, right ? - Do you preload a microcode update from the loader ? - Show the output of sysctl vm.pmap. - Show verbose dmesg from the boot of the problematic kernel. You posted non-verbose dmesg for 12.0-BETA4. - Enter ddb, when booted the problematic kernel. Do db> x/x cpu_stdext_feature db> x/x cpu_stdext_feature+4 - From the same ddb session, disassemble e.g. cpu_set_user_tls(). You could paste me whole disassembling, but really I want to know the single line with the call to set_pcb_flagsXXXX, it should be either set_pcb_flags_raw or set_pcb_flags_fsgsbase. To disassemble in ddb, do db> x/i cpu_set_user_tls and then press more to get next and next instructions. (I want the disassembly from ddb and not from gdb/kgdb). - Try the following patch. diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c index 6e36ae97523..8dafd4b4756 100644 --- a/sys/amd64/amd64/machdep.c +++ b/sys/amd64/amd64/machdep.c @@ -2627,8 +2627,8 @@ set_pcb_flags_raw(struct pcb *pcb, const u_int flags) * the PCB_FULL_IRET flag is set. We disable interrupts to sync with * context switches. */ -static void -set_pcb_flags_fsgsbase(struct pcb *pcb, const u_int flags) +void +set_pcb_flags(struct pcb *pcb, const u_int flags) { register_t r; @@ -2649,13 +2649,6 @@ set_pcb_flags_fsgsbase(struct pcb *pcb, const u_int flags) } } -DEFINE_IFUNC(, void, set_pcb_flags, (struct pcb *, const u_int), static) -{ - - return ((cpu_stdext_feature & CPUID_STDEXT_FSGSBASE) != 0 ? - set_pcb_flags_fsgsbase : set_pcb_flags_raw); -} - void clear_pcb_flags(struct pcb *pcb, const u_int flags) {