From owner-freebsd-current@freebsd.org Thu Sep 26 20:33:58 2019 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 112FA133A7B for ; Thu, 26 Sep 2019 20:33:58 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46fRTx5pTyz3wf3; Thu, 26 Sep 2019 20:33:57 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x8QKXndm075921 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Thu, 26 Sep 2019 23:33:52 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x8QKXndm075921 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x8QKXnjT075920; Thu, 26 Sep 2019 23:33:49 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 26 Sep 2019 23:33:49 +0300 From: Konstantin Belousov To: Alan Somers Cc: FreeBSD CURRENT Subject: Re: panic: Unregistered use of FPU in kernel Message-ID: <20190926203349.GK44691@kib.kiev.ua> References: <20190926170241.GG44691@kib.kiev.ua> <20190926172924.GH44691@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.2 (2019-09-21) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-Rspamd-Queue-Id: 46fRTx5pTyz3wf3 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-5.99 / 15.00]; NEURAL_HAM_MEDIUM(-0.99)[-0.989,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Sep 2019 20:33:58 -0000 On Thu, Sep 26, 2019 at 02:12:21PM -0600, Alan Somers wrote: > On Thu, Sep 26, 2019 at 11:29 AM Konstantin Belousov > wrote: > > > On Thu, Sep 26, 2019 at 11:20:51AM -0600, Alan Somers wrote: > > > On Thu, Sep 26, 2019 at 11:02 AM Konstantin Belousov < > > kostikbel@gmail.com> > > > wrote: > > > > > > > On Thu, Sep 26, 2019 at 09:45:43AM -0600, Alan Somers wrote: > > > > > The latest VM snapshot > > > > (FreeBSD-13.0-CURRENT-amd64-20190920-r352544.qcow2) > > > > > instapanics on boot: > > > > > > > > > > panic: Unregistered use of FPU in kernel > > > > > > > > > > stack trace: > > > > > ... > > > > > sse42_crc32c > > > > > readsuper > > > > > ffs_sbget > > > > > g_label_ufs_taste_common > > > > > g_label_taste > > > > > g_new_provider_event > > > > > g_run_events > > > > > fork_exit > > > > > ... > > > > > > > > > > Has anybody touched this area recently? I'll try to narrow down the > > > > commit > > > > > range. > > > > > > > > Start with disassembling the faulting instruction. I suspect that > > somehow > > > > vital compiler switches like -mno-sse got omitted in the build. > > > > > > > > > > No problem with compiler switches here. The C file uses inline assembly > > to > > > generate a crc32q instruction, in crc32_sse42.c:257. But why would that > > > generate a floating point exception? The instruction doesn't appear to > > be > > > using any floating point registers. This is on a Kaby Lake CPU. > > > > > > crc32q %rsi, %rbx > > > > No idea, this instruction does not generate #NP at all. > > > > Provide exact script of the panic and backtrace, > > together with the disassembly of the function which contained the faulted > > instruction. Do disassemble from ddb, in case text was corrupted. > > > > Ok, here's the full stack trace: > #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 > #1 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:392 > #2 0xffffffff804a1edb in db_dump (dummy=, > dummy2=, dummy3=, dummy4=) > at /usr/src/sys/ddb/db_command.c:575 > #3 0xffffffff804a1c8f in db_command (last_cmdp=, > cmd_table=, dopager=1) at > /usr/src/sys/ddb/db_command.c:482 > #4 0xffffffff804a1a04 in db_command_loop () > at /usr/src/sys/ddb/db_command.c:535 > #5 0xffffffff804a4cbf in db_trap (type=, code= out>) > at /usr/src/sys/ddb/db_main.c:252 > #6 0xffffffff80c1e55c in kdb_trap (type=3, code=0, tf=) > at /usr/src/sys/kern/subr_kdb.c:692 > #7 0xffffffff811957df in trap (frame=0xfffffe00907e8d20) > at /usr/src/sys/amd64/amd64/trap.c:621 > #8 This is not a useful trace. It only shows the ddb part after the trap. Please show all console messages around the panic, as was requested. > > Your guess about corrupted text was prescient. Here is the disassembly > according to ddb: > https://people.freebsd.org/~asomers/Screenshot_fbsd-head_2019-09-26_13%3A51%3A34.png > And here is the disassembly of the same section according to gdb: > 0xffffffff8113b2e0 : mov %rsi,%r9 > 0xffffffff8113b2e3 : sub $0xffffffffffffff80,%r9 > 0xffffffff8113b2e7 : add $0x100,%rsi > 0xffffffff8113b2ee : mov %r11,%rbx > 0xffffffff8113b2f1 : xor %eax,%eax > 0xffffffff8113b2f3 : xor %r11d,%r11d > 0xffffffff8113b2f6 : nopw %cs:0x0(%rax,%rax,1) > 0xffffffff8113b300 : mov %rsi,%rdx > 0xffffffff8113b303 : mov -0x100(%rsi),%rsi > 0xffffffff8113b30a : mov -0x80(%rdx),%rdi > 0xffffffff8113b30e : crc32q %rsi,%rbx > 0xffffffff8113b314 : crc32q %rdi,%rax > 0xffffffff8113b31a : mov (%rdx),%rsi > 0xffffffff8113b31d : crc32q %rsi,%r11 > 0xffffffff8113b323 : lea 0x8(%rdx),%rsi > 0xffffffff8113b327 : add $0xffffffffffffff08,%rdx > 0xffffffff8113b32e : cmp %r9,%rdx > 0xffffffff8113b331 : > jb 0xffffffff8113b300 > 0xffffffff8113b333 : movzbl %cl,%r9d > 0xffffffff8113b337 : movzbl %ch,%edi > 0xffffffff8113b33a : mov %ecx,%edx > > Care to guess what's causing the corruption? I agree with cem that it is more likely ddb disassembler unable to handle some aspects, and that looking at hex bytes of the faulted instruction is the interesting data.