From owner-freebsd-current@freebsd.org Sun Jun 3 20:53:52 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D7957FD3FD8 for ; Sun, 3 Jun 2018 20:53:51 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 463B37BC13; Sun, 3 Jun 2018 20:53:51 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTP id w53KreKO024720; Sun, 3 Jun 2018 23:53:43 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua w53KreKO024720 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id w53KrexX024717; Sun, 3 Jun 2018 23:53:40 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 3 Jun 2018 23:53:40 +0300 From: Konstantin Belousov To: Michael Gmelin Cc: "freebsd-current@freebsd.org" , Matthias Apitz , jhb@freebsd.org Subject: Re: Fatal trap 12: page fault on Acer Chromebook 720 (peppy) Message-ID: <20180603205340.GS3789@kib.kiev.ua> References: <20180603144840.44bfea41@bsd64.grem.de> <20180603132110.GP3789@kib.kiev.ua> <20180603165500.361ec894@bsd64.grem.de> <20180603150423.GQ3789@kib.kiev.ua> <20180603215020.452a81d8@bsd64.grem.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180603215020.452a81d8@bsd64.grem.de> User-Agent: Mutt/1.10.0 (2018-05-17) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jun 2018 20:53:52 -0000 On Sun, Jun 03, 2018 at 09:50:20PM +0200, Michael Gmelin wrote: > > > On Sun, 3 Jun 2018 18:04:23 +0300 > Konstantin Belousov wrote: > > > On Sun, Jun 03, 2018 at 04:55:00PM +0200, Michael Gmelin wrote: > > > > > > > > > On Sun, 3 Jun 2018 16:21:10 +0300 > > > Konstantin Belousov wrote: > > > > > > > On Sun, Jun 03, 2018 at 02:48:40PM +0200, Michael Gmelin wrote: > > > > > Hi, > > > > > > > > > > After upgrading CURRENT to r333992 (from something at least a > > > > > year old, quite some changes in mp_machdep.c since), this > > > > > machine crashes on boot: > > > > > > > > > > Copyright (c) 1992-2018 The FreeBSD Project. > > > > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, > > > > > 1993, 1994 The Regents of the University of California. All > > > > > rights reserved. FreeBSD is a registered trademark of The > > > > > FreeBSD Foundation. FreeBSD 12.0-CURRENT #1 r333992: Tue May 22 > > > > > 00:31:04 CEST 2018 > > > > > root@flimsy:/usr/obj/usr/src/amd64.amd64/sys/flimsy amd64 > > > > > FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) > > > > > (based on LLVM 6.0.0) WARNING: WITNESS option enabled, expect > > > > > reduced performance. VT(vga): resolution 640x480 CPU: Intel(R) > > > > > Celeron(R) 2955U @ 1.40GHz (1396.80-MHz K8-class CPU) > > > > > Origin="GenuineIntel" Id=0x40651 Family=0x6 Model=0x45 > > > > > Stepping=1 > > > > > Features=0xbfebfbff > > > > CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > > > > > Features2=0x4ddaebbf > > > > xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,XSAVE,OSXSAVE,RDRAND> > > > > > AMD Features=0x2c100800 AMD > > > > > Features2=0x21 Structured Extended > > > > > Features=0x2603 XSAVE > > > > > Features=0x1 VT-x: (disabled in BIOS) > > > > > PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, > > > > > performance statistics real memory = 4301258752 (4102 MB) > > > > > avail memory = 1907572736 (1819 MB) Event timer "LAPIC" quality > > > > > 600 ACPI APIC Table: > > > > What does this mean ? Did you flashed coreboot ? > > > > > > This machine comes with it by default (my model was delivered with > > > SeaBIOS 20131018_145217-build121-m2). So I didn't flash anything > > > (didn't feel like bricking it). > > > > > > > > > > > > kernel trap 12 with interrupts disabled > > > > > > > > > > Fatal trap 12: page fault while in kernel mode > > > > > cpuid = 0; apic id = 00 > > > > > fault virtual address = 0xfffff80001000000 > > > > > fault code = supervisor write data, protection > > > > > violation instruction pointer = 0x20:Oxffffffff8102955f > > > > > stack pointer = 0x28:0xffffffff82a79be0 > > > > > frame pointer = 0x28:0xffffffff82a79c10 > > > > > code segment = base Ox0, limit Oxfffff, type Ox1b > > > > > = DPL 0, pres 1, long 1, def32 0, gran > > > > > 1 processor eflags = resume, IOPL = 0 > > > > > current process = 0 () > > > > > [ thread pid 0 tid 0 ] > > > > > Stopped at native_start_all_aps+0x08f: movq > > > > > %rax,(%rsi) > > > > Look up the source line number for this address. > > > > > > > > > > I guess that's sys/amd64/amd64/support.S line 854 (in rdmsr), > > > called by native_start_all_aps. Any additional hints how I can > > > track it down? > > Why did you decided that this is rdmsr_safe() ? First, > > native_start_all_aps() does not call rdmsr, second the ddb > > report clearly indicates that the fault occured acessing DMAP in > > native_start_all_aps(). > > > > Just look up the source line by the address > > native_start_all_aps+0x08f. > > Okay, according to kgbd this should be here: > > https://svnweb.freebsd.org/base/head/sys/amd64/amd64/mp_machdep.c?revision=333368&view=markup#l369 > > 364 > 365 /* Create the initial 1GB replicated page tables */ > 366 for (i = 0; i < 512; i++) { > 367 /* Each slot of the level 4 pages points to the same > level 3 page */ 368 pt4[i] = > (u_int64_t)(uintptr_t)(mptramp_pagetables + PAGE_SIZE); 369 > pt4[i] |= PG_V | PG_RW | PG_U; 370 > 371 /* Each slot of the level 3 pages points to the same > level 2 page */ 372 pt3[i] = > (u_int64_t)(uintptr_t)(mptramp_pagetables + (2 * PAGE_SIZE)); > 373 pt3[i] |= PG_V | PG_RW | PG_U; 374 > 375 /* The level 2 page slots are mapped with 2MB pages for > 1GB. */ 376 pt2[i] = i * (2 * 1024 * 1024); > 377 pt2[i] |= PG_V | PG_RW | PG_PS | PG_U; > 378 } > > -m You have fault on write due to read-only mapping of the portion of the direct map, which maps the kernel text. It is consistent with the faulting address. It is not clear if it is something new on your machine, or before the kernel text was silently corrupted, since ro protection is somewhat recent. It seems that mp_bootaddress() selected the bad place for the bootstrap page tables. Even more, we do not include the kernel text into the physmem[] array, so it is not clear how did it happen. This code was also changed recently. Can you add the print of the physmap[] array somewhere before the panic, to see what is the kernel idea of the available memory ? It should be already done if you have serial console and set debug.late_console tunable to 0. > > p.s. This machine uses quirks in biosmem.c, see > > Type '?' for a list of command, 'help' for more detailed > help. > OK biosmem > bios_basemem: 0x9e400 > bios_extmem: 0x3ff00000 > memtop: 0x3c000000 > high_heap_base: 0x3c000000 > high_heap_size: 0x4000000 > bios_quirks: 0x01 BQ_DISTRUST_820_EXTMEM > b_bios_probed: 0x0a B_BASEMEM_12 B_EXTMEM_E801 > > -- > Michael Gmelin > > -- > Michael Gmelin