Date: Sun, 16 Jul 2017 12:16:14 +0200 From: Nicolas Embriz <nbari@tequila.io> To: Warner Losh <imp@bsdimp.com> Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: GEOM: ada0: the secondary GPT header is not in the last LBA or random: unblocking device. when using more than 2 cores Message-ID: <CAGuJ=C=BwF6nz2aAQEmQVsEDHQEF9%2BiuKuWN42M65D57Kx5jqA@mail.gmail.com> In-Reply-To: <CAGuJ=CkN5DYaObu4YTga=gZNdvo9j2ThfK9zq5pKtYBcZO6SiQ@mail.gmail.com> References: <CAGuJ=CnBvEViR00j57Af78Sk7MuiEMyR_1A-ZNiBx7qFjCDmxQ@mail.gmail.com> <CANCZdfrGgEBdiiaDX9ND87h%2Bi4RB2FvD5dp2EMoO1nuKakfHPg@mail.gmail.com> <CAGuJ=CkwWey2FCa6zWmAJw-pvJQndZ8JRwHyg=n_-WwC2capww@mail.gmail.com> <CAGuJ=CkN5DYaObu4YTga=gZNdvo9j2ThfK9zq5pKtYBcZO6SiQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I change the zone on AWS (eu-central-1a) and got the same behaviour, instantes with 1 core work but with 2 cores they panic, I was available to get more details, in this time I was using UFS instead of ZFS on root: Setting up harvesting: [UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTA= CH,CACHED Feeding entropy: . spin lock 0xffffffff80db45c0 (smp rendezvous) held by 0xfffff80004378560 (tid 100074) too long timeout stopping cpus panic: spin lock held too long cpuid =3D 1 KDB: stack backtrace: #0 0xffffffff804f69a7 at kdb_backtrace+0x67 #1 0xffffffff804b9666 at vpanic+0x186 #2 0xffffffff804b94d3 at panic+0x43 #3 0xffffffff8049da60 at __mtx_trylock_spin_flags+0 #4 0xffffffff807bd2d6 at smp_targeted_tlb_shootdown+0xd6 #5 0xffffffff807bd62c at smp_masked_invlpg+0x4c #6 0xffffffff807558b2 at pmap_invalidate_page+0x142 #7 0xffffffff8075f2a9 at pmap_ts_referenced+0x709 #8 0xffffffff8073ac9c at vm_pageout+0xcbc #9 0xffffffff80482345 at fork_exit+0x75 #10 0xffffffff8074b4be at fork_trampoline+0xe Uptime: 1m0s Rebooting... cpu_reset: Stopping other CPUs timeout stopping cpus cpu_reset: Restarting BSP cpu_reset: Failed to restart BSP Full output: - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / /boot/kernel/kernel text=3D0x6876f0 - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ data=3D0x76348+0x37d168 | / - \ syms=3D[0x8+0xa1e38| / - \ | +0x8+0x9d58c/ - \ | / ] - \ | / - \ | / - \ | Booting [/boot/kernel/kernel]... / - \ | / Copyright (c) 1992-2017 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.1-PRERELEASE #0 r321036: Sat Jul 15 21:52:50 UTC 2017 devops@fabrik-de1.127.network:/fabrik/aws-nozfs/host/obj/usr/src/sys/FA= BRIKAWS amd64 FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 4.0.0) VT: init without driver. XEN: Hypervisor version 4.2 detected. CPU: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz (2394.50-MHz K8-class CPU) Origin=3D"GenuineIntel" Id=3D0x306f2 Family=3D0x6 Model=3D0x3f Steppi= ng=3D2 Features=3D0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG= E,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT> Features2=3D0xfffa3203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,PCID,SSE4.1,SSE4.2,x= 2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV> AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM> AMD Features2=3D0x21<LAHF,ABM> Structured Extended Features=3D0x728<BMI1,AVX2,BMI2,ERMS,INVPCID> XSAVE Features=3D0x1<XSAVEOPT> Hypervisor: Origin =3D "XenVMMXenVMM" real memory =3D 4294967296 (4096 MB) avail memory =3D 4131692544 (3940 MB) Event timer "LAPIC" quality 100 ACPI APIC Table: <Xen HVM> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) ioapic0: Changing APIC ID to 1 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 <Version 1.1> irqs 0-47 on motherboard random: entropy device external interface random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" nexus0 aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard cryptosoft0: <software crypto> on motherboard acpi0: <Xen> on motherboard acpi0: Power Button (fixed) acpi0: Sleep Button (fixed) cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 62500000 Hz quality 950 attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 Event timer "RTC" frequency 32768 Hz quality 0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <32-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 isab0: <PCI-ISA bridge> at device 1.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel PIIX3 WDMA2 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc100-0xc10f at device 1.1 on pci0 ata0: <ATA channel> at channel 0 on atapci0 ata1: <ATA channel> at channel 1 on atapci0 pci0: <bridge> at device 1.3 (no driver attached) vgapci0: <VGA-compatible display> mem 0xf0000000-0xf1ffffff,0xf3000000-0xf3000fff at device 2.0 on pci0 vgapci0: Boot video device xenpci0: <Xen Platform Device> port 0xc000-0xc0ff mem 0xf2000000-0xf2ffffff irq 28 at device 3.0 on pci0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: console (9600,n,8,1) xenpv0: <Xen PV bus> on motherboard granttable0: <Xen Grant-table Device> on xenpv0 xen_et0: <Xen PV Clock> on xenpv0 Event timer "XENTIMER" frequency 1000000000 Hz quality 950 Timecounter "XENTIMER" frequency 1000000000 Hz quality 950 xenstore0: <XenStore> on xenpv0 evtchn0: <Xen event channel user-space device> on xenpv0 privcmd0: <Xen privileged interface user-space device> on xenpv0 debug0: <Xen debug handler> on xenpv0 Timecounters tick every 1.000 msec nvme cam probe device init xenballoon0: <Xen Balloon Device> on xenstore0 xctrl0: <Xen Control Device> on xenstore0 xs_dev0: <Xenstore user-space device> on xenstore0 xenbusb_front0: <Xen Frontend Devices> on xenstore0 xn0: <Virtual Network Interface> at device/vif/0 on xenbusb_front0 xn0: Ethernet address: 02:0e:56:1c:1c:d3 xenbusb_back0: <Xen Backend Devices> on xenstore0 xn0: backend features: feature-sg feature-gso-tcp4 xbd0: 8192MB <Virtual Block Device> at device/vbd/768 on xenbusb_front0 xbd0: attaching as ada0 SMP: AP CPU #1 Launched! Trying to mount root from ufs:/dev/gpt/rootfs [rw]... GEOM: ada0: the secondary GPT header is not in the last LBA. Growing root partition to fill device ada0 recovered ada0p3 resized super-block backups (for fsck_ffs -b #) at: 2097600, 2621952, 3146304, 3670656, 4195008, 4719360, 5243712, 5768064, 6292416, 6816768, 7341120, 7865472, 8389824, 8914176, 9438528, 9962880, 10487232, 11011584, 11535936, 12060288, 12584640, 13108992, 13633344, 1415= 7696 Setting hostuuid: ec259844-260d-90a9-bfa5-5157bc239b6b. Setting hostid: 0x52ee9fe5. Starting file system checks: /dev/gpt/rootfs: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/gpt/rootfs: clean, 1648359 free (39 frags, 206040 blocks, 0.0% fragmentation) Mounting local filesystems:. ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib random: unblocking device. 32-bit compatibility ldconfig path: Setting hostname: fabrik. Setting up harvesting: [UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTA= CH,CACHED Feeding entropy: . spin lock 0xffffffff80db45c0 (smp rendezvous) held by 0xfffff80004378560 (tid 100074) too long timeout stopping cpus panic: spin lock held too long cpuid =3D 1 KDB: stack backtrace: #0 0xffffffff804f69a7 at kdb_backtrace+0x67 #1 0xffffffff804b9666 at vpanic+0x186 #2 0xffffffff804b94d3 at panic+0x43 #3 0xffffffff8049da60 at __mtx_trylock_spin_flags+0 #4 0xffffffff807bd2d6 at smp_targeted_tlb_shootdown+0xd6 #5 0xffffffff807bd62c at smp_masked_invlpg+0x4c #6 0xffffffff807558b2 at pmap_invalidate_page+0x142 #7 0xffffffff8075f2a9 at pmap_ts_referenced+0x709 #8 0xffffffff8073ac9c at vm_pageout+0xcbc #9 0xffffffff80482345 at fork_exit+0x75 #10 0xffffffff8074b4be at fork_trampoline+0xe Uptime: 1m0s Rebooting... cpu_reset: Stopping other CPUs timeout stopping cpus cpu_reset: Restarting BSP cpu_reset: Failed to restart BSP On Sun, Jul 16, 2017 at 10:59 AM, Nicolas Embriz <nbari@tequila.io> wrote: > > Hi, this is the only output I have from AWS (system log), important to mention that this only happens when using more than 2 cores (t2.medium) for example, when using 1 core it works. > > Output: > > / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ /boot/kernel/kernel text=3D0x6876f0 | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / data=3D0x76348+0x37d168 - \ | syms=3D[0x8+0xa1e38/ - +0x8+0x9d58c\ | / ] > - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ /boot/kernel/zfs.ko | / - \ | / - \ | / size 0x2e8938 at 0xfba000 > loading required module 'opensolaris' > - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / /boot/kernel/opensolaris.ko size 0xaab0 at 0x12a3000 > > > Booting [/boot/kernel/kernel]... > - \ | / - \ | / - \ | / - \ Copyright (c) 1992-2017 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 11.1-PRERELEASE #0 r321034: Sat Jul 15 20:44:15 UTC 2017 > devops@fabrik-de1.127.network:/fabrik/aws/host/obj/usr/src/sys/FABRIK= AWS amd64 > FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 4.0.0) > VT: init without driver. > XEN: Hypervisor version 4.2 detected. > CPU: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz (2400.05-MHz K8-class CPU) > Origin=3D"GenuineIntel" Id=3D0x306f2 Family=3D0x6 Model=3D0x3f Step= ping=3D2 > Features=3D0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,= MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT> > Features2=3D0xfffa3203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,PCID,SSE4.1,SSE4.2,x2A= PIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV> > AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM> > AMD Features2=3D0x21<LAHF,ABM> > Structured Extended Features=3D0x728<BMI1,AVX2,BMI2,ERMS,INVPCID> > XSAVE Features=3D0x1<XSAVEOPT> > Hypervisor: Origin =3D "XenVMMXenVMM" > real memory =3D 4294967296 (4096 MB) > avail memory =3D 4128489472 (3937 MB) > Event timer "LAPIC" quality 100 > ACPI APIC Table: <Xen HVM> > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > FreeBSD/SMP: 1 package(s) x 2 core(s) > ioapic0: Changing APIC ID to 1 > MADT: Forcing active-low polarity and level trigger for SCI > ioapic0 <Version 1.1> irqs 0-47 on motherboard > random: entropy device external interface > random: registering fast source Intel Secure Key RNG > random: fast provider: "Intel Secure Key RNG" > nexus0 > aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard > cryptosoft0: <software crypto> on motherboard > acpi0: <Xen> on motherboard > acpi0: Power Button (fixed) > acpi0: Sleep Button (fixed) > cpu0: <ACPI CPU> on acpi0 > cpu1: <ACPI CPU> on acpi0 > hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 > Timecounter "HPET" frequency 62500000 Hz quality 950 > attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0 > Timecounter "i8254" frequency 1193182 Hz quality 0 > Event timer "i8254" frequency 1193182 Hz quality 100 > atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 > Event timer "RTC" frequency 32768 Hz quality 0 > Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 > acpi_timer0: <32-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > isab0: <PCI-ISA bridge> at device 1.0 on pci0 > isa0: <ISA bus> on isab0 > atapci0: <Intel PIIX3 WDMA2 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc100-0xc10f at device 1.1 on pci0 > ata0: <ATA channel> at channel 0 on atapci0 > ata1: <ATA channel> at channel 1 on atapci0 > pci0: <bridge> at device 1.3 (no driver attached) > vgapci0: <VGA-compatible display> mem 0xf0000000-0xf1ffffff,0xf3000000-0xf3000fff at device 2.0 on pci0 > vgapci0: Boot video device > xenpci0: <Xen Platform Device> port 0xc000-0xc0ff mem 0xf2000000-0xf2ffffff irq 28 at device 3.0 on pci0 > atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > uart0: console (9600,n,8,1) > xenpv0: <Xen PV bus> on motherboard > granttable0: <Xen Grant-table Device> on xenpv0 > xen_et0: <Xen PV Clock> on xenpv0 > Event timer "XENTIMER" frequency 1000000000 Hz quality 950 > Timecounter "XENTIMER" frequency 1000000000 Hz quality 950 > xenstore0: <XenStore> on xenpv0 > evtchn0: <Xen event channel user-space device> on xenpv0 > privcmd0: <Xen privileged interface user-space device> on xenpv0 > debug0: <Xen debug handler> on xenpv0 > ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present; > to enable, add "vfs.zfs.prefetch_disable=3D0" to /boot/loader.conf. > ZFS filesystem version: 5 > ZFS storage pool version: features support (5000) > Timecounters tick every 1.000 msec > nvme cam probe device init > xenballoon0: <Xen Balloon Device> on xenstore0 > xctrl0: <Xen Control Device> on xenstore0 > xs_dev0: <Xenstore user-space device> on xenstore0 > xenbusb_front0: <Xen Frontend Devices> on xenstore0 > xn0: <Virtual Network Interface> at device/vif/0 on xenbusb_front0 > xn0: Ethernet address: 0a:9a:53:0d:ff:6b > xenbusb_back0: <Xen Backend Devices> on xenstore0 > xn0: backend features: feature-sg feature-gso-tcp4 > xbd0: 18432MB <Virtual Block Device> at device/vbd/768 on xenbusb_front0 > xbd0: attaching as ada0 > SMP: AP CPU #1 Launched! > Trying to mount root from zfs:zroot/ROOT/default []... > GEOM: ada0: the secondary GPT header is not in the last LBA. > > > > > On Sun, Jul 16, 2017 at 9:49 AM, Nicolas Embriz <nbari@tequila.io> wrote: >> >> Hi, >> >> I am trying this in Amazon AWS, the problem is that I can=E2=80=99t do a= dmesg >> or get more logs because when choosing a t2.medium or any instance >> with more than 2 cores, the image gets stuck on the boot process and >> therefore I can=E2=80=99t login. >> >> I have a working image with FreeBSD 11.0-stable: >> https://github.com/fabrik-red/images/releases/download/11.0/disk.tar.gz >> >> Using this kernel: >> https://github.com/fabrik-red/images/blob/11.0/fabrik.kernel and made >> using this script: >> https://github.com/fabrik-red/images/blob/11.0/fabrik.sh >> >> I know is probably not useful this info but so far the only difference >> is that I updated the sources and now while doing the same thing with >> FreeBSD 11.1-prelrelease I am getting this strange behaviour. >> >> When using only one single cpu core the image works but when using >> more thant 2 cores the image just don=E2=80=99t boot. >> >> Any ideas of what else I may add so that I could get more info at >> least on the boot screen that is the only thing I get from AWS while >> booting? >> >> Thanks in advance. >> >> Regards. > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGuJ=C=BwF6nz2aAQEmQVsEDHQEF9%2BiuKuWN42M65D57Kx5jqA>