Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Jul 2017 12:16:14 +0200
From:      Nicolas Embriz <nbari@tequila.io>
To:        Warner Losh <imp@bsdimp.com>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: GEOM: ada0: the secondary GPT header is not in the last LBA or random: unblocking device. when using more than 2 cores
Message-ID:  <CAGuJ=C=BwF6nz2aAQEmQVsEDHQEF9%2BiuKuWN42M65D57Kx5jqA@mail.gmail.com>
In-Reply-To: <CAGuJ=CkN5DYaObu4YTga=gZNdvo9j2ThfK9zq5pKtYBcZO6SiQ@mail.gmail.com>
References:  <CAGuJ=CnBvEViR00j57Af78Sk7MuiEMyR_1A-ZNiBx7qFjCDmxQ@mail.gmail.com> <CANCZdfrGgEBdiiaDX9ND87h%2Bi4RB2FvD5dp2EMoO1nuKakfHPg@mail.gmail.com> <CAGuJ=CkwWey2FCa6zWmAJw-pvJQndZ8JRwHyg=n_-WwC2capww@mail.gmail.com> <CAGuJ=CkN5DYaObu4YTga=gZNdvo9j2ThfK9zq5pKtYBcZO6SiQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I change the zone on AWS (eu-central-1a) and got the same behaviour,
instantes with 1 core work but with 2 cores they panic, I was available to
get more details, in this time I was using UFS instead of ZFS on root:

Setting up harvesting:
[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTA=
CH,CACHED
Feeding entropy: .
spin lock 0xffffffff80db45c0 (smp rendezvous) held by 0xfffff80004378560
(tid 100074) too long
timeout stopping cpus
panic: spin lock held too long
cpuid =3D 1
KDB: stack backtrace:
#0 0xffffffff804f69a7 at kdb_backtrace+0x67
#1 0xffffffff804b9666 at vpanic+0x186
#2 0xffffffff804b94d3 at panic+0x43
#3 0xffffffff8049da60 at __mtx_trylock_spin_flags+0
#4 0xffffffff807bd2d6 at smp_targeted_tlb_shootdown+0xd6
#5 0xffffffff807bd62c at smp_masked_invlpg+0x4c
#6 0xffffffff807558b2 at pmap_invalidate_page+0x142
#7 0xffffffff8075f2a9 at pmap_ts_referenced+0x709
#8 0xffffffff8073ac9c at vm_pageout+0xcbc
#9 0xffffffff80482345 at fork_exit+0x75
#10 0xffffffff8074b4be at fork_trampoline+0xe
Uptime: 1m0s
Rebooting...
cpu_reset: Stopping other CPUs
timeout stopping cpus
cpu_reset: Restarting BSP
cpu_reset: Failed to restart BSP


Full output:

- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ |
/ - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
/boot/kernel/kernel text=3D0x6876f0 - \ | / - \ | / - \ | / - \ | / - \
| / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / -
\ data=3D0x76348+0x37d168 | / - \ syms=3D[0x8+0xa1e38| / - \ |
+0x8+0x9d58c/ - \ | / ]
- \ | / - \ | / - \ |

Booting [/boot/kernel/kernel]...
/ - \ | / Copyright (c) 1992-2017 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.1-PRERELEASE #0 r321036: Sat Jul 15 21:52:50 UTC 2017
    devops@fabrik-de1.127.network:/fabrik/aws-nozfs/host/obj/usr/src/sys/FA=
BRIKAWS
amd64
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
LLVM 4.0.0)
VT: init without driver.
XEN: Hypervisor version 4.2 detected.
CPU: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz (2394.50-MHz K8-class CPU)
  Origin=3D"GenuineIntel"  Id=3D0x306f2  Family=3D0x6  Model=3D0x3f  Steppi=
ng=3D2
  Features=3D0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG=
E,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT>
  Features2=3D0xfffa3203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,PCID,SSE4.1,SSE4.2,x=
2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
  AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=3D0x21<LAHF,ABM>
  Structured Extended Features=3D0x728<BMI1,AVX2,BMI2,ERMS,INVPCID>
  XSAVE Features=3D0x1<XSAVEOPT>
Hypervisor: Origin =3D "XenVMMXenVMM"
real memory  =3D 4294967296 (4096 MB)
avail memory =3D 4131692544 (3940 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: <Xen HVM>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
ioapic0: Changing APIC ID to 1
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-47 on motherboard
random: entropy device external interface
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
nexus0
aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard
cryptosoft0: <software crypto> on motherboard
acpi0: <Xen> on motherboard
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 62500000 Hz quality 950
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
isab0: <PCI-ISA bridge> at device 1.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX3 WDMA2 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc100-0xc10f at device 1.1 on
pci0
ata0: <ATA channel> at channel 0 on atapci0
ata1: <ATA channel> at channel 1 on atapci0
pci0: <bridge> at device 1.3 (no driver attached)
vgapci0: <VGA-compatible display> mem
0xf0000000-0xf1ffffff,0xf3000000-0xf3000fff at device 2.0 on pci0
vgapci0: Boot video device
xenpci0: <Xen Platform Device> port 0xc000-0xc0ff mem
0xf2000000-0xf2ffffff irq 28 at device 3.0 on pci0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: console (9600,n,8,1)
xenpv0: <Xen PV bus> on motherboard
granttable0: <Xen Grant-table Device> on xenpv0
xen_et0: <Xen PV Clock> on xenpv0
Event timer "XENTIMER" frequency 1000000000 Hz quality 950
Timecounter "XENTIMER" frequency 1000000000 Hz quality 950
xenstore0: <XenStore> on xenpv0
evtchn0: <Xen event channel user-space device> on xenpv0
privcmd0: <Xen privileged interface user-space device> on xenpv0
debug0: <Xen debug handler> on xenpv0
Timecounters tick every 1.000 msec
nvme cam probe device init
xenballoon0: <Xen Balloon Device> on xenstore0
xctrl0: <Xen Control Device> on xenstore0
xs_dev0: <Xenstore user-space device> on xenstore0
xenbusb_front0: <Xen Frontend Devices> on xenstore0
xn0: <Virtual Network Interface> at device/vif/0 on xenbusb_front0
xn0: Ethernet address: 02:0e:56:1c:1c:d3
xenbusb_back0: <Xen Backend Devices> on xenstore0
xn0: backend features: feature-sg feature-gso-tcp4
xbd0: 8192MB <Virtual Block Device> at device/vbd/768 on xenbusb_front0
xbd0: attaching as ada0
SMP: AP CPU #1 Launched!
Trying to mount root from ufs:/dev/gpt/rootfs [rw]...
GEOM: ada0: the secondary GPT header is not in the last LBA.
Growing root partition to fill device
ada0 recovered
ada0p3 resized
super-block backups (for fsck_ffs -b #) at:
 2097600, 2621952, 3146304, 3670656, 4195008, 4719360, 5243712, 5768064,
 6292416, 6816768, 7341120, 7865472, 8389824, 8914176, 9438528, 9962880,
 10487232, 11011584, 11535936, 12060288, 12584640, 13108992, 13633344, 1415=
7696
Setting hostuuid: ec259844-260d-90a9-bfa5-5157bc239b6b.
Setting hostid: 0x52ee9fe5.
Starting file system checks:
/dev/gpt/rootfs: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/gpt/rootfs: clean, 1648359 free (39 frags, 206040 blocks, 0.0%
fragmentation)
Mounting local filesystems:.
ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib
random: unblocking device.
32-bit compatibility ldconfig path:
Setting hostname: fabrik.
Setting up harvesting:
[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTA=
CH,CACHED
Feeding entropy: .
spin lock 0xffffffff80db45c0 (smp rendezvous) held by
0xfffff80004378560 (tid 100074) too long
timeout stopping cpus
panic: spin lock held too long
cpuid =3D 1
KDB: stack backtrace:
#0 0xffffffff804f69a7 at kdb_backtrace+0x67
#1 0xffffffff804b9666 at vpanic+0x186
#2 0xffffffff804b94d3 at panic+0x43
#3 0xffffffff8049da60 at __mtx_trylock_spin_flags+0
#4 0xffffffff807bd2d6 at smp_targeted_tlb_shootdown+0xd6
#5 0xffffffff807bd62c at smp_masked_invlpg+0x4c
#6 0xffffffff807558b2 at pmap_invalidate_page+0x142
#7 0xffffffff8075f2a9 at pmap_ts_referenced+0x709
#8 0xffffffff8073ac9c at vm_pageout+0xcbc
#9 0xffffffff80482345 at fork_exit+0x75
#10 0xffffffff8074b4be at fork_trampoline+0xe
Uptime: 1m0s
Rebooting...
cpu_reset: Stopping other CPUs
timeout stopping cpus
cpu_reset: Restarting BSP
cpu_reset: Failed to restart BSP



On Sun, Jul 16, 2017 at 10:59 AM, Nicolas Embriz <nbari@tequila.io> wrote:
>
> Hi, this is the only output I have from AWS (system log), important to
mention that this only happens when using more than 2 cores (t2.medium) for
example, when using 1 core it works.
>
> Output:
>
> / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
| / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
- \ | / - \ | / - \ | / - \ | / - \ | / - \ /boot/kernel/kernel
text=3D0x6876f0 | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
data=3D0x76348+0x37d168 - \ | syms=3D[0x8+0xa1e38/ - +0x8+0x9d58c\ | / ]
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / -
\ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ |
/ - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ /boot/kernel/zfs.ko |
/ - \ | / - \ | / size 0x2e8938 at 0xfba000
> loading required module 'opensolaris'
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / -
\ | / /boot/kernel/opensolaris.ko size 0xaab0 at 0x12a3000
>
>
> Booting [/boot/kernel/kernel]...
> - \ | / - \ | / - \ | / - \ Copyright (c) 1992-2017 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 11.1-PRERELEASE #0 r321034: Sat Jul 15 20:44:15 UTC 2017
>     devops@fabrik-de1.127.network:/fabrik/aws/host/obj/usr/src/sys/FABRIK=
AWS
amd64
> FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
LLVM 4.0.0)
> VT: init without driver.
> XEN: Hypervisor version 4.2 detected.
> CPU: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz (2400.05-MHz K8-class CPU)
>   Origin=3D"GenuineIntel"  Id=3D0x306f2  Family=3D0x6  Model=3D0x3f  Step=
ping=3D2
>
Features=3D0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,=
MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT>
>
Features2=3D0xfffa3203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,PCID,SSE4.1,SSE4.2,x2A=
PIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
>   AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM>
>   AMD Features2=3D0x21<LAHF,ABM>
>   Structured Extended Features=3D0x728<BMI1,AVX2,BMI2,ERMS,INVPCID>
>   XSAVE Features=3D0x1<XSAVEOPT>
> Hypervisor: Origin =3D "XenVMMXenVMM"
> real memory  =3D 4294967296 (4096 MB)
> avail memory =3D 4128489472 (3937 MB)
> Event timer "LAPIC" quality 100
> ACPI APIC Table: <Xen HVM>
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> FreeBSD/SMP: 1 package(s) x 2 core(s)
> ioapic0: Changing APIC ID to 1
> MADT: Forcing active-low polarity and level trigger for SCI
> ioapic0 <Version 1.1> irqs 0-47 on motherboard
> random: entropy device external interface
> random: registering fast source Intel Secure Key RNG
> random: fast provider: "Intel Secure Key RNG"
> nexus0
> aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard
> cryptosoft0: <software crypto> on motherboard
> acpi0: <Xen> on motherboard
> acpi0: Power Button (fixed)
> acpi0: Sleep Button (fixed)
> cpu0: <ACPI CPU> on acpi0
> cpu1: <ACPI CPU> on acpi0
> hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
> Timecounter "HPET" frequency 62500000 Hz quality 950
> attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
> Timecounter "i8254" frequency 1193182 Hz quality 0
> Event timer "i8254" frequency 1193182 Hz quality 100
> atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
> Event timer "RTC" frequency 32768 Hz quality 0
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
> acpi_timer0: <32-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> isab0: <PCI-ISA bridge> at device 1.0 on pci0
> isa0: <ISA bus> on isab0
> atapci0: <Intel PIIX3 WDMA2 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc100-0xc10f at device 1.1 on pci0
> ata0: <ATA channel> at channel 0 on atapci0
> ata1: <ATA channel> at channel 1 on atapci0
> pci0: <bridge> at device 1.3 (no driver attached)
> vgapci0: <VGA-compatible display> mem
0xf0000000-0xf1ffffff,0xf3000000-0xf3000fff at device 2.0 on pci0
> vgapci0: Boot video device
> xenpci0: <Xen Platform Device> port 0xc000-0xc0ff mem
0xf2000000-0xf2ffffff irq 28 at device 3.0 on pci0
> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
> uart0: console (9600,n,8,1)
> xenpv0: <Xen PV bus> on motherboard
> granttable0: <Xen Grant-table Device> on xenpv0
> xen_et0: <Xen PV Clock> on xenpv0
> Event timer "XENTIMER" frequency 1000000000 Hz quality 950
> Timecounter "XENTIMER" frequency 1000000000 Hz quality 950
> xenstore0: <XenStore> on xenpv0
> evtchn0: <Xen event channel user-space device> on xenpv0
> privcmd0: <Xen privileged interface user-space device> on xenpv0
> debug0: <Xen debug handler> on xenpv0
> ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is
present;
>             to enable, add "vfs.zfs.prefetch_disable=3D0" to
/boot/loader.conf.
> ZFS filesystem version: 5
> ZFS storage pool version: features support (5000)
> Timecounters tick every 1.000 msec
> nvme cam probe device init
> xenballoon0: <Xen Balloon Device> on xenstore0
> xctrl0: <Xen Control Device> on xenstore0
> xs_dev0: <Xenstore user-space device> on xenstore0
> xenbusb_front0: <Xen Frontend Devices> on xenstore0
> xn0: <Virtual Network Interface> at device/vif/0 on xenbusb_front0
> xn0: Ethernet address: 0a:9a:53:0d:ff:6b
> xenbusb_back0: <Xen Backend Devices> on xenstore0
> xn0: backend features: feature-sg feature-gso-tcp4
> xbd0: 18432MB <Virtual Block Device> at device/vbd/768 on xenbusb_front0
> xbd0: attaching as ada0
> SMP: AP CPU #1 Launched!
> Trying to mount root from zfs:zroot/ROOT/default []...
> GEOM: ada0: the secondary GPT header is not in the last LBA.
>
>
>
>
> On Sun, Jul 16, 2017 at 9:49 AM, Nicolas Embriz <nbari@tequila.io> wrote:
>>
>> Hi,
>>
>> I am trying this in Amazon AWS, the problem is that I can=E2=80=99t do a=
 dmesg
>> or get more logs because when choosing a t2.medium or any instance
>> with more than 2 cores, the image gets stuck on the boot process and
>> therefore I can=E2=80=99t login.
>>
>> I have a working image with FreeBSD 11.0-stable:
>> https://github.com/fabrik-red/images/releases/download/11.0/disk.tar.gz
>>
>> Using this kernel:
>> https://github.com/fabrik-red/images/blob/11.0/fabrik.kernel and made
>> using this script:
>> https://github.com/fabrik-red/images/blob/11.0/fabrik.sh
>>
>> I know is probably not useful this info but so far the only difference
>> is that I updated the sources and now while doing the same thing with
>> FreeBSD 11.1-prelrelease I am getting this strange behaviour.
>>
>> When using only one single cpu core the image works but when using
>> more thant 2 cores the image just don=E2=80=99t boot.
>>
>> Any ideas of what else I may add so that I could get more info at
>> least on the boot screen that is the only thing I get from AWS while
>> booting?
>>
>> Thanks in advance.
>>
>> Regards.
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGuJ=C=BwF6nz2aAQEmQVsEDHQEF9%2BiuKuWN42M65D57Kx5jqA>