Date: Sun, 18 Apr 2021 10:47:00 +0000 From: "Mark Delany" <n7w@delta.emu.st> To: freebsd-hackers@freebsd.org Subject: Various problems with 13.0 amd64 on vultr.com Message-ID: <0.2.0-final-1618742820.474-0x878fa2@qmda.emu.st>
next in thread | raw e-mail | index | archive | help
Hi all. I rarely if ever post here so if there's a better place, LMK. I've been running 12.2 on vultr.com instances for a long time without any issues. However I recently attempted an upgrade to 13.0 and the system now exhibits a number of issues. The most critical issue is that the system randomly wedged after running for a while (anywhere from 10 minutes to a couple of hours) requiring a reboot to recover. No console response or messages and limited network response (see below). No messages logged anywhere as best I can tell. The second issue is more annoying than critical: the system doesn't reboot with the reboot/shutdown commands. The shutdown sequence seems to complete but the reboot never occurs. I compiled and ran a "reboot(RB_AUTOBOOT | RB_VERBOSE)" but nothing interesting showed up. I have no idea whether the two issues are related excepting that neither occur with 12.2 Some details: - I first upgraded with freebsd-update and then tried with a fresh ISO image and completely overwrote the original file system. - I've tried both UFS and ZFS root file systems. - I tried with a fresh VM instance in case there was some sort of per-instance glitch - The system is 99% idle with no memory pressure. It normally runs nsd, openntpd and a few other processes installed via pkg, but nothing wierd as best I can tell. - it has no kernel modules manually loaded - It's configured with ipv4 and ipv6 and when it gets wedged I get a ping response from the ipv6 address, but not from ipv4. Furthermore, if I try a tcp connection to ipv6 I get a connection setup, but no data. - The VM is configured as a single-CPU system - I haven't raised the issue with vultr yet. Thought I'd see what the hive-mind thinks first. Not that it will surprise anyone, but I recently spun up 13.0 in Virtualbox on a lab machine as well as on a different VM provider without any problems, so it's probably something relatively unique to vultr. That this is a virtually idle system on a single CPU with no oddball or unusual kernel modules or network configs makes the situation surprising to me. There is no pattern that I'm yet able to discern. The main thing I have left to try is to boot the system without any networking activated, but apart from that I'm out of ideas in terms of identifying the root cause. So my questions are: 1. Anyone else having the same issue? Or not having the same issue? 2. Clues on how to diagnose? This is a non-critical system so I can try anything that anyone suggests but I'm not particularly familiar with kernel-level debugging so a bit of hand-holding might be needed if you have suggestions. For those unfamiliar with vultr's VMs, here's the first part of dmesg: FreeBSD 13.0-RELEASE #0 releng/13.0-n244733-ea31abc261f: Fri Apr 9 04:24:09 UTC 2021 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe) VT(vga): text 80x25 CPU: Intel Xeon Processor (Cascadelake) (2993.02-MHz K8-class CPU) Origin="GenuineIntel" Id=0x50656 Family=0x6 Model=0x55 Stepping=6 Features=0x783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2> Features2=0xfffa3203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x21<LAHF,ABM> Structured Extended Features=0xd18307a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,CLFLUSHOPT,CLWB,AVX512CD,AVX512BW,AVX512VL> Structured Extended Features2=0x808<PKU,AVX512VNNI> Structured Extended Features3=0xa4000000<IBPB,ARCH_CAP,SSBD> XSAVE Features=0x1<XSAVEOPT> IA32_ARCH_CAPS=0x2b<RDCL_NO,IBRS_ALL,SKIP_L1DFL_VME,MDS_NO> Hypervisor: Origin = "KVMKVMKVM" real memory = 1073741824 (1024 MB) avail memory = 997744640 (951 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: <BOCHS BXPCAPIC> random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" random: unblocking device. ioapic0 <Version 1.1> irqs 0-23 Timecounter "TSC-low" frequency 1496510010 Hz quality 800 in case it shows up anything odd to those who can decode this sort of stuff. Mark.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0.2.0-final-1618742820.474-0x878fa2>