Date: Wed, 29 Nov 2017 20:06:42 -0800 From: Cy Schubert <Cy.Schubert@komquats.com> To: Hans Petter Selasky <hselasky@FreeBSD.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r326376 - head/sys/kern Message-ID: <201711300406.vAU46gCX012711@slippy.cwsent.com> In-Reply-To: Message from Hans Petter Selasky <hselasky@FreeBSD.org> of "Wed, 29 Nov 2017 23:28:40 %2B0000." <201711292328.vATNSeOM046518@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
In message <201711292328.vATNSeOM046518@repo.freebsd.org>, Hans Petter Selasky writes: > Author: hselasky > Date: Wed Nov 29 23:28:40 2017 > New Revision: 326376 > URL: https://svnweb.freebsd.org/changeset/base/326376 > > Log: > The sched_add() function is not only used when the thread is initially > started, but also by the turnstiles to mark a thread as runnable for > all locks, for instance sleepqueues do: > setrunnable()->sched_wakeup()->sched_add() > > In r326218 code was added to allow booting from non-zero CPU numbers > by setting the ts_cpu field inside the ULE scheduler's sched_add() > function. This had an undesired side-effect that prior sched_pin() and > sched_bind() calls got disregarded. This patch fixes the > initialization of the ts_cpu field for the ULE scheduler to only > happen once when the initial thread is constructed during system > init. Forking will then later on ensure that a valid ts_cpu value gets > copied to all children. > > Reviewed by: jhb, kib > Discussed with: nwhitehorn > MFC after: 1 month > Differential revision: https://reviews.freebsd.org/D13298 > Sponsored by: Mellanox Technologies > > Modified: > head/sys/kern/sched_ule.c > > Modified: head/sys/kern/sched_ule.c > ============================================================================= > = > --- head/sys/kern/sched_ule.c Wed Nov 29 21:16:14 2017 (r326375) > +++ head/sys/kern/sched_ule.c Wed Nov 29 23:28:40 2017 (r326376) > @@ -1405,7 +1405,6 @@ sched_setup(void *dummy) > > /* Add thread0's load since it's running. */ > TDQ_LOCK(tdq); > - td_get_sched(&thread0)->ts_cpu = curcpu; /* Something valid to start */ > thread0.td_lock = TDQ_LOCKPTR(TDQ_SELF()); > tdq_load_add(tdq, &thread0); > tdq->tdq_lowpri = thread0.td_priority; > @@ -1642,6 +1641,7 @@ schedinit(void) > ts0->ts_ltick = ticks; > ts0->ts_ftick = ticks; > ts0->ts_slice = 0; > + ts0->ts_cpu = curcpu; /* set valid CPU number */ > } > > /* > @@ -2453,7 +2453,6 @@ sched_add(struct thread *td, int flags) > * Pick the destination cpu and if it isn't ours transfer to the > * target cpu. > */ > - td_get_sched(td)->ts_cpu = curcpu; /* Pick something valid to start */ > cpu = sched_pickcpu(td, flags); > tdq = sched_setcpu(td, cpu, flags); > tdq_add(tdq, td, flags); > Hi Hans, Sorry I didn't get to this sooner: $JOB (and I have a change window in an hour). I sent you a couple of photos from my laptop however I've been able to reproduce this on my testbed. It boots okay without INVARIANTS and and WITNESS however with them it gets a double fault. Type '?' for a list of commands, 'help' for more detailed help. OK include /boot/testbed/amd64-current-t / testbed/amd64-current-t (12.0-CURRENT) loader file selected / unload complete vfs.zfs.arc_max set to 1024M vm.kmem_size set to 1536M vm.kmem_size_max set to 1536M currdev set to disk1s3f: /boot/kernel/kernel text=0xc3262c data=0xc5288+0x3a5048 syms=[0x8+0x116508+0x8+0x11285a] /boot/kernel/if_sk.ko size 0x17cd0 at 0x16c6000 /boot/kernel/if_nfe.ko size 0x1a710 at 0x16de000 /boot/kernel/opensolaris.ko size 0xd6e0 at 0x16f9000 /boot/kernel/zfs.ko size 0x4ef290 at 0x1707000 /boot/kernel/amdtemp.ko size 0x53d0 at 0x1bf7000 loading required module 'amdsmn' /boot/kernel/amdsmn.ko size 0x2d78 at 0x1bfd000 /boot/kernel/cpufreq.ko size 0x168c0 at 0x1c00000 /boot/kernel/tmpfs.ko size 0x203b0 at 0x1c17000 /boot/kernel/fdescfs.ko size 0x9d98 at 0x1c38000 /boot/kernel/nullfs.ko size 0xd308 at 0x1c42000 - new kernel has been loaded OK boot Booting... GDB: no debug ports present KDB: debugger backends: ddb KDB: current backend: ddb Copyright (c) 1992-2017 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 12.0-CURRENT #0 r326382M: Wed Nov 29 19:44:24 PST 2017 root@bob:/export/obj/opt/src/svn-current/amd64.amd64/sys/BREAK4 amd64 FreeBSD clang version 5.0.0 (tags/RELEASE_500/final 312559) (based on LLVM 5.0.0svn) WARNING: WITNESS option enabled, expect reduced performance. VT(vga): text 80x25 CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5000+ (2612.09-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x60fb2 Family=0xf Model=0x6b Stepping=2 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,M CA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x2001<SSE3,CX16> AMD Features=0xea500800<SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!> AMD Features2=0x11f<LAHF,CMP,SVM,ExtAPIC,CR8,Prefetch> SVM: NAsids=64 real memory = 5368709120 (5120 MB) avail memory = 5134057472 (4896 MB) Event timer "LAPIC" quality 100 ACPI APIC Table: <Nvidia ASUSACPI> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) arc4random: no preloaded entropy cache Firmware Warning (ACPI): 32/64X length mismatch in FADT/Pm1aEventBlock: 32/8 (20171110/tbfadt-748) Firmware Warning (ACPI): 32/64X length mismatch in FADT/Pm1aControlBlock: 16/8 (20171110/tbfadt-748) Firmware Warning (ACPI): 32/64X length mismatch in FADT/PmTimerBlock: 32/8 (20171110/tbfadt-748) Firmware Warning (ACPI): 32/64X length mismatch in FADT/Gpe0Block: 64/8 (20171110/tbfadt-748) Firmware Warning (ACPI): 32/64X length mismatch in FADT/Gpe1Block: 128/8 (20171110/tbfadt-748) Firmware Warning (ACPI): Invalid length for FADT/Pm1aEventBlock: 8, using default 32 (20171110/tbfadt-850) Firmware Warning (ACPI): Invalid length for FADT/Pm1aControlBlock: 8, using default 16 (20171110/tbfadt-850) Firmware Warning (ACPI): Invalid length for FADT/PmTimerBlock: 8, using default 32 (20171110/tbfadt-850) ioapic0: Changing APIC ID to 4 ioapic0 <Version 1.1> irqs 0-23 on motherboard random: entropy device external interface module_register_init: MOD_LOAD (vesa, 0xffffffff80cad0f0, 0) error 19 nexus0 vtvga0: <VT VGA driver> on motherboard cryptosoft0: <software crypto> on motherboard acpi0: <Nvidia ASUSACPI> on motherboard ACPI Error: Could not find/resolve named package element: \_PR_.CPU0 (20171110/dspkginit-513) acpi0: Power Button (fixed) cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 attimer0: <AT timer> port 0x40-0x43 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 hpet0: <High Precision Event Timer> iomem 0xfefff000-0xfefff3ff irq 0,8 on acpi0 Timecounter "HPET" frequency 25000000 Hz quality 950 atrtc0: <AT realtime clock> port 0x70-0x73 on acpi0 atrtc0: registered as a time-of-day clock, resolution 1.000000s Event timer "RTC" frequency 32768 Hz quality 0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pci0: <memory, RAM> at device 0.0 (no driver attached) isab0: <PCI-ISA bridge> at device 1.0 on pci0 isa0: <ISA bus> on isab0 ohci0: <nVidia nForce MCP55 USB Controller> mem 0xfe02f000-0xfe02ffff irq 21 at device 2.0 on pci0 usbus0 on ohci0 ehci0: <NVIDIA nForce MCP55 USB 2.0 controller> mem 0xfe02e000-0xfe02e0ff irq 22 at device 2.1 on pci0 usbus1: EHCI version 1.0 usbus1 on ehci0 atapci0: <nVidia nForce MCP55 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 4.0 on pci0 ata0: <ATA channel> at channel 0 on atapci0 ata1: <ATA channel> at channel 1 on atapci0 atapci1: <nVidia nForce MCP55 SATA300 controller> port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xf700-0xf70f mem 0xfe02d000-0xfe02dfff irq 23 at device 5.0 on pci0 ata2: <ATA channel> at channel 0 on atapci1 ata3: <ATA channel> at channel 1 on atapci1 atapci2: <nVidia nForce MCP55 SATA300 controller> port 0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xf200-0xf20f mem 0xfe02c000-0xfe02cfff irq 20 at device 5.1 on pci0 ata4: <ATA channel> at channel 0 on atapci2 ata5: <ATA channel> at channel 1 on atapci2 atapci3: <nVidia nForce MCP55 SATA300 controller> port 0xf100-0xf107,0xf000-0xf003,0xef00-0xef07,0xee00-0xee03,0xed00-0xed0f mem 0xfe02b000-0xfe02bfff irq 21 at device 5.2 on pci0 ata6: <ATA channel> at channel 0 on atapci3 ata7: <ATA channel> at channel 1 on atapci3 pcib1: <ACPI PCI-PCI bridge> at device 6.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pci1: <network, ethernet> at device 6.0 (no driver attached) skc0: <D-Link DGE-530T Gigabit Ethernet> port 0xd800-0xd8ff mem 0xfdef8000-0xfdefbfff irq 17 at device 7.0 on pci1 skc0: DGE-530T Gigabit Ethernet Adapter rev. (0x1) sk0: <Marvell Semiconductor, Inc. Yukon> on skc0 sk0: Ethernet address: 00:13:46:73:0c:0d miibus0: <MII bus> on sk0 e1000phy0: <Marvell 88E1011 Gigabit PHY> PHY 0 on miibus0 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto pci0: <multimedia, HDA> at device 6.1 (no driver attached) nfe0: <NVIDIA nForce MCP55 Networking Adapter> port 0xec00-0xec07 mem 0xfe02a000-0xfe02afff,0xfe029000-0xfe0290ff,0xfe028000-0xfe02800f irq 23 at device 8.0 on pci0 miibus1: <MII bus> on nfe0 e1000phy1: <Marvell 88E1116 Gigabit PHY> PHY 1 on miibus1 e1000phy1: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow nfe0: Using defaults for TSO: 65518/35/2048 nfe0: Ethernet address: 00:18:f3:a0:dc:a4 pcib2: <ACPI PCI-PCI bridge> at device 15.0 on pci0 pci2: <ACPI PCI bus> on pcib2 vgapci0: <VGA-compatible display> mem 0xe8000000-0xefffffff,0xfddfe000-0xfdd fffff irq 16 at device 0.0 on pci2 vgapci0: Boot video device amdtemp0: <AMD CPU On-Die Thermal Sensors> on hostb3 acpi_tz0: <Thermal Zone> on acpi0 fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fd0: <1440-KB 3.5" drive> on fdc0 drive 0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: console (9600,n,8,1) orm0: <ISA Option ROM> at iomem 0xc0000-0xcc7ff on isa0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] powernow0: <PowerNow! K8> on cpu0 powernow1: <PowerNow! K8> on cpu1 ZFS filesystem version: 5 ZFS storage pool version: features support (5000) Timecounters tick every 1.000 msec usbus0: 12Mbps Full Speed USB v1.0 usbus1: 480Mbps High Speed USB v2.0 ugen0.1: <nVidia OHCI root HUB> at usbus0 uhub0: <nVidia OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0 ugen1.1: <nVidia EHCI root HUB> at usbus1 uhub1: <nVidia EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1 uhub0: 10 ports with 10 removable, self powered uhub1: 10 ports with 10 removable, self powered ada0 at ata0 bus 0 scbus0 target 0 lun 0 ada0: <SAMSUNG SP0802N TK100-24> ATA-7 device ada0: Serial Number S00JJ20X889726 ada0: 100.000MB/s transfers (UDMA5, PIO 8192bytes) ada0: 76351MB (156368016 512 byte sectors) ada1 at ata0 bus 0 scbus0 target 1 lun 0 ada1: <Maxtor 6Y120P0 YAR41BW0> ATA-7 device ada1: Serial Number Y327ZJ1E ada1: 133.000MB/s transfers (UDMA6, PIO 8192bytes) ada1: 117246MB (240121728 512 byte sectors) ada2 at ata2 bus 0 scbus2 target 0 lun 0 ada2: <ST3500413AS JC4B> ATA8-ACS SATA 3.x device ada2: Serial Number 6VMRAGM7 ada2: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes) ada2: 476940MB (976773168 512 byte sectors) ada3 at ata3 bus 0 scbus3 target 0 lun 0 ada3: <ST3320620AS 3.ADG> ATA-7 SATA 2.x device ada3: Serial Number 9QFA0729 ada3: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes) ada3: 305245MB (625142448 512 byte sectors) SMP: AP CPU #1 Launched! random: unblocking device. WARNING: WITNESS option enabled, expect reduced performance. arc4random: no preloaded entropy cache arc4random: no preloaded entropy cache Trying to mount root from ufs:/dev/ada0s3f [rw]... sysctl: kern.ipc.nmbclusters=32768 at line 28: Invalid argument Setting hostuuid: 3428e43e-d455-9478-f48e-14a3b4bb94dd. Setting hostid: 0x448c33e0. Fatal double fault rip 0xffffffff80793105 rsp 0xfffffe0036358fe0 rbp 0xfffffe0036359150 rax 0 rdx 0xfffffe00363590f8 rbx 0x2 rcx 0x1 rsi 0x2 rdi 0 r8 0xfffffe00363592f0 r9 0xfffffe0036359350 r10 0 r11 0 r12 0x60 r13 0xfffffe003635b200 r14 0xfffffe0036359d30 r15 0x60 rflags 0x10002 cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b fsbase 0x80063f8b0 gsbase 0xffffffff8142be80 kgsbase 0 cpuid = 1; apic id = 01 panic: double fault cpuid = 1 time = 1511985616 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe00003fdb40 kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe00003fdc10 vpanic() at vpanic+0x268/frame 0xfffffe00003fdce0 kproc_shutdown() at kproc_shutdown/frame 0xfffffe00003fdd40 dblfault_handler() at dblfault_handler+0x387/frame 0xfffffe00003fdf30 Xdblfault() at Xdblfault+0xac/frame 0xfffffe00003fdf30 --- trap 0x17, rip = 0xffffffff80793105, rsp = 0xfffffe0036358fe0, rbp = 0xfffffe0036359150 --- cpu_search_lowest() at cpu_search_lowest+0x1b5/frame 0xfffffe0036359150 cpu_search_lowest() at cpu_search_lowest+0x35e/frame 0xfffffe00363592d0 sched_lowest() at sched_lowest+0x66/frame 0xfffffe0036359340 sched_pickcpu() at sched_pickcpu+0x5af/frame 0xfffffe0036359470 sched_add() at sched_add+0x111/frame 0xfffffe00363594e0 sched_wakeup() at sched_wakeup+0x10a/frame 0xfffffe0036359530 setrunnable() at setrunnable+0x1c8/frame 0xfffffe0036359580 sleepq_resume_thread() at sleepq_resume_thread+0x634/frame 0xfffffe0036359600 sleepq_remove_matching() at sleepq_remove_matching+0xa9/frame 0xfffffe0036359660 sleepq_broadcast() at sleepq_broadcast+0x142/frame 0xfffffe00363596a0 wakeup() at wakeup+0x29/frame 0xfffffe00363596d0 g_io_request() at g_io_request+0x739/frame 0xfffffe0036359750 vdev_geom_io_start() at vdev_geom_io_start+0x582/frame 0xfffffe0036359800 zio_vdev_io_start() at zio_vdev_io_start+0x962/frame 0xfffffe0036359950 zio_execute() at zio_execute+0x28a/frame 0xfffffe00363599b0 zio_nowait() at zio_nowait+0x129/frame 0xfffffe0036359a00 vdev_cache_read() at vdev_cache_read+0x47c/frame 0xfffffe0036359b20 zio_vdev_io_start() at zio_vdev_io_start+0x82a/frame 0xfffffe0036359c70 zio_execute() at zio_execute+0x28a/frame 0xfffffe0036359cd0 zio_nowait() at zio_nowait+0x129/frame 0xfffffe0036359d20 vdev_mirror_io_start() at vdev_mirror_io_start+0x28a/frame 0xfffffe0036359dc0 zio_vdev_io_start() at zio_vdev_io_start+0x11f/frame 0xfffffe0036359f10 zio_execute() at zio_execute+0x28a/frame 0xfffffe0036359f70 zio_nowait() at zio_nowait+0x129/frame 0xfffffe0036359fc0 arc_read() at arc_read+0x1e8f/frame 0xfffffe003635a380 traverse_prefetch_metadata() at traverse_prefetch_metadata+0x21b/frame 0xfffffe003635a3f0 traverse_visitbp() at traverse_visitbp+0x831/frame 0xfffffe003635a600 traverse_visitbp() at traverse_visitbp+0x8d6/frame 0xfffffe003635a810 traverse_visitbp() at traverse_visitbp+0x8d6/frame 0xfffffe003635aa20 traverse_visitbp() at traverse_visitbp+0x8d6/frame 0xfffffe003635ac30 traverse_visitbp() at traverse_visitbp+0x8d6/frame 0xfffffe003635ae40 traverse_dnode() at traverse_dnode+0x176/frame 0xfffffe003635aee0 traverse_visitbp() at traverse_visitbp+0xde8/frame 0xfffffe003635b0f0 traverse_impl() at traverse_impl+0x73a/frame 0xfffffe003635b320 traverse_dataset_resume() at traverse_dataset_resume+0x92/frame 0xfffffe003635b390 traverse_dataset() at traverse_dataset+0x43/frame 0xfffffe003635b3d0 traverse_pool() at traverse_pool+0x1da/frame 0xfffffe003635b4d0 spa_load_verify() at spa_load_verify+0x15e/frame 0xfffffe003635b580 spa_load_impl() at spa_load_impl+0x2241/frame 0xfffffe003635ba30 spa_load() at spa_load+0x301/frame 0xfffffe003635baf0 spa_load_impl() at spa_load_impl+0x1445/frame 0xfffffe003635bfa0 spa_load() at spa_load+0x301/frame 0xfffffe003635c060 spa_load_best() at spa_load_best+0xbd/frame 0xfffffe003635c100 spa_open_common() at spa_open_common+0x228/frame 0xfffffe003635c1f0 spa_open() at spa_open+0x35/frame 0xfffffe003635c220 pool_status_check() at pool_status_check+0x8e/frame 0xfffffe003635c280 zfsdev_ioctl() at zfsdev_ioctl+0xb9d/frame 0xfffffe003635c540 devfs_ioctl() at devfs_ioctl+0x187/frame 0xfffffe003635c5e0 VOP_IOCTL_APV() at VOP_IOCTL_APV+0x227/frame 0xfffffe003635c660 VOP_IOCTL() at VOP_IOCTL+0x6e/frame 0xfffffe003635c6e0 vn_ioctl() at vn_ioctl+0x1cc/frame 0xfffffe003635c820 devfs_ioctl_f() at devfs_ioctl_f+0x62/frame 0xfffffe003635c870 fo_ioctl() at fo_ioctl+0x4c/frame 0xfffffe003635c8b0 kern_ioctl() at kern_ioctl+0x361/frame 0xfffffe003635c980 sys_ioctl() at sys_ioctl+0x2af/frame 0xfffffe003635ca60 syscallenter() at syscallenter+0x4e4/frame 0xfffffe003635cb30 amd64_syscall() at amd64_syscall+0x1b/frame 0xfffffe003635cbf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe003635cbf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80143b5aa, rsp = 0x7fffffffcf78, rbp = 0x7fffffffcff0 --- KDB: enter: panic [ thread pid 44 tid 100102 ] Stopped at breakpoint+0x5: popq %rbp db> The type of CPU doesn't matter. The testbed (above) is an AMD X2 5000+ The laptop is a core i3. I'll poke around at it after tonight's change window, before bed. -- Cheers, Cy Schubert <Cy.Schubert@cschubert.com> FreeBSD UNIX: <cy@FreeBSD.org> Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201711300406.vAU46gCX012711>