Date: Tue, 16 Jun 2009 15:01:45 -0800 From: Mel Flynn <mel.flynn+fbsd.hackers@mailing.thruhere.net> To: freebsd-hackers@freebsd.org Subject: Re: How best to debug locking/scheduler problems Message-ID: <200906161501.45411.mel.flynn%2Bfbsd.hackers@mailing.thruhere.net> In-Reply-To: <200906161502.42741.jhb@freebsd.org> References: <200906151353.06630.mel.flynn%2Bfbsd.hackers@mailing.thruhere.net> <200906160952.24895.mel.flynn%2Bfbsd.hackers@mailing.thruhere.net> <200906161502.42741.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 16 June 2009 11:02:42 John Baldwin wrote: > On Tuesday 16 June 2009 1:52:23 pm Mel Flynn wrote: > > Hi John, > > > > On Tuesday 16 June 2009 04:19:57 John Baldwin wrote: > > > On Monday 15 June 2009 5:53:05 pm Mel Flynn wrote: > > > > PID TID COMM TDNAME KSTACK > > > > 4283 100215 kdeinit4 - mi_switch > > > > turnstile_wait _mtx_lock_sleep uipc_peeraddr kern_getpeername > > > > getpeername syscall Xint0x80_syscall > > > > % ps -ww 4283 > > > > PID TT STAT TIME COMMAND > > > > 4283 ?? T 0:00.38 kdeinit4: kdeinit4: kio_http http > > > > local:/tmp/ksocket-mel/klauncherxJ1635.slave-socket > > > > local:/tmp/ksocket- mel/plasmayC1653.slave-socket (kdeinit4) > > > > > > > > %ls -l /tmp/ksocket-mel/ > > > > > > > > total 2 > > > > -rw-rw-r-- 1 mel wheel 62 Jun 14 22:55 KSMserver__0 > > > > srw------- 1 mel wheel 0 Jun 14 22:55 kdeinit4__0 > > > > srwxrwxr-x 1 mel wheel 0 Jun 14 22:55 > > > > klauncherxJ1635.slave-socket > > > > > > You can use kgdb and the scripts at www.freebsd.org/~jhb/gdb. Simply > > > run 'kgdb' as root and do 'lcd /folder/with/scripts' and 'source gdb6'. > > > You can then do 'lockchain 4283' to find who holds the lock this thread > > > is blocked on and what state they are in. > > > > Looks like a deadlock: > > > > (kgdb) lockchain 4283 > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 "unp_mtx" > > thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 "unp_mtx" > > DEADLOCK > > > > Looking through the scripts now to see how I can get more info on the > > call chain and hoping I don't panic the machine ;). It is quite random to > > reproduce. > > In kgdb you can simply do 'tid 100122' followed by 'bt' and 'tid 100215' > followed by 'bt'. Cool, thanks for helping John. Of course it pretty much shows me what procstat -k shows and can't get any info on the userland part, but I can fully inspect the locks and threads. Both threads are in TDS_INHIBITED state, and blocked on: (kgdb) frame 0 #0 sched_switch (td=0xc5971240, newtd=0xc4d39900, flags=259) at /usr/src/sys/kern/sched_ule.c:1864 1864 cpuid = PCPU_GET(cpuid); print newtd->td_name $9 = "idle: cpu0\000\000\000\000\000\000\000\000\000" Is there anything you want to see to shed some light on why these threads might be deadlocked? This is a 8-current kernel, seen this issue for a while (<March for sure) but the running one is FreeBSD 8.0-CURRENT #2 r194183: Sun Jun 14 15:09:27 AKDT 2009. Not a GENERIC, basically stripped I[45]86_CPU, SCTP, unused hardware, no PRINTF_BUFFER_SIZE, added wpi, ichsmb and smbus, mmc, mmcsd and sdhci, HWPMC. Config inlined below sig. -- Mel # # GENERIC -- Generic kernel configuration file for FreeBSD/i386 # # For more information on this file, please read the config(5) manual page, # and/or the handbook section on Kernel Configuration Files: # # http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig- config.html # # The handbook is also available locally in /usr/share/doc/handbook # if you've installed the doc distribution, otherwise always see the # FreeBSD World Wide Web server (http://www.FreeBSD.org/) for the # latest information. # # An exhaustive list of options and more detailed explanations of the # device lines is also present in the ../../conf/NOTES and NOTES files. # If you are in doubt as to the purpose or necessity of a line, check first # in NOTES. # # Used: # $FreeBSD: src/sys/i386/conf/GENERIC,v 1.511 2009/03/19 20:33:26 thompsa Exp $ # $FreeBSD: src/sys/conf/NOTES,v 1.1534 2009/04/15 22:38:22 marcel Exp $ # # This file: # $Coar: kernels/8.x/i386/SMOOCHIES,v 1.1 2009/04/17 11:50:10 mel Exp $ cpu I686_CPU ident SMOOCHIES # To statically compile in device wiring instead of /boot/device.hints #hints "GENERIC.hints" # Default places to look for devices. # Use the following to compile in values accessible to the kernel # through getenv() (or kenv(1) in userland). The format of the file # is 'variable=value', see kenv(1) # # env "GENERIC.env" makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols # # MAXPHYS and DFLTPHYS # # These are the max and default 'raw' I/O block device access sizes. # Reads and writes will be split into DFLTPHYS chunks. Some applications # have better performance with larger raw I/O access sizes. Typically # MAXPHYS should be twice the size of DFLTPHYS. Note that certain VM # parameters are derived from these values and making them too large # can make an an unbootable kernel. # # The defaults are 64K and 128K respectively. #options DFLTPHYS=(128*1024) #options MAXPHYS=(256*1024) options INCLUDE_CONFIG_FILE # Include this file in kernel options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL # Enable gjournal-based UFS journaling options MD_ROOT # MD is a potential root device options NFSCLIENT # Network Filesystem Client options NFSSERVER # Network Filesystem Server options NFSLOCKD # Network Lock Manager options NFS_ROOT # NFS usable as /, requires NFSCLIENT options MSDOSFS # MSDOS Filesystem options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization options COMPAT_43TTY # BSD 4.3 TTY compat (sgtty) options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD5 # Compatible with FreeBSD5 options COMPAT_FREEBSD6 # Compatible with FreeBSD6 options COMPAT_FREEBSD7 # Compatible with FreeBSD7 options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options STACK # stack(9) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV # install a CDEV entry in /dev options STOP_NMI # Stop CPUS using NMI instead of IPI options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4) options AUDIT # Security event auditing #options KDTRACE_HOOKS # Kernel DTrace hooks # To make an SMP kernel, the next two lines are needed options SMP # Symmetric MultiProcessor Kernel device apic # I/O APIC # CPU frequency control device cpufreq # Bus support. device acpi device eisa device pci # ATA and ATAPI devices device ata device atadisk # ATA disk drives device ataraid # ATA RAID drives device atapicd # ATAPI CDROM drives device atapifd # ATAPI floppy drives device atapist # ATAPI tape drives options ATA_STATIC_ID # Static device numbering # SCSI peripherals device scbus # SCSI bus (required for SCSI) device ch # SCSI media changers device da # Direct Access (disks) device sa # Sequential Access (tape etc) device cd # CD device pass # Passthrough device (direct SCSI access) device ses # SCSI Environmental Services (and SAF-TE) # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver device splash # Splash screen and screen saver support # syscons is the default console driver, resembling an SCO console device sc device agp # support several AGP chipsets # Power management support (see NOTES for more options) #device apm # Add suspend/resume support for the i8254. device pmtimer # PCCARD (PCMCIA) support # PCMCIA and cardbus bridge support device cbb # cardbus (yenta) bridge device pccard # PC Card (16-bit) bus device cardbus # CardBus (32-bit) bus # Serial (COM) ports device uart # Generic UART driver # If you've got a "dumb" serial or parallel PCI card that is # supported by the puc(4) glue driver, uncomment the following # line to enable it (connects to sio, uart and/or ppc drivers): #device puc # PCI Ethernet NICs. device em # Intel PRO/1000 Gigabit Ethernet Family device igb # Intel PRO/1000 PCIE Server Gigabit Family device ixgb # Intel PRO/10GbE Ethernet Card # PCI Ethernet NICs that use the common MII bus controller code. # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs! device miibus # MII bus support # Wireless NIC cards device wlan # 802.11 support options IEEE80211_DEBUG # enable debug msgs options IEEE80211_AMPDU_AGE # age frames in AMPDU reorder q's device wlan_wep # 802.11 WEP support device wlan_ccmp # 802.11 CCMP support device wlan_tkip # 802.11 TKIP support device wlan_amrr # AMRR transmit rate control algorithm device wpi # Intel 3945ABG # Pseudo devices. device loop # Network loopback device random # Entropy device device ether # Ethernet support device tun # Packet tunnel. device pty # BSD-style compatibility pseudo ttys device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translation) device firmware # firmware assist module # The `bpf' device enables the Berkeley Packet Filter. # Be aware of the administrative consequences of enabling this! # Note that 'bpf' is required for DHCP. device bpf # Berkeley packet filter # USB support device uhci # UHCI PCI->USB interface device ohci # OHCI PCI->USB interface device ehci # EHCI PCI->USB interface (USB 2.0) device usb # USB Bus (required) #device udbp # USB Double Bulk Pipe devices device uhid # "Human Interface Devices" device ukbd # Keyboard device ulpt # Printer device umass # Disks/Mass storage - Requires scbus and da device ums # Mouse device urio # Diamond Rio 500 MP3 player # USB Serial devices device u3g # USB-based 3G modems (Option, Huawei, Sierra) device uark # Technologies ARK3116 based serial adapters device ubsa # Belkin F5U103 and compatible serial adapters device uftdi # For FTDI usb serial adapters device uipaq # Some WinCE based devices device uplcom # Prolific PL-2303 serial adapters device uslcom # SI Labs CP2101/CP2102 serial adapters device uvisor # Visor and Palm devices device uvscom # USB serial support for DDI pocket's PHS # USB Ethernet, requires miibus device aue # ADMtek USB Ethernet device axe # ASIX Electronics USB Ethernet device cdce # Generic USB over Ethernet device cue # CATC USB Ethernet device kue # Kawasaki LSI USB Ethernet device rue # RealTek RTL8150 USB Ethernet device udav # Davicom DM9601E USB # FireWire support device firewire # FireWire bus code device sbp # SCSI over FireWire (Requires scbus and da) device fwe # Ethernet over FireWire (non-standard!) device fwip # IP over FireWire (RFC 2734,3146) device dcons # Dumb console driver device dcons_crom # Configuration ROM for dcons # # SMB bus # # System Management Bus support is provided by the 'smbus' device. # Access to the SMBus device is via the 'smb' device (/dev/smb*), # which is a child of the 'smbus' device. # # Supported devices: # smb standard I/O through /dev/smb* # # Supported SMB interfaces: # iicsmb I2C to SMB bridge with any iicbus interface # bktr brooktree848 I2C hardware interface # intpm Intel PIIX4 (82371AB, 82443MX) Power Management Unit # alpm Acer Aladdin-IV/V/Pro2 Power Management Unit # ichsmb Intel ICH SMBus controller chips (82801AA, 82801AB, 82801BA) # viapm VIA VT82C586B/596B/686A and VT8233 Power Management Unit # amdpm AMD 756 Power Management Unit # amdsmb AMD 8111 SMBus 2.0 Controller # nfpm NVIDIA nForce Power Management Unit # nfsmb NVIDIA nForce2/3/4 MCP SMBus 2.0 Controller # device smbus # Bus support, required for smb below. device ichsmb # # MMC/SD # # mmc MMC/SD bus # mmcsd MMC/SD memory card # sdhci Generic PCI SD Host Controller # device mmc device mmcsd device sdhci
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200906161501.45411.mel.flynn%2Bfbsd.hackers>