From owner-freebsd-questions@FreeBSD.ORG Thu Jan 13 20:38:52 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0853A10656C7; Thu, 13 Jan 2011 20:38:52 +0000 (UTC) (envelope-from dieterbsd@engineer.com) Received: from imr-ma05.mx.aol.com (imr-ma05.mx.aol.com [64.12.100.31]) by mx1.freebsd.org (Postfix) with ESMTP id AE1198FC20; Thu, 13 Jan 2011 20:38:51 +0000 (UTC) Received: from imo-ma02.mx.aol.com (imo-ma02.mx.aol.com [64.12.78.137]) by imr-ma05.mx.aol.com (8.14.1/8.14.1) with ESMTP id p0DKSUQa003800; Thu, 13 Jan 2011 15:28:30 -0500 Received: from dieterbsd@engineer.com by imo-ma02.mx.aol.com (mail_out_v42.9.) id n.1028.3d458c0 (34918); Thu, 13 Jan 2011 15:28:26 -0500 (EST) Received: from smtprly-me03.mx.aol.com (smtprly-me03.mx.aol.com [64.12.95.104]) by cia-da03.mx.aol.com (v129.7) with ESMTP id MAILCIADA033-b2d24d2f605f35; Thu, 13 Jan 2011 15:28:26 -0500 Received: from web-mmc-d03 (web-mmc-d03.sim.aol.com [205.188.103.93]) by smtprly-me03.mx.aol.com (v129.5) with ESMTP id MAILSMTPRLYME031-b2d24d2f605f35; Thu, 13 Jan 2011 15:28:15 -0500 To: freebsd-questions@freebsd.org, freebsd-firewire@freebsd.org, freebsd-usb@freebsd.org Content-Transfer-Encoding: quoted-printable Date: Thu, 13 Jan 2011 15:28:15 -0500 X-AOL-IP: 67.206.162.81 X-MB-Message-Source: WebUI Received: from 67.206.162.81 by web-mmc-d03.sysops.aol.com (205.188.103.93) with HTTP (WebMailUI); Thu, 13 Jan 2011 15:28:15 -0500 MIME-Version: 1.0 From: dieterbsd@engineer.com X-MB-Message-Type: User Content-Type: text/plain; charset="us-ascii"; format=flowed X-Mailer: Mail.com Webmail 33069-STANDARD Message-Id: <8CD81778A4C13E7-17B0-DFA1@web-mmc-d03.sysops.aol.com> X-Spam-Flag: NO X-AOL-SENDER: dieterbsd@engineer.com Cc: freebsd@sopwith.solgatos.com Subject: lock contention problem? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jan 2011 20:38:52 -0000 I suspect that I have a problem with lock/mutex contention. Reading from a USB disk appears to lock out the firewire driver for too long, causing data transfer (writing to firewire bus) to fail with EAGAIN. Once it fails it does not recover. kernel: fwohci1: IT DMA underrun (0x40308011) (stat &=20 OHCI_CNTL_CYCMATCH_S) last message repeated 63 times This is from the end of the fwohci_itxbuf_enable() function in=20 dev/firewire/fwohci.c I added LOCK_PROFILING to the kernel and gathered some data. The data is quite verbose, so I sorted by "max" and am including the first 40 lines. This is a true real-time task, so I am concerned with the worst case rather than the average. First, data from approx 20 seconds without a problem: debug.lock.prof.enable: 0 debug.lock.prof.reset: 0 debug.lock.prof.stats: max wait_max total wait_total count avg wait_avg=20 cnt_hold cnt_lock name 16990 0 2216486 0 1837 1206 0 0= =20 0 /usr/src/sys/kern/vfs_vnops.c:533 (lockmgr:ufs) 5852 0 11669 0 18 648 0 0= =20 0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:syncer) 1659 0 5522 0 34 162 0 0= =20 0 /usr/src/sys/kern/vfs_mount.c:2231 (sleep mutex:struct mount mtx) 1540 0 30353 0 15283 1 0 0= =20 0 /usr/src/sys/kern/sys_pipe.c:574 (sleep mutex:pipe mutex) 1415 0 1885651 0 53865 35 0 0= =20 0 /usr/src/sys/vm/vm_map.c:3526 (sx:user map) 1385 0 252414 0 53865 4 0 0= =20 0 /usr/src/sys/vm/vm_fault.c:937 (sleep mutex:vm object) 1361 0 84475 0 53865 1 0 0= =20 0 /usr/src/sys/vm/vm_fault.c:938 (sleep mutex:vm page queue mutex) 1044 0 2079 0 2 1039 0 0= =20 0 /usr/src/sys/kern/vfs_mount.c:2194 (sleep mutex:struct mount mtx) 934 0 65031 0 146 445 0 0= =20 0 /usr/src/sys/vm/vm_pager.c:311 (lockmgr:bufwait) 864 0 194008 0 53865 3 0 0= =20 0 /usr/src/sys/amd64/amd64/pmap.c:2958 (sleep mutex:pmap) 864 0 192948 0 53865 3 0 0= =20 0 /usr/src/sys/amd64/amd64/pmap.c:2957 (sleep mutex:vm page queue=20 mutex) 862 0 24690 0 30 823 0 0= =20 0 /usr/src/sys/kern/kern_mutex.c:147 (sleep mutex:nfe0) 857 0 84478 0 53863 1 0 0= =20 0 /usr/src/sys/amd64/amd64/trap.c:740 (sleep mutex:process lock) 697 0 154340 0 3292 46 0 0= =20 0 /usr/src/sys/kern/vfs_bio.c:2559 (lockmgr:bufwait) 557 0 896107 0 13438 66 0 0= =20 0 /usr/src/sys/kern/vfs_bio.c:1835 (lockmgr:bufwait) 519 0 1615 0 11 146 0 0= =20 0 /usr/src/sys/ufs/ffs/ffs_vfsops.c:1321 (sleep mutex:struct mount= =20 mtx) 481 0 24417 0 13438 1 0 0= =20 0 /usr/src/sys/kern/vfs_subr.c:1595 (sleep mutex:bufobj interlock) 367 0 65371 0 9178 7 0 0= =20 0 /usr/src/sys/kern/sys_pipe.c:643 (sleep mutex:pipe mutex) 351 0 2324 0 52 44 0 0= =20 0 /usr/src/sys/kern/vfs_subr.c:2083 (lockmgr:ufs) 333 0 325772 0 53865 6 0 0= =20 0 /usr/src/sys/vm/vm_fault.c:297 (sleep mutex:vm object) 329 0 171709 0 108608 1 0 0= =20 0 /usr/src/sys/amd64/amd64/pmap.c:3989 (sleep mutex:pmap) 329 0 2090 0 534 3 0 0= =20 0 /usr/src/sys/kern/vfs_subr.c:337 (sleep mutex:struct mount mtx) 327 0 99170 0 58917 1 0 0= =20 0 /usr/src/sys/vm/vm_page.c:1052 (sleep mutex:vm page queue free=20 mutex) 305 0 514 0 4 128 0 0= =20 0 /usr/src/sys/vm/vm_object.c:541 (sleep mutex:vm object) 304 0 304 0 1 304 0 0= =20 0 /usr/src/sys/vm/uma_core.c:1565 (sleep mutex:UMA lock) 297 0 359 0 13 27 0 0= =20 0 /usr/src/sys/kern/vfs_bio.c:3611 (sleep mutex:vm object) 296 0 23341 0 983 23 0 0= =20 0 /usr/src/sys/kern/tty_ttydisc.c:467 (sleep mutex:ttymtx) 291 0 448 0 6 74 0 0= =20 0 /usr/src/sys/vm/vm_object.c:1719 (sleep mutex:vm object) 284 0 12090 0 997 12 0 0= =20 0 /usr/src/sys/kern/sys_generic.c:1446 (sleep mutex:select mtxpool) 281 0 8752 0 955 9 0 0= =20 0 /usr/src/sys/kern/sys_generic.c:1455 (sleep mutex:sellck) 265 0 796 0 3 265 0 0= =20 0 /usr/src/sys/vm/vm_glue.c:738 (sx:allproc) 251 74 26825 74 16746 1 0 0= =20 1 /usr/src/sys/kern/vfs_bio.c:1488 (sleep mutex:buf queue lock) 231 0 7470 0 2848 2 0 0= =20 0 /usr/src/sys/geom/geom_io.c:77 (sleep mutex:bio queue) 228 0 228 0 1 228 0 0= =20 0 /usr/src/sys/vm/vm_map.c:2973 (sx:user map) 227 0 543 0 5 108 0 0= =20 0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:devfs) 215 0 840 0 13 64 0 0= =20 0 /usr/src/sys/dev/random/yarrow.c:191 (sleep mutex:random reseed) 202 0 198454 0 13438 14 0 0= =20 0 /usr/src/sys/kern/vfs_bio.c:2963 (sleep mutex:vm object) 199 0 18282 0 3668 4 0 0= =20 0 /usr/src/sys/kern/sys_pipe.c:611 (sleep mutex:pipe mutex) 190 0 1606 0 800 2 0 0= =20 0 /usr/src/sys/kern/vfs_bio.c:3089 (sleep mutex:sleep mtxpool) 189 0 83829 0 53899 1 0 0= =20 0 /usr/src/sys/vm/vm_object.c:482 (sleep mutex:vm object) debug.lock.prof.rejected: 0 debug.lock.prof.skipcount: 0 debug.lock.prof.skipspin: 0 Next, data when the problem occurs: debug.lock.prof.enable: 0 debug.lock.prof.reset: 0 debug.lock.prof.stats: max wait_max total wait_total count avg wait_avg=20 cnt_hold cnt_lock name 1586932 0 10264893 0 2290 4482 0 0= =20 0 /usr/src/sys/kern/vfs_vnops.c:533 (lockmgr:ufs) 35676 0 165709 0 20 8285 0 0= =20 0 /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 (lockmgr:ufs) 32133 0 80040 0 17541 4 0 0= =20 0 /usr/src/sys/vm/vm_object.c:482 (sleep mutex:vm object) 32129 0 50459 0 276 182 0 0= =20 0 /usr/src/sys/vm/vm_object.c:720 (sleep mutex:vm page queue mutex) 30899 0 47896 0 2 23948 0 0= =20 0 /usr/src/sys/kern/vfs_syscalls.c:3508 (lockmgr:ufs) 30680 0 36901 0 17 2170 0 0= =20 0 /usr/src/sys/kern/imgact_elf.c:774 (lockmgr:ufs) 26779 0 912921 0 9351 97 0 0= =20 0 /usr/src/sys/kern/vfs_bio.c:1835 (lockmgr:bufwait) 25691 0 193858 0 1850 104 0 0= =20 0 /usr/src/sys/kern/vfs_subr.c:2083 (lockmgr:ufs) 25596 0 592159 0 1205 491 0 0= =20 0 /usr/src/sys/vm/vm_pager.c:311 (lockmgr:bufwait) 24791 0 95678 0 24 3986 0 0= =20 0 /usr/src/sys/kern/kern_sysctl.c:1513 (sx:sysctl mem) 24788 0 250975 0 10563 23 0 0= =20 0 /usr/src/sys/kern/kern_sysctl.c:1521 (sx:sysctl lock) 23977 0 68951 0 538 128 0 0= =20 0 /usr/src/sys/geom/geom_event.c:185 (sx:GEOM topology) 20302 0 112442 0 6539 17 0 0= =20 0 /usr/src/sys/kern/kern_sysctl.c:1417 (sleep mutex:Giant) 18552 0 18552 0 1 18552 0 0= =20 0 /usr/src/sys/kern/kern_descrip.c:2586 (sx:allproc) 17688 0 17917 0 150 119 0 0= =20 0 /usr/src/sys/kern/kern_descrip.c:1580 (sleep mutex:fdesc) 14813 0 1285697 0 15257 84 0 0= =20 0 /usr/src/sys/kern/vfs_bio.c:2559 (lockmgr:bufwait) 5904 0 16455 0 34 483 0 0= =20 0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:syncer) 5715 0 6088 0 8 761 0 0= =20 0 /usr/src/sys/kern/vfs_vnops.c:604 (lockmgr:ufs) 5464 0 5464 0 1 5464 0 0= =20 0 /usr/src/sys/ufs/ufs/ufs_dirhash.c:216 (sx:dirhash) 4690 0 9137 0 2 4568 0 0= =20 0 /usr/src/sys/vm/vm_meter.c:113 (sleep mutex:vm object_list) 3304 0 18601 0 6 3100 0 0= =20 0 /usr/src/sys/kern/kern_proc.c:1079 (sx:allproc) 3041 0 5969 0 2 2984 0 0= =20 0 /usr/src/sys/vm/vm_meter.c:130 (sx:allproc) 2964 0 35694 0 127 281 0 0= =20 0 /usr/src/sys/vm/vm_map.c:2766 (sleep mutex:system map) 2474 0 28012 0 127 220 0 0= =20 0 /usr/src/sys/vm/vm_kern.c:289 (sleep mutex:system map) 2362 0 4716 0 2 2358 0 0= =20 0 /usr/src/sys/kern/kern_malloc.c:777 (sleep mutex:malloc) 2343 0 8922 0 64 139 0 0= =20 0 /usr/src/sys/kern/vfs_mount.c:2231 (sleep mutex:struct mount mtx) 1813 0 3021 0 2 1510 0 0= =20 0 /usr/src/sys/kern/kern_linker.c:2122 (sx:kernel linker) 1588 0 622646 0 16428 37 0 0= =20 0 /usr/src/sys/vm/vm_map.c:3526 (sx:user map) 1544 0 4025 0 15 268 0 0= =20 0 /usr/src/sys/vm/vm_map.c:2139 (sx:user map) 1537 0 4331 0 47 92 0 0= =20 0 /usr/src/sys/vm/vm_map.c:2766 (sx:user map) 1529 0 3991 0 437 9 0 0= =20 0 /usr/src/sys/vm/vm_fault.c:1017 (sleep mutex:vm object) 1501 0 16994 0 127 133 0 0= =20 0 /usr/src/sys/vm/vm_kern.c:394 (sleep mutex:vm object) 1138 0 44043 0 53 831 0 0= =20 0 /usr/src/sys/kern/kern_mutex.c:147 (sleep mutex:nfe0) 1061 0 142103 0 9351 15 0 0= =20 0 /usr/src/sys/kern/vfs_bio.c:2963 (sleep mutex:vm object) 1046 0 99060 0 59009 1 0 0= =20 0 /usr/src/sys/vm/vm_page.c:1052 (sleep mutex:vm page queue free=20 mutex) 1032 0 2640 0 18 146 0 0= =20 0 /usr/src/sys/ufs/ffs/ffs_vfsops.c:1321 (sleep mutex:struct mount= =20 mtx) 974 0 57222 0 34916 1 0 0= =20 0 /usr/src/sys/amd64/amd64/pmap.c:3989 (sleep mutex:pmap) 964 0 3167 0 1203 2 0 0= =20 0 /usr/src/sys/kern/tty_pts.c:126 (sleep mutex:ttymtx) 952 0 9113 0 127 71 0 0= =20 0 /usr/src/sys/vm/vm_kern.c:325 (sleep mutex:vm object) 892 0 79906 0 16428 4 0 0= =20 0 /usr/src/sys/vm/vm_fault.c:937 (sleep mutex:vm object) debug.lock.prof.rejected: 0 debug.lock.prof.skipcount: 0 debug.lock.prof.skipspin: 0 If I understand this, something was locked out for 1.58 seconds? YIKES! grep -i giant /var/run/dmesg.boot atkbd0: [GIANT-LOCKED] So the only thing still using the giant lock is the keyboard? I only use this keyboard with the firmware. I rsh in from another=20 machine. FreeBSD 8.0 amd64 nforce4-ultra chipset (USB controller) NEC firewire controller on PCI card fwohci1: mem 0xfdeff000-0xfdefffff irq 19 at device=20 8.0 on pci2 fwohci1: [ITHREAD] fwohci1: OHCI version 1.0 (ROM=3D1) fwohci1: No. of Isochronous channels is 4. fwohci1: Phy 1394a available S400, 2 ports. fwohci1: max_rec 1024 -> 2048 firewire1: on fwohci1 dcons_crom1: on firewire1 dcons_crom1: bus_addr 0x7c31c000 dcons_crom1: dcons_paddr is already set ehci0: mem 0xfebfe000-0xfebfe0ff=20 irq 22 at device 2.1 on pci0 ehci0: [ITHREAD] usbus1: EHCI version 1.0 addr 0x61 on iicbus1 usbus1: on ehci0 usbd_set_config_index:523: could not read device status:=20 USB_ERR_SHORT_XFER ugen1.2: at usbus1 umass0: on usbus1 umass0: SCSI over Bulk-Only; quirks =3D 0x0000 umass0:6:0:-1: Attached to scbus6 da0 at umass-sim0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers da0: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)