From owner-freebsd-usb@FreeBSD.ORG Mon Jun 13 09:10:10 2011 Return-Path: Delivered-To: freebsd-usb@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 445CD106564A for ; Mon, 13 Jun 2011 09:10:10 +0000 (UTC) (envelope-from hselasky@c2i.net) Received: from swip.net (mailfe02.c2i.net [212.247.154.34]) by mx1.freebsd.org (Postfix) with ESMTP id AA8D88FC20 for ; Mon, 13 Jun 2011 09:10:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mailfe02.swip.net X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=ALL_TRUSTED,BAYES_05, URIBL_RED autolearn=ham version=3.3.1 X-TFF-CGPSA-Version: 1.7 X-T2-CGPSA-Filter: Scanned X-Cloudmark-Score: 0.000000 [] X-Cloudmark-Analysis: v=1.1 cv=6YwywTXDltnW4viG0YW+xGnSOPPSCXZBKiVv2UswsZM= c=1 sm=1 a=SvYTsOw2Z4kA:10 a=M_bvHRtf90sA:10 a=WQU8e4WWZSUA:10 a=8nJEP1OIZ-IA:10 a=CL8lFSKtTFcA:10 a=i9M/sDlu2rpZ9XS819oYzg==:17 a=708nMKq4AAAA:20 a=eN2XhXH0tt_xzXMuM0sA:9 a=TTkS9SePSf382y841yIA:7 a=wPNLvfGTeEIA:10 a=i9M/sDlu2rpZ9XS819oYzg==:117 Received: from [188.126.198.129] (account mc467741@c2i.net HELO laptop002.hselasky.homeunix.org) by mailfe02.swip.net (CommuniGate Pro SMTP 5.2.19) with ESMTPA id 139519459; Mon, 13 Jun 2011 11:10:05 +0200 From: Hans Petter Selasky To: Charles Sprickman Date: Mon, 13 Jun 2011 11:08:42 +0200 User-Agent: KMail/1.13.5 (FreeBSD/8.2-STABLE; KDE/4.4.5; amd64; ; ) References: <201106122245.44386.hselasky@c2i.net> In-Reply-To: X-Face: *nPdTl_}RuAI6^PVpA02T?$%Xa^>@hE0uyUIoiha$pC:9TVgl.Oq, NwSZ4V"|LR.+tj}g5 %V,x^qOs~mnU3]Gn; cQLv&.N>TrxmSFf+p6(30a/{)KUU!s}w\IhQBj}[g}bj0I3^glmC( :AuzV9:.hESm-x4h240C`9=w MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201106131108.42959.hselasky@c2i.net> Cc: freebsd-usb@freebsd.org Subject: Re: ucom/uftdi high interrupt load X-BeenThere: freebsd-usb@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD support for USB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jun 2011 09:10:10 -0000 On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote: > On Sun, 12 Jun 2011, Hans Petter Selasky wrote: > > On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote: > >> Hello, > >> > >> We ran into an odd problem last week with our serial consoles after > >> moving the USB to serial adapters from an old 4.11 box to a box running > >> 8.1. We have two boxes that incorporate (I assume) hubs and a bunch of > >> FTDI serial interfaces. One has 16 ports, the other 8. Each is > >> plugged directly into a USB port on the rear of the mainboard. We run > >> conserver[1] to handle access to the serial ports. From what I've > >> observed, this application opens the ports when the daemon starts - it > >> logs any output (handy for panics, or anything else that might spit > >> interesting info to the console) and waits for clients to connect to > >> it. > >> > >> Everything had been working fine for a few weeks. The box was rebooted > >> recently to enable PostgreSQL to start normally (bumped SHM stuff in > >> loader.conf). After six days, we found that the consoles were > >> unresponsive. Restarting conserver brought us this each time we > >> connected to a console for full read/write access: > >> > >> [Thu Jun 9 10:04:59 2011] conserver (50113): ERROR: [h22] > >> open(/dev/ttyU4): Interrupted system call: forcing down > >> [Thu Jun 9 10:04:59 2011] conserver (50112): ERROR: [h21] > >> open(/dev/ttyU11): Interrupted system call: forcing down > >> > >> All devices still appeared in /dev. Stopping conserver and confirming > >> it and all child processes were gone and then using picocom and cu > >> yielded no response on the serial ports. > >> > >> We also found (after the fact) that around the time the consoles became > >> unresponsive, cpu usage went to nearly 90% and was mostly in the kernel > >> process "intr": > >> > >> root 12 70.5 0.0 0 136 ?? WL Fri12AM 120:01.47 [intr] > >> > >> A graph showing cpu usage (red is "system"): > >> http://i.imgur.com/0yO5l.png > >> > >> I should note that we know the cpu spike and devices becoming > >> unresponsive can be correlated because one of the serial ports runs a > >> temperature monitor which is tied into our monitoring. When the data > >> goes stale, we get notified. > >> > >> Issuing a "usbconfig -u 0 reset" caused all devices except for the root > >> hub to disappear and not come back. CPU usage also dipped a bit after > >> that. Rebooting was the only way to resolve the issue - perhaps > >> plugging and unplugging would have worked, but that's a bit too complex > >> for our remote hands. > >> > >> I can supply full dmesg and more, but for now, here's a summary of the > >> usb info from dmesg: > >> > >> FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010 > >> > >> ohci0: mem 0xfe9fc000-0xfe9fcfff irq 10 > >> at device 15.2 on pci0 > >> ohci0: [ITHREAD] > >> ... > >> usbus0: on ohci0 > >> usbus0: 12Mbps Full Speed USB v1.0 > >> ugen0.1: <(0x1166)> at usbus0 > >> ... > >> uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on > >> usbus0 > >> uhub0: 4 ports with 4 removable, self powered > >> ugen0.2: at usbus0 > >> uhub1: on > >> usbus0 uhub1: 7 ports with 7 removable, self powered > >> ugen0.3: at usbus0 > >> uftdi0: on usbus0 > >> uftdi1: on usbus0 > >> ugen0.4: at usbus0 > >> uftdi2: on usbus0 > >> uftdi3: on usbus0 > >> ugen0.5: at usbus0 > >> uftdi4: on usbus0 > >> uftdi5: on usbus0 > >> ugen0.6: at usbus0 > >> uftdi6: on usbus0 > >> uftdi7: on usbus0 > >> ugen0.7: at usbus0 > >> uftdi8: on usbus0 > >> uftdi9: on usbus0 > >> ugen0.8: at usbus0 > >> uftdi10: on usbus0 > >> uftdi11: on usbus0 > >> ugen0.9: at usbus0 > >> uhub2: on > >> usbus0 uhub2: 4 ports with 4 removable, self powered > >> ugen0.10: at usbus0 > >> uftdi12: on usbus0 > >> uftdi13: on usbus0 > >> ugen0.11: at usbus0 > >> ... (mangling below is as it appears in dmesg) > >> da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14: >> ADAPTER> on usbus0 > >> da1: Fixed Direct Access SCSI-3 device > >> uftdi15: on usbus0 > >> ... > >> Root mount waiting for: usbus0 > >> ugen0.12: at usbus0 > >> uhub3: > >> on usbus0 > >> Root mount waiting for: usbus0 > >> uhub3: 7 ports with 7 removable, self powered > >> ugen0.13: at usbus0 > >> uftdi16: on usbus0 > >> Root mount waiting for: usbus0 > >> ugen0.14: at usbus0 > >> uftdi17: on usbus0 > >> Root mount waiting for: usbus0 > >> ugen0.15: at usbus0 > >> uftdi18: on usbus0 > >> > >> ugen0.16: at usbus0Root mount waiting for: > >> usbus0 > >> > >> uftdi19: on usbus0 > >> ugen0.17: at usbus0 > >> uftdi20: on usbus0 > >> Root mount waiting for: usbus0 > >> ugen0.18: at usbus0 > >> uftdi21: on usbus0 > >> Root mount waiting for: usbus0 > >> ugen0.19: at usbus0 > >> uhub4: > >> on usbus0 > >> uhub4: 4 ports with 4 removable, self powered > >> Root mount waiting for: usbus0 > >> ugen0.20: at usbus0 > >> uftdi22: on usbus0 > >> Root mount waiting for: usbus0 > >> ugen0.21: at usbus0 > >> uftdi23: on usbus0 > >> Trying to mount root from zfs:zroot > >> > >> Thanks, > >> > >> Charles > > > > Hi, > > > > Try to get output from vmstat -i. > > > > Also try to set the: > > > > hw.usb.ehci.iaadbug=1 > > > > and > > > > hw.usb.ehci.lostintrbug=1 > > > > in /boot/loader.conf > > Quick question - this host only has USB 1.1 - so I don't think I have > ehci, do I? To be honest, I'm not totally clear on what's what as far as > uhci, ehci, and ohci. This is what's seen in sysctl output related to > USB: > > [spork@h12 ~]$ sysctl -a|grep usb|more > hw.pci.usb_early_takeover: 1 > hw.usb.no_boot_wait: 0 > hw.usb.debug: 0 > hw.usb.usb_lang_mask: 255 > hw.usb.usb_lang_id: 9 > hw.usb.template: 0 > hw.usb.power_timeout: 30 > hw.usb.ucom.cons_baud: 9600 > hw.usb.ucom.cons_unit: -1 > dev.usbus.0.%desc: OHCI (generic) USB controller > dev.usbus.0.%driver: usbus > dev.usbus.0.%parent: ohci0 > dev.uhub.0.%parent: usbus0 Ok, then those quirks won't help. For OHCI, I guess you should check vmstat -i. --HPS