Date: Mon, 13 Jun 2011 20:58:44 -0400 (EDT) From: Charles Sprickman <spork@bway.net> To: Hans Petter Selasky <hselasky@c2i.net> Cc: freebsd-usb@freebsd.org Subject: Re: ucom/uftdi high interrupt load Message-ID: <alpine.OSX.2.00.1106132045330.12822@freemac> In-Reply-To: <201106131108.42959.hselasky@c2i.net> References: <alpine.OSX.2.00.1106111715580.99056@hotlap.nat.fasttrackmonkey.com> <201106122245.44386.hselasky@c2i.net> <alpine.OSX.2.00.1106121745020.99056@hotlap.nat.fasttrackmonkey.com> <201106131108.42959.hselasky@c2i.net>
index | next in thread | previous in thread | raw e-mail
On Mon, 13 Jun 2011, Hans Petter Selasky wrote: > On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote: >> On Sun, 12 Jun 2011, Hans Petter Selasky wrote: >>> On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote: >>>> Hello, >>>> >>>> We ran into an odd problem last week with our serial consoles after >>>> moving the USB to serial adapters from an old 4.11 box to a box running >>>> 8.1. We have two boxes that incorporate (I assume) hubs and a bunch of >>>> FTDI serial interfaces. One has 16 ports, the other 8. Each is >>>> plugged directly into a USB port on the rear of the mainboard. We run >>>> conserver[1] to handle access to the serial ports. From what I've >>>> observed, this application opens the ports when the daemon starts - it >>>> logs any output (handy for panics, or anything else that might spit >>>> interesting info to the console) and waits for clients to connect to >>>> it. >>>> >>>> Everything had been working fine for a few weeks. The box was rebooted >>>> recently to enable PostgreSQL to start normally (bumped SHM stuff in >>>> loader.conf). After six days, we found that the consoles were >>>> unresponsive. Restarting conserver brought us this each time we >>>> connected to a console for full read/write access: >>>> >>>> [Thu Jun 9 10:04:59 2011] conserver (50113): ERROR: [h22] >>>> open(/dev/ttyU4): Interrupted system call: forcing down >>>> [Thu Jun 9 10:04:59 2011] conserver (50112): ERROR: [h21] >>>> open(/dev/ttyU11): Interrupted system call: forcing down >>>> >>>> All devices still appeared in /dev. Stopping conserver and confirming >>>> it and all child processes were gone and then using picocom and cu >>>> yielded no response on the serial ports. >>>> >>>> We also found (after the fact) that around the time the consoles became >>>> unresponsive, cpu usage went to nearly 90% and was mostly in the kernel >>>> process "intr": >>>> >>>> root 12 70.5 0.0 0 136 ?? WL Fri12AM 120:01.47 [intr] >>>> >>>> A graph showing cpu usage (red is "system"): >>>> http://i.imgur.com/0yO5l.png >>>> >>>> I should note that we know the cpu spike and devices becoming >>>> unresponsive can be correlated because one of the serial ports runs a >>>> temperature monitor which is tied into our monitoring. When the data >>>> goes stale, we get notified. >>>> >>>> Issuing a "usbconfig -u 0 reset" caused all devices except for the root >>>> hub to disappear and not come back. CPU usage also dipped a bit after >>>> that. Rebooting was the only way to resolve the issue - perhaps >>>> plugging and unplugging would have worked, but that's a bit too complex >>>> for our remote hands. >>>> >>>> I can supply full dmesg and more, but for now, here's a summary of the >>>> usb info from dmesg: >>>> >>>> FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010 >>>> >>>> ohci0: <OHCI (generic) USB controller> mem 0xfe9fc000-0xfe9fcfff irq 10 >>>> at device 15.2 on pci0 >>>> ohci0: [ITHREAD] >>>> ... >>>> usbus0: <OHCI (generic) USB controller> on ohci0 >>>> usbus0: 12Mbps Full Speed USB v1.0 >>>> ugen0.1: <(0x1166)> at usbus0 >>>> ... >>>> uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on >>>> usbus0 >>>> uhub0: 4 ports with 4 removable, self powered >>>> ugen0.2: <vendor 0x05e3> at usbus0 >>>> uhub1: <vendor 0x05e3 USB Hub, class 9/0, rev 1.01/0.11, addr 2> on >>>> usbus0 uhub1: 7 ports with 7 removable, self powered >>>> ugen0.3: <FTDI> at usbus0 >>>> uftdi0: <USB FAST SERIAL ADAPTER> on usbus0 >>>> uftdi1: <USB FAST SERIAL ADAPTER> on usbus0 >>>> ugen0.4: <FTDI> at usbus0 >>>> uftdi2: <USB FAST SERIAL ADAPTER> on usbus0 >>>> uftdi3: <USB FAST SERIAL ADAPTER> on usbus0 >>>> ugen0.5: <FTDI> at usbus0 >>>> uftdi4: <USB FAST SERIAL ADAPTER> on usbus0 >>>> uftdi5: <USB FAST SERIAL ADAPTER> on usbus0 >>>> ugen0.6: <FTDI> at usbus0 >>>> uftdi6: <USB FAST SERIAL ADAPTER> on usbus0 >>>> uftdi7: <USB FAST SERIAL ADAPTER> on usbus0 >>>> ugen0.7: <FTDI> at usbus0 >>>> uftdi8: <USB FAST SERIAL ADAPTER> on usbus0 >>>> uftdi9: <USB FAST SERIAL ADAPTER> on usbus0 >>>> ugen0.8: <FTDI> at usbus0 >>>> uftdi10: <USB FAST SERIAL ADAPTER> on usbus0 >>>> uftdi11: <USB FAST SERIAL ADAPTER> on usbus0 >>>> ugen0.9: <vendor 0x05e3> at usbus0 >>>> uhub2: <vendor 0x05e3 USB Hub, class 9/0, rev 1.01/0.12, addr 9> on >>>> usbus0 uhub2: 4 ports with 4 removable, self powered >>>> ugen0.10: <FTDI> at usbus0 >>>> uftdi12: <USB FAST SERIAL ADAPTER> on usbus0 >>>> uftdi13: <USB FAST SERIAL ADAPTER> on usbus0 >>>> ugen0.11: <FTDI> at usbus0 >>>> ... (mangling below is as it appears in dmesg) >>>> da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14: <USB FAST SERIAL >>>> ADAPTER> on usbus0 >>>> da1: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device >>>> uftdi15: <USB FAST SERIAL ADAPTER> on usbus0 >>>> ... >>>> Root mount waiting for: usbus0 >>>> ugen0.12: <vendor 0x0409> at usbus0 >>>> uhub3: <vendor 0x0409 product 0x0050, class 9/0, rev 2.00/1.00, addr 12> >>>> on usbus0 >>>> Root mount waiting for: usbus0 >>>> uhub3: 7 ports with 7 removable, self powered >>>> ugen0.13: <FTDI> at usbus0 >>>> uftdi16: <FT232R USB UART> on usbus0 >>>> Root mount waiting for: usbus0 >>>> ugen0.14: <FTDI> at usbus0 >>>> uftdi17: <FT232R USB UART> on usbus0 >>>> Root mount waiting for: usbus0 >>>> ugen0.15: <FTDI> at usbus0 >>>> uftdi18: <FT232R USB UART> on usbus0 >>>> >>>> ugen0.16: <FTDI> at usbus0Root mount waiting for: >>>> usbus0 >>>> >>>> uftdi19: <FT232R USB UART> on usbus0 >>>> ugen0.17: <FTDI> at usbus0 >>>> uftdi20: <FT232R USB UART> on usbus0 >>>> Root mount waiting for: usbus0 >>>> ugen0.18: <FTDI> at usbus0 >>>> uftdi21: <FT232R USB UART> on usbus0 >>>> Root mount waiting for: usbus0 >>>> ugen0.19: <vendor 0x0409> at usbus0 >>>> uhub4: <vendor 0x0409 product 0x005a, class 9/0, rev 2.00/1.00, addr 19> >>>> on usbus0 >>>> uhub4: 4 ports with 4 removable, self powered >>>> Root mount waiting for: usbus0 >>>> ugen0.20: <FTDI> at usbus0 >>>> uftdi22: <FT232R USB UART> on usbus0 >>>> Root mount waiting for: usbus0 >>>> ugen0.21: <FTDI> at usbus0 >>>> uftdi23: <FT232R USB UART> on usbus0 >>>> Trying to mount root from zfs:zroot >>>> >>>> Thanks, >>>> >>>> Charles >>> >>> Hi, >>> >>> Try to get output from vmstat -i. >>> >>> Also try to set the: >>> >>> hw.usb.ehci.iaadbug=1 >>> >>> and >>> >>> hw.usb.ehci.lostintrbug=1 >>> >>> in /boot/loader.conf >> >> Quick question - this host only has USB 1.1 - so I don't think I have >> ehci, do I? To be honest, I'm not totally clear on what's what as far as >> uhci, ehci, and ohci. This is what's seen in sysctl output related to >> USB: >> >> [spork@h12 ~]$ sysctl -a|grep usb|more >> hw.pci.usb_early_takeover: 1 >> hw.usb.no_boot_wait: 0 >> hw.usb.debug: 0 >> hw.usb.usb_lang_mask: 255 >> hw.usb.usb_lang_id: 9 >> hw.usb.template: 0 >> hw.usb.power_timeout: 30 >> hw.usb.ucom.cons_baud: 9600 >> hw.usb.ucom.cons_unit: -1 >> dev.usbus.0.%desc: OHCI (generic) USB controller >> dev.usbus.0.%driver: usbus >> dev.usbus.0.%parent: ohci0 >> dev.uhub.0.%parent: usbus0 > > Ok, then those quirks won't help. > > For OHCI, I guess you should check vmstat -i. Oddly enough, the box paniced today, but it appeared to be related to fxp. However in the coredump summary, I have "vmstat -i" output, and ohci seems fairly high in comparison to everything else: vmstat -i interrupt total rate irq4: uart0 106 0 irq10: ohci0 142322001 968176 irq14: ata0 1178 8 irq20: fxp0 3008691 20467 irq21: fxp1 1733357 11791 irq28: sym1 30 0 irq29: sym0 2624749 17855 cpu0: timer 728063100 4952810 cpu1: timer 728044684 4952684 Total 1605797896 10923795 Also, just a brief summary of the panic, since it mentions the interrupt process again: #7 0x8059139b in fxp_new_rfabuf (sc=0x8564c000, rxp=0x8564c1c0) at /usr/src/sys/dev/fxp/if_fxp.c:2611 #8 0x8059285b in fxp_intr (xsc=0x8564c000) at /usr/src/sys/dev/fxp/if_fxp.c:1931 #9 0x8067b1db in intr_event_execute_handlers (p=0x8553d7f8, ie=0x8557d080) at /usr/src/sys/kern/kern_intr.c:1220 #10 0x8067c8eb in ithread_loop (arg=0x856525d0) at /usr/src/sys/kern/kern_intr.c:1233 #11 0x80678f11 in fork_exit (callout=0x8067c880 <ithread_loop>, arg=0x856525d0, frame=0xd80e7d38) at /usr/src/sys/kern/kern_fork.c:844 #12 0x80931de0 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:270 And also unrelated to usb, but fairly bizarre "netstat -m" output: 18446744073709550887/1355/626/25600 mbuf clusters in use (current/cache/total/max) 18014398509480560K/3497K/2073K bytes allocated to network (current/cache/total) Sorry for all the extra noise, but I'm not adept enough at determining whether this panic was usb related or fxp related. Thanks, Charles > --HPS >help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.OSX.2.00.1106132045330.12822>
