From owner-freebsd-usb@FreeBSD.ORG  Tue Jun 14 00:58:46 2011
Return-Path: <owner-freebsd-usb@FreeBSD.ORG>
Delivered-To: freebsd-usb@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BEAAF106564A
	for <freebsd-usb@freebsd.org>; Tue, 14 Jun 2011 00:58:46 +0000 (UTC)
	(envelope-from spork@bway.net)
Received: from xena.bway.net (xena.bway.net [216.220.96.26])
	by mx1.freebsd.org (Postfix) with ESMTP id 856788FC18
	for <freebsd-usb@freebsd.org>; Tue, 14 Jun 2011 00:58:46 +0000 (UTC)
Received: (qmail 51250 invoked by uid 0); 14 Jun 2011 00:58:44 -0000
Received: from smtp.bway.net (216.220.96.25)
	by xena.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP;
	14 Jun 2011 00:58:44 -0000
Received: (qmail 51244 invoked by uid 90); 14 Jun 2011 00:58:44 -0000
Received: from unknown (HELO freemac) (spork@bway.net@96.57.144.66)
	by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP;
	14 Jun 2011 00:58:44 -0000
Date: Mon, 13 Jun 2011 20:58:44 -0400 (EDT)
From: Charles Sprickman <spork@bway.net>
X-X-Sender: spork@freemac
To: Hans Petter Selasky <hselasky@c2i.net>
In-Reply-To: <201106131108.42959.hselasky@c2i.net>
Message-ID: <alpine.OSX.2.00.1106132045330.12822@freemac>
References: <alpine.OSX.2.00.1106111715580.99056@hotlap.nat.fasttrackmonkey.com>
	<201106122245.44386.hselasky@c2i.net>
	<alpine.OSX.2.00.1106121745020.99056@hotlap.nat.fasttrackmonkey.com>
	<201106131108.42959.hselasky@c2i.net>
User-Agent: Alpine 2.00 (OSX 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-usb@freebsd.org
Subject: Re: ucom/uftdi high interrupt load
X-BeenThere: freebsd-usb@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: FreeBSD support for USB <freebsd-usb.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-usb>,
	<mailto:freebsd-usb-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-usb>
List-Post: <mailto:freebsd-usb@freebsd.org>
List-Help: <mailto:freebsd-usb-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-usb>,
	<mailto:freebsd-usb-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Jun 2011 00:58:46 -0000

On Mon, 13 Jun 2011, Hans Petter Selasky wrote:

> On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:
>> On Sun, 12 Jun 2011, Hans Petter Selasky wrote:
>>> On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:
>>>> Hello,
>>>>
>>>> We ran into an odd problem last week with our serial consoles after
>>>> moving the USB to serial adapters from an old 4.11 box to a box running
>>>> 8.1.  We have two boxes that incorporate (I assume) hubs and a bunch of
>>>> FTDI serial interfaces.  One has 16 ports, the other 8.  Each is
>>>> plugged directly into a USB port on the rear of the mainboard.  We run
>>>> conserver[1] to handle access to the serial ports.  From what I've
>>>> observed, this application opens the ports when the daemon starts - it
>>>> logs any output (handy for panics, or anything else that might spit
>>>> interesting info to the console) and waits for clients to connect to
>>>> it.
>>>>
>>>> Everything had been working fine for a few weeks.  The box was rebooted
>>>> recently to enable PostgreSQL to start normally (bumped SHM stuff in
>>>> loader.conf).  After six days, we found that the consoles were
>>>> unresponsive.  Restarting conserver brought us this each time we
>>>> connected to a console for full read/write access:
>>>>
>>>> [Thu Jun  9 10:04:59 2011] conserver (50113): ERROR: [h22]
>>>> open(/dev/ttyU4): Interrupted system call: forcing down
>>>> [Thu Jun  9 10:04:59 2011] conserver (50112): ERROR: [h21]
>>>> open(/dev/ttyU11): Interrupted system call: forcing down
>>>>
>>>> All devices still appeared in /dev.  Stopping conserver and confirming
>>>> it and all child processes were gone and then using picocom and cu
>>>> yielded no response on the serial ports.
>>>>
>>>> We also found (after the fact) that around the time the consoles became
>>>> unresponsive, cpu usage went to nearly 90% and was mostly in the kernel
>>>> process "intr":
>>>>
>>>> root   12 70.5  0.0     0   136  ??  WL   Fri12AM 120:01.47  [intr]
>>>>
>>>> A graph showing cpu usage (red is "system"):
>>>> http://i.imgur.com/0yO5l.png
>>>>
>>>> I should note that we know the cpu spike and devices becoming
>>>> unresponsive can be correlated because one of the serial ports runs a
>>>> temperature monitor which is tied into our monitoring.  When the data
>>>> goes stale, we get notified.
>>>>
>>>> Issuing a "usbconfig -u 0 reset" caused all devices except for the root
>>>> hub to disappear and not come back.  CPU usage also dipped a bit after
>>>> that.  Rebooting was the only way to resolve the issue - perhaps
>>>> plugging and unplugging would have worked, but that's a bit too complex
>>>> for our remote hands.
>>>>
>>>> I can supply full dmesg and more, but for now, here's a summary of the
>>>> usb info from dmesg:
>>>>
>>>> FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010
>>>>
>>>> ohci0: <OHCI (generic) USB controller> mem 0xfe9fc000-0xfe9fcfff irq 10
>>>> at device 15.2 on pci0
>>>> ohci0: [ITHREAD]
>>>> ...
>>>> usbus0: <OHCI (generic) USB controller> on ohci0
>>>> usbus0: 12Mbps Full Speed USB v1.0
>>>> ugen0.1: <(0x1166)> at usbus0
>>>> ...
>>>> uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
>>>> usbus0
>>>> uhub0: 4 ports with 4 removable, self powered
>>>> ugen0.2: <vendor 0x05e3> at usbus0
>>>> uhub1: <vendor 0x05e3 USB Hub, class 9/0, rev 1.01/0.11, addr 2> on
>>>> usbus0 uhub1: 7 ports with 7 removable, self powered
>>>> ugen0.3: <FTDI> at usbus0
>>>> uftdi0: <USB FAST SERIAL ADAPTER> on usbus0
>>>> uftdi1: <USB FAST SERIAL ADAPTER> on usbus0
>>>> ugen0.4: <FTDI> at usbus0
>>>> uftdi2: <USB FAST SERIAL ADAPTER> on usbus0
>>>> uftdi3: <USB FAST SERIAL ADAPTER> on usbus0
>>>> ugen0.5: <FTDI> at usbus0
>>>> uftdi4: <USB FAST SERIAL ADAPTER> on usbus0
>>>> uftdi5: <USB FAST SERIAL ADAPTER> on usbus0
>>>> ugen0.6: <FTDI> at usbus0
>>>> uftdi6: <USB FAST SERIAL ADAPTER> on usbus0
>>>> uftdi7: <USB FAST SERIAL ADAPTER> on usbus0
>>>> ugen0.7: <FTDI> at usbus0
>>>> uftdi8: <USB FAST SERIAL ADAPTER> on usbus0
>>>> uftdi9: <USB FAST SERIAL ADAPTER> on usbus0
>>>> ugen0.8: <FTDI> at usbus0
>>>> uftdi10: <USB FAST SERIAL ADAPTER> on usbus0
>>>> uftdi11: <USB FAST SERIAL ADAPTER> on usbus0
>>>> ugen0.9: <vendor 0x05e3> at usbus0
>>>> uhub2: <vendor 0x05e3 USB Hub, class 9/0, rev 1.01/0.12, addr 9> on
>>>> usbus0 uhub2: 4 ports with 4 removable, self powered
>>>> ugen0.10: <FTDI> at usbus0
>>>> uftdi12: <USB FAST SERIAL ADAPTER> on usbus0
>>>> uftdi13: <USB FAST SERIAL ADAPTER> on usbus0
>>>> ugen0.11: <FTDI> at usbus0
>>>> ... (mangling below is as it appears in dmesg)
>>>> da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14: <USB FAST SERIAL
>>>> ADAPTER> on usbus0
>>>> da1: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
>>>> uftdi15: <USB FAST SERIAL ADAPTER> on usbus0
>>>> ...
>>>> Root mount waiting for: usbus0
>>>> ugen0.12: <vendor 0x0409> at usbus0
>>>> uhub3: <vendor 0x0409 product 0x0050, class 9/0, rev 2.00/1.00, addr 12>
>>>> on usbus0
>>>> Root mount waiting for: usbus0
>>>> uhub3: 7 ports with 7 removable, self powered
>>>> ugen0.13: <FTDI> at usbus0
>>>> uftdi16: <FT232R USB UART> on usbus0
>>>> Root mount waiting for: usbus0
>>>> ugen0.14: <FTDI> at usbus0
>>>> uftdi17: <FT232R USB UART> on usbus0
>>>> Root mount waiting for: usbus0
>>>> ugen0.15: <FTDI> at usbus0
>>>> uftdi18: <FT232R USB UART> on usbus0
>>>>
>>>> ugen0.16: <FTDI> at usbus0Root mount waiting for:
>>>>   usbus0
>>>>
>>>> uftdi19: <FT232R USB UART> on usbus0
>>>> ugen0.17: <FTDI> at usbus0
>>>> uftdi20: <FT232R USB UART> on usbus0
>>>> Root mount waiting for: usbus0
>>>> ugen0.18: <FTDI> at usbus0
>>>> uftdi21: <FT232R USB UART> on usbus0
>>>> Root mount waiting for: usbus0
>>>> ugen0.19: <vendor 0x0409> at usbus0
>>>> uhub4: <vendor 0x0409 product 0x005a, class 9/0, rev 2.00/1.00, addr 19>
>>>> on usbus0
>>>> uhub4: 4 ports with 4 removable, self powered
>>>> Root mount waiting for: usbus0
>>>> ugen0.20: <FTDI> at usbus0
>>>> uftdi22: <FT232R USB UART> on usbus0
>>>> Root mount waiting for: usbus0
>>>> ugen0.21: <FTDI> at usbus0
>>>> uftdi23: <FT232R USB UART> on usbus0
>>>> Trying to mount root from zfs:zroot
>>>>
>>>> Thanks,
>>>>
>>>> Charles
>>>
>>> Hi,
>>>
>>> Try to get output from vmstat -i.
>>>
>>> Also try to set the:
>>>
>>> hw.usb.ehci.iaadbug=1
>>>
>>> and
>>>
>>> hw.usb.ehci.lostintrbug=1
>>>
>>> in /boot/loader.conf
>>
>> Quick question - this host only has USB 1.1 - so I don't think I have
>> ehci, do I?  To be honest, I'm not totally clear on what's what as far as
>> uhci, ehci, and ohci.  This is what's seen in sysctl output related to
>> USB:
>>
>> [spork@h12 ~]$ sysctl -a|grep usb|more
>> hw.pci.usb_early_takeover: 1
>> hw.usb.no_boot_wait: 0
>> hw.usb.debug: 0
>> hw.usb.usb_lang_mask: 255
>> hw.usb.usb_lang_id: 9
>> hw.usb.template: 0
>> hw.usb.power_timeout: 30
>> hw.usb.ucom.cons_baud: 9600
>> hw.usb.ucom.cons_unit: -1
>> dev.usbus.0.%desc: OHCI (generic) USB controller
>> dev.usbus.0.%driver: usbus
>> dev.usbus.0.%parent: ohci0
>> dev.uhub.0.%parent: usbus0
>
> Ok, then those quirks won't help.
>
> For OHCI, I guess you should check vmstat -i.

Oddly enough, the box paniced today, but it appeared to be related to fxp. 
However in the coredump summary, I have "vmstat -i" output, and ohci seems 
fairly high in comparison to everything else:

vmstat -i

interrupt                          total       rate
irq4: uart0                          106          0
irq10: ohci0                   142322001     968176
irq14: ata0                         1178          8
irq20: fxp0                      3008691      20467
irq21: fxp1                      1733357      11791
irq28: sym1                           30          0
irq29: sym0                      2624749      17855
cpu0: timer                    728063100    4952810
cpu1: timer                    728044684    4952684
Total                         1605797896   10923795

Also, just a brief summary of the panic, since it mentions the interrupt 
process again:

#7  0x8059139b in fxp_new_rfabuf (sc=0x8564c000, rxp=0x8564c1c0)
     at /usr/src/sys/dev/fxp/if_fxp.c:2611
#8  0x8059285b in fxp_intr (xsc=0x8564c000)
     at /usr/src/sys/dev/fxp/if_fxp.c:1931
#9  0x8067b1db in intr_event_execute_handlers (p=0x8553d7f8, 
ie=0x8557d080)
     at /usr/src/sys/kern/kern_intr.c:1220
#10 0x8067c8eb in ithread_loop (arg=0x856525d0)
     at /usr/src/sys/kern/kern_intr.c:1233
#11 0x80678f11 in fork_exit (callout=0x8067c880 <ithread_loop>,
     arg=0x856525d0, frame=0xd80e7d38) at /usr/src/sys/kern/kern_fork.c:844
#12 0x80931de0 in fork_trampoline () at 
/usr/src/sys/i386/i386/exception.s:270

And also unrelated to usb, but fairly bizarre "netstat -m" output:

18446744073709550887/1355/626/25600 mbuf clusters in use 
(current/cache/total/max)
18014398509480560K/3497K/2073K bytes allocated to network 
(current/cache/total)

Sorry for all the extra noise, but I'm not adept enough at determining 
whether this panic was usb related or fxp related.

Thanks,

Charles

> --HPS
>