From owner-freebsd-usb@FreeBSD.ORG Tue Feb 17 00:21:05 2009 Return-Path: Delivered-To: usb@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7DF501065677 for ; Tue, 17 Feb 2009 00:21:05 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from asmtpout020.mac.com (asmtpout020.mac.com [17.148.16.95]) by mx1.freebsd.org (Postfix) with ESMTP id 6CD638FC19 for ; Tue, 17 Feb 2009 00:21:05 +0000 (UTC) (envelope-from xcllnt@mac.com) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Received: from mnatesan-lt2.jnpr.net (natint3.juniper.net [66.129.224.36]) by asmtp020.mac.com (Sun Java(tm) System Messaging Server 6.3-7.03 (built Aug 7 2008; 32bit)) with ESMTPSA id <0KF600DPDOZ28W50@asmtp020.mac.com> for usb@freebsd.org; Mon, 16 Feb 2009 16:21:05 -0800 (PST) Message-id: From: Marcel Moolenaar To: usb@freebsd.org Date: Mon, 16 Feb 2009 16:21:01 -0800 X-Mailer: Apple Mail (2.930.3) Cc: Subject: USB2+umass: timing related bug (machine check abort) X-BeenThere: freebsd-usb@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD support for USB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Feb 2009 00:21:05 -0000 Context: MACHINE=ia64, CPU=Montecito I'm running into a timing related MCA. In short: ... umass0: on usbus2 umass0: SCSI over Bulk-Only; quirks = 0x0000 umass0:2:0:-1: Attached to scbus2 *** machine check abort *** *********************************************************** * ROM Version : 01.05 * ROM Date : 11/06/2006 * BMC Version : 05.06 *********************************************************** ... When I enable EHCI debugging (level 99) this does not happen and between the debug output, I see: ... (probe0:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 (probe0:umass-sim0:0:0:0): CAM Status: SCSI Status Error (probe0:umass-sim0:0:0:0): SCSI Status: Check Condition (probe0:umass-sim0:0:0:0): UNIT ATTENTION asc:29,0 (probe0:umass-sim0:0:0:0): Power on, reset, or bus device reset occurred (probe0:umass-sim0:0:0:0): Retrying Command (per Sense Data) ... (probe0:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 (probe0:umass-sim0:0:0:0): CAM Status: SCSI Status Error (probe0:umass-sim0:0:0:0): SCSI Status: Check Condition (probe0:umass-sim0:0:0:0): NOT READY asc:3a,0 (probe0:umass-sim0:0:0:0): Medium not present (probe0:umass-sim0:0:0:0): Unretryable error ... ehcd0 at umass-sim0 bus 0 target 0 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: 40.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present ... MCA error dumps tells me that it's PCI related. I suspect it's a race condition caused by the HCD writing/updating operational state at the same time that the HC is accessing it. I have 2 instruction pointers. The first is one where an interrupt last occured: IP=0xe0000000041cf810 (gdb) l *0xe0000000041cf810 0xe0000000041cf810 is in ehci_root_ctrl_done (/nfs/freebsd/base/head/ sys/dev/usb2/controller/ehci2.c:3307). 3302 std->err = USB_ERR_IOERROR; 3303 goto done; 3304 } 3305 v = EOREAD4(sc, EHCI_PORTSC(index)); 3306 DPRINTFN(9, "port status=0x%04x\n", v); 3307 if (sc->sc_flags & EHCI_SCFLG_FORCESPEED) { 3308 if ((v & 0xc000000) == 0x8000000) 3309 i = UPS_HIGH_SPEED; 3310 else if ((v & 0xc000000) == 0x4000000) 3311 i = UPS_LOW_SPEED; The second is when the MCA happened: IP=0xe00000000420d8b0 (gdb) l *0xe00000000420d8b0 0xe00000000420d8b0 is in usb2_transfer_start (/nfs/freebsd/base/head/ sys/dev/usb2/core/usb2_transfer.c:1577). 1572 { 1573 if (xfer == NULL) { 1574 /* transfer is gone */ 1575 return; 1576 } 1577 USB_XFER_LOCK_ASSERT(xfer, MA_OWNED); 1578 1579 /* mark the USB transfer started */ 1580 1581 if (!xfer->flags_int.started) { The last access to the EHCI registers was through register 0x6c, which corresponds to PORTSC(3). This matches the first IP. The MCA is caused by an error on the PCI bus, most likely an invalid inbound address: **** MEMORY ERROR STRUCTURE **** MEM_ERR_STRUCT_VALID 0x0000000000000201 **** PLATFORM_SPECIFIC_ERROR_INFO **** VALIDATION_BITS 0x000000000000007b PLATFORM_ERROR_STATUS 0x0000000000421200 PLATFORM_REQUESTOR_ID 0x0000000000000000 PLATFORM_RESPONDER_ID 0x0000000000000000 PLATFORM_TARGET_ID 0x000000003fde6000 PLATFORM_BUS_SPECIFIC_DATA 0x0000000000107628 PLATFORM_OEM_COMPONENT_ID[0] 0x000000004033103c PLATFORM_OEM_COMPONENT_ID[1] 0x0000000000000000 PLATFORM_OEM_DEVICE_PATH 0x0000000000000000 .... HP_TITAN_PLATFORM_DATA ..... ERROR_LOG_EN 0x0000008000003dff ERROR_SIG_EN 0x0000200000002117 ERROR_STATUS 0x0000000000001000 ERROR_OVFL 0x0000000000001000 ERROR_FIRST 0x0000000000000000 AP_ADDRa 0x0000000000000000 AP_ADDRb 0x0000000000000000 ST_ADDRa 0x0000000000000000 ST_ADDRb 0x0000000000000000 RT_ADDRa 0x0000000000000000 RT_ADDRb 0x0000000000000000 RP_ADDRa 0x0000000000000000 RP_ADDRb 0x0000000000000000 LE_ADDRa 0x503800003fde6000 LE_ADDRb 0xc020000000030118 ST_TO 0x00000000fffffff3 PT_TO 0x00000000ffffffff RT_TO 0x000000009e8c6100 **** PCI BUS REGISTERS **** PCI_BUS_ERROR_VALID 0x0000000000000001 **** PLATFORM_PCI_BUS_ERROR_INFO **** VALIDATION_BITS 0x00000000000007cf PCI_BUS_ERROR_STATUS 0x0000000000091200 PCI_BUS_ERROR_TYPE 0x0000000000000000 PCI_BUS_ID 0x0000000000000000 PCI_BUS_ADDRESS 0x00000000fc2fa5d0 PCI_BUS_DATA 0x0000000000000000 PCI_BUS_CMD 0x0000000000000000 PCI_BUS_REQUESTOR_ID 0x0000000000001000 PCI_BUS_COMPLETER_ID 0x00000000fed20000 PCI_BUS_TARGET_ID 0x00000000fc2fa5d0 PCI_BUS_OEM_ID[0] 0x000000000000122e PCI_BUS_OEM_ID[1] 0x0000000000000000 .... HP_MERCURY_DATA .... CELL_NUMBER 0x0000000000000000 SBA_NUMBER 0x0000000000000000 ROPE_NUMBER 0x0000000000000000 ERROR_STATUS 0x000000010000021a ERROR_MASTER_ID_LOG 0x0000000000000008 INBOUND_ERR_ADDRESS 0x00000000fc2fa5d0 INBOUND_ERR_ATTRIBUTE 0x2000000000000000 COMPLETION_MESSAGE_LOG 0x0000000000000000 OUTBOUND_ERR_ADDRESS 0x0000000000000000 ERROR_CONFIG 0x0000000000001d50 STATUS_INFO_CONTROL 0x0000000000000048 FUNC_ID 0x0ab00146122e103c CAPABILITIES_LIST 0x0f00023700200002 AGP_COMMAND 0x0000000000000000 PCIX_CAPABILITIES 0x0013ff0000010007 OLR_CONTROL 0x00023e1b00032403 CLOCK_CONTROL 0x0000000000000048 BUS_MODE 0x9da874ae36d58460 Some more background information: \begin{log} ... FreeBSD 8.0-CURRENT #28 r188699M: Mon Feb 16 14:51:49 PST 2009 marcel@hob.lan.xcllnt.net:/usr/obj/nfs/freebsd/base/head/sys/HOB ... CPU: Montecito (1594.66-Mhz Itanium 2) ... ohci0: mem 0x88032000-0x88032fff irq 17 at device 2.0 on pci0 ohci0: [ITHREAD] usbus0: on ohci0 ohci1: mem 0x88031000-0x88031fff irq 18 at device 2.1 on pci0 ohci1: [ITHREAD] usbus1: on ohci1 ehci0: mem 0x88030000-0x880300ff irq 19 at device 2.2 on pci0 ehci0: [ITHREAD] usbus2: EHCI version 1.0 usbus2: on ehci0 ... usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 480Mbps High Speed USB v2.0 ugen0.1: at usbus0 ushub0: on usbus0 ugen1.1: at usbus1 ushub1: on usbus1 ugen2.1: at usbus2 ushub2: on usbus2 ushub1: 2 ports with 2 removable, self powered ushub0: 3 ports with 3 removable, self powered ... ushub2: 5 ports with 5 removable, self powered ugen0.2: at usbus0 uhid0: on usbus0 Symlink: uhid0 -> usb0.2.0.16 ums0: on usbus0 ugen2.2: at usbus2 ums0: 3 buttons and [] coordinates Symlink: ums0 -> usb0.2.1.17 ... \end{log} -- Marcel Moolenaar xcllnt@mac.com