From owner-freebsd-firewire Sat Feb 22 11:47:50 2003 Delivered-To: freebsd-firewire@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B87DB37B405 for ; Sat, 22 Feb 2003 11:47:45 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 906A143FBF for ; Sat, 22 Feb 2003 11:47:44 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.6/8.12.6) with ESMTP id h1MJlh1o030662; Sat, 22 Feb 2003 11:47:44 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.7/8.12.7) with ESMTP id h1MJlhmj000679; Sat, 22 Feb 2003 11:47:43 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.7/8.12.7/Submit) id h1MJlfCw000678; Sat, 22 Feb 2003 11:47:41 -0800 (PST) Date: Sat, 22 Feb 2003 11:47:41 -0800 From: Marcel Moolenaar To: Hidetoshi Shimokawa Cc: firewire@FreeBSD.org Subject: Re: Bad news: bus resets not fixed yet Message-ID: <20030222194741.GA579@dhcp01.pn.xcllnt.net> References: <20030222002547.GA1247@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.3i Sender: owner-freebsd-firewire@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Feb 22, 2003 at 10:25:54PM +0900, Hidetoshi Shimokawa wrote: > > > > I rebooted after Hidetoshi-san's patch has been committed and I > > also made sure to include ken's cd(4) fixes, but unfortunately > > the bus resets have returned. We may have timing problems that > > were hidden by the additional overhead of emitting the debug info. As a quick update: Enabling debug in sbp "fixed" the problem. dmesg shows (partial, it should not be much different (if at all) from the one I gave the URL to before -- now it's archived): : FreeBSD 5.0-CURRENT #10: Sat Feb 22 01:18:29 PST 2003 marcel@dhcp01.pn.xcllnt.net:/usr/src/sys/i386/compile/VAIO : fwohci0: mem 0xe0200000-0xe0203fff,0xe0205000-0xe02057ff at device 2.0 on pci2 fwohci0: PCI bus latency was changing to 250. pcib1: slot 2 INTA is routed to irq 9 fwohci0: OHCI version 1.10 (ROM=1) fwohci0: No. of Isochronous channel is 4. fwohci0: EUI64 08:00:46:03:01:24:cb:1d fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 sbp_identify sbp_probe sbp0: on firewire0 sbp_attach (cold=1) fwohci0: Initiate bus reset fwohci0: BUS reset fwohci0: node_id = 0xc000ffc1, CYCLEMASTER mode firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me) : firewire0: New S400 device ID:08004603011eb709 firewire0: Device SBP-II sbp_post_explore (sbp_cold=2) sbp_post_explore: EUI:08004603011eb709 spec=1 key=1. target 0 lun 0 found sbp0:0:0 LOGIN sbp0:0:0 ordered:0 type:5 EUI:08004603011eb709 node:0 speed:2 maxrec:10 new! sbp0:0:0 'Sony' 'PCGA-DSM5' 'ad1830' fwohci0: BUS reset fw_xfer_done: pending fwohci0: node_id = 0xc000ffc1, CYCLEMASTER mode firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me) ad0: 28615MB [58140/16/63] at ata0-master UDMA100 pcm0: measured ac97 link rate at 5994 Hz sbp_post_explore (sbp_cold=1) sbp_post_explore: EUI:08004603011eb709 spec=1 key=1. sbp0:0:0 RECONNECT sbp0:0:0 ordered:0 type:5 EUI:08004603011eb709 node:0 speed:2 maxrec:10 new! sbp0:0:0 'Sony' 'PCGA-DSM5' 'ad1830' fw_attach_dev: 1 pending handlers called sbp0:0:0 ORB status src:1 resp:0 dead:0 len:1 stat:4 orb:00080513c sbp0:0:0 Access denied sbp0:0:0 reconnect failed sbp0:0:0 LOGIN (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe0:umass-sim0:0:0:0): CAM Status: SCSI Status Error (probe0:umass-sim0:0:0:0): SCSI Status: Check Condition (probe0:umass-sim0:0:0:0): UNIT ATTENTION asc:29,0 (probe0:umass-sim0:0:0:0): Power on, reset, or bus device reset occurred (probe0:umass-sim0:0:0:0): Retrying Command (per Sense Data) (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe0:umass-sim0:0:0:0): CAM Status: SCSI Status Error (probe0:umass-sim0:0:0:0): SCSI Status: Check Condition (probe0:umass-sim0:0:0:0): NOT READY asc:3a,0 (probe0:umass-sim0:0:0:0): Medium not present (probe0:umass-sim0:0:0:0): Unretryable error Mounting root from ufs:/dev/ad0s1a sbp0:0:0 login: len 12, ID 0, cmd 0000fffff0010100, recon_hold 0 sbp0:0:0 sbp_busy_timeout sbp0:0:0 sbp_agent_reset sbp0:0:0 sbp_do_attach sbp0:0:0 sbp_cam_scan_lun sbp0:0:0 ORB status src:1 resp:0 dead:0 len:7 stat:c orb:00080560c sbp0:0:0 Request aborted sbp0:0:0 XPT_SCSI_IO: cmd: 12 01 80 00 ff 00 ff af f7 f7, flags: 0x40, 6b cmd/255b data/18b sense sbp0:0:0 SCSI status 2 sfmt 0 valid 0 key 5 code 24 qlfr 0 len 7 sbp0:0:0 ORB status src:1 resp:0 dead:0 len:7 stat:c orb:000805740 sbp0:0:0 Request aborted sbp0:0:0 XPT_SCSI_IO: cmd: 00 00 00 00 00 00 ff af f7 f7, flags: 0xc0, 6b cmd/0b data/32b sense sbp0:0:0 SCSI status 2 sfmt 0 valid 0 key 6 code 29 qlfr 0 len 7 sbp0:0:0 sbp_cam_callback sbp0:0:0 ORB status src:1 resp:0 dead:0 len:7 stat:c orb:000805874 sbp0:0:0 Request aborted sbp0:0:0 XPT_SCSI_IO: cmd: 25 00 00 00 00 00 00 00 00 00, flags: 0x40, 10b cmd/8b data/32b sense sbp0:0:0 SCSI status 2 sfmt 0 valid 0 key 2 code 3a qlfr 0 len 7 cd0 at sbp0 bus 0 target 0 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: 50.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present This looks good. Next step: Rebuild with the exact same sources, but with debug disabled again, just to make sure no other changes have been made (ie cvs update). If the bus reset loop (I have to give it a name) reappears we have something that's reproducable, which means we can try to narrow down on the code... > > o Aren't bus resets expensive operations in general/principle? > > It's expensive, at least, for bus operation. > We cannot do transactions while bus reset phase. Hmmmm... As Katsushi-san explained, bus resets happen for a couple of reasons of which one can be unstability. If we know bus resets are expensive, we may be able to limit the rate of bus resets or simply bring down the bus (can we keep a device off the bus?) if we detect a bus reset loop... just a thought... > > o My logs show 6 resets per second (unoptimized :-). Isn't this > > high no matter if there's a bug or not? > > Too high and unusual. > Please note the driver doesn't initiate bus reset at all as far > as dmesg says. Your CD drive seems initiate bus reset. > (fwcontrol -t shows who initiated bus reset last time) > phy chip of your laptop may initiate bus reset. Ah... interesting. With debug enabled it shows: dhcp01% sudo fwcontrol -t crc_len: 4 generation:2 node_count:2 sid_count:2 id link gap_cnt speed delay cIRM power port0 port1 port2 ini more 00 1 1 S400 0 0 0W - P 1 0 01 1 1 S400 0 1 0W - C 0 0 I'm interested to see what it will show when I disable debug again. > Even with the success case: > http://www.xcllnt.net/~marcel/vaio.txt > Someone initiated bus reset just after login operation. I'd like to > know why this happens... Ok. I'll pay special attention to that. -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-firewire" in the body of the message