From owner-freebsd-stable Wed Mar 21 21:52:59 2001 Delivered-To: freebsd-stable@freebsd.org Received: from mass.dis.org (mass.dis.org [216.240.45.41]) by hub.freebsd.org (Postfix) with ESMTP id 15FDD37B71F for ; Wed, 21 Mar 2001 21:52:43 -0800 (PST) (envelope-from msmith@mass.dis.org) Received: from mass.dis.org (localhost [127.0.0.1]) by mass.dis.org (8.11.2/8.11.2) with ESMTP id f2M1KwE00867; Wed, 21 Mar 2001 17:21:02 -0800 (PST) (envelope-from msmith@mass.dis.org) Message-Id: <200103220121.f2M1KwE00867@mass.dis.org> X-Mailer: exmh version 2.1.1 10/15/1999 To: rand@meridian-enviro.com Cc: freebsd-stable@FreeBSD.ORG, Mike Tancsa , bryanh@meridian-enviro.com Subject: Re: 3ware problems In-reply-to: Your message of "Wed, 21 Mar 2001 17:50:39 CST." <87u24m7kc0.wl@delta.meridian-enviro.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 21 Mar 2001 17:20:58 -0800 From: Mike Smith Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Mike> If you can add another function like twe_printstate that invokes > Mike> twe_print_request on each of the requests on the busy queue and > Mike> let me know what they look like, that might give me some clues. > > OK, I haven't written the twe_printstate function yet, but I think I > have the request. I got the filesystem wedged first, and then browsing > the datastructures with DDB, I think I've found the busy queue. Here's > the request: Cool, this works just as well. 8) > db> call twe_print_request(0xc1529800) > twe0: CMD: request_id 89 opcode size 7 unit 0 host_id 0 > twe0: status 0 flags 0x0 count 16 sgl_offset 3 > twe0: lba 264703 > twe0: 0: 0xce4f000/4096 > twe0: 1: 0x2ab0000/4096 > twe0: tr_command 0xc1529800/0x1749d800 tr_data 0xcb928000/0xce4f000,8192 > twe0: tr_status 2 tr_flags 0x1 tr_complete 0xc011f170 tr_private 0 Er. This is bad; tr_status == 2 means that the command has been completed; it shouldn't still be on the busy queue. Can you check to make sure you have the right queue here? > I'm rebuilding the kernel now with the function twe_printstate, after > I figured it out with the debugger. (This reminds me of a saying that > has to do with horses and carriages, hmm.) Hrm. It *should* be pretty easy; I'm sorry I confused you with the 'printstate' reference; you should be able to fix up twe_report to just dump the busy queue: struct twe_request *tr; ... TAILQ_FOREACH(tr, TAILQ_FIRST(sc->twe_busy), tr_link) twe_print_request(tr); > Oh, btw, it took over 3 million rows to get it stuck this time. Gotta > love a test cycle of 6 hours or so. Sigh. This is obviously a really weird case; possibly either an extremely narrow race, or some very borderline PCI issue. One question I should have asked, but don't recall whether you answered; are you using an AMD K7 system by any chance? We've seen some *very* weird behaviour with these controllers in some K7 systems. Thanks again for your help here. Regards, Mike -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message