From owner-freebsd-net@FreeBSD.ORG Sat Mar 19 18:47:46 2005 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3F26C16A4CE; Sat, 19 Mar 2005 18:47:46 +0000 (GMT) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5DB4E43D1F; Sat, 19 Mar 2005 18:47:43 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.12] (g4.samsco.home [192.168.254.12]) (authenticated bits=0) by pooker.samsco.org (8.13.1/8.13.1) with ESMTP id j2JIjrTZ045386; Sat, 19 Mar 2005 11:45:53 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <423C7387.8010804@samsco.org> Date: Sat, 19 Mar 2005 11:46:31 -0700 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7) Gecko/20040514 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Danny Braniss References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on pooker.samsco.org cc: Sam Leffler cc: scsi@freebsd.org cc: net@freebsd.org Subject: Re: iSCSI initiator driver beta version, testers wanted X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Mar 2005 18:47:46 -0000 Danny Braniss wrote: >>Scott Long wrote: >> >>>Danny Braniss wrote: >>> >>> >>>>>with tags enabled, iSCSI is much faster, but it also causes a >>>>>deadlock :-( >>>>>this is what i run: >>>>> newfs -U / >>>>> cd / >>>>> restore rf /home/file.dump >>>>> >>>>>on the same motherboard, a dual Xeon, with smp disabled all is OK >>>>>with smp enabled restore gets stuck usualy waiting on biord. >>>>>the iscsi driver shows that all requests have been done, the sniffing >>>>>shows the same(ie all request have been done). >>>>> >>>>>so this leads me to think that there is some race condition that i'm not >>>>>aware of in a SMP system, where xpt_done(ccb) is called while >>>>>another process is calling biowait. >>>>> >>>>>another lead is that after restore gets stuck, the system slowly gets >>>>>'stalled'. >>>>> >>>>>any insight is most welcome!, i'm also stuck. >>>> >>>> >>>> >>>>ahh, hate talking to myself :-) >>>> >>>>grabbing Giant before calling xpt_done solved it, so the problem is >>>>most probably in the CAM ... >>>> >>>>danny >>>> >>>> >>>> >>> >>>No, you need to grab Giant when calling xpt_done(). I even put an >>>assertion into CAM to make sure of that. Are you running with WITNESS >>>and/or INVARIANTS enabled? Those would have caught this problem. >>> > > they are off :-(, would have saved me some time. > >>>Scott >> >>Oops, I forgot to mention that I recently addressed this in 6-CURRENT. >>Now, much of the rest of cam API still requires Giant to be held, but >>xpt_done() does not. This only applies to 6-CURRENT, and I doubt that >>it will be backported to 5-STABLE. >> >>Scott > > > so what you are saying is that in 5.x it's a must to grab Giant before calling > xpt_done, and not in 6? > > danny > > > Correct. Again, turning on WITNESS and INVARIANTS is a very good thing to do during development. Scott