From owner-freebsd-net@FreeBSD.ORG Sat Mar 19 18:06:23 2005 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1B61A16A4CE; Sat, 19 Mar 2005 18:06:23 +0000 (GMT) Received: from cs1.cs.huji.ac.il (cs1.cs.huji.ac.il [132.65.16.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 75A9F43D1F; Sat, 19 Mar 2005 18:06:22 +0000 (GMT) (envelope-from danny@cs.huji.ac.il) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by cs1.cs.huji.ac.il with esmtp id 1DCiL6-000MUy-6n; Sat, 19 Mar 2005 20:06:20 +0200 X-Mailer: exmh version 2.7.0 06/18/2004 with nmh-1.0.4 To: Scott Long In-reply-to: Your message of Sat, 19 Mar 2005 08:56:30 -0700 . Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 19 Mar 2005 20:06:20 +0200 From: Danny Braniss Message-ID: cc: Sam Leffler cc: scsi@freebsd.org cc: net@freebsd.org Subject: Re: iSCSI initiator driver beta version, testers wanted X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Mar 2005 18:06:23 -0000 > Scott Long wrote: > > Danny Braniss wrote: > > > >>> with tags enabled, iSCSI is much faster, but it also causes a > >>> deadlock :-( > >>> this is what i run: > >>> newfs -U / > >>> cd / > >>> restore rf /home/file.dump > >>> > >>> on the same motherboard, a dual Xeon, with smp disabled all is OK > >>> with smp enabled restore gets stuck usualy waiting on biord. > >>> the iscsi driver shows that all requests have been done, the sniffing > >>> shows the same(ie all request have been done). > >>> > >>> so this leads me to think that there is some race condition that i'm not > >>> aware of in a SMP system, where xpt_done(ccb) is called while > >>> another process is calling biowait. > >>> > >>> another lead is that after restore gets stuck, the system slowly gets > >>> 'stalled'. > >>> > >>> any insight is most welcome!, i'm also stuck. > >> > >> > >> > >> ahh, hate talking to myself :-) > >> > >> grabbing Giant before calling xpt_done solved it, so the problem is > >> most probably in the CAM ... > >> > >> danny > >> > >> > >> > > > > No, you need to grab Giant when calling xpt_done(). I even put an > > assertion into CAM to make sure of that. Are you running with WITNESS > > and/or INVARIANTS enabled? Those would have caught this problem. > > they are off :-(, would have saved me some time. > > Scott > > Oops, I forgot to mention that I recently addressed this in 6-CURRENT. > Now, much of the rest of cam API still requires Giant to be held, but > xpt_done() does not. This only applies to 6-CURRENT, and I doubt that > it will be backported to 5-STABLE. > > Scott so what you are saying is that in 5.x it's a must to grab Giant before calling xpt_done, and not in 6? danny