From owner-freebsd-hackers@FreeBSD.ORG  Fri Jan 12 19:31:06 2007
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: freebsd-hackers@freebsd.org
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id D6FB016A47E;
	Fri, 12 Jan 2007 19:31:06 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from cs1.cs.huji.ac.il (cs1.cs.huji.ac.il [132.65.16.10])
	by mx1.freebsd.org (Postfix) with ESMTP id 90A3713C480;
	Fri, 12 Jan 2007 19:31:06 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from pampa.cs.huji.ac.il ([132.65.80.32])
	by cs1.cs.huji.ac.il with esmtp
	id 1H5S7E-000BS0-RR; Fri, 12 Jan 2007 21:31:04 +0200
X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
In-reply-to: Your message of Fri, 12 Jan 2007 20:02:49 +0100 .
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Fri, 12 Jan 2007 21:31:04 +0200
From: Danny Braniss <danny@cs.huji.ac.il>
Message-ID: <E1H5S7E-000BS0-RR@cs1.cs.huji.ac.il>
Cc: freebsd-scsi@FreeBSD.org, freebsd-hackers@freebsd.org
Subject: Re: iSCSI disconnects dilema 
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Jan 2007 19:31:06 -0000

> 
> --s/l3CgOIzMHHjg/5
> Content-Type: text/plain; charset=iso-8859-2
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> On Tue, Jan 09, 2007 at 09:06:46AM +0200, Danny Braniss wrote:
> > Hi,
> > While I think I have almost solved the problem of network disconnects,
> > It downed on me a major problem:
> > When a 'local' disk crashes, the kernel will probably hang/panic/crash.
> > if i don't try to recover, then there is no change in the above scenario.
> > if i try to recover, then the client does not know that it should
> > umount/fsck/mount.
> > While all this seems familiar, removing  a floppy/disk-on-key while it's
> > mounted, we could always say "you shouldn't have done that!", with
> > a network connection, it can happen very often - rebooting the target, a
> > network hickup, etc.
> >=20
> > So, any ideas?
> 
> In my opinion it should be done this way:
> 
> You have a queue of I/O requests. You send the to the other end and wait
> for confirmation. Until confirmation is received, you keep the requests
> queued. If the other end dies, you try to reconnect (until some timeout
> expires, the processes which send those requests will just wait), if you
> reconnect successfully, you resend not-confirmed requests, if you won't
> be able to reconnect, you just pass the errors up.
> 
> This is what I did in ggate and it seems to work.

That is basically what i'm doing - unacked request get requed.
the problem I fear (and maybe I'm paranoid :-):

assume the following scenario, the client(initiator) sends a write command,
the target acks it, then it crashes, if the write was never completed,
the initiator goes on as nothing ever happened. 

danny