From owner-freebsd-xen@freebsd.org  Fri Sep 22 09:14:12 2017
Return-Path: <owner-freebsd-xen@freebsd.org>
Delivered-To: freebsd-xen@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 04185E205E8
 for <freebsd-xen@mailman.ysv.freebsd.org>;
 Fri, 22 Sep 2017 09:14:12 +0000 (UTC)
 (envelope-from kpielorz_lst@tdx.co.uk)
Received: from smtp.krpservers.com (smtp.krpservers.com [62.13.128.145])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "*.krpservers.com", Issuer "RapidSSL SHA256 CA" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9FFAF3199
 for <freebsd-xen@freebsd.org>; Fri, 22 Sep 2017 09:14:10 +0000 (UTC)
 (envelope-from kpielorz_lst@tdx.co.uk)
Received: from [10.12.30.106]
 (host86-162-208-244.range86-162.btcentralplus.com [86.162.208.244])
 (authenticated bits=0)
 by smtp.krpservers.com (8.15.2/8.15.2) with ESMTPSA id v8M9E6ir093967
 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Fri, 22 Sep 2017 10:14:08 +0100 (BST)
 (envelope-from kpielorz_lst@tdx.co.uk)
Date: Fri, 22 Sep 2017 10:13:59 +0100
From: Karl Pielorz <kpielorz_lst@tdx.co.uk>
To: "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>
cc: freebsd-xen@freebsd.org
Subject: Re: Storage 'failover' largely kills FreeBSD 10.x under XenServer?
Message-ID: <29F6204C1998F74F077A94D9@[10.12.30.106]>
In-Reply-To: <D5D409CD045CF518CE957E77@[10.12.30.106]>
References: <201709211423.v8LENKvN094067@pdx.rh.CN85.dnsmgr.net>
 <D5D409CD045CF518CE957E77@[10.12.30.106]>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-BeenThere: freebsd-xen@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Discussion of the freebsd port to xen - implementation and usage
 <freebsd-xen.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-xen>,
 <mailto:freebsd-xen-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-xen/>
List-Post: <mailto:freebsd-xen@freebsd.org>
List-Help: <mailto:freebsd-xen-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-xen>,
 <mailto:freebsd-xen-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Sep 2017 09:14:12 -0000


--On 21 September 2017 15:49 +0100 Karl Pielorz <kpielorz_lst@tdx.co.uk> 
wrote:

>> Are these timeouts coming from Dom0 or from a VM in a DomU?
>
> domU - as above, dom0 grumbles, but generally seems OK about things. dom0
> doesn't do anything silly like invalidate the VM's disks or anything.

I've chased this down in the code - having briefly looked at 
blkfront/blkback - I can see all the mechanisms in place for performing I/O 
- but I cannot see there's any timeouts set anywhere (in that code).

I can see the callback that fires when the I/O fails.

It looks like for the purposes of xbd I/O requests are just gathered up, 
processed - and then fired off to XenServer (i.e. upstream). If they fail, 
callbacks are fired - and action taken.

But nowhere can I see where there are any timeouts either specified, or 
specifiable by FreeBSD - nor can I see (certainly at that level) that there 
are any I/O retries in that code.

So,

  - Timeouts may be set by Xen (i.e. outside of FreeBSD's scope)
  - I/O may be retried by 'higher levels' than blkfront/blkback - but I 
can't see where.

It may simply be that I/O from FreeBSD through XenServer is a 'fire and 
forget' process, where FreeBSD has no control over timeouts, and currently 
has no code (at that level) to perform retries.

I'd need to figure out what sits above 'blkfront/blkback' - and whether 
that's likely to do any retries.

It's certainly not CAM running the storage - so those timeout/retry sysctl 
values are completely irrelevant.

More study, and maybe a quick post to -hackers to see what lies above 
blkfront/back etc.

-Kp