From owner-freebsd-xen@freebsd.org Fri Sep 22 09:14:12 2017 Return-Path: Delivered-To: freebsd-xen@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 04185E205E8 for ; Fri, 22 Sep 2017 09:14:12 +0000 (UTC) (envelope-from kpielorz_lst@tdx.co.uk) Received: from smtp.krpservers.com (smtp.krpservers.com [62.13.128.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.krpservers.com", Issuer "RapidSSL SHA256 CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9FFAF3199 for ; Fri, 22 Sep 2017 09:14:10 +0000 (UTC) (envelope-from kpielorz_lst@tdx.co.uk) Received: from [10.12.30.106] (host86-162-208-244.range86-162.btcentralplus.com [86.162.208.244]) (authenticated bits=0) by smtp.krpservers.com (8.15.2/8.15.2) with ESMTPSA id v8M9E6ir093967 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 22 Sep 2017 10:14:08 +0100 (BST) (envelope-from kpielorz_lst@tdx.co.uk) Date: Fri, 22 Sep 2017 10:13:59 +0100 From: Karl Pielorz To: "Rodney W. Grimes" cc: freebsd-xen@freebsd.org Subject: Re: Storage 'failover' largely kills FreeBSD 10.x under XenServer? Message-ID: <29F6204C1998F74F077A94D9@[10.12.30.106]> In-Reply-To: References: <201709211423.v8LENKvN094067@pdx.rh.CN85.dnsmgr.net> X-Mailer: Mulberry/4.0.8 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2017 09:14:12 -0000 --On 21 September 2017 15:49 +0100 Karl Pielorz wrote: >> Are these timeouts coming from Dom0 or from a VM in a DomU? > > domU - as above, dom0 grumbles, but generally seems OK about things. dom0 > doesn't do anything silly like invalidate the VM's disks or anything. I've chased this down in the code - having briefly looked at blkfront/blkback - I can see all the mechanisms in place for performing I/O - but I cannot see there's any timeouts set anywhere (in that code). I can see the callback that fires when the I/O fails. It looks like for the purposes of xbd I/O requests are just gathered up, processed - and then fired off to XenServer (i.e. upstream). If they fail, callbacks are fired - and action taken. But nowhere can I see where there are any timeouts either specified, or specifiable by FreeBSD - nor can I see (certainly at that level) that there are any I/O retries in that code. So, - Timeouts may be set by Xen (i.e. outside of FreeBSD's scope) - I/O may be retried by 'higher levels' than blkfront/blkback - but I can't see where. It may simply be that I/O from FreeBSD through XenServer is a 'fire and forget' process, where FreeBSD has no control over timeouts, and currently has no code (at that level) to perform retries. I'd need to figure out what sits above 'blkfront/blkback' - and whether that's likely to do any retries. It's certainly not CAM running the storage - so those timeout/retry sysctl values are completely irrelevant. More study, and maybe a quick post to -hackers to see what lies above blkfront/back etc. -Kp