From owner-freebsd-current@FreeBSD.ORG Tue Nov 1 13:05:43 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0EE3C1065672; Tue, 1 Nov 2011 13:05:43 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2DE2F8FC12; Tue, 1 Nov 2011 13:05:42 +0000 (UTC) Received: by eyd10 with SMTP id 10so8038033eyd.13 for ; Tue, 01 Nov 2011 06:05:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=sRSKsN1WmuXYXd7bywp5jE1hRJRacNhtvinHkTUFlCw=; b=Thpbx2UQwdSs3Eo2LJnX7vuMiVmJR+9+Viev42J9iUi+zVrsoixH16cjhkmn5N6Gzv 8sM6cECvQKZXk3D10G7w+HoVq41NEiNTJo80E2vVRxVbv2Be8KWy1+XzQxP060uLIT0O ar4pe1NUCPuCTRvEmB8Pgpk9eqxXcU//9WmxQ= Received: by 10.14.8.136 with SMTP id 8mr4284eer.87.1320152741192; Tue, 01 Nov 2011 06:05:41 -0700 (PDT) Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id t6sm29926393eeb.11.2011.11.01.06.05.39 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 01 Nov 2011 06:05:40 -0700 (PDT) Sender: Alexander Motin Message-ID: <4EAFEEA1.80500@FreeBSD.org> Date: Tue, 01 Nov 2011 15:05:37 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:7.0.1) Gecko/20111003 Thunderbird/7.0.1 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4EAF00A6.5060903@FreeBSD.org> <20111101123944.GC4567@garage.freebsd.pl> In-Reply-To: <20111101123944.GC4567@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, freebsd-geom@freebsd.org Subject: Re: RFC: GEOM MULTIPATH rewrite X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Nov 2011 13:05:43 -0000 On 11/01/11 14:39, Pawel Jakub Dawidek wrote: > On Mon, Oct 31, 2011 at 10:10:14PM +0200, Alexander Motin wrote: >> Attempt to fix some GEOM MULTIPATH issues made me almost rewrite it. So >> I would like to present my results and request for testing and feedback. >> >> The main changes: >> - Improved locking and destruction process to fix crashes in many cases. >> - Improved "automatic" configuration method to make it safe by reading >> metadata back from all specified paths after writing to one. >> - Added provider size check to reduce chance of conflict with other >> GEOM classes. >> - Added "manual" configuration method without using on-disk metadata. >> - Added "add" and "remove" commands to manage paths manually. >> - Failed paths no longer dropped from GEOM, but only marked as FAIL and >> excluded from I/O operations. >> - Automatically restore failed paths when all others paths are marked >> as failed, for example, because of device-caused (not transport) errors. >> - Added "fail" and "restore" commands to manually control FAIL flag. >> - GEOM is now destroyed on last provider disconnection. IMHO it is >> right to do if device was completely removed. >> - Added optional Active/Active mode support. Unlike Active/Passive >> mode, load evenly distributed between all working paths. If supported by >> device, it allows to significantly improve performance, utilizing >> bandwidth of all paths. It is controlled by -A option during creation. >> Disabled by default now. >> - Improved `status` and `list` commands output. >> >> Latest patch can be found here: >> http://people.freebsd.org/~mav/gmultipath4.patch >> >> Feedbacks are welcome! >> >> Sponsored by: iXsystems, Inc. > > There are two possible issues that comes to my mind, not sure if you > address them. > > 1. When configuration is based on on-disk metadata, GEOM spoil/taste is > not fully helpful - if you have two paths: da0 and da1 and I write > to da0, gmultipath won't be informed by GEOM that da1 changed as well. > One solution is to basically keep all paths open exclusively all the > time, even if gmultipath provider is not open or emulate spoil/taste > for other paths if any path was modified. Now I am opening all underlying providers exclusively on attach to spoil other consumers. It is configurable via sysctl. > 2. In active/active mode do you do anything to handle possible > reordering? Ie. if you have overlapping writes and send both of them > using different paths, you cannot be sure that order will be > preserved. Most of the time that's not a problem, as file systems > rarely if at all send overlapping writes to device, but this is weak > assumption. No, I don't. I have doubt that it is sane to send even dependent I/O simultaneously without waiting for completion, not speaking about overlapping. When most of present devices support command queuing and so officially justify reordering simultaneous commands in custom way, I am not sure why above layers should be more strict, especially in cases when it is problematic. If somebody have ideas why and how to implement it, I am ready to discuss. -- Alexander Motin