From owner-freebsd-questions@freebsd.org Thu Sep 10 17:53:27 2015 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F214AA01130 for ; Thu, 10 Sep 2015 17:53:26 +0000 (UTC) (envelope-from milios@ccsys.com) Received: from cargobay.net (cargobay.net [198.178.123.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B3FA31319 for ; Thu, 10 Sep 2015 17:53:26 +0000 (UTC) (envelope-from milios@ccsys.com) Received: from [192.168.0.4] (cblmdm72-240-160-19.buckeyecom.net [72.240.160.19]) by cargobay.net (Postfix) with ESMTPSA id 5B7C763A; Thu, 10 Sep 2015 17:39:46 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: mdconfig creating file based memory disk From: "Chad J. Milios" X-Mailer: iPhone Mail (12H321) In-Reply-To: <20150910111034.20b97c41@X220.alogt.com> Date: Thu, 10 Sep 2015 13:44:18 -0400 Cc: "freebsd-questions@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: References: <20150910111034.20b97c41@X220.alogt.com> To: Erich Dollansky X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Sep 2015 17:53:27 -0000 > On Sep 9, 2015, at 11:10 PM, Erich Dollansky = wrote: >=20 > Hi, >=20 > I just came across a simple question. What will happen when I create > two memory disks using the same file? >=20 > Example: >=20 > mdconfig -f /usr/home/swap/swapfile -u 0 > mdconfig -f /usr/home/swap/swapfile -u 1 >=20 > and then I do a >=20 > swapon /dev/md0 > swapon /dev/md1 >=20 > It gives me double the size of 'swapfile' as swap space. It is obvious > to me that this must fail. >=20 > Shouldn't there be a note in the documentation? >=20 > Erich Perhaps, but if we documented every way in which FreeBSD allows one to shoot= oneself in the foot, the docs would probably more than triple in size. :) This is an interesting experiment but I can't imagine anyone inviting the da= nger while actually expecting to get away with such a configuration and I do= n't imagine happening onto it by accident any more likely than the other inf= inite potentially dangerous misconfigurations of *nix. I doubt this merits a= mention for safety's sake, though as an illustration of how swap actually w= orks internally it has a lot of merit. I'd be curious to see more thorough t= est results and discussion from those with intimate knowledge of the virtual= memory and swapper/pager systems. Imagine the following analog: a hypothetical database software which mmap()s= a file possibly larger than physical memory to rely on the VM system for de= mand paging. Now imagine two or more instances of the database software bein= g started with hard links to the same underlying file and both/all are allow= ed to read and write. If the software is SMP-capable and uses locks or data s= tructures WITHIN the mapped region to handle synchronization (and doesn't go= out of its way to in-and-of-itself cache/process the data (beyond the help t= he kernel already provides) outside that region for moments during which the= data could become stale) then the multiple instances could all serve data f= rom, AND modify data in, that same single source of truth and will remain st= able and in-sync even without msync()ing to the underlying file or storage. I= 'm also positive this holds true though any (or an arbitrary and very large)= number/combination of indirections through hardlinks, symlinks, mdconfig, n= ullfs and/or unionfs (or it intends to, so any failure or race should be con= sidered a kernel bug). So without inspecting the relevant kernel source myself, based on the little= experiment you've conducted, I can imagine the swap perhaps having been set= up in a way that the data structure(s) that map swapped regions is either f= ully inside or fully outside the swap partition/file in a way in which any "= surprise" data showing up in the "other" swap device (besides the one it was= written to) ends up being non-problematic. I am just brainstorming here and= would love it if someone with knowledge rather than conjecture chimes in. := ) On the outset of the experiment you describe, my expectation was almost cert= ain spectacular failure. Anything else actually is quite curious and if such= a config doesn't just burst right into flames I consider it quite a testame= nt to sound *nix engineering. I'd be interested to hear someone exercise it w= ith more swapping out and paging in of data and verifying the data and seman= tics.=