From owner-freebsd-arm@freebsd.org Tue Jun 19 21:59:50 2018 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E92CB1011879 for ; Tue, 19 Jun 2018 21:59:49 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (www.zefox.net [50.1.20.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "www.zefox.org", Issuer "www.zefox.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 66544781E3 for ; Tue, 19 Jun 2018 21:59:49 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (localhost [127.0.0.1]) by www.zefox.net (8.15.2/8.15.2) with ESMTPS id w5JM006n089378 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 19 Jun 2018 15:00:01 -0700 (PDT) (envelope-from fbsd@www.zefox.net) Received: (from fbsd@localhost) by www.zefox.net (8.15.2/8.15.2/Submit) id w5JM00tt089377; Tue, 19 Jun 2018 15:00:00 -0700 (PDT) (envelope-from fbsd) Date: Tue, 19 Jun 2018 15:00:00 -0700 From: bob prohaska To: Warner Losh Cc: "Rodney W. Grimes" , "freebsd-arm@freebsd.org" , bob prohaska Subject: Re: GPT vs MBR for swap devices Message-ID: <20180619220000.GA89266@www.zefox.net> References: <20180614175622.GC35161@www.zefox.net> <201806142110.w5ELAL0N046840@pdx.rh.CN85.dnsmgr.net> <20180615035225.GA37370@www.zefox.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jun 2018 21:59:50 -0000 On Tue, Jun 19, 2018 at 01:10:15PM -0600, Warner Losh wrote: > On Thu, Jun 14, 2018 at 10:00 PM, Warner Losh wrote: > > > > > > > On Thu, Jun 14, 2018, 9:52 PM bob prohaska wrote: > > > >> On Thu, Jun 14, 2018 at 02:10:21PM -0700, Rodney W. Grimes wrote: > >> > > >> > It might be interesting to do in order the swapon > >> > commands to 1G USB flash, 1G SD flash, 2G SD flash, > >> > >> It seems clear that USB flash swap, alone or in any > >> combination, fails early in buildworld. > > My statement is now known to be mistaken. Failures happen when USB flash swap is placed on the same device as /usr. USB flash swap or USB mechanical swap works fine when placed on a device remote from /usr > > > > I think that's because USB flash can't swap fast enough to keep up with > > the page demand. You might be able to confirm this by looking at the write > > rates to the swap portions for the various other media with gstat. I > > suspect it's FTL is doing more expensive garbage collection under a swap > > work load leading to long pauses from time to time that the VM system > > responds to by starting OOM too soon. > > > > Looking at the data posted, I see that we have a 2s latency averaged over > 10s. This means that the drive is basically unresponsive. So the average > latency is 2s. That means that the max latency is likely way more than that > (likely as much as 10-20s if my experience with SSD latency distributions > can be trusted). The latency bounces around a bit (and there appears to be > some missing data), but this is what I expected to see. > > Now, maybe we have an issue with the USB stack that's causing this (missed > interrupts leading to polling 'saving' the day after a massively long time, > for example), or the USB drives are as bad as I've experiences them to be. > In any event, if we can't retire the dirty pages fast enough, we'll get > into OOM to try to cope with too many dirty pages. The different control > loops in the system to moderate these things have some 'hidden' assumptions > that we can launder pages faster than this, so I think we're falling off > the rails when we can't, even when we have available swap space. > One more test has completed. The Pi3 was configured with 1 GB swap on the microSD card. It was expected to run out of swap and start the OOM assassin. Instead, it ran until the swap usage hit a little over 80% and then traffic with da0d (/usr, likely /usr/obj) got stuck. Traffic to da0d remained stuck for several sampling cycles, swap usage dropped to a few percent, and I finally pulled the plug when the debugger would not start. The console messages are of a sort seen before and might suggest hardware trouble, but after rebooting and fsck the machine seems just fine and is running another test. It's completely unclear to me what triggered the USB stoppage. The relevant files are at http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbsdflash_swapinfo/ I hope they're useful. Thanks for reading, bob prohaska