From owner-freebsd-hackers@freebsd.org Mon Nov 28 04:19:26 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4C96BC5981C for ; Mon, 28 Nov 2016 04:19:26 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-io0-x22e.google.com (mail-io0-x22e.google.com [IPv6:2607:f8b0:4001:c06::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 08DD7959 for ; Mon, 28 Nov 2016 04:19:26 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-io0-x22e.google.com with SMTP id a124so210412187ioe.2 for ; Sun, 27 Nov 2016 20:19:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=CEpghk8GCP55VChF1aTeHJiAyO3qCgA1nD5ZAVit9hk=; b=SRye1B8qhoSTdaHHpHEiYWZTRbPN+tp8XsBHCW668TxZ5x5zSKMU52kwPWFgIBOFJy oiYwltFg2OcxW4XIst0f7laLiiq3432fQlwu1YyqKFLpqvPIC5gJMTy8JOr1nnmLMjHH xTdYSkD1lJiY9bHIX6mpI5ZYANoNCllrSDktti/ondYr+gXvoLOdkRdb0N11LPSY3IhD ZCNWApq2aTilTQsozj6WcdVqS2aD9D3g26juhjoDJhrtZ1eJ6pa0Rr3qmoUe3QEe2K/O Qe6konDMiMEike3AI/a1D9galW82ust5H8kJTJ5UIx+PVZnJJpgnxJT9dOyoQ+oMPrUH enhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=CEpghk8GCP55VChF1aTeHJiAyO3qCgA1nD5ZAVit9hk=; b=BZ1lGlt9fwW1bTB3X2KUIABbM1Crt8cN30f49EQqo9fW2ht49XnGxUEZNpB5waV8hK jqfk/rCSIQC0YbPU31vqYJ8+CB6YwxJiF2YwDegr8mtoeJXch6gYoi8PEfT3I84mr7RN biwLPxlLUPVXfX581p4up0mTDMF1dmS9JKt0mq1jUKz6jfINeSSPkvF9Hjac5XPSxHrr Wxpg3uOE8rylIIlk1nbO0fVj6AlKH+ObBa2cqxBaveSeuuF7A/sN/L3N7n0a36N8oOu1 d3T3DN7lCwi5fvIXD+yr42d1MLdijZ01oRGeldzeTFxfXWndG4UkE1Q/Zlxfp7drSEnr LQKw== X-Gm-Message-State: AKaTC00soDW0is2/pK9v69a09k1bdDjCD5bCrlTX/sPs0oIT86Tb/ghFHDlb7ZW/MZCQEA== X-Received: by 10.107.135.219 with SMTP id r88mr16415189ioi.224.1480306765396; Sun, 27 Nov 2016 20:19:25 -0800 (PST) Received: from charmander (toroon0812w-lp140-03-67-70-148-219.dsl.bell.ca. [67.70.148.219]) by smtp.gmail.com with ESMTPSA id v74sm19335937ioi.2.2016.11.27.20.19.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 27 Nov 2016 20:19:24 -0800 (PST) Sender: Mark Johnston Date: Sun, 27 Nov 2016 20:18:47 -0800 From: Mark Johnston To: David Cross Cc: freebsd-hackers@freebsd.org Subject: Re: FreeBSD 11 i386 disk deadlock (I think) (now with reproduction steps!) Message-ID: <20161128041847.GA65249@charmander> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2016 04:19:26 -0000 On Sun, Nov 27, 2016 at 03:17:13PM -0500, David Cross wrote: > So, narrowing this down, I think it has something to do with geli swap > (since I can easily reproduce it with geli swap, but have yet to reproduce > it without).. and I have a bit of a convoluted way almost anyone can > reproduce it with bhyve. (Note, I haven't been able to get a crashdump, > since apparently the VM system being locked up prevents that, but with > watchdogd, I have been able to get into DDB) > > Anyway, my reproduction steps, I used the 11.0 Retail DVD, but I fully > suspect the 11.0-RELEASE image will be fine to install an i386 image into > bhyve; I install to vtbd disks (even though my 'real' case is to an ada > device, that this can be repro-ed across such wide "hardware" really > reduces the likelyhood of a device driver issue) > > After its installed, I start my VM with the following (dropping memory to > the floor, well below my "real" machine, but the emulated machine is much > faster and I suspsect this is a race condition somewhere), note the options > to the virtio-blk device to pin it to "real" and not hit the host vmcache, > again speed seems to be key here, and slowing things down makes it more > likely to happen. > > bhyveload -m 64M -d /usr/bhyve/11.0.1-i386.img fbsd11-i386 > bhyve -u -A -c 1 -H -m 64M -C -s 0,hostbridge -s 1,lpc -s 2,virtio-net,tap0 > -s 3,virtio-blk,/usr/bhyve/11.0.1-i386.img,nocache,direct -l > com1,/dev/nmdm0A fbsd11-i386 > > At this point: > Log into the VM > cd /usr/src > /usr/bin/make buildkernel > > > For me this has hung 99% of the time at: > objcopy --strip-debug kernel > > Once you've gotten here once, I have been able to just skip the rest of the > compile, cd /usr/obj/usr/src/sys/GENERIC run that command directly and > trigger the condition. > > What I have at this point is the following DDB ps list: > > db> ps > pid ppid pgrp uid state wmesg wchan cmd > ... > 50 0 0 0 DL vmwait 0xc1c4f6d8 [g_eli[0] vtbd0p3] > ... > 100043 D wswbuf0 0xc1bf30d4 [pagedaemon] > ... > > I note that the swapper and that geli are both in vmwait, and a bunch of > other processes are in pfault, and the "crypto" drivers are in disk wait?? This is a low memory deadlock: the pagedaemon is attempting to reclaim memory by freeing pages from the inactive queue, and here is waiting for the swap pager to finish writing out a page. However, the GELI thread is blocked waiting for the pagedaemon to free up some pages. Some recent work that's gone into HEAD ought to address this scenario. In particular, with r308474 swapping is performed by a separate thread, so even if that thread blocks waiting for the GELI thread, the pagedaemon is able to continue freeing clean pages or at least kill memory-hogging processes. Could you try your scenario in a VM running a HEAD kernel?