From owner-freebsd-stable@freebsd.org Thu Jun 22 17:06:20 2017 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31593D90B4B; Thu, 22 Jun 2017 17:06:20 +0000 (UTC) (envelope-from madpilot@FreeBSD.org) Received: from mail.madpilot.net (grunt.madpilot.net [78.47.145.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B906982642; Thu, 22 Jun 2017 17:06:19 +0000 (UTC) (envelope-from madpilot@FreeBSD.org) Received: from mail (mail [192.168.254.3]) by mail.madpilot.net (Postfix) with ESMTP id 3wtnzR1sVgzb56; Thu, 22 Jun 2017 19:06:11 +0200 (CEST) Received: from mail.madpilot.net ([192.168.254.3]) by mail (mail.madpilot.net [192.168.254.3]) (amavisd-new, port 10024) with ESMTP id FE8vaIaHIRzM; Thu, 22 Jun 2017 19:06:09 +0200 (CEST) Received: from tommy.madpilot.net (micro.madpilot.net [88.149.173.206]) by mail.madpilot.net (Postfix) with ESMTPSA; Thu, 22 Jun 2017 19:06:09 +0200 (CEST) Subject: Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available) To: Warner Losh Cc: Peter Blok , Ed Maste , FreeBSD-STABLE Mailing List , FreeBSD FS , Glen Barber , Jonathan Chen References: <20170610123435.GB69235@FreeBSD.org> <10A08FC1-C84E-4D06-9360-B7C3848F4680@bsd4all.org> <05e6bd02-3582-aea1-5fc3-19caa4073f94@FreeBSD.org> From: Guido Falsi Message-ID: <0c4c58d6-f34c-a358-3bda-122913110c6d@FreeBSD.org> Date: Thu, 22 Jun 2017 19:06:08 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jun 2017 17:06:20 -0000 On 06/22/17 18:38, Warner Losh wrote: > > > On Thu, Jun 22, 2017 at 2:26 AM, Guido Falsi > wrote: > > On 06/21/17 16:59, Guido Falsi wrote: > > On 06/13/17 13:44, Peter Blok wrote: > >> Hi, > >> > >> For a while now, I’m not able to build a RPI1-B image from -stable. I have narrowed it dow to fix 318394, which adds a refresh option to geom_label. If I undo this fix in today’s stable it works ok. If I don’t I’m getting continuously: > >> > >> vm_fault: pager read error, pid 1 (init) > >> vnode_pager_generic_getpages_done: I/O read error 5 > >> > >> I have looked at the fix and I can’t figure out why it breaks the code. > >> > >> And yes I have tried various other SD cards - they all have the same issue. > >> > > > > Hi, > > > > I'm seeing similar symptoms with NanoBSD images on PCEngines ALIX and > > APU2 boards, using compactflash and SD card storage respectively. The > > problem has appeared as soon as I started testing 11.1-BETA1 from the > > stable branch. > > > > Problem appears when I update the image, using a slightly modified > > version of the standard nanobsd update and updatep[12] scripts. My > > changes are not in the dd/gpart commands though, which are the same. > > gpart seems the most likely candidate though. > > > > I have just discovered this thread and I will test reverting r318394 > > soon. Thanks to Peter for narrowing it down! > > > > Maybe this is related to having the disks mounted read-only? > > > > I noticed that after the problem appears many commands, including > shutdown, start failing telling "device not configured" for all mounted > FSes. I'm even unable to "ls /dev". > > Looks like the geom refresh changes devices from below the system in a > way which triggers this reaction. > > I don't know the geom code and have been unable to find an immediate > problem in the commit mentioned above. I'd really like some help to know > where to look, or what kind of debugging information is needed. > > This is quite a bad bug for people running NanoBSD and should be fixed > before the release. > > > So can I recreate this with the embedded-type NanoBSD image? If this > change breaks NanoBSD, it will need to be reverted... > You should be able to reproduce it with a nanobsd image, then updating it using the standard script which dumps the new image in the "other" partition and uses gpart to configure the new partition as bootable. I'm using a slightly modified update script which also mounts the new partition in /mnt and performs some operations there. Then it dismounts the partition and launches the "gpart set -a active -i ${_to} ${NANO_DRIVE}" command (which I suspect is exactly where the actual problem is happening). I also tested reverting the change and can confirm that it makes the problem go away. I'm sure it can be triggered by other gpart operations. I'm trying to understand exactly which operations. I'll followup as soon as I have easier use case to reproduce it. I first need to revert to an image affected by the problem. Thanks for your feedback! -- Guido Falsi