Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 Dec 2020 16:22:10 +0100 (CET)
From:      Ronald Klop <ronald-lists@klop.ws>
To:        FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: rc.d/zpool runs before ada(4) attaches
Message-ID:  <286917313.21.1606836130991@localhost>
In-Reply-To: <08815f92-742c-2934-e746-fd04ca9b4e16@omnilan.de>
References:  <b55604d6-5c23-a590-859c-a52f36386d44@omnilan.de> <1439301337.11.1606815206810@localhost> <08815f92-742c-2934-e746-fd04ca9b4e16@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help
 
Van: Harry Schmalzbauer <freebsd@omnilan.de>
Datum: dinsdag, 1 december 2020 12:51
Aan: Ronald Klop <ronald-lists@klop.ws>, FreeBSD Current <freebsd-current@freebsd.org>
Onderwerp: Re: rc.d/zpool runs before ada(4) attaches
> 
> Am 01.12.2020 um 10:33 schrieb Ronald Klop:
> :
> :
> :
> >> One machine fails importing zpool because the correponding vdevs >> (ada0-ada2)
> >> are not available at the time rc.d/zpool runs.
> >>
> >>
> >> Adhoc  I'm not aware of any rc(8) vs. driver awareness.
> >> Is there any?
> >>
> >> Suggestions how to fix else than 'sleep 1'?
> >>
> >> Thanks,
> >>
> >> -harry
> >>
> >> P.S.: ahci(4) is compiled into kernel, machine is a HPE U48 (Gen 10 >> plus MicroServer), zfsloader loads root_MFS kernel module
> >>
> >
> >
> > There have been some changes to etc/rc.d/zpool in September.
> > Do you have the latest version? Compare with:
> > https://github.com/freebsd/freebsd/blob/master/libexec/rc/rc.d/zpool
> > or
> > https://svnweb.freebsd.org/base/head/libexec/rc/rc.d/zpool?revision=365354&view=markup >
> >
> > Otherwise it would be helpful for readers if you could post some logs > which indicate what is happening.
> > /var/run/dmesg.boot or the output of "dmesg"
> > Part of /var/log/messages
> > Part of /var/log/console.log if it exists.
> >
> 
> Thanks, I'm on -current from view days ago.
> The problem is that cam(4) is still probing devices, when rc.d/zpool runs, since mount_root_from succeeded, because it is a RAM disk, so succeeds independent of any real drive/controller probing.
> I can imagine of other seldom edgecases hitting the issue too.
> 
> So my proposed patch, working for me, looks like this:
> Index: libexec/rc/rc.d/zpool
> ===================================================================
> --- libexec/rc/rc.d/zpool       (revision 368202)
> +++ libexec/rc/rc.d/zpool       (working copy)
> @@ -18,8 +18,16 @@
> 
>   zpool_start()
>   {
> -       local cachefile
> +        local cachefile n=0 camlist=`camcontrol devlist -v`
> 
> +       # Wait for cam(4) devices attaching, 4 times at max by increasing
> +       # 1s each (10s max in total)
> +        while [ X"${camlist#*target*lun*probe}" != X"${camlist}" ]; do
> +               [ $n -lt 4 ] || break
> +               sleep $((n+=1))
> +               camlist=`camcontrol devlist -v`
> +       done
> +
>          for cachefile in /etc/zfs/zpool.cache /boot/zfs/zpool.cache; do
>                  if [ -r $cachefile ]; then
>                          zpool import -c $cachefile -a -N && break
> 
> best,
> -harry
> 
> 
> 

You can define these in /boot/loader.conf:
#kern.cam.boot_delay="10000" # Delay (in ms) of root mount for CAM bus
#kern.cam.scsi_delay="2000" # Delay (in ms) before probing SCSI

Maybe that helps.

Ronald.
 
From owner-freebsd-current@freebsd.org  Tue Dec  1 15:34:39 2020
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1874C4AC17D
 for <freebsd-current@mailman.nyi.freebsd.org>;
 Tue,  1 Dec 2020 15:34:39 +0000 (UTC) (envelope-from ian@freebsd.org)
Received: from outbound2k.ore.mailhop.org (outbound2k.ore.mailhop.org
 [54.148.219.64])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4ClmPB4RZpz4R7F
 for <freebsd-current@freebsd.org>; Tue,  1 Dec 2020 15:34:38 +0000 (UTC)
 (envelope-from ian@freebsd.org)
ARC-Seal: i=1; a=rsa-sha256; t=1606836877; cv=none;
 d=outbound.mailhop.org; s=arc-outbound20181012;
 b=mJbFdn3vky8b43f+LC05+hEfbLSlzIqnHvDJIputTaNjOnkA805NcEG3AJbeqSfyW2+D9yJTNcXzT
 pRwlen6CQsdxae8HXor5vn7hx8nNGLHST2kFFKvumt/1o4gLg7dVAmkongft+WAM6kgwwueOOmJCtM
 AsWk7noXuQXfsKVyGoc06A0LLN99HEdNpwc4VncQ756P+u/K4+dg/o0LDnnP/iTcy5PxBbQpQasfXO
 AHGE1pkeL0VUq2Gev7dxJ6J8djv2sOH+FddhqRR6uPnQrZfu8ZiyZdJaowFmYwZ4R3NXvEu3QCO/s5
 avlUJq8mV9mDkGRCGwH9N04B5n97OIg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=outbound.mailhop.org; s=arc-outbound20181012;
 h=content-transfer-encoding:mime-version:content-type:references:in-reply-to:
 date:to:from:subject:message-id:dkim-signature:from;
 bh=JXGElMP2/v99hP+QuY3qvbtOtBeeSyOc/L7PO2Jk3ZM=;
 b=m75oKsMErXO+T+igy6l4J1awwq9cfoiTiIqQUasmB348gY81d1WMJRHqNoz3MhVcg9HVkvAcc1FIe
 GLOxrn71Rr3CxkX+jOCZRS37oYk9SmkxrkXlYmICeoREYBphOk1q4UJ9vUom7qKL5PY5WWW9Nt1uml
 /EjzQAhmhQYAtt6VTXKmkhJVcweVars9Joj849P37x3Fp54AhM0Pur2ozthx5q4wQUSN3DNp/19QqO
 QgstIMqbkLvctrpfWCRi8rnxk7WNVmXcCUDihMvBrY5EWCNuXdGT3XiPsIDLzXbDBi9VUEO8r0ePV3
 NpFG4zVgLYT/TiEGyedkJpovDuv8qzA==
ARC-Authentication-Results: i=1; outbound4.ore.mailhop.org;
 spf=softfail smtp.mailfrom=freebsd.org smtp.remote-ip=67.177.211.60;
 dmarc=none header.from=freebsd.org;
 arc=none header.oldest-pass=0;
X-MHO-RoutePath: aGlwcGll
X-MHO-User: b62a4d2a-33ea-11eb-9e14-df46ed8f892f
X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information
X-Originating-IP: 67.177.211.60
X-Mail-Handler: DuoCircle Outbound SMTP
Received: from ilsoft.org (c-67-177-211-60.hsd1.co.comcast.net [67.177.211.60])
 by outbound4.ore.mailhop.org (Halon) with ESMTPSA
 id b62a4d2a-33ea-11eb-9e14-df46ed8f892f;
 Tue, 01 Dec 2020 15:34:35 +0000 (UTC)
Received: from rev (rev [172.22.42.240])
 by ilsoft.org (8.15.2/8.15.2) with ESMTP id 0B1FYXWV094978;
 Tue, 1 Dec 2020 08:34:33 -0700 (MST) (envelope-from ian@freebsd.org)
Message-ID: <786faeee90e79aa0175b298ec859265ff57a3129.camel@freebsd.org>
Subject: Re: rc.d/zpool runs before ada(4) attaches
From: Ian Lepore <ian@freebsd.org>
To: Ronald Klop <ronald-lists@klop.ws>, FreeBSD Current
 <freebsd-current@freebsd.org>
Date: Tue, 01 Dec 2020 08:34:33 -0700
In-Reply-To: <286917313.21.1606836130991@localhost>
References: <b55604d6-5c23-a590-859c-a52f36386d44@omnilan.de>
 <1439301337.11.1606815206810@localhost>
 <08815f92-742c-2934-e746-fd04ca9b4e16@omnilan.de>
 <286917313.21.1606836130991@localhost>
Content-Type: text/plain; charset="ASCII"
X-Mailer: Evolution 3.28.5 FreeBSD GNOME Team
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Rspamd-Queue-Id: 4ClmPB4RZpz4R7F
X-Spamd-Bar: /
X-Spamd-Result: default: False [0.00 / 15.00];
 local_wl_from(0.00)[freebsd.org];
 ASN(0.00)[asn:16509, ipnet:54.148.0.0/15, country:US]
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>;
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Dec 2020 15:34:39 -0000

On Tue, 2020-12-01 at 16:22 +0100, Ronald Klop wrote:
>  
> Van: Harry Schmalzbauer <freebsd@omnilan.de>
> Datum: dinsdag, 1 december 2020 12:51
> Aan: Ronald Klop <ronald-lists@klop.ws>, FreeBSD Current <
> freebsd-current@freebsd.org>
> Onderwerp: Re: rc.d/zpool runs before ada(4) attaches
> > 
> > Am 01.12.2020 um 10:33 schrieb Ronald Klop:
> > :
> > :
> > :
> > > > One machine fails importing zpool because the correponding
> > > > vdevs >> (ada0-ada2)
> > > > are not available at the time rc.d/zpool runs.
> > > > 
> > > > 
> > > > Adhoc  I'm not aware of any rc(8) vs. driver awareness.
> > > > Is there any?
> > > > 
> > > > Suggestions how to fix else than 'sleep 1'?
> > > > 
> > > > Thanks,
> > > > 
> > > > -harry
> > > > 
> > > > P.S.: ahci(4) is compiled into kernel, machine is a HPE U48
> > > > (Gen 10 >> plus MicroServer), zfsloader loads root_MFS kernel
> > > > module
> > > > 
> > > 
> > > 
> > > There have been some changes to etc/rc.d/zpool in September.
> > > Do you have the latest version? Compare with:
> > > 
https://github.com/freebsd/freebsd/blob/master/libexec/rc/rc.d/zpool
> > > or
> > > 
https://svnweb.freebsd.org/base/head/libexec/rc/rc.d/zpool?revision=365354&view=markup
> > >  >
> > > 
> > > Otherwise it would be helpful for readers if you could post some
> > > logs > which indicate what is happening.
> > > /var/run/dmesg.boot or the output of "dmesg"
> > > Part of /var/log/messages
> > > Part of /var/log/console.log if it exists.
> > > 
> > 
> > Thanks, I'm on -current from view days ago.
> > The problem is that cam(4) is still probing devices, when
> > rc.d/zpool runs, since mount_root_from succeeded, because it is a
> > RAM disk, so succeeds independent of any real drive/controller
> > probing.
> > I can imagine of other seldom edgecases hitting the issue too.
> > 
> > So my proposed patch, working for me, looks like this:
> > Index: libexec/rc/rc.d/zpool
> > ===================================================================
> > --- libexec/rc/rc.d/zpool       (revision 368202)
> > +++ libexec/rc/rc.d/zpool       (working copy)
> > @@ -18,8 +18,16 @@
> > 
> >   zpool_start()
> >   {
> > -       local cachefile
> > +        local cachefile n=0 camlist=`camcontrol devlist -v`
> > 
> > +       # Wait for cam(4) devices attaching, 4 times at max by
> > increasing
> > +       # 1s each (10s max in total)
> > +        while [ X"${camlist#*target*lun*probe}" != X"${camlist}"
> > ]; do
> > +               [ $n -lt 4 ] || break
> > +               sleep $((n+=1))
> > +               camlist=`camcontrol devlist -v`
> > +       done
> > +
> >          for cachefile in /etc/zfs/zpool.cache
> > /boot/zfs/zpool.cache; do
> >                  if [ -r $cachefile ]; then
> >                          zpool import -c $cachefile -a -N && break
> > 
> > best,
> > -harry
> > 
> > 
> > 
> 
> You can define these in /boot/loader.conf:
> #kern.cam.boot_delay="10000" # Delay (in ms) of root mount for CAM
> bus
> #kern.cam.scsi_delay="2000" # Delay (in ms) before probing SCSI
> 
> Maybe that helps.
> 
> Ronald.
> 

Those settings control waiting before mounting root.  Harry's problem
is that root is mounted quickly, before other drives are ready for zfs.
 
The zpool script waits for 'disks'.  It would be nice if the cam
subsystem had something like a sysctl it set to indicate when initial
probing for disks was done, then there could be an rc.d/camprobe script
with 'PROVIDE: disks' which waits for the probing to complete.

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?286917313.21.1606836130991>