From nobody Fri Apr 1 03:24:28 2022 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id E8F1E1A4FBC7 for ; Fri, 1 Apr 2022 03:24:32 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-vs1-xe2b.google.com (mail-vs1-xe2b.google.com [IPv6:2607:f8b0:4864:20::e2b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KV5BS1Pr1z4kHL for ; Fri, 1 Apr 2022 03:24:32 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-vs1-xe2b.google.com with SMTP id i10so1488710vsr.6 for ; Thu, 31 Mar 2022 20:24:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MMXqC1635P/vHQproBpqMy6cVF7QiE2Am3O6aGLvI9w=; b=eVqdfvSagWhKfSQR9RuC1dy2r2QO7SaO//v5IR5sugZ9DkAiCM/0b7BmWIBzV8lJgs b00hEy4mGrtVM4kujYYTOVZ0Np06F9aWllAupiVotAC+KP5asVI2BIzQ5msZuQEhHkiN LV0BImeS31ynfd+9BI39ecBu3QMBrRfRy4RpYsVDjpcmEU7RDlDT/OOBc7xiEYNl5gLN zu/NRLvkoET21qGHwzjWwrCodGH3EVTY/KzgWRjp6ar6EqOu1XDugj/b4gHwJIW+huH2 IMdAjSvNhAON9GGmH5bZnwRORGmoQGgAc15lgYmADFWX3a4aceqHHfk5aAXJIC22styH lidQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MMXqC1635P/vHQproBpqMy6cVF7QiE2Am3O6aGLvI9w=; b=SiOC1JAshJTFKogZHQsEoHpxe7jYnRzBPcuF/L/oPMgiNvWO4xr/pWNqvZAIhVmWha biwkAKTkf31bAFIWF9X45Oz3Sr1uH7AYneVc59AK4dwzn25d+UxvnwuS/K81+SoNip0K 72qnsFLsWH6NZEdDAppyNd347gq23yBoyOZnIhFfheFUDVdrIRdnrX9CY5zB6Li0qqGV NousZOF5HbZRCVx/kSKNLMbpQupPFb+1/oD7nyY2lDEFasC9xXJW521kB6cSi1DUe53y /3ZXxbDJFk3DYwwulJUgUv/UsRqjqh9AIgsfNezr418twLglFEqk+eDA7TGmCnnjX530 ZRVA== X-Gm-Message-State: AOAM531eH9T/FpmFmyniNmCMnQGgNLBGH0nza4LK/Z6MsfrkkppWSQo4 IA2BDluQBtuiCKZ5U9qC10ZGFSXZc0Razt04GAqCawoNCcHRig== X-Google-Smtp-Source: ABdhPJwEyVPxKl8uuO00p871VebFO6vsBZv2VhaLe7qLGDsxqZxkTsw0fMU8NE5BsM74dwl7Kb87O5Dtm100fSSPNbs= X-Received: by 2002:a67:3201:0:b0:325:38ba:2d43 with SMTP id y1-20020a673201000000b0032538ba2d43mr22438636vsy.13.1648783471488; Thu, 31 Mar 2022 20:24:31 -0700 (PDT) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Warner Losh Date: Thu, 31 Mar 2022 21:24:28 -0600 Message-ID: Subject: Re: CAM timeouts on boot (13.0-p8, Dell R730xd, mrsas controller) To: George Michaelson Cc: FreeBSD Stable Content-Type: multipart/alternative; boundary="0000000000002140fe05db8f5145" X-Rspamd-Queue-Id: 4KV5BS1Pr1z4kHL X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20210112.gappssmtp.com header.s=20210112 header.b=eVqdfvSa; dmarc=none; spf=none (mx1.freebsd.org: domain of wlosh@bsdimp.com has no SPF policy when checking 2607:f8b0:4864:20::e2b) smtp.mailfrom=wlosh@bsdimp.com X-Spamd-Result: default: False [-3.00 / 15.00]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20210112.gappssmtp.com:s=20210112]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; DMARC_NA(0.00)[bsdimp.com]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20210112.gappssmtp.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::e2b:from]; NEURAL_HAM_SHORT(-1.00)[-0.999]; MLMMJ_DEST(0.00)[freebsd-stable]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; R_SPF_NA(0.00)[no SPF record]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_COUNT_TWO(0.00)[2]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com] X-ThisMailContainsUnwantedMimeParts: N --0000000000002140fe05db8f5145 Content-Type: text/plain; charset="UTF-8" On Thu, Mar 31, 2022 at 9:14 PM George Michaelson wrote: > I upgraded 12.2 to 13.0-p8 and hit a delay with SCSI drive > initialisation, it loops for a timeout over the "Waiting for CAM" > message, then proceeds. > OK. > This interferes with the ZFS initialisation and the non-root zpool are > not imported. > It does not. The Waiting for CAM happens before mountroot, and zpool imports are after that. It cannot interfere. IT's just a message that says CAM hasn't finished probing its buses to release mountroot. Something else must be going on. > I intruded a /usr/local/etc/rc.d script PROCEED: var REQUIRE: zfs > which put it right after zfs initialisation, before any daemons, and I > can manually zpool import the tank once the SCSI stuff has settled > down. > > Adding delay to loader.conf CAM initialisation didn't fix this. > > There are references to this in 2018-2020 timeframe, but mostly it's > people on desktops. I am not used to Dell rackmounts going this bad. > A dmesg would help. Generally, CAM probes all buses to completion before it releases the hold on mountroot. Something else is afoot. My spidy sense says the kernel doesn't have all the disk controllers in it, some are loaded by devmatch, which happens late enough to explain the behavior you are seeing. Or maybe there's a USB device which shows up late, since I've seen umass arrive too late to hold off mountroot. If adding delay doesn't help, then that tells me that the disk controller SIM isn't present before mountroot. Warner --0000000000002140fe05db8f5145 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Thu, Mar 31, 2022 at 9:14 PM Georg= e Michaelson <ggm@algebras.org&g= t; wrote:
I upgr= aded 12.2 to 13.0-p8 and hit a delay with SCSI drive
initialisation, it loops for a timeout over the "Waiting for CAM"=
message, then proceeds.

OK.
=C2=A0
This interferes with the ZFS initialisation and the non-root zpool are
not imported.

It does not. The Waiting = for CAM happens before mountroot, and zpool imports are after that.
It cannot interfere. IT's just a message that says CAM hasn't fi= nished probing its buses to
release mountroot. Something else mus= t be going on.
=C2=A0
I intruded a=C2=A0 /usr/local/etc/rc.d script PROCEED: var REQUIRE: zfs
which put it right after zfs initialisation, before any daemons, and I
can manually zpool import the tank once the SCSI stuff has settled
down.

Adding delay to loader.conf CAM initialisation didn't fix this.

There are references to this in 2018-2020 timeframe, but mostly it's people on desktops. I am not used to Dell rackmounts going this bad.

A dmesg would help. Generally, CAM probes all= buses to completion before it releases
the hold on mountroot.
Something else is afoot.

My spidy sense sa= ys the kernel doesn't have all the disk controllers in it, some are
loaded by devmatch, which happens late enough to explain the behavio= r you are
seeing. Or maybe there's a USB device which shows u= p late, since I've seen umass
arrive too late to hold off mou= ntroot. If adding delay doesn't help, then that tells me
that= the disk controller SIM isn't present before mountroot.
=
Warner
--0000000000002140fe05db8f5145--