Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Jul 2025 12:31:36 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        John Nielsen <lists@jnielsen.net>
Cc:        scsi@freebsd.org
Subject:   Re: isboot: help me understand what CAM is doing
Message-ID:  <CANCZdfoyo_GvGkUGkM-45a4h48uBXTmeGuWxK9ek%2BBS%2B_2jMcQ@mail.gmail.com>
In-Reply-To: <8D8F60A2-98E2-4ED7-AB59-E1B7B3DFD10A@jnielsen.net>
References:  <F16B5BD7-4D1F-41F2-8091-508923F09553@jnielsen.net> <CANCZdfpzhu0n1EuX4YsbOkxN_qK2A6gSRfQQnqWLub%2BTPBwEWA@mail.gmail.com> <8D8F60A2-98E2-4ED7-AB59-E1B7B3DFD10A@jnielsen.net>

next in thread | previous in thread | raw e-mail | index | archive | help
[[ this got stuck in my inbox, sorry ]]

On Thu, May 15, 2025 at 3:58=E2=80=AFPM John Nielsen <lists@jnielsen.net> w=
rote:
>
>
>
> > On May 15, 2025, at 2:24=E2=80=AFPM, Warner Losh <imp@bsdimp.com> wrote=
:
> >
> > On Thu, May 15, 2025 at 10:30=E2=80=AFAM John Nielsen <lists@jnielsen.n=
et> wrote:
> >>
> >> Hi all-
> >>
> >> I=E2=80=99m working on a cosmetic bug in isboot-kmod. There is a globa=
l string called isboot_boot_device which is printed for informational purpo=
ses and also available via the net.isboot.device sysctl. The string is popu=
lated in this function: https://github.com/jnielsendotnet/isboot/blob/maste=
r/src/iscsi.c#L1870.
> >>
> >> I don=E2=80=99t know when this changed but historically the string com=
parison on https://github.com/jnielsendotnet/isboot/blob/master/src/iscsi.c=
#L1904 would be called once with =E2=80=9Cpass=E2=80=9D and once with =E2=
=80=9Cda=E2=80=9D and the global isboot_boot_device would be correctly popu=
lated with e.g. =E2=80=9Cda0=E2=80=9D.
> >
> > Yes. We create two peripherals for each device found (well, we always
> > create pass when pass is in the kernel, and if another driver like da
> > or cd likes the device, we'll create a periph for that device.
> >
> > I've never used this iscsi before...
> >
> >> That sometimes happens in the current code as well, but on my test mac=
hine (running 14-STABLE) it is ONLY when I have enabled debug output (by se=
tting bootverbose or net.isboot.debug to 1 or higher). Otherwise, the strin=
g comparison is called only once with =E2=80=9Cprobe=E2=80=9D rather than =
=E2=80=9Cpass=E2=80=9D or =E2=80=9Cda=E2=80=9D. Everything still works in t=
his case; the disk is found (at da0 or whatever) and mounted, but the isboo=
t_boot_device is never populated (or populated with the wrong name like =E2=
=80=9Cprobe0=E2=80=9D if I mess with what the string comparison is looking =
for.

"probe" is the driver that is used to evaluate a new device to see
what drivers might attach to it. It's all the same 'device'. You
mentioned 'failing to terminate' which suggests that the probe device
is still around. so these are related.

> > So what we do is that the scsi XPT layer creates a probe device (whose
> > name is "probe") for each device that's either scanned or that the SIM
> > tells XPT exists. This probe device then sends a bunch of SCSI
> > commands to the device to determine what the device is. Once that's
> > done, it offers the device to each of the periph drivers, who either
> > pass on the device, or create a cam_periph for that device.
> >
> >> There is no functional change other than debug messages when debug out=
put is enabled, so I=E2=80=99m guessing this is a race condition. But since=
 I am still a rank amateur when it comes to kernel programming I don=E2=80=
=99t know where else to look. So my questions are twofold:
> >
> > Likely we're not proceeding to create the pass or the da device
> > because the initial commands fail somehow.
> >
> >> 1) What is going on here? When does the =E2=80=9Cprobe=E2=80=9D name s=
how up in ccb.cgdl.periph_name and why doesn=E2=80=99t the loop ever see =
=E2=80=9Cda=E2=80=9D or =E2=80=9Cpass=E2=80=9D when it does? Corollary: doe=
s isboot_cam_set_devices() operate in a safe/sane way for modern CAM?
> >
> > Not sure which loop this is, so I can't say. But 'pass' is there while
> > scsi_xpt is looking at the device, and then the async routines decide
> > whether to add da or pass devices and then the probe device is
> > removed. The last two happen in parallel, so there could be a race
> > there if you are examining the periph lists.
> >
> >> From looking at the code, it looks like you may be doing racey things
> > by rescanning the device and doing things when the rescan is done.

If you are seeing 'probe' but not 'pass' or 'da' that makes sense, if
things aren't finished. The probe driver should always be transient
that exists just long enough to attach other drivers to the device.

> >> 2) What would be a safer or more reliable way to determine the correct=
 device name so it can be written to the isboot_boot_device global variable=
?
> >
> > It should be something like da0. pass isn't going to be a block device
> > (so you can't boot off of it). cd0 you could boot off of, but nobody
> > exports their SCSI cd. And it's rare that the boot media is
> > multi-voliume, so it's unlikely to be da1, etc. and we don't support
> > any other kind of boot (tape, etc).
> >
> > So I'd love to help, but you're currently way too zoomed in on the
> > problem and assuming that we have more context to what you're trying
> > to do than I think we have. This makes it hard to know how to help.
>
> Thanks Warner. Apologies, let me provide some more context. This isboot c=
ode was written by Daisuke Aaoyama in the FreeBSD 8 days (or earlier). He s=
topped updating it several years ago. I have been the port maintainer for n=
et/isboot-kmod for a long time but am not well-versed in the code. In 2021 =
I created the GitHub repository where it lives today since I needed a place=
 to host patches submitted by others. Since then I have made some minor imp=
rovements (mostly to support my own use cases) in addition to bringing in o=
ther patches to keep the code working in newer FreeBSD versions (thanks jhb=
).
>
> Ideally this functionality should exist in the base system, but I don=E2=
=80=99t know enough to merge the isboot code with the kernel initiator code=
 that=E2=80=99s already there. In the mean time, I=E2=80=99m trying to fix =
issues that I and others have encountered and reported on the GitHub projec=
t.

Thanks for the context.

> The links in my previous email are to specific line numbers in the code o=
n the GitHub repository (and possibly off by a few lines due to commits I=
=E2=80=99ve made in the mean time=E2=80=A6). Since I did not author this co=
de and have not otherwise worked much with FreeBSD=E2=80=99s SCSI subsystem=
 I can=E2=80=99t provide much more context than what is in this code itself=
. So if you or someone else would be willing to take a look I=E2=80=99d be =
grateful. I=E2=80=99m also trying to learn as I go so I can be more useful =
in the future.
>
> The other open issue I=E2=80=99m trying to solve is where the CAM rescan =
fails to complete entirely. So I=E2=80=99m open to any type of feedback fro=
m =E2=80=9Ctry getting the device name at this point instead=E2=80=9D to =
=E2=80=9Cthis line looks race-y because=E2=80=A6=E2=80=9D to =E2=80=9Cspend=
 some time reading X or Y=E2=80=9D (but chapter 12 of the developer handboo=
k and your 2015 CAM scheduling BSDCan talk/slides are already on my list). =
Or even =E2=80=9Clook at this function in this driver as a good example of =
doing Z=E2=80=9D.

I think this is the root cause of your problems. Unfortunately, apart
from my talk on CAM and some early work by Justin Gibbs on CAM
generally, there's not a lot out there. Chapter 12 needs to be brought
up to date since it was done before locking.

The question is, is the probe failing to finish due to a change in
code, or a change in the device where something 'hangs' before it can
get through the probe. dtrace might help, but it's kinda a pain to use
before mountroot (it will record, but you can't write the data
anywhere, which I think is the problem). That leaves CAM debug to try
to sort it out.

> Zooming out even farther, getting this functionality into the base system=
 would be fantastic and eliminate the need for a port or this code living s=
eparately entirely. There=E2=80=99s a patch on Phabricator (not mine) purpo=
rting to do that at https://reviews.freebsd.org/D34477 . If that=E2=80=99s =
the way forward I=E2=80=99d love to see progress on it (and help if I can b=
e useful). If not (or in parallel?), I could make a diff to bring isboot in=
 to the tree as-is, but it would need careful review before I would expect =
it to be merged. And removing the bits that are redundant with the existing=
 initiator would still want to be done at some point.

Yea, I can't help on this one, I'm sorry.

Warner

> In any event, thank you for your attention.
>
> JN



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfoyo_GvGkUGkM-45a4h48uBXTmeGuWxK9ek%2BBS%2B_2jMcQ>