From nobody Thu Dec 12 16:27:20 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Y8Hts6FP7z5gSqZ for ; Thu, 12 Dec 2024 16:27:33 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Y8Hts3XTxz4kVx; Thu, 12 Dec 2024 16:27:33 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-5d2726c0d45so1425477a12.2; Thu, 12 Dec 2024 08:27:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734020852; x=1734625652; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Z4Xh4F5uOx13HyraeRCyNTUBRBLxJMeEUCHUhjMUmdk=; b=bTRtxv31yL+IYkC9PuU9NqKTwcNPLexIbZ+MsNUJghmmgeD8TgNDGQuhp7SGW709EJ En5VEJJtNm/MosHGt9NJvJUEqQuCP+PMGhXs7WcWF2sWf1r+qvP+pu21Dpgb/4uWCsAO YQQigbqDzRi1cDqgmWln9onWPRA8ErCZN+Jd7VbWlBicIYpgjGA37Nrb+vDHe6J81AzV uV+PIFzGqwlGdP5P0q/XgK4x2d33/NuRVClULeXN1GLj0DbeHqo6VePeG43gKSkO/eJ4 BocfJvtbFOE8e2KtJTGKTTMHIeSv82AWlEMVy+KjtA3ibuayo6dtU8AZo5JbmV/FK3UZ UKXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734020852; x=1734625652; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z4Xh4F5uOx13HyraeRCyNTUBRBLxJMeEUCHUhjMUmdk=; b=uWgQdmb3CzlPqBXfWY0KW+KUWn4yTrJALe8CfkNULmNklXS6ev4f9EU6Ayj1lvvIMn C4wHhbtJWXwHSr1IcRL5gqh7ATKnxDWYEi3Z+83qShQR1lR0OEczMcdeCa/xUklzwVoB jQk6ZXZXzux4QXSEnfMPJDaNFzvxiANPpwsQgzeKe3PvPl2BYRPsAeKX3YYTza/D+ejE awEGNaLaghbj8L+OcbmKS1ceGMTqe737jhGsS35g6cKCbmBzv8weteyWT5qqTzDUMMDL gEmjLTL/oHJLQznUunR0BrPqvegHZGt2A9IF5sEojuSPoy66P7toxjcr93iOvQe3mhvM obEg== X-Gm-Message-State: AOJu0Yxk0Uqz0puR91F9fSW+CrZXWlGgrJtgG2EKnPTmdIrzn4qUCK1I GVTtjYfYR0kObOhaQUPuRDrB2U1frS7vFR3zzBBdkX6kp1T8eI7ql78lxiAsXg75zMoPihtWkso elt2K4i1lJWWLxclLx918EtbM931T X-Gm-Gg: ASbGnctU+b7g3GPvFrqIPpFke/HyegObVpYGYAeCjaf5kW7Zt5zSbfa3FUJGX6YWZfW qmyGfQdrFRkJmI2bcSGhlNZuF/8nAV/PdNSbAxNoTBhS6iiZiu0i747LPzWe1SRv3mOeStw== X-Google-Smtp-Source: AGHT+IGwE5LeuYgvwFeKRy20hp17KeSzg6ih609R1bz///c9UzrNOzgKlSC63o/9GTWsvZ9aX/oRkBkcNiPxZ6jEjdI= X-Received: by 2002:a05:6402:358e:b0:5d2:723c:a559 with SMTP id 4fb4d7f45d1cf-5d63237a498mr946183a12.10.1734020851217; Thu, 12 Dec 2024 08:27:31 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: <3FBDFCF4-4427-4653-9EE4-EBC44DCB72ED@FreeBSD.org> In-Reply-To: <3FBDFCF4-4427-4653-9EE4-EBC44DCB72ED@FreeBSD.org> From: Rick Macklem Date: Thu, 12 Dec 2024 08:27:20 -0800 Message-ID: Subject: Re: Module variable initialization To: Zhenlei Huang Cc: FreeBSD CURRENT Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4Y8Hts3XTxz4kVx X-Spamd-Bar: ---- On Thu, Dec 12, 2024 at 1:46=E2=80=AFAM Zhenlei Huang wr= ote: > > > > On Dec 12, 2024, at 10:44 AM, Rick Macklem wrote= : > > Hi, > > Bugzilla pr#282156 reports a crash that appears to be caused by > a NFS client variable (nfscbd_pool) not being initialized when a > NFS mount is done. > > Now, the NFS client module (nfscl.ko) is weird in that it has > two definitions for the module. There is a VFS_SET() one for > the file system and a separate DECLARE_MODULE() for nfscl. > (The latter exists so that the module can refuse to unload and > define dependencies on other modules.) > > The variable (nfscbd_pool) is initialized in the modevent() function > for nfscl in the MOD_LOAD section. > > > Does anyone know if this can somehow result in the variable not > being initialized when an NFS mount occurs? > > > I'm not familiar with NFS. From a quick look of the source code I think > `nfscbd_pool` is correctly initialized. > > I do not know the exact version pr#282156, so I guess and tried 14.1-p1, > ``` > $ addr2line -fip -e /.zfs/snapshot/14.1-p1/usr/lib/debug/boot/kernel/kern= el.debug 0xffffffff80e1c558 > svc_run at /usr/src/sys/rpc/svc.c:1414 > ``` > > https://cgit.freebsd.org/src/tree/sys/rpc/svc.c?h=3Dreleng/14.1&id=3D0892= dff104440867956a53e78c12d66090fec36b#n1414 > > If `nfscbd_pool` is NULL, then I expect the panic should happens earlier.= Say line 1405 or event earlier line 1389 . This is not a panic most see. I haven't been able to reproduce it and I think it has been reported by one other person. However, if you look at 282156, you'll see that the address is d0 (208). This is almost guaranteed to be a NULL pointer to a structure. I looked at the structures accessed via the callback stuff and the only one with a field at offset 208 is SVCPOOL. The code does use nfscbd_pool in the NFS common code, where it sets xprt->xp_pool =3D nfscbd_pool in svc_vc_create_backchannel(), which is called when a TCP connection for a mount is created. --> So, one theory is that the code somehow does this before nfscbd_pool is set non-null. --> Another would be that some runaway pointer set it back to NULL. Anyhow, I will probably get the reporter to add some sanity checks to their code to try and see what the results of that are. > > Maybe `svc_run_internal()` is to be blamed ? In a sense, yes, in that the crash occurs in there. However, this code is worked heavily for the NFS server and doesn't demonstrate issues that I am aware of. Thanks for your comments, rick > > > And, if the above is possible, would doing the initialization in the > vfs_init function for VFS_SET() be guaranteed to happen before > a mount is done? > > > The order of modules seems right to me. nfscl module has order SI_ORDER_= FIRST > and VFS_SET(... nfs ... ) has SI_ORDER_MIDDLE. One additional issue is that nfscbd_pool is actually declared in the nfscommon module and the use of it that might "save" the NULL value in xp_pool before it gets set non-NULL is also in nfscommon. Maybe there is some memory cache issue where the stale (set to NULL) value = for nfscbd_pool persists in the nfscommon module for long enough to cause this? (It is a stretch, but I am pretty well running out of thoughts on this.) --> I could move the initialization into the nfscommon module easily enough= . I might create such a patch for the reporter to try. Thanks for your comments, rick > > > Thanks for any help with this, rick > > > Best regards, > Zhenlei >