From nobody Mon Feb 24 04:53:11 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Z1Szb1xp4z5pk11 for ; Mon, 24 Feb 2025 04:53:15 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from omta003.cacentral1.a.cloudfilter.net (omta001.cacentral1.a.cloudfilter.net [3.97.99.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Z1SzZ2bYNz3F4n for ; Mon, 24 Feb 2025 04:53:14 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=permerror reason="p tag has invalid value: quarantine rua=mailto:p[ostmaster@cschubert.com" header.from=cschubert.com (policy=permerror); spf=pass (mx1.freebsd.org: domain of cy.schubert@cschubert.com designates 3.97.99.32 as permitted sender) smtp.mailfrom=cy.schubert@cschubert.com Received: from shw-obgw-4001a.ext.cloudfilter.net ([10.228.9.142]) by cmsmtp with ESMTPS id mAyHtRm0A9JM2mQSvtg1xa; Mon, 24 Feb 2025 04:53:13 +0000 Received: from spqr.komquats.com ([70.66.136.217]) by cmsmtp with ESMTPSA id mQStt4J1k4k0omQSutAQbU; Mon, 24 Feb 2025 04:53:13 +0000 X-Auth-User: cschuber X-Authority-Analysis: v=2.4 cv=fLKa3oae c=1 sm=1 tr=0 ts=67bbfb39 a=h7br+8Ma+Xn9xscxy5znUg==:117 a=h7br+8Ma+Xn9xscxy5znUg==:17 a=kj9zAlcOel0A:10 a=T2h4t0Lz3GQA:10 a=6I5d2MoRAAAA:8 a=EkcXrb_YAAAA:8 a=YxBL1-UpAAAA:8 a=pGLkceISAAAA:8 a=ASre1B6bsTNvbFaSfpwA:9 a=CjuIK1q_8ugA:10 a=LK5xJRSDVpKd5WXXoEvA:22 a=Ia-lj3WSrqcvXOmTRaiG:22 Received: from slippy.cwsent.com (slippy [10.1.1.91]) by spqr.komquats.com (Postfix) with ESMTP id 77EF218E8; Sun, 23 Feb 2025 20:53:11 -0800 (PST) Received: from slippy (localhost [127.0.0.1]) by slippy.cwsent.com (Postfix) with ESMTP id 5D33B712; Sun, 23 Feb 2025 20:53:11 -0800 (PST) Date: Sun, 23 Feb 2025 20:53:11 -0800 From: Cy Schubert To: Lars Tunkrans Cc: Rick Macklem , FreeBSD CURRENT , Gleb Smirnoff Subject: Re: RFC: mount_nfs failure due to dns not running yet Message-ID: <20250223205311.7569a16a@slippy> In-Reply-To: References: Organization: KOMQUATS X-Mailer: Claws Mail 3.21.0 (GTK+ 2.24.33; amd64-portbld-freebsd15.0) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-CMAE-Envelope: MS4xfKbWmLMw2pMReih4GbTnS7jH3aNiblzvFGmCmGLm5PsRdBHGsw+tr+IsWA8+kL129RDEeqwWg61S6Bvo4whiMQmDOpWo6HFw0vyAEQREHbxAvF11bHF5 LuZwJ5quVfRKsMvaaJuN6GvVDR3nBYeSVy9ZFbN2BDKYRL5leVqUaQotfaGWqiTykkGmsIpCYKrpzMB8SPj7VNtfh+fRhiMLkZd106br22KSZqBkvB2BkWzH XHQBQpQPeYsSQdt2zzOkhvxIy1YG5m6x59t5Wr+oMSAJfErNkS4gfYQE6b1QYf0g X-Spamd-Result: default: False [-1.57 / 15.00]; SUSPICIOUS_RECIPS(1.50)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.94)[-0.944]; NEURAL_HAM_LONG(-0.83)[-0.826]; MID_RHS_NOT_FQDN(0.50)[]; RWL_MAILSPIKE_EXCELLENT(-0.40)[3.97.99.32:from]; R_SPF_ALLOW(-0.20)[+ip4:3.97.99.32/31]; RCVD_IN_DNSWL_LOW(-0.10)[3.97.99.32:from]; MIME_GOOD(-0.10)[text/plain]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; FROM_HAS_DN(0.00)[]; DMARC_BAD_POLICY(0.00)[cschubert.com : p tag has invalid value: quarantine rua=mailto:p[ostmaster@cschubert.com]; FREEMAIL_TO(0.00)[gmail.com]; ARC_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org,glebi.us]; HAS_ORG_HEADER(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_DKIM_NA(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TAGGED_RCPT(0.00)[]; ASN(0.00)[asn:16509, ipnet:3.96.0.0/15, country:US]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+] X-Rspamd-Queue-Id: 4Z1SzZ2bYNz3F4n X-Spamd-Bar: - Excuse the top posting. I am replying to a top post. Correct. This is not a bug. It's been NFS behaviour ever since I switched careers from IBM Mainframe to UNIX ~ 30 years ago. Sun Solaris behaved this way, as did Tru64, DG-UX and HP/UX. Typically a sysadmin needed to -- and needs to today -- put NFS server IPs in hosts(5). One person suggested using late for NFS shares. That too what Red Hat does. NFS mounts are flagged by systemd (and prior to that upstart [rc.d]), with _netserv. _netserv would cause the NFS mount to take place after the network is fully up, including DNS resolution. Solaris didn't do this when I last worked on it and AFAIK it still doesn't. I think our choices are to document that sysadmins must either use hosts(5) or ensure NFS shares are mounted late. Or, mount NFS shares after the network is fully up. A retry forever, until DNS finally provides a good answer, can potentially hang boot. This would be especially troublesome for remote unattended reboot in which remediation would require calling remote eyes and hands remote support to "fix" the situation on the console. BTW, with NFSv3 and v2, uninterruptible mounts, i.e. those without the intr option, did behave this way. NFSv4 doesn't support intr. I think the easiest solution would be some documentation. Next would be mounting NFS shares later at about the same time late mounts are processed (actually, immediately prior), like Red Hat Linux does. A _netserv fstab(5) option could serve the same purpose it does in linux, immediately prior to late option handling. Altering the kernel wait forever is undesirable. This would result in boot hangs requiring console access to work around the problem. This would be a PITA and POLA for unattended remote sites. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 On Thu, 20 Feb 2025 01:25:20 +0100 Lars Tunkrans wrote: > This situation has existed these past 40 years. You have to put your > ipadress : hostname pairs into /etc/hosts if you dont have accsss to a > working DNS. This is not a bug. Its the way name resolution works. > > Den ons 19 feb. 2025 23:40Rick Macklem skrev: > > > Hi, > > > > The subject line basically describes the problem glebius@ > > ran into. When doing an NFS mount in /etc/fstab, it failed > > since the DNS service was not yet working and, as such, > > the DNS lookup of the server fqdn failed, causing the mount > > to fail. Note that this behaviour has existed for decades. > > > > He feels this is a bug and that mount_nfs(8) should retry > > getaddrinfo(3) calls until success, instead of failing the > > mount when the first attempt fails. > > The problem with just retrying getaddrinfo(3) is that it > > could retry forever for simple failures like a typo in the > > server fqdn. > > I can see several ways this can be handled and would > > like feedback from others w.r.t. these alternatives. > > > > 1) Simply document this case and encourage use of > > host names in /etc/hosts for NFS servers along with > > specifying use of file before dns in nsswitch.conf. > > Doing this results in the mounts working whether or > > not DNS is working. > > > > 2) Call it a bug and patch mount_nfs(8) to retry getaddrinfo(3) > > until it succeeds. (I feel this would be a POLA violation, > > given that the current behaviour has existed for decades > > and for simple cases where the fqdn will never resolve > > the behaviour would be to hang at the mount attempt > > during boot unless "bg" is specified for the /etc/fstab entry.) > > > > 3) Add a new NFS mount option "retrydns=", which would enable > > retries of getaddrinfo(3). This would avoid any POLA violation and > > would allow for a convenient way to document the behaviour in > > "man mount_nfs". > > > > 4) ??? > > > > So, what do you think is the preferred change? > > > > rick > > ps: I looked and the return value from getaddrinfo(3) does not > > appear to be useful to discern the case of "DNS service not > > running yet". (I think it replies EAI_FAIL for this case.) > > > >