From nobody Mon Nov 15 00:13:21 2021 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B87311863819 for ; Mon, 15 Nov 2021 00:13:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HsqR64mcCz4cg1 for ; Mon, 15 Nov 2021 00:13:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 83DC47F10 for ; Mon, 15 Nov 2021 00:13:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1AF0DM3k029066 for ; Mon, 15 Nov 2021 00:13:22 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1AF0DMdC029065 for fs@FreeBSD.org; Mon, 15 Nov 2021 00:13:22 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 254282] 13.0-RC2: NFS export from nullfs mount doesn't work as of 13.0 Date: Mon, 15 Nov 2021 00:13:21 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: misc X-Bugzilla-Version: 13.0-RELEASE X-Bugzilla-Keywords: regression X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: rmacklem@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status assigned_to attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254282 Rick Macklem changed: What |Removed |Added ---------------------------------------------------------------------------- Status|New |Open Assignee|fs@FreeBSD.org |rmacklem@FreeBSD.org --- Comment #8 from Rick Macklem --- Created attachment 229500 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D229500&action= =3Dedit make mountd start after mountlate I think this patch to the /etc/rc.d scripts (applied when in the root directory) will fix the problem. --> The nullfs mount(s) need the "late" option. --> Without this patch, mountd starts before mountlate happens, so the exports get applied to the underlying file system and not the nullfs mount. This patch forces mountd to be started after mountlate. Unfortunately, I am not sure if it safe to start lockd before nfsd. The only obvious reason is to make sure the nfscommon.ko must be loaded before lockd starts, so I have added this to the lockd script. Hopefully the reporter can test this patch. --=20 You are receiving this mail because: You are the assignee for the bug.= From nobody Mon Nov 15 03:26:13 2021 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4F8F21855B0C for ; Mon, 15 Nov 2021 03:26:34 +0000 (UTC) (envelope-from cross+freebsd@distal.com) Received: from relay.wiredblade.com (relay.wiredblade.com [168.235.95.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Hsvk11ZSrz3qJ3; Mon, 15 Nov 2021 03:26:33 +0000 (UTC) (envelope-from cross+freebsd@distal.com) Received: from mail.distal.com (pool-108-48-165-176.washdc.fios.verizon.net [108.48.165.176]) by relay.wiredblade.com with ESMTPSA (version=TLSv1.2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256) ; Mon, 15 Nov 2021 03:26:26 +0000 Received: from smtpclient.apple ( [2001:420:c0c4:1004::33c]) by tristain.distal.com (OpenSMTPD) with ESMTPSA id d6f489ac (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO); Sun, 14 Nov 2021 22:26:24 -0500 (EST) Content-Type: text/plain; charset=utf-8 List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.20.0.1.32\)) Subject: Re: swap_pager: cannot allocate bio From: Chris Ross In-Reply-To: <19A3AAF6-149B-4A3C-8C27-4CFF22382014@distal.com> Date: Sun, 14 Nov 2021 22:26:13 -0500 Cc: freebsd-fs Content-Transfer-Encoding: quoted-printable Message-Id: <6DA63618-F0E9-48EC-AB57-3C3C102BC0C0@distal.com> References: <9FE99EEF-37C5-43D1-AC9D-17F3EDA19606@distal.com> <09989390-FED9-45A6-A866-4605D3766DFE@distal.com> <4E5511DF-B163-4928-9CC3-22755683999E@distal.com> <19A3AAF6-149B-4A3C-8C27-4CFF22382014@distal.com> To: Mark Johnston X-Mailer: Apple Mail (2.3693.20.0.1.32) X-Rspamd-Queue-Id: 4Hsvk11ZSrz3qJ3 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of cross@distal.com designates 168.235.95.80 as permitted sender) smtp.mailfrom=cross@distal.com X-Spamd-Result: default: False [-1.18 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; MV_CASE(0.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; R_SPF_ALLOW(-0.20)[+a:relay.dynu.com]; DMARC_NA(0.00)[distal.com]; RCVD_COUNT_THREE(0.00)[3]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; TO_DN_ALL(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_LONG(0.62)[0.621]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:3842, ipnet:168.235.92.0/22, country:US]; TAGGED_FROM(0.00)[freebsd]; RCVD_TLS_ALL(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[108.48.165.176:received] X-ThisMailContainsUnwantedMimeParts: N > On Nov 12, 2021, at 23:15, Chris Ross = wrote: >=20 > I=E2=80=99ve built a stable/13 as of today, and updated the system. = I=E2=80=99ll see > If the problem recurs, it usually takes about 24 hours to show. If > It does, I=E2=80=99ll see if I can run a procstat -kka and get it off = of the system. Happy Sunday, all. So, I logged in this evening 48 hours after starting = the job that uses a lot of CPU and I/O to the ZFS pool. The system seemed = to be working, and I was thought stable/13 just fixed it. But, after only = a few minutes of fooling around it started to show problems. Ssh connection hung, and new ones couldn=E2=80=99t be made. Then they could, but the = shell got stuck in disk wait once, and others worked. Very odd. I logged into = the console and ran a procstat -kka. Then, I tried to ls -f a directory in = the large ZFS fs (/tank), which hung. Ctrl-T on that shows: load: 0.04. cmd: ls 87050 [aw.aew_cv] 41.13r 0.00u 0.00s 0% 2632k mi_switch+0xc1 _cv_wait+0xf2 arc_wait_for_eviction+0x1df = arc_get_data_impl+0x85 arc_hdr_alloc_abd+0x7b arc_read+0x6f7 = dbuf_read+0xc5b dmu_buf_hold+0x46 zap_cursor_retrieve+0x163 = zfs_freebsd_readdir+0x393 VOP_READDIR_APV+0x1f kern_getdirentries+0x1d9 = sys_getdirentries+0x29 amd64_syscall+0x10c fast_syscall_common+0xf8 A procstat -kka output is available (208kb of text, 1441 lines) at https://pastebin.com/SvDcvRvb An ssh of a top command completed and shows: last pid: 91551; load averages: 0.00, 0.02, 0.30 up 2+00:19:33 = 22:23:15 40 processes: 1 running, 38 sleeping, 1 zombie CPU: 3.9% user, 0.0% nice, 0.9% system, 0.0% interrupt, 95.2% idle Mem: 58G Active, 210M Inact, 1989M Laundry, 52G Wired, 1427M Buf, 12G = Free ARC: 48G Total, 10G MFU, 38G MRU, 128K Anon, 106M Header, 23M Other 46G Compressed, 46G Uncompressed, 1.00:1 Ratio Swap: 425G Total, 3487M Used, 422G Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU = COMMAND 90996 root 1 22 0 21M 9368K select 22 0:00 0.10% = sshd 89398 cross 23 52 0 97G 60G uwait 4 94.1H 0.00% = python3. 55463 cross 18 20 0 301M 54M kqread 31 4:30 0.00% = python3. 54338 cross 4 20 0 82M 9632K kqread 33 1:02 0.00% = python3. 84083 ntpd 1 20 0 21M 1712K select 33 0:07 0.00% = ntpd I=E2=80=99d love to hear any thoughts. Again, this is running a = 13-stable stable/13-n248044-4a36455c417. Thanks all. - Chris