From nobody Thu Feb 29 00:04:41 2024 X-Original-To: stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TlWgY1dzVz5CgP3 for ; Thu, 29 Feb 2024 00:04:57 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TlWgX6qZmz51cC; Thu, 29 Feb 2024 00:04:56 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pj1-x1034.google.com with SMTP id 98e67ed59e1d1-29a61872f4eso189454a91.2; Wed, 28 Feb 2024 16:04:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709165095; x=1709769895; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=XBSSjGcbvpox77B9sWvMW2EwHCgeaukw/lJku+7rabI=; b=G/9Njue1gpB5kf82TFnC6a+VIQy1ZVPo+xncvpU9Wdp5xV5syca2CI1N/HrAr62rD9 106xT/3GAMcS7JGz1AXzqysZoTaN6RLGASG2+us1TgQyCxkDejnPf4+vr8ebZRHRl3z8 LIHmZPM9uWsXwFjSF6dMHno15jeS8wxBHHfMA9pRV90jDH4WMirg+O5zPtjh4pLsIcsC fu78YoeftZu+Y523mpY3PWBfusmGF7E2me0ni2jrs8olM0wpQ0f9YITEadXMRRZOqcM7 iqXbCzI3YVLOcSASyB5ChU3eM9II+0W1YX37VVa1l/pHtEk4eUiLefgbktUAZyqmAugt sYfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709165095; x=1709769895; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XBSSjGcbvpox77B9sWvMW2EwHCgeaukw/lJku+7rabI=; b=sjqQZ57BXwjlxi3mhtbvmV0p1zbfZRDevs+vlZwNd/9HOOTQyRjrM9gcmmBh52+p4h w+3O9mXKflL5subyqMiNYlbC2vj72ca77BIZZWBz1xSJe5tcLQ5gHbw6G9zOTZtCdFVu ARIOTyHAQddxisGZj/9Co8Rhaa3cFp4zmuRvGv3u98jJ2rwKOcqBHrgRlIu2cgLue9rm XfY81rrg70Ldu1vhLIeIW+QK+kcwO7cQ/RaCx5G84Wx5fdMjcs2nZRj+s21qHdw3YV2r EHefpeI24PLdkcD2U/yQ6UYH/l5SN0MqEMjenWc21gykSLgYAlj+J53YXCgDbeCS6k5D BAyQ== X-Forwarded-Encrypted: i=1; AJvYcCX6piSOSL8mIf3iWdWcbqIn74ReWHWfjjXT7rZhalpULMqvnxxOD0XQbXsutz7DIkbwswbHM20Ty90gCQEIHZyM5BJopA== X-Gm-Message-State: AOJu0YxbmnaLUcjYmAhFkjpQ0dmqAcCfRbwgptr/9FR8cHOhR+q8692l bi1NKatZKDIRkH/EBbO/9Yr0mH/iGa00GnwbZTOG3jJKNx9AkbCZeH96OanalNzT4ENwJe59C4y UyDIB4BCHt2MQBNF7/v3Uz6gT0D3dTqQ= X-Google-Smtp-Source: AGHT+IHHR5cz0PxPtK6kddqp8KkSmxuc+U2DWRTQAJW1qVjhncvr259tGvatqvLD/D5Od6gduZbhU3/QKmDpzo4+700= X-Received: by 2002:a17:90a:bc84:b0:29a:c4a3:ca0a with SMTP id x4-20020a17090abc8400b0029ac4a3ca0amr851859pjr.18.1709165095491; Wed, 28 Feb 2024 16:04:55 -0800 (PST) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 References: <26078.50375.679881.64018@hergotha.csail.mit.edu> In-Reply-To: <26078.50375.679881.64018@hergotha.csail.mit.edu> From: Rick Macklem Date: Wed, 28 Feb 2024 16:04:41 -0800 Message-ID: Subject: Re: 13-stable NFS server hang To: Garrett Wollman Cc: stable@freebsd.org, rmacklem@freebsd.org Content-Type: multipart/mixed; boundary="000000000000613a6806127a02f4" X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; TAGGED_FROM(0.00)[] X-Rspamd-Queue-Id: 4TlWgX6qZmz51cC --000000000000613a6806127a02f4 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Feb 27, 2024 at 9:30=E2=80=AFPM Garrett Wollman wrote: > > Hi, all, > > We've had some complaints of NFS hanging at unpredictable intervals. > Our NFS servers are running a 13-stable from last December, and > tonight I sat in front of the monitor watching `nfsstat -dW`. I was > able to clearly see that there were periods when NFS activity would > drop *instantly* from 30,000 ops/s to flat zero, which would last > for about 25 seconds before resuming exactly as it was before. > > I wrote a little awk script to watch for this happening and run > `procstat -k` on the nfsd process, and I saw that all but two of the > service threads were idle. The three nfsd threads that had non-idle > kstacks were: > > PID TID COMM TDNAME KSTACK > 997 108481 nfsd nfsd: master mi_switch sleepq_tim= edwait _sleep nfsv4_lock nfsrvd_dorpc nfssvc_program svc_run_internal svc_r= un nfsrvd_nfsd nfssvc_nfsd sys_nfssvc amd64_syscall fast_syscall_common > 997 960918 nfsd nfsd: service mi_switch sleepq_tim= edwait _sleep nfsv4_lock nfsrv_setclient nfsrvd_exchangeid nfsrvd_dorpc nfs= svc_program svc_run_internal svc_thread_start fork_exit fork_trampoline > 997 962232 nfsd nfsd: service mi_switch _cv_wait t= xg_wait_synced_impl txg_wait_synced dmu_offset_next zfs_holey zfs_freebsd_i= octl vn_generic_copy_file_range vop_stdcopy_file_range VOP_COPY_FILE_RANGE = vn_copy_file_range nfsrvd_copy_file_range nfsrvd_dorpc nfssvc_program svc_r= un_internal svc_thread_start fork_exit fork_trampoline > > I'm suspicious of two things: first, the copy_file_range RPC; second, > the "master" nfsd thread is actually servicing an RPC which requires > obtaining a lock. The "master" getting stuck while performing client > RPCs is, I believe, the reason NFS service grinds to a halt when a > client tries to write into a near-full filesystem, so this problem > would be more evidence that the dispatching function should not be > mixed with actual operations. I don't know what the clients are > doing, but is it possible that nfsrvd_copy_file_range is holding a > lock that is needed by one or both of the other two threads? > > Near-term I could change nfsrvd_copy_file_range to just > unconditionally return NFSERR_NOTSUP and force the clients to fall > back, but I figured I would ask if anyone else has seen this. I have attached a little patch that should limit the server's Copy size to vfs.nfsd.maxcopyrange (default of 10Mbytes). Hopefully this makes sure that the Copy does not take too long. You could try this instead of disabling Copy. It would be nice to know if this is suffciient? (If not, I'll probably add a sysctl to disable Copy.) rick > > -GAWollman > > --000000000000613a6806127a02f4 Content-Type: application/octet-stream; name="copylen.patch" Content-Disposition: attachment; filename="copylen.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lt6gr9sk0 LS0tIHN5cy9mcy9uZnNzZXJ2ZXIvbmZzX25mc2RzZXJ2LmMuY29weWxlbgkyMDI0LTAyLTI4IDE1 OjM1OjQ3LjcwMDUzMTAwMCAtMDgwMAorKysgc3lzL2ZzL25mc3NlcnZlci9uZnNfbmZzZHNlcnYu YwkyMDI0LTAyLTI4IDE1OjQxOjMzLjcyMzAyMjAwMCAtMDgwMApAQCAtOTksNiArOTksOSBAQCBT WVNDVExfQk9PTChfdmZzX25mc2QsIE9JRF9BVVRPLCBlbmFibGVfdjQyYWxsb2NhdGUsIEMKIFNZ U0NUTF9CT09MKF92ZnNfbmZzZCwgT0lEX0FVVE8sIGVuYWJsZV92NDJhbGxvY2F0ZSwgQ1RMRkxB R19SVywKICAgICAmbmZzcnZfZG9hbGxvY2F0ZSwgMCwKICAgICAiRW5hYmxlIE5GU3Y0LjIgQWxs b2NhdGUgb3BlcmF0aW9uIik7CitzdGF0aWMgdWludDY0X3QgbmZzcnZfbWF4Y29weXJhbmdlID0g MTAgKiAxMDI0ICogMTAyNDsKK1NZU0NUTF9VNjQoX3Zmc19uZnNkLCBPSURfQVVUTywgbWF4Y29w eXJhbmdlLCBDVExGTEFHX1JXLAorICAgICZuZnNydl9tYXhjb3B5cmFuZ2UsIDAsICJNYXggc2l6 ZSBvZiBhIENvcHkgc28gUlBDIHRpbWVzIHJlYXNvbmFibGUiKTsKIAogLyoKICAqIFRoaXMgbGlz dCBkZWZpbmVzIHRoZSBHU1MgbWVjaGFuaXNtcyBzdXBwb3J0ZWQuCkBAIC01Nzc4LDcgKzU3ODEs MTUgQEAgbmZzcnZkX2NvcHlfZmlsZV9yYW5nZShzdHJ1Y3QgbmZzcnZfZGVzY3JpcHQgKm5kLCBf X3VuCiAJCQluZC0+bmRfcmVwc3RhdCA9IGVycm9yOwogCX0KIAotCXhmZXIgPSBsZW47CisJLyoK KwkgKiBEbyB0aGUgYWN0dWFsIGNvcHkgdG8gYW4gdXBwZXIgbGltaXQgb2YgdmZzLm5mc2QubWF4 Y29weXJhbmdlLgorCSAqIFRoaXMgbGltaXQgaXMgYXBwbGllZCB0byBlbnN1cmUgdGhhdCB0aGUg UlBDIHJlcGxpZXMgaW4gYQorCSAqIHJlYXNvbmFibGUgdGltZS4KKwkgKi8KKwlpZiAobGVuID4g bmZzcnZfbWF4Y29weXJhbmdlKQorCQl4ZmVyID0gbmZzcnZfbWF4Y29weXJhbmdlOworCWVsc2UK KwkJeGZlciA9IGxlbjsKIAlpZiAobmQtPm5kX3JlcHN0YXQgPT0gMCkgewogCQluZC0+bmRfcmVw c3RhdCA9IHZuX2NvcHlfZmlsZV9yYW5nZSh2cCwgJmlub2ZmLCB0b3ZwLCAmb3V0b2ZmLAogCQkg ICAgJnhmZXIsIENPUFlfRklMRV9SQU5HRV9USU1FTzFTRUMsIG5kLT5uZF9jcmVkLCBuZC0+bmRf Y3JlZCwK --000000000000613a6806127a02f4--