From nobody Mon Mar 4 00:29:50 2024 X-Original-To: stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Tp02h2bl0z5CdfH for ; Mon, 4 Mar 2024 00:30:04 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Tp02g4jFmz4hs4 for ; Mon, 4 Mar 2024 00:30:03 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=UXHXgs2K; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of rick.macklem@gmail.com designates 2607:f8b0:4864:20::42e as permitted sender) smtp.mailfrom=rick.macklem@gmail.com Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6e4560664b5so3197987b3a.1 for ; Sun, 03 Mar 2024 16:30:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709512202; x=1710117002; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HgFDXfE8ViO+mtfmLC/wmKQqeYrAf+CZRccHBwyhLYE=; b=UXHXgs2KlT1itW+CLT4fE1nSfjVQzw8Ld75wcefQuiFvzGmhPRYTboZNw6raclCMtU C/pUbnNt9f4F9qZzehmxywdYzFWRGuQl09Th7BVo1AHZPVBXoA2a9Yemvuycy3YMBRhS PU0BMrKoKoAsFg8n4ie7zH75O3cDHceIJ1TTTAuU+ETJwpGMN8HNWRpN2qK8pjctx/9R yD7oshwnv18sDBfUIrzFgmt0fHljIJWYzu8rqfGIn94/IsLMG9Vjlmda9Ouno+CpSBjS tBWnm93aHBxZYf88CLNU+7Ax2oKMkRnw9wlN0I1O5veNp3pP/r5bteNPXs5oaRGMeRK/ f8jA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709512202; x=1710117002; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HgFDXfE8ViO+mtfmLC/wmKQqeYrAf+CZRccHBwyhLYE=; b=WyGSRaMqjyqFHWS553nqEkheybsMt7U97ip6HcwMIG5NPwkBCG2eIAP6Em6wdkFIXi vhUzOdXCIN8w2g/P6oNC5ycuC5eSJr9LUdlSaRdaVtlHUQfcgwcpmjDnxpDcRrueE7kr qN6KIv3rxnS457eYjxI9avYuic+ocCCK7VvzTrdqAcSnB3Yb0PF4EYNlANQ+0kPiQiME IQc4xwgAf59C166mkZq/oSENIVa3UNmacUnQZgCOpTYkuS/CmC3/aI3ouyC2fwJrW/YM zjlGjsAq4nOjWBf9JngrlBO8YC/W7D5iz7PCbSpAUJLuCoKPB4voOfGKvbQnz8+zQlw4 P9BA== X-Gm-Message-State: AOJu0YzUdTeqUGEzqHK0IK33L3HALA2+me1FTl9b7/R19fX2y+nCPeyN 2sbNooI1lSQA4NE/Tkjow4JZEHC/KqcSdwt02EIpgcC9LO33MyKAB6lO+j561PMWfJiP2muH0Xd Pj6fbDAxHIb1vp0dQ5GTE2K5lvx/TWMw= X-Google-Smtp-Source: AGHT+IETQg4YdaMPYIcqQwGdaQmARUWQ5qf+5ljZHtCBWy06rOOJxImG9S5uFXZ3UDdRVrnqwjlim/h62nh7aJ2yRl8= X-Received: by 2002:a05:6a20:1595:b0:1a1:1817:b13 with SMTP id h21-20020a056a20159500b001a118170b13mr10209665pzj.15.1709512202122; Sun, 03 Mar 2024 16:30:02 -0800 (PST) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 References: <26078.50375.679881.64018@hergotha.csail.mit.edu> <26083.64612.717082.366639@hergotha.csail.mit.edu> In-Reply-To: From: Rick Macklem Date: Sun, 3 Mar 2024 16:29:50 -0800 Message-ID: Subject: Re: 13-stable NFS server hang To: Garrett Wollman Cc: stable@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: --- X-Spamd-Result: default: False [-4.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.999]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; TAGGED_FROM(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; TO_DN_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; MISSING_XM_UA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; PREVIOUSLY_DELIVERED(0.00)[stable@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MID_RHS_MATCH_FROMTLD(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MLMMJ_DEST(0.00)[stable@freebsd.org]; RCVD_COUNT_ONE(0.00)[1]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::42e:from] X-Rspamd-Queue-Id: 4Tp02g4jFmz4hs4 On Sun, Mar 3, 2024 at 4:28=E2=80=AFPM Rick Macklem wrote: > > On Sun, Mar 3, 2024 at 3:27=E2=80=AFPM Rick Macklem wrote: > > > > On Sun, Mar 3, 2024 at 1:17=E2=80=AFPM Rick Macklem wrote: > > > > > > On Sat, Mar 2, 2024 at 8:28=E2=80=AFPM Garrett Wollman wrote: > > > > > > > > > > > > I wrote previously: > > > > > PID TID COMM TDNAME KSTACK > > > > > 997 108481 nfsd nfsd: master mi_switch slee= pq_timedwait _sleep nfsv4_lock nfsrvd_dorpc nfssvc_program svc_run_internal= svc_run nfsrvd_nfsd nfssvc_nfsd sys_nfssvc amd64_syscall fast_syscall_comm= on > > > > > 997 960918 nfsd nfsd: service mi_switch slee= pq_timedwait _sleep nfsv4_lock nfsrv_setclient nfsrvd_exchangeid nfsrvd_dor= pc nfssvc_program svc_run_internal svc_thread_start fork_exit fork_trampoli= ne > > > > > 997 962232 nfsd nfsd: service mi_switch _cv_= wait txg_wait_synced_impl txg_wait_synced dmu_offset_next zfs_holey zfs_fre= ebsd_ioctl vn_generic_copy_file_range vop_stdcopy_file_range VOP_COPY_FILE_= RANGE vn_copy_file_range nfsrvd_copy_file_range nfsrvd_dorpc nfssvc_program= svc_run_internal svc_thread_start fork_exit fork_trampoline > > > > > > > > I spent some time this evening looking at this last stack trace, an= d > > > > stumbled across the following comment in > > > > sys/contrib/openzfs/module/zfs/dmu.c: > > > > > > > > | /* > > > > | * Enable/disable forcing txg sync when dirty checking for holes = with lseek(). > > > > | * By default this is enabled to ensure accurate hole reporting, = it can result > > > > | * in a significant performance penalty for lseek(SEEK_HOLE) heav= y workloads. > > > > | * Disabling this option will result in holes never being reporte= d in dirty > > > > | * files which is always safe. > > > > | */ > > > > | int zfs_dmu_offset_next_sync =3D 1; > > > > > > > > I believe this explains why vn_copy_file_range sometimes takes much > > > > longer than a second: our servers often have lots of data waiting t= o > > > > be written to disk, and if the file being copied was recently modif= ied > > > > (and so is dirty), this might take several seconds. I've set > > > > vfs.zfs.dmu_offset_next_sync=3D0 on the server that was hurting the= most > > > > and am watching to see if we have more freezes. > > > > > > > > If this does the trick, then I can delay deploying a new kernel unt= il > > > > April, after my upcoming vacation. > > > Interesting. Please let us know how it goes. > > Btw, I just tried this for my trivial test and it worked very well. > > A 1Gbyte file was cpied in two Copy RPCs of 1sec and slightly less than > > 1sec. > Oops, I spoke too soon. > The Copy RPCs worked fine (as above) but the Commit RPCs took > a long time, so it still looks like you may need the patches. And I should mention that my test is done on a laptop without a ZIL, so maybe a ZIL on a separate device might generate different results. rick > > rick > > > > > So, your vacation may be looking better, rick > > > > > > > > And enjoy your vacation, rick > > > > > > > > > > > -GAWollman > > > >