From nobody Sat Aug 16 00:41:26 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4c3gCS0Y0hz64XST for ; Sat, 16 Aug 2025 00:41:40 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4c3gCS0204z3FN9 for ; Sat, 16 Aug 2025 00:41:40 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-61a1663bd7dso76149a12.1 for ; Fri, 15 Aug 2025 17:41:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1755304899; x=1755909699; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=IEyZSooPZ/7qxB4TaohK7tvOPbEFiuXciX7OFETO2f0=; b=gQMLKnBMLOyhT3/1PzK3nPFlbqQ73oQqy7MfU1Lw+p9Cnmvgy0ZOjDYo4w+vWYWmwn ued18AhdwXpR2i6NWPlbg3G5z8x3jRliGeYhGVj3uq7Cvmi/zOCm4+bYXjMCgicMMQ+P hAe6aTBvan3TSuIbuL2BapAj/1EVYH2Sa31TdMmkMvtnZL6t606TDZlPmqPfTmPRE7Sp 1pkraa2c3PLEPeJMNp1HB27g17RmloZLcyy1Fc7StVLMazIjkMk+WXhZ+Byy4HQkP9Gh y96hFH4i1HL+/jOQpT5dc4cJaC9qC/SIQtgHcV+b7FgpcHOjGzhcoPWUPOFRJWAuhNf9 2gSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755304899; x=1755909699; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IEyZSooPZ/7qxB4TaohK7tvOPbEFiuXciX7OFETO2f0=; b=TiTBhU5x0gZzD+swKXhxHTXygNt3HjnlxFeOe5kd9xh1nG711SW60So1SA1y99LbJ3 Ew0/M9Sr5WE7q0Rtfb2/8hvC/WFNmKT1S7QFuKhGlLRSf6euF8d/AbtnSfaqe3QKHfeQ ic01rYiX49ec1jgBr1pULEtQKDOp+XiKNuuVBoYDysjSiFasCyPQwrE8th2uINDnhBqb aX2hCJGvr26dJ0ez/D6h17yA3F6DNe/kAH5lqIwIUifc9jKmNmZp/Fyqz4CalCV//MoB kDBtpKPahYDiWlYCxWWmFJVOjF5gkt8B8s3ZDogudPMmj+iTKOjvQZ64p2w4zlvScNjB FR5Q== X-Forwarded-Encrypted: i=1; AJvYcCUQY4wr6qU8na+tXcKuPzIHMTzFoTTy9UVlI5bU1AT+pNIrI9a1xvK2rxAQKU2eKaBzzFyjI/6d1x7xgrQzc54=@freebsd.org X-Gm-Message-State: AOJu0YwW5Hk1lPsQQPonm6Ze/qV2gF9sTm+g3+1NZ9/uWWAt31dNo8iz ixoc834diK/hT7Y2Pc67kxZY5h2mHkyyYIrxpPmqf801XPqvVa7TnwZAeUv/BxORXYrqH/FKuvq y73NcK2lNGf9O2SMAxSCqCZzYvaOAQw== X-Gm-Gg: ASbGncvF8J0tcEq8KKl9xLkFfQiVF6GBTKWlttFXFDYNSsNNA/3nT2KRCyWrVV75p5q Rma9G+6M8j1IdBst6N8RPZmS524+I0+0FxSC556EIF+85tYzdgMMMIRAqWoK3g2fXzIjIHSG8nD fHQdqROnay65MsAGUUEFw8xvXYqw+zmktKt/GJFcIKhaQMmL95mgCCo3/7GaWIEWzsXM/V1joXH StvOseOTPG2JPleckzSAEXTLsyPrPKUttrWjJQ= X-Google-Smtp-Source: AGHT+IEi+QNbSuABH6XX7HOH79SgHnZK/lrr1USQ0+faCuTcRb6EQO+M0QYxwrLjKRyh/zt5J2F3IBUkoEzhULwFkII= X-Received: by 2002:a05:6402:50c8:b0:618:1835:249f with SMTP id 4fb4d7f45d1cf-618b0502bccmr3129236a12.3.1755304898685; Fri, 15 Aug 2025 17:41:38 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: <14C5523D-1F66-434F-A4D9-E14DA4BBF1E9@iitbombay.org> In-Reply-To: From: Rick Macklem Date: Fri, 15 Aug 2025 17:41:26 -0700 X-Gm-Features: Ac12FXyWlYzkH8DVocu00VIz0y0Hu73wck-huhWGOfgPfiF_TEGfVcgeDzokLSs Message-ID: Subject: Re: zfs related panic To: Konstantin Belousov Cc: Bakul Shah , FreeBSD Current Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Queue-Id: 4c3gCS0204z3FN9 On Fri, Aug 15, 2025 at 5:35=E2=80=AFPM Konstantin Belousov wrote: > > On Fri, Aug 15, 2025 at 05:26:21PM -0700, Rick Macklem wrote: > > On Fri, Aug 15, 2025 at 5:07=E2=80=AFPM Konstantin Belousov wrote: > > > > > > On Fri, Aug 15, 2025 at 04:51:00PM -0700, Bakul Shah wrote: > > > > On Aug 15, 2025, at 3:51=E2=80=AFPM, Konstantin Belousov wrote: > > > > > > > > > > On Fri, Aug 15, 2025 at 11:19:55AM -0700, Bakul Shah wrote: > > > > >> Is this a known bug or may be something specific on my machine? > > > > >> If the latter, any way to "fsck" it? FYI, the zpool is a mirror > > > > >> (two files on the host via nvme). built from c992ac621327 commit= hash > > > > >> (which has other issues but they seem to be separate from this). > > > > >> I saw the same panic when I booted from a day old snapshot. > > > > >> > > > > >> Note that "ls /.zfs" panics but "ls /.zfs/snapshot" doesn't! > > > > >> > > > > >> This is on a -current VM: > > > > >> > > > > >> root@:/ # ls .zfs > > > > >> VNASSERT failed: oresid =3D=3D 0 || nresid !=3D oresid || *(a)->= a_eofflag =3D=3D 1 not true at vnode_if.c:1824 (VOP_READDIR_APV) > > > > > > > > > > Try this, untested. > > > > > > > > Thanks for the quick patch! But I am afraid it didn't help. Let me = know if you > > > > want me to check things via gdb. [I have filed > > > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D288889 > > > > so we can continue debugging there] > > > > > > > > On the console (single user, RO root): > > > > # ls /.zfs > > > > VNASSERT failed: oresid =3D=3D 0 || nresid !=3D oresid || *(a)->a_e= offlag =3D=3D 1 not true at vnode_if.c:1824 (VOP_READDIR_APV) > > > > 0xfffff800059546e0: type VDIR state VSTATE_CONSTRUCTED op 0xfffffff= f8272cfd0 > > > > usecount 1, writecount 0, refcount 1 seqc users 0 mountedhere 0 > > > > hold count flags () > > > > flags () > > > > lock type zfs: SHARED (count 1) > > > > name =3D .zfs > > > > parent_id =3D 0 > > > > id =3D 1 > > > > panic: VOP_READDIR: eofflag not set > > > > cpuid =3D 0 > > > > time =3D 1755276357 > > > > KDB: stack backtrace: > > > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffff= e0053f83af0 > > > > vpanic() at vpanic+0x136/frame 0xfffffe0053f83c20 > > > > panic() at panic+0x43/frame 0xfffffe0053f83c80 > > > > VOP_READDIR_APV() at VOP_READDIR_APV+0x205/frame 0xfffffe0053f83cd0 > > > > kern_getdirentries() at kern_getdirentries+0x228/frame 0xfffffe0053= f83dd0 > > > > sys_getdirentries() at sys_getdirentries+0x29/frame 0xfffffe0053f83= e00 > > > > amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe0053f83f30 > > > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe005= 3f83f30 > > > > --- syscall (554, FreeBSD ELF64, getdirentries), rip =3D 0x331339f9= 76aa, rsp =3D 0x33133631ade8, rbp =3D 0x33133631ae20 --- > > > > KDB: enter: panic > > > > [ thread pid 23 tid 100211 ] > > > > Stopped at kdb_enter+0x33: movq $0,0x12313e2(%rip) > > > > db> > > > > > > > > Running gdb on the host (attached to tcp port): > > > > #16 0xffffffff80b7992b in vpanic ( > > > > fmt=3D0xffffffff812ddf30 "VOP_READDIR: eofflag not set", > > > > ap=3Dap@entry=3D0xfffffe0053f83c60) > > > > at /home/FreeBSD/current/sys/kern/kern_shutdown.c:962 > > > > #17 0xffffffff80b79793 in panic ( > > > > fmt=3D0xffffffff81d9eab0 "\304\372\032\201\377\377= \377\377") > > > > at /home/FreeBSD/current/sys/kern/kern_shutdown.c:887 > > > > #18 0xffffffff81195fd5 in VOP_READDIR_APV (vop=3D, > > > > a=3Da@entry=3D0xfffffe0053f83d30) at vnode_if.c:1824 > > > > #19 0xffffffff80c95e58 in VOP_READDIR (vp=3D0xfffff800059546e0, > > > > uio=3D0xfffffe0053f83d00, cred=3D, eofflag=3D0xf= ffffe0053f83d6c, > > > > ncookies=3D0x0, cookies=3D0x0) at ./vnode_if.h:972 > > > From this frame, do > > > p *vp > > > and > > > p *(vp->v_op) > > > I am mostly interested what is the .vop_readdir fp points to. > > I think the problem is that, for this case, ZFS replies with eofflag > > =3D=3D -1 instead > > of 1. (I don't know if you want to change the ASSERT or try to fix ZFS > > to not do this?) > > Where do you see it? I mean the '-1' set to *eofp. I saw it in a printf() after VOP_READDIR(). However, a subsequent test showed 0. --> The first time I was printing out for non-ZFS, so there was a fair amou= nt of other printf()s being logged. Then I limited it to ZFS. Anyhow, ZFS seems to get eof wrong when it is already at eof. I'll take a look at the ZFS code, rick >