From nobody Wed Mar 19 19:17:23 2025 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ZHzXl4rbfz5qLf7 for ; Wed, 19 Mar 2025 19:38:47 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ZHzXk6zGhz3D8P for ; Wed, 19 Mar 2025 19:38:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1742413127; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h8sfJ6LdRPhFgcd9fSmcY3LVO8yjpv8Dq5I8UYdS7J8=; b=mtA4msnfE2anr1F3wOygfS6RU/Y0uzuYN0bcuHsf7ZUFLiISkR3cXBNWe9G0SZ5Minemh+ TkQcAFPEt1uSEzhBe462BRw/lj+dYbtEx97pnjRVQPmsXye4TapY1HqHYVPzpb2c5JRAE8 5lV9S0FdX1ULXOW1GJ1HaSTyVUdI0RyZQajYIDWBsRo7yHZ1A8Vk0aoumj1IITb0F+N3hv mRyytNc+eRAhYZiubUESASycy9b3ORbTw0qDYXNPbczWyalgOlYy+cJiXGNL7S1dEJpvHR S9AAH9wCFqRH4tK4n/sl+/FEdHRRBz0imASqzQrOYqAIsRAMpYhX/jXpf+rvig== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1742413127; a=rsa-sha256; cv=none; b=HsHlgtThKbGe8weWYMypmIeFzIo7+cxNp++sLkYr7qYHSabxJvLnRN1bW7oRCzTZfUwCbg BTQeu4pVnoo6si370QRoTATEEZtzl0BpLY/FAlfc/OTeWAPk2qzUoWFKMeTU9+3QrKWVeT lUM5TTEG7mBriJ+d1geOsNi8zOq4Qnk169L5GlH9cIHWM+CgGfztLcc/RBmvYh81YXIYUI MgVUcrVTH3N46LHt7E5bhzLf0Q3AmMHOrQDKMMzuGd4NomxHNW9StWvjBy83KaabHg4AMS st8AtVRlnCQI76CpxFixYAA+ggDies0i6bPZ1Ta8SpPOI8hkPR0+NkEXfF8/rg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1742413127; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h8sfJ6LdRPhFgcd9fSmcY3LVO8yjpv8Dq5I8UYdS7J8=; b=oRW3oOY3c63et9sbO6qJaYMDTQyJwMdEf9UsGx1MD1iOriDH/G4Kbm6IgADEzSTCDKgVUS iH4T2a08sXG5S5nXnf3wrayxxbGpezcvBD+akBd+wytvOTT32sWBonq6EfqUJHfomJhNUN NJ1J795+v0ZCdX85gvcAvXqHc+3PlLna4QJ1MRf8Uvsk8XdGZKuYspF1DOIm2paaMFBjb8 afkl/QJxm3aiTUhXRSZQhdgjkkbHCRQk+9hE/qMeQOtU4WJNzrO5ytLPo2XxY1BBv5QMMh 60H7OoJsUwiqzw1v6jCyoOqDw1sAZQCsL//0NMZWs8F8OOlLgpLq47QnJuZdCg== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4ZHzXk5mMdzYwP for ; Wed, 19 Mar 2025 19:38:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 52JJckWC060298 for ; Wed, 19 Mar 2025 19:38:46 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 52JJHQ7H098022 for bugs@FreeBSD.org; Wed, 19 Mar 2025 19:17:26 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 283747] kernel panic after telegraf service restart Date: Wed, 19 Mar 2025 19:17:23 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 14.1-RELEASE X-Bugzilla-Keywords: crash X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: matthew.l.dailey@dartmouth.edu X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@FreeBSD.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D283747 --- Comment #41 from Matthew L. Dailey --- Thanks for the explanation, Gleb. It sounds like this is definitely worth fixing, regardless of whether it's the cause of this specific bug. :-) FWIW, neither of my test servers (uptime ~8 days with telegraf running constantly) showed any zombies: # ps aux | grep Z | grep -v grep USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND I was just now able to panic my hardware system (took just over 8 days) and will send a core the same way I did before. This is obviously from the unpatched kernel. The last frame before the panic is in crunusebatch() so t= his looks promising. Unread portion of the kernel message buffer: [691684] panic: crunusebatch: ref -4294967294 not >=3D 0 on cred 0xfffff8011b43cb00 [691684] cpuid =3D 17 [691684] time =3D 1742410811 [691684] KDB: stack backtrace: [691684] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d3bfa870 [691684] vpanic() at vpanic+0x131/frame 0xfffffe00d3bfa9a0 [691684] panic() at panic+0x43/frame 0xfffffe00d3bfaa00 [691684] crunusebatch() at crunusebatch+0xfa/frame 0xfffffe00d3bfaa30 [691684] thread_reap_domain() at thread_reap_domain+0x28d/frame 0xfffffe00d3bfaae0 [691684] proc_reap() at proc_reap+0x660/frame 0xfffffe00d3bfab20 [691684] proc_to_reap() at proc_to_reap+0x3c4/frame 0xfffffe00d3bfab70 [691684] kern_wait6() at kern_wait6+0x1a6/frame 0xfffffe00d3bfac10 [691684] sys_wait4() at sys_wait4+0x6b/frame 0xfffffe00d3bfae00 [691684] amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00d3bfaf30 [691684] fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00d3bfaf30 [691684] --- syscall (7, FreeBSD ELF64, wait4), rip =3D 0x2893da, rsp =3D 0x821093d18, rbp =3D 0x821093d80 --- [691684] KDB: enter: panic Restarted both test systems (hardware and VM) with patched DEBUG kernels and will report back in 10 days or so. --- Comment #42 from Matthew L. Dailey --- Thanks for the explanation, Gleb. It sounds like this is definitely worth fixing, regardless of whether it's the cause of this specific bug. :-) FWIW, neither of my test servers (uptime ~8 days with telegraf running constantly) showed any zombies: # ps aux | grep Z | grep -v grep USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND I was just now able to panic my hardware system (took just over 8 days) and will send a core the same way I did before. This is obviously from the unpatched kernel. The last frame before the panic is in crunusebatch() so t= his looks promising. Unread portion of the kernel message buffer: [691684] panic: crunusebatch: ref -4294967294 not >=3D 0 on cred 0xfffff8011b43cb00 [691684] cpuid =3D 17 [691684] time =3D 1742410811 [691684] KDB: stack backtrace: [691684] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d3bfa870 [691684] vpanic() at vpanic+0x131/frame 0xfffffe00d3bfa9a0 [691684] panic() at panic+0x43/frame 0xfffffe00d3bfaa00 [691684] crunusebatch() at crunusebatch+0xfa/frame 0xfffffe00d3bfaa30 [691684] thread_reap_domain() at thread_reap_domain+0x28d/frame 0xfffffe00d3bfaae0 [691684] proc_reap() at proc_reap+0x660/frame 0xfffffe00d3bfab20 [691684] proc_to_reap() at proc_to_reap+0x3c4/frame 0xfffffe00d3bfab70 [691684] kern_wait6() at kern_wait6+0x1a6/frame 0xfffffe00d3bfac10 [691684] sys_wait4() at sys_wait4+0x6b/frame 0xfffffe00d3bfae00 [691684] amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00d3bfaf30 [691684] fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00d3bfaf30 [691684] --- syscall (7, FreeBSD ELF64, wait4), rip =3D 0x2893da, rsp =3D 0x821093d18, rbp =3D 0x821093d80 --- [691684] KDB: enter: panic Restarted both test systems (hardware and VM) with patched DEBUG kernels and will report back in 10 days or so. --=20 You are receiving this mail because: You are the assignee for the bug.=