From owner-freebsd-net@freebsd.org Thu May 6 16:02:55 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 25300626267 for ; Thu, 6 May 2021 16:02:55 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Fbddp1FFWz4pTM for ; Thu, 6 May 2021 16:02:54 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qk1-x732.google.com with SMTP id x8so5422228qkl.2 for ; Thu, 06 May 2021 09:02:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=WNr+YUkxpeRC2IoYeuI63qSMvpoaxsNOraMDAW1eowA=; b=ip28BM96pi9IrkNV52PCEnMvEAojoMuXDc3kunfxktk82OstlyVPRBHMI9LWNSFqA3 2TJvts75LUn5MQXZTYxfUqQph8axk7TmE55MEeny/EialOgCEp59elAHK8fhejjttPSU TTopZ38epKuGUO5SKlBb8NGBipLAdZ8UBZtGN+BZmISrZ6bZDy/Fu1Mm44yJj2HnW2Rn COWve90iyhQ71kMXUEP7fCTptaYYuiNZbup16YJCPsDaEFv2RsmRaRY62icSbmOWCN5Y xjMSVQJ3eElemmsOGdxukyAx+vw7A6CTa+TTalwt5W1U9Awr5OUqus5JEBmyh1SyUTkP wCOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=WNr+YUkxpeRC2IoYeuI63qSMvpoaxsNOraMDAW1eowA=; b=VOWksT3Q4KAwdc+asovFpnA2H+pd6dvPaqM9fjUB2LCDqgsQ6WzS/2ARDeG4UgtB0Q 0IPz1svNHy8zL7w/aM7RA7fcb1B10iMrjcIzhxjoHkZkKIEFPGF4aZU19WwhuXWbPw/U CJRR/wrHPFsgUCtbuoNZIP8NSopKU+zmS7SUdGH04S1D9yJAOdOt0HmVmQxZ5WY82psk gIi0fTGN3Omedvp5sqDoyO/1gmGjXLFaAoRknb65nTOb0o5fhJYVeB0ziEQU+JluOHq5 0VnyC/K2a98xLM2ggCiiNFAPBKO5M/QA0/G12NzCA9dpL++jFkYpgZdVDKeAY1YoodW7 ngiw== X-Gm-Message-State: AOAM531QBP3ylv2OSwDEsfFJ8Eywkdc0/X0REyPu5pS+hSZ7UXXhtL1r RVcMXrUXm6S60us6imOZKms= X-Google-Smtp-Source: ABdhPJwCgMKNT8sRh5mi3XeBa6Iru3CCMiMqxcrVxPsopVFRC1OL1z8XkBfhQgeYT22vg5Ms2x6Ogg== X-Received: by 2002:a37:e508:: with SMTP id e8mr4848796qkg.82.1620316973467; Thu, 06 May 2021 09:02:53 -0700 (PDT) Received: from nuc ([142.126.164.150]) by smtp.gmail.com with ESMTPSA id w16sm292547qts.70.2021.05.06.09.02.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 May 2021 09:02:53 -0700 (PDT) Sender: Mark Johnston Date: Thu, 6 May 2021 12:02:56 -0400 From: Mark Johnston To: Michael Schmiedgen Cc: freebsd-net@freebsd.org Subject: Re: page fault while in kernel mode - after upgrade from 12.2 to 13.0 Message-ID: References: <51a3abc5-76b9-df09-acbe-895b62ec87b3@gmx.net> <90ed0277-9fcc-28c0-a546-c6a80babfa34@gmx.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <90ed0277-9fcc-28c0-a546-c6a80babfa34@gmx.net> X-Rspamd-Queue-Id: 4Fbddp1FFWz4pTM X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=ip28BM96; dmarc=none; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::732 as permitted sender) smtp.mailfrom=markjdb@gmail.com X-Spamd-Result: default: False [-2.51 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.81)[-0.806]; FREEMAIL_TO(0.00)[gmx.net]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2607:f8b0:4864:20::732:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-net@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; SPAMHAUS_ZRD(0.00)[2607:f8b0:4864:20::732:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::732:from]; MID_RHS_NOT_FQDN(0.50)[]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 May 2021 16:02:55 -0000 On Thu, May 06, 2021 at 06:00:05PM +0200, Michael Schmiedgen wrote: > On 05.05.2021 20:38, Mark Johnston wrote: > > On Wed, May 05, 2021 at 06:35:32PM +0200, Michael Schmiedgen wrote: > >> On 04.05.2021 21:02, Mark Johnston wrote: > >>> This looks like fairly random kernel memory corruption. Are you able to > >>> build an INVARIANTS kernel and test that? Assuming you're using 13.0, > >>> you'd grab the 13.0 sources, add "options INVARIANT_SUPPORT" and > >>> "options INVARIANTS" to the GENERIC kernel configuration in > >>> sys/amd64/conf, and do a "make buildkernel installkernel". > >> > >> Below some info with an INVARIANTS kernel. Please let me know if I can provide > >> further information. Thank you! > > > > Thanks, this helped a lot. I believe https://reviews.freebsd.org/D30129 > > will fix the problem. That patch is against the main branch but applies > > cleanly to 13.0. > > I applied the patch and the server is running fine now for 8 hours with the > INVARIANTS kernel, including the Samba jail and SIP VM. I just compiled my > custom kernel and it looks like it is working too. Are there plans to get > this MFCed or even as Errata? Great, thanks. Yes I think we will do an EN for this. > BTW, we got 2 other systems, also with userland NAT but different workload. > After an uncertain amount of time, mostly weeks, the natd starts to spin 100% > CPU on these systems. Quick noobish workaround was restarting natd every night. > I saw your recent commits that applied some more safety in that area, do you > plan to MFC these as well? I can imagine that could help with my NAT problems. I am skeptical that anything I did recently would fix this. Did you try attaching a debugger to natd to see where it's getting stuck? Is it also a regression from upgrading to 13.0? > Anyway, many thanks for your investigation and your fix, much appreciated! > > Michael >