From owner-freebsd-stable@freebsd.org Fri Nov 20 11:02:16 2020 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BF9F02C5B95 for ; Fri, 20 Nov 2020 11:02:16 +0000 (UTC) (envelope-from pblok@bsd4all.org) Received: from smtpq2.tb.mail.iss.as9143.net (smtpq2.tb.mail.iss.as9143.net [212.54.42.165]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cctt045DPz4v0Z; Fri, 20 Nov 2020 11:02:15 +0000 (UTC) (envelope-from pblok@bsd4all.org) Received: from [212.54.42.134] (helo=smtp10.tb.mail.iss.as9143.net) by smtpq2.tb.mail.iss.as9143.net with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kg4BC-0000qK-K6; Fri, 20 Nov 2020 12:02:14 +0100 Received: from 82-101-198-11.cable.dynamic.v4.ziggo.nl ([82.101.198.11] helo=wan0.bsd4all.org) by smtp10.tb.mail.iss.as9143.net with esmtp (Exim 4.90_1) (envelope-from ) id 1kg4BC-0005qp-Ee; Fri, 20 Nov 2020 12:02:14 +0100 Received: from newnas.bsd4all.local (localhost [127.0.0.1]) by wan0.bsd4all.org (Postfix) with ESMTP id E6FE511C; Fri, 20 Nov 2020 12:02:52 +0100 (CET) X-Virus-Scanned: amavisd-new at bsd4all.org Received: from wan0.bsd4all.org ([127.0.0.1]) by newnas.bsd4all.local (newnas.bsd4all.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2OG6dGEULseD; Fri, 20 Nov 2020 12:02:52 +0100 (CET) Received: from mpro.bsd4all.local (mpro.bsd4all.local [192.168.1.65]) by wan0.bsd4all.org (Postfix) with ESMTPSA id 18E48314; Fri, 20 Nov 2020 12:02:52 +0100 (CET) From: Peter Blok Message-Id: <665757BF-DA06-4503-9ACD-8A4630E23FF4@bsd4all.org> Content-Type: multipart/signed; boundary="Apple-Mail=_D4E6FEDC-7B80-4B5B-BF86-174BD15DAD20"; protocol="application/pkcs7-signature"; micalg=sha-256 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: Commit 367705+367706 causes a pabic Date: Fri, 20 Nov 2020 12:02:16 +0100 In-Reply-To: <1753B4A3-2FFC-47A5-9D0C-DC0B71BA22E8@FreeBSD.org> Cc: FreeBSD Stable To: Kristof Provost References: <1753B4A3-2FFC-47A5-9D0C-DC0B71BA22E8@FreeBSD.org> X-Mailer: Apple Mail (2.3608.120.23.2.4) X-SourceIP: 82.101.198.11 X-Ziggo-spambar: / X-Ziggo-spamscore: 0.0 X-Ziggo-spamreport: CMAE Analysis: v=2.4 cv=HZuq8gI8 c=1 sm=1 tr=0 ts=5fb7a236 a=3epKYzPC7TQzyyYZMm6bHQ==:17 a=nNwsprhYR40A:10 a=6I5d2MoRAAAA:8 a=6Q3WNqvRAAAA:8 a=XkTKGFxqhWCpKb6hJtcA:9 a=QEXdDO2ut3YA:10 a=fn2EAvp5IHDLcdF3:21 a=_W_S_7VecoQA:10 a=g2jUX3k5JNHoXd8hjKIA:9 a=ZVk8-NSrHBgA:10 a=IjZwj45LgO3ly-622nXo:22 a=I8PBwKCn76L9oNdl0isp:22 X-Ziggo-Spam-Status: No X-Spam-Status: No X-Spam-Flag: No X-Rspamd-Queue-Id: 4Cctt045DPz4v0Z X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Nov 2020 11:02:16 -0000 --Apple-Mail=_D4E6FEDC-7B80-4B5B-BF86-174BD15DAD20 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Kristof, This is 12-stable. With the previous bridge epochification that was = backed out my config had a panic too. I don=E2=80=99t have any local modifications. I did a clean rebuild = after removing /usr/obj/usr My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and = nmdm.ko as modules. Everything else is statically linked. I have removed = all drivers not needed for the hardware at hand. My bridge is between two vlans from the same trunk and the jail epair = devices as well as the bhyve tap devices. The panic happens when the jails are starting. I can try to narrow it down over the weekend and make the crash dump = available for analysis. Previously I had the following crash with 363492 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid =3D 2; apic id =3D 02 fault virtual address =3D 0xffffffff00000410 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80692326 stack pointer =3D 0x28:0xfffffe00c06097b0 frame pointer =3D 0x28:0xfffffe00c06097f0 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 2030 (ifconfig) trap number =3D 12 panic: page fault cpuid =3D 2 time =3D 1595683412 KDB: stack backtrace: #0 0xffffffff80698165 at kdb_backtrace+0x65 #1 0xffffffff8064d67b at vpanic+0x17b #2 0xffffffff8064d4f3 at panic+0x43 #3 0xffffffff809cc311 at trap_fatal+0x391 #4 0xffffffff809cc36f at trap_pfault+0x4f #5 0xffffffff809cb9b6 at trap+0x286 #6 0xffffffff809a5b28 at calltrap+0x8 #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d #8 0xffffffff8069213a at epoch_wait_preempt+0xaa #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7 #10 0xffffffff8075274f at ifioctl+0x47f #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7 #12 0xffffffff806b5b4a at sys_ioctl+0xfa #13 0xffffffff809ccec7 at amd64_syscall+0x387 #14 0xffffffff809a6450 at fast_syscall_common+0x101 > On 20 Nov 2020, at 11:30, Kristof Provost wrote: >=20 > On 20 Nov 2020, at 11:18, peter.blok@bsd4all.org = wrote: >> I=E2=80=99m afraid the last Epoch fix for bridge is not solving the = problem ( or perhaps creates a new ). >>=20 > We=E2=80=99re talking about the stable/12 branch, right? >=20 >> This seems to happen when the jail epair is added to the bridge. >>=20 > There must be something more to it than that. I=E2=80=99ve run the = bridge tests on stable/12 without issue, and this is a problem we = didn=E2=80=99t see when the bridge epochification initially went into = stable/12. >=20 > Do you have a custom kernel config? Other patches? What exact commands = do you run to trigger the panic? >=20 >> kernel trap 12 with interrupts disabled >>=20 >>=20 >> Fatal trap 12: page fault while in kernel mode >> cpuid =3D 6; apic id =3D 06 >> fault virtual address =3D 0xc10 >> fault code =3D supervisor read data, page not present >> instruction pointer =3D 0x20:0xffffffff80695e76 >> stack pointer =3D 0x28:0xfffffe00bf14e6e0 >> frame pointer =3D 0x28:0xfffffe00bf14e720 >> code segment =3D base 0x0, limit 0xfffff, type 0x1b >> =3D DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags =3D resume, IOPL =3D 0 >> current process =3D 1686 (jail) >> trap number =3D 12 >> panic: page fault >> cpuid =3D 6 >> time =3D 1605811310 >> KDB: stack backtrace: >> #0 0xffffffff8069bb85 at kdb_backtrace+0x65 >> #1 0xffffffff80650a4b at vpanic+0x17b >> #2 0xffffffff806508c3 at panic+0x43 >> #3 0xffffffff809d0351 at trap_fatal+0x391 >> #4 0xffffffff809d03af at trap_pfault+0x4f >> #5 0xffffffff809cf9f6 at trap+0x286 >> #6 0xffffffff809a98c8 at calltrap+0x8 >> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d >> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa >> #9 0xffffffff80757d40 at vnet_if_init+0x120 >> #10 0xffffffff8078c994 at vnet_alloc+0x114 >> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7 >> #12 0xffffffff80620190 at sys_jail_set+0x40 >> #13 0xffffffff809d0f07 at amd64_syscall+0x387 >> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8 >=20 > This panic is rather odd. This isn=E2=80=99t even the bridge code. = This is during initial creation of the vnet. I don=E2=80=99t really see = how this could even trigger panics. > That panic looks as if something corrupted the net_epoch_preempt, by = overwriting the epoch->e_epoch. The bridge patches only access this = variable through the well-established functions and macros. I see no = obvious way that they could corrupt it. >=20 > Best regards, > Kristof --Apple-Mail=_D4E6FEDC-7B80-4B5B-BF86-174BD15DAD20 Content-Disposition: attachment; filename=smime.p7s Content-Type: application/pkcs7-signature; name=smime.p7s Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCCBSAw ggUcMIIEBKADAgECAhEAq2wFIs+rCK6H6/2jbblXhDANBgkqhkiG9w0BAQsFADCBlzELMAkGA1UE BhMCR0IxGzAZBgNVBAgTEkdyZWF0ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2FsZm9yZDEaMBgG A1UEChMRQ09NT0RPIENBIExpbWl0ZWQxPTA7BgNVBAMTNENPTU9ETyBSU0EgQ2xpZW50IEF1dGhl bnRpY2F0aW9uIGFuZCBTZWN1cmUgRW1haWwgQ0EwHhcNMTgwNDE0MDAwMDAwWhcNMjEwNDEzMjM1 OTU5WjBEMQswCQYDVQQGEwJOTDETMBEGA1UEAxMKUGV0ZXIgQmxvazEgMB4GCSqGSIb3DQEJARYR cGJsb2tAYnNkNGFsbC5vcmcwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDPT/3evs2a zLSIVepGa9qFVcSISd5HzoJt9xAyQ4od7NM6Qzwm446OyhzWsIN/a6+nDNB4AxzSg00QXKx4afEa FrdLzmREEfv24f88j2UZYqHAls0j26jyED5FZ068xs4gWZBG2U7EVTUNNJuUrrmqBNZkGxTIrFrD Cgr1EpRULpN+HrEelHHh7uR0twAjvwcyXkG9DbDJXnw8HzKGR80ik4+13HDxx4mDxOY4NOvWSSiM kEFS2Z2AKtxXSMBQZHazAUvbka27c1m93/QsjnDF+P6Aef9NEvUDL9mU9Jbf/+5V+anT2KdPGP4p rQ9gA/Nup61qxDkwc+RupiXD5NSbAgMBAAGjggGzMIIBrzAfBgNVHSMEGDAWgBSCr2yM+MX+lmF8 6B89K3FIXsSLwDAdBgNVHQ4EFgQUjwe7n1zvxFkTeCUYWrsaJpOGP14wDgYDVR0PAQH/BAQDAgWg MAwGA1UdEwEB/wQCMAAwHQYDVR0lBBYwFAYIKwYBBQUHAwQGCCsGAQUFBwMCMEYGA1UdIAQ/MD0w OwYMKwYBBAGyMQECAQMFMCswKQYIKwYBBQUHAgEWHWh0dHBzOi8vc2VjdXJlLmNvbW9kby5uZXQv Q1BTMFoGA1UdHwRTMFEwT6BNoEuGSWh0dHA6Ly9jcmwuY29tb2RvY2EuY29tL0NPTU9ET1JTQUNs aWVudEF1dGhlbnRpY2F0aW9uYW5kU2VjdXJlRW1haWxDQS5jcmwwgYsGCCsGAQUFBwEBBH8wfTBV BggrBgEFBQcwAoZJaHR0cDovL2NydC5jb21vZG9jYS5jb20vQ09NT0RPUlNBQ2xpZW50QXV0aGVu dGljYXRpb25hbmRTZWN1cmVFbWFpbENBLmNydDAkBggrBgEFBQcwAYYYaHR0cDovL29jc3AuY29t b2RvY2EuY29tMA0GCSqGSIb3DQEBCwUAA4IBAQC85hVlqTVwt218IJR/WjMiMnDtZ7hY860XKjzO uB3sUUQwHxHj+ZYuMbAfVLZGGqh1EekbwDMVgkK9cezIHM+ZzxrNGX2SJyl1YW+3FLn52P0uIlmA VPFjUowf5qBhOHl2NJo+WXYZhQY7rT/xSygE81o3oLE/A4zO6WtO3PeZpFpZNrBvizAsjTDfPeXW iQzXz6NLrgwert0Wml95ov2rG5oCzHYPijabubSNm2NdUjPRtcVylcqAThXOvp6X4UvW8/L0uhkp 9WsKP2JEJ3Zukv7Ib+vMBsdE4tf4rmv89pQC+lLpD08ze/QDCIeFBCRIihcC2PycDQrnNIp1RAIh MYIDyjCCA8YCAQEwga0wgZcxCzAJBgNVBAYTAkdCMRswGQYDVQQIExJHcmVhdGVyIE1hbmNoZXN0 ZXIxEDAOBgNVBAcTB1NhbGZvcmQxGjAYBgNVBAoTEUNPTU9ETyBDQSBMaW1pdGVkMT0wOwYDVQQD EzRDT01PRE8gUlNBIENsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQgU2VjdXJlIEVtYWlsIENBAhEA q2wFIs+rCK6H6/2jbblXhDANBglghkgBZQMEAgEFAKCCAe0wGAYJKoZIhvcNAQkDMQsGCSqGSIb3 DQEHATAcBgkqhkiG9w0BCQUxDxcNMjAxMTIwMTEwMjE2WjAvBgkqhkiG9w0BCQQxIgQgkoj7KDrS tF64PFkyAH79LZTJPVnVFDuWu8BfBmv8JHowgb4GCSsGAQQBgjcQBDGBsDCBrTCBlzELMAkGA1UE BhMCR0IxGzAZBgNVBAgTEkdyZWF0ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2FsZm9yZDEaMBgG A1UEChMRQ09NT0RPIENBIExpbWl0ZWQxPTA7BgNVBAMTNENPTU9ETyBSU0EgQ2xpZW50IEF1dGhl bnRpY2F0aW9uIGFuZCBTZWN1cmUgRW1haWwgQ0ECEQCrbAUiz6sIrofr/aNtuVeEMIHABgsqhkiG 9w0BCRACCzGBsKCBrTCBlzELMAkGA1UEBhMCR0IxGzAZBgNVBAgTEkdyZWF0ZXIgTWFuY2hlc3Rl cjEQMA4GA1UEBxMHU2FsZm9yZDEaMBgGA1UEChMRQ09NT0RPIENBIExpbWl0ZWQxPTA7BgNVBAMT NENPTU9ETyBSU0EgQ2xpZW50IEF1dGhlbnRpY2F0aW9uIGFuZCBTZWN1cmUgRW1haWwgQ0ECEQCr bAUiz6sIrofr/aNtuVeEMA0GCSqGSIb3DQEBAQUABIIBAGud4BgzeYAF7/TbwvKXS2emx+F4/Von +ghazxpbvcPBhTvAdrNXSSUVkiD8jrWS5EmBDQXHHad6NsfYOB7r+crXCneGFaJ60J4qTYf6Ev5D YoZ2fGbsEieC8mPHwuQ52RrnGKMECbRD8iRPp2dgdmuw80ykkDsh/wxZFwtS37Kg+HUspxlmwb0y g24cpU16LJ3kKjxqcynvSeEs6CqZ30dEehq6V8GbdMP45lt4awP8PfugSZj75WTHmKAzkmzMP0lP didcqVuYJzAcrKIZpow7Lx8DlqvgCfrsy373sEEGav6o5HJGPbouUDw2CUHbX530bC9J+McaA5C6 Gq5HLyYAAAAAAAA= --Apple-Mail=_D4E6FEDC-7B80-4B5B-BF86-174BD15DAD20--