From owner-freebsd-stable@freebsd.org Tue Dec 8 19:02:49 2020 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7B2034A95D1 for ; Tue, 8 Dec 2020 19:02:49 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cr8h938gBz4n12; Tue, 8 Dec 2020 19:02:49 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.codepro.be", Issuer "Let's Encrypt Authority X3" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id 34FA22218E; Tue, 8 Dec 2020 19:02:49 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id D656A4CC4D; Tue, 8 Dec 2020 20:02:47 +0100 (CET) From: "Kristof Provost" To: Peter Cc: freebsd-stable@freebsd.org Subject: Re: Panic: 12.2 fails to use VIMAGE jails Date: Tue, 08 Dec 2020 20:02:47 +0100 X-Mailer: MailMate (1.13.2r5673) Message-ID: <1AAE98C9-ADF9-4869-B863-601542CEBB67@FreeBSD.org> In-Reply-To: References: <20201207125451.GA11406@gate.oper.dinoex.org> <39DBEA53-960F-4D70-86D7-847E6DFA437D@FreeBSD.org> <20201207233449.GA11025@gate.oper.dinoex.org> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Dec 2020 19:02:49 -0000 On 8 Dec 2020, at 19:49, Peter wrote: > On Tue, Dec 08, 2020 at 04:50:00PM +0100, Kristof Provost wrote: > ! Yeah, the bug is not exclusive to epair but that=E2=80=99s where it=E2= =80=99s = > most easily > ! seen. > > Ack. > > ! Try = > http://people.freebsd.org/~kp/0001-if-Fix-panic-when-destroying-vnet-an= d-epair-simultan.patch > > Great, thanks a lot. > > Now I have bad news: when playing yoyo with the next-best three > application jails (with all their installed stuff) it took about > ten up and down's then I got this one: > > Fatal trap 12: page fault while in kernel mode > cpuid =3D 1; apic id =3D 02 > fault virtual address =3D 0x10 > fault code =3D supervisor read data, page not present > instruction pointer =3D 0x20:0xffffffff80aad73c > stack pointer =3D 0x28:0xfffffe003f80e810 > frame pointer =3D 0x28:0xfffffe003f80e810 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 15486 (ifconfig) > trap number =3D 12 > panic: page fault > cpuid =3D 1 > time =3D 1607450838 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame = > 0xfffffe003f80e4d0 > vpanic() at vpanic+0x17b/frame 0xfffffe003f80e520 > panic() at panic+0x43/frame 0xfffffe003f80e580 > trap_fatal() at trap_fatal+0x391/frame 0xfffffe003f80e5e0 > trap_pfault() at trap_pfault+0x4f/frame 0xfffffe003f80e630 > trap() at trap+0x4cf/frame 0xfffffe003f80e740 > calltrap() at calltrap+0x8/frame 0xfffffe003f80e740 > --- trap 0xc, rip =3D 0xffffffff80aad73c, rsp =3D 0xfffffe003f80e810, r= bp = > =3D 0xfffffe003f80e810 --- > ng_eiface_mediastatus() at ng_eiface_mediastatus+0xc/frame = > 0xfffffe003f80e810 > ifmedia_ioctl() at ifmedia_ioctl+0x174/frame 0xfffffe003f80e850 > ifhwioctl() at ifhwioctl+0x639/frame 0xfffffe003f80e8d0 > ifioctl() at ifioctl+0x448/frame 0xfffffe003f80e990 > kern_ioctl() at kern_ioctl+0x275/frame 0xfffffe003f80e9f0 > sys_ioctl() at sys_ioctl+0x101/frame 0xfffffe003f80eac0 > amd64_syscall() at amd64_syscall+0x380/frame 0xfffffe003f80ebf0 > fast_syscall_common() at fast_syscall_common+0xf8/frame = > 0xfffffe003f80ebf0 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip =3D 0x800475b2a, rsp =3D= = > 0x7fffffffe358, rbp =3D 0x7fffffffe450 --- > Uptime: 9m51s > Dumping 899 out of 3959 MB: > > I decided to give it a second try, and this is what I did: > > root@edge:/var/crash # jls > JID IP Address Hostname Path > 1 1*********** gate.***********.org /j/gate > 3 1*********** raix.***********.org /j/raix > 4 oper.***********.org /j/oper > 5 admn.***********.org /j/admn > 6 data.***********.org /j/data > 7 conn.***********.org /j/conn > 8 kerb.***********.org /j/kerb > 9 tele.***********.org /j/tele > 10 rail.***********.org /j/rail > root@edge:/var/crash # service jail stop rail > Stopping jails: rail. > root@edge:/var/crash # service jail stop tele > Stopping jails: tele. > root@edge:/var/crash # service jail stop kerb > Stopping jails: kerb. > root@edge:/var/crash # jls > JID IP Address Hostname Path > 1 1*********** gate.***********.org /j/gate > 3 1*********** raix.***********.org /j/raix > 4 oper.***********.org /j/oper > 5 admn.***********.org /j/admn > 6 data.***********.org /j/data > 7 conn.***********.org /j/conn > root@edge:/var/crash # jls -d > JID IP Address Hostname Path > 1 1*********** gate.***********.org /j/gate > 3 1*********** raix.***********.org /j/raix > 4 oper.***********.org /j/oper > 5 admn.***********.org /j/admn > 6 data.***********.org /j/data > 7 conn.***********.org /j/conn > 9 tele.***********.org /j/tele > 10 rail.***********.org /j/rail > root@edge:/var/crash # service jail start kerb > Starting jails:Fssh_packet_write_wait: Connection to 1*********** port = > 22: Broken pipe > > Fatal trap 12: page fault while in kernel mode > cpuid =3D 1; apic id =3D 02 > fault virtual address =3D 0x0 > fault code =3D supervisor read instruction, page not = > present > instruction pointer =3D 0x20:0x0 > stack pointer =3D 0x28:0xfffffe00540ea658 > frame pointer =3D 0x28:0xfffffe00540ea670 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 13420 (ifconfig) > trap number =3D 12 > panic: page fault > cpuid =3D 1 > time =3D 1607451910 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame = > 0xfffffe00540ea310 > vpanic() at vpanic+0x17b/frame 0xfffffe00540ea360 > panic() at panic+0x43/frame 0xfffffe00540ea3c0 > trap_fatal() at trap_fatal+0x391/frame 0xfffffe00540ea420 > trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00540ea470 > trap() at trap+0x4cf/frame 0xfffffe00540ea580 > calltrap() at calltrap+0x8/frame 0xfffffe00540ea580 > --- trap 0xc, rip =3D 0, rsp =3D 0xfffffe00540ea658, rbp =3D = > 0xfffffe00540ea670 --- > ??() at 0/frame 0xfffffe00540ea670 > sysctl_rtsock() at sysctl_rtsock+0x3d5/frame 0xfffffe00540ea8a0 > sysctl_root_handler_locked() at sysctl_root_handler_locked+0x90/frame = > 0xfffffe00540ea8e0 > sysctl_root() at sysctl_root+0x248/frame 0xfffffe00540ea960 > userland_sysctl() at userland_sysctl+0x178/frame 0xfffffe00540eaa10 > sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe00540eaac0 > amd64_syscall() at amd64_syscall+0x380/frame 0xfffffe00540eabf0 > fast_syscall_common() at fast_syscall_common+0xf8/frame = > 0xfffffe00540eabf0 > --- syscall (202, FreeBSD ELF64, sys___sysctl), rip =3D 0x80047646a, rs= p = > =3D 0x7fffffffe378, rbp =3D 0x7fffffffe3b0 --- > Uptime: 16m48s > Dumping 938 out of 3959 MB: > > > Sorry for the bad news. > You appear to be triggering two or three different bugs there. Can you reduce your netgraph use case to a small test case that can = trigger the problem? I=E2=80=99m not likely to be able to do anything unl= ess I = can reproduce the problem(s). Best regards, Kristof