From nobody Tue Jan 4 21:58:13 2022 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 68AC6193D442 for ; Tue, 4 Jan 2022 22:09:18 +0000 (UTC) (envelope-from pmc@citylink.dinoex.sub.org) Received: from uucp.dinoex.org (uucp.dinoex.org [IPv6:2a0b:f840::12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "uucp.dinoex.sub.de", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JT6GQ1QZnz4jt3; Tue, 4 Jan 2022 22:09:18 +0000 (UTC) (envelope-from pmc@citylink.dinoex.sub.org) Received: from uucp.dinoex.sub.de (uucp.dinoex.org [185.220.148.12]) by uucp.dinoex.org (8.17.1/8.17.1) with ESMTPS id 204M94n9037451 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 4 Jan 2022 23:09:04 +0100 (CET) (envelope-from pmc@citylink.dinoex.sub.org) X-Authentication-Warning: uucp.dinoex.sub.de: Host uucp.dinoex.org [185.220.148.12] claimed to be uucp.dinoex.sub.de Received: (from uucp@localhost) by uucp.dinoex.sub.de (8.17.1/8.17.1/Submit) with UUCP id 204M944K037450; Tue, 4 Jan 2022 23:09:04 +0100 (CET) (envelope-from pmc@citylink.dinoex.sub.org) Received: from gate.intra.daemon.contact (gate-e [192.168.98.2]) by citylink.dinoex.sub.de (8.16.1/8.16.1) with ESMTP id 204M0R0N079441; Tue, 4 Jan 2022 23:00:27 +0100 (CET) (envelope-from peter@gate.intra.daemon.contact) Received: from gate.intra.daemon.contact (gate-e [192.168.98.2]) by gate.intra.daemon.contact (8.16.1/8.16.1) with ESMTPS id 204LwDG0078488 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 4 Jan 2022 22:58:13 +0100 (CET) (envelope-from peter@gate.intra.daemon.contact) Received: (from peter@localhost) by gate.intra.daemon.contact (8.16.1/8.16.1/Submit) id 204LwDZv078487; Tue, 4 Jan 2022 22:58:13 +0100 (CET) (envelope-from peter) Date: Tue, 4 Jan 2022 22:58:13 +0100 From: Peter To: Mark Johnston Cc: freebsd-stable@freebsd.org, jtl@freebsd.org Subject: Re: dtrace bitfields failure (was: 12.3-RC1 fails ...) Message-ID: References: List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Milter: Spamilter (Reciever: uucp.dinoex.sub.de; Sender-ip: 185.220.148.12; Sender-helo: uucp.dinoex.sub.de;) X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (uucp.dinoex.org [185.220.148.12]); Tue, 04 Jan 2022 23:09:07 +0100 (CET) X-Rspamd-Queue-Id: 4JT6GQ1QZnz4jt3 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Tue, Jan 04, 2022 at 01:01:55PM -0500, Mark Johnston wrote: ! On Tue, Jan 04, 2022 at 04:05:53PM +0100, Peter wrote: ! > ! > Hija, ! > ! > sadly, I was too early in agreeing that the two patches ! > 22082f15f9 ! > 68396709e7 ! > together do solve the issue. They only do on a certain assumption, ! > which does not hold true in all cases. ! > ! > ! > Let's look at https://reviews.freebsd.org/D27213 ! > ! > This is the code in question that will trigger the action: ! > ! > if (dst_type == CTF_ERR && name[0] != '\0' && ! > (hep = ctf_hash_lookup(&src_fp->ctf_names, src_fp, name, ! > strlen(name))) != NULL && ! > src_type != (ctf_id_t)hep->h_type) { ! > ! > What happens here: in the case of a bitfield type we need to also ! > copy the corresponding intrinsic type. This condition here checks for ! > the case and also should deliver that respective intrinsic type ! > into the "hep" variable. ! > ! > But this depends on the assumption that the intrinsic type appears ! > first in the "src_fp" container, so that the hash will point to it. ! > And that is not necessarily true; it depends on what options you have ! > in your kernel config. ! > ! > ! > For instance, with my custom kernel, things look like this: ! > ! > $ ctfdump -t kernel.full ! > ! > - Types ---------------------------------------------------------------------- ! > ! > [1] STRUCT (anon) (8 bytes) ! > sle_next type=262 off=0 ! > ! > [2] STRUCT (anon) (8 bytes) ! > stqe_next type=262 off=0 ! > ! > [3] UNION (anon) (8 bytes) ! > m_next type=262 off=0 ! > m_slist type=1 off=0 ! > m_stailq type=2 off=0 ! > ! > [4] UNION (anon) (8 bytes) ! > m_nextpkt type=262 off=0 ! > m_slistpkt type=1 off=0 ! > m_stailqpkt type=2 off=0 ! > ! > <5> INTEGER char encoding=SIGNED CHAR offset=0 bits=8 ! > <6> POINTER (anon) refers to 5 ! > <7> TYPEDEF caddr_t refers to 6 ! > <8> INTEGER int encoding=SIGNED offset=0 bits=32 ! > <9> TYPEDEF __int32_t refers to 8 ! > <10> TYPEDEF int32_t refers to 9 ! > [11] INTEGER unsigned int encoding=0x0 offset=0 bits=8 ! > [12] INTEGER unsigned int encoding=0x0 offset=0 bits=24 ! > [13] STRUCT (anon) (8 bytes) ! > cstqe_next type=229 off=0 ! > ! > <14> POINTER (anon) refers to 229 ! > [15] STRUCT (anon) (16 bytes) ! > le_next type=229 off=0 ! > le_prev type=14 off=64 ! > ! > <16> INTEGER long encoding=SIGNED offset=0 bits=64 ! > <17> ARRAY (anon) content: 5 index: 16 nelems: 16 ! > ! > <18> INTEGER unsigned int encoding=0x0 offset=0 bits=32 ! > <19> TYPEDEF u_int refers to 18 ! > [etc.etc.] ! > ! > ! > As we can see, this one has the bitfield types as #11 and #12, and ! > the intrinsic type as #18. And consequentially things do fail. ! > ! > ! > I currently do not know what is the culprit. Has the linking stage of ! > the kernel a flaw? Or is the patch D27213 based on a wrong assumption? ! > ! > I hope You guys can answer that. For now I changed the patch D27213 ! > to cover the case, so that things do work. ! > Further details on request. ! ! I'm not immediately sure where the problem is. Could you please post ! the kernel configuration and src revision that you're using, so that I ! can try and reproduce this? Oh, I feared that would come... Src revision is easy now: release/12.3.0 (70cb68e7a00) Kernel config is difficult. I have compiled into the kernel * ipfw (obviousely) * dtraceall * drm2 & friends (that needs objects to be added to conf/files) * khelp/h_ertt/etc. (that needs the files and fixing the SI_SUB sequence to make it boot) So the kernel config itself doesn't help to reproduce. What I am currently looking for is only an educated statement, about if that types sequence (as quoted above) can possibly happen, or, should never happen at all. If it should not happen, then it's my fault and I might go and look why it happens. ! How exactly does the bug manifest? Exactly as is to be expected, with either of these two errors (depending on the native order of files in /usr/lib/dtrace); [1] dtrace: failed to establish error handler: "/usr/lib/dtrace/ipfw.d", line 107: failed to copy type of 'inp': Conflicting type is already defined [2] dtrace: failed to establish error handler: "/usr/lib/dtrace/psinfo.d", line 41: failed to copy type of 'pr_gid': Conflicting type is already defined Then I single-stepped the libctf and it clearly showed the mismatch between type #11 and type #18 (and the patch 68396709e7 one time doing things where it shouldn't and the other time not doing things where it should). So I am probably on track with understanding what happens, nevertheless I would greatly appreciate some input from You how it *is supposed to* work. cheerio, PMc