Date: Mon, 16 Dec 2024 08:01:59 +0100 From: Philipp <satanist+freebsd@bureaucracy.de> To: Mark Millard <marklmi@yahoo.com> Cc: FreeBSD Current <freebsd-current@freebsd.org>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, freebsd-hackers <freebsd-hackers@freebsd.org>, freebsd-amd64@freebsd.org Subject: Re: What kind of code might generate amd64 addressses like 0xFFFFF80000000007 or be based on 0xFFFFF80000000000 ? Message-ID: <63a9fdaa4ac204c319a1c1e273a29c18.philipp.takacs@asta.kit.edu> In-Reply-To: <65B0673C-287A-47E5-A732-17CC5EEE3350@yahoo.com> References: <65B0673C-287A-47E5-A732-17CC5EEE3350.ref@yahoo.com> <65B0673C-287A-47E5-A732-17CC5EEE3350@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Mark [2024-12-15 16:03] Mark Millard <marklmi@yahoo.com> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D267028 is for a crash= problem > someone has been having over more than 2 years. There are boot time crash= es > involved. > > It appears that 0xFFFFF80000000007 is showing up in use and stored in dat= a > structures as a pointer value in fields/arguments that are pointers, wher= e such > a special value would not be expected. Later defrerencing does not go wel= l, at > least when the dererefenced data is then in-turn put to use. > > The small offset from 0xFFFFF80000000000 suggests to me that the special = value likely > is inappropriately left around and somehow picked up and used. 0xFFFFF800= 00000000 (or > near it) might be odd enough to have only a few known likely possible usa= ges. Such > notes in the bugzilla report would be good if such is the case. Thus my q= uestion. By simple grep through sys/ I found following comment in sys/amd64/include/= vmparam.h: > /* > * Virtual addresses of things. Derived from the page directory and > * page table indexes from pmap.h for precision. > [...] > * 0xfffff80000000000 - 0xfffffbffffffffff 4TB direct map The direct map is 4TB of virtuall address space mapping the physical address space 1:1 (minus the base). So I would guess this is caused by an NULL pointer converted by PHYS_TO_DMAP. Philipp > The context has amdgpu raven support in use normally. Reportedly the prob= lem has > never been seen with that disabled. (However, I'm not aware of experiment= s with > alternate card types, for example.) > > Where, when, and if a boot crash occurs is variable, not stable. But use = of the > list found_modules->tqh_first->. . . tends to be involved. > > > > Some other modern 13.4-RELEASE related context notes > ( comments #231 and #233 ): > > The person with the problem reports . . . > > I am not using a stock distribution of the kernel: > > diff -u sys/amd64/conf/{GENERIC,M5P} > --- sys/amd64/conf/GENERIC 2024-07-03 16:23:56.252550000 -0400 > +++ sys/amd64/conf/M5P 2024-07-03 16:25:05.287604000 -0400 > @@ -18,12 +18,13 @@ > # > = > cpu HAMMER > -ident GENERIC > +ident M5P > = > makeoptions DEBUG=3D-g # Build kernel with gdb(1) debug symbols > makeoptions WITH_CTF=3D1 # Run ctfconvert(1) for DTrace support > = > -options SCHED_ULE # ULE scheduler > +#options SCHED_ULE # ULE scheduler > +options SCHED_4BSD # 4BSD scheduler > options NUMA # Non-Uniform Memory Architecture support > options PREEMPTION # Enable kernel thread preemption > options VIMAGE # Subsystem virtualization, e.g. VNET > > > I also noted (for modern 13.4-RELEASE times): > > Also: the build is based on the -p2 source code (hash 3f40d5821): > > # strings boot/kernel/kernel | grep "\-RELEASE" > @(#)FreeBSD 13.4-RELEASE-p2 3f40d5821 M5P > FreeBSD 13.4-RELEASE-p2 3f40d5821 M5P > 13.4-RELEASE-p2 > > Because it is a rebuild, the kernel ends up with -p2 instead > of the official -p1 ( from -p2 not updating boot/kernel/kernel > in the official distributions ). > > > > =3D=3D=3D > Mark Millard > marklmi at yahoo.com > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?63a9fdaa4ac204c319a1c1e273a29c18.philipp.takacs>