Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Dec 2022 14:06:38 -0800
From:      Rick Macklem <rick.macklem@gmail.com>
To:        Rob Wing <rob.fx907@gmail.com>
Cc:        John F Carr <jfc@mit.edu>, Hikmat Jafarli <jafarlihi@gmail.com>,  "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: Trying to implement BFS, page fault at vfs_domount_first, how to debug?
Message-ID:  <CAM5tNy5p-0vtPyU2aTFCdJ50-sbK4uqezzXp%2Bsg9WqOwGbEumQ@mail.gmail.com>
In-Reply-To: <CAF3%2Bn_eVEUz_Qmf-eU4T-UaLmKLsRf_x8JMv7pHFcmUJC2SZLg@mail.gmail.com>
References:  <CAPWrP-Y3usfDukwhQroJY0NUbZK_C=cuctm%2BXYSjBqDQYejBWw@mail.gmail.com> <23A1E4DF-320A-4BCB-ADB8-83FEFC3D7649@mit.edu> <CAF3%2Bn_eVEUz_Qmf-eU4T-UaLmKLsRf_x8JMv7pHFcmUJC2SZLg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000059449005f112d1f8
Content-Type: text/plain; charset="UTF-8"

On Fri, Dec 30, 2022 at 11:48 AM Rob Wing <rob.fx907@gmail.com> wrote:

> you might try `addr2line -e $path_to_kernel 0xffffffff80cf0651`
>
Just a note. I find I need the kernel called "kernel.debug" for
addr2line to work. It normally lives in the kernel build directory
under /usr/obj.

rick


>
> Aside from that, it looks like errors aren't being handled correctly after
> failing to find the BFS superblock in bfs_mountfs(). Since no error is
> returned after failing to find the superblock..I'm guessing that the NULL
> pointer `bfsmp` is being de-referenced in bfs_statfs().
>
> On Fri, Dec 30, 2022 at 10:35 AM John F Carr <jfc@mit.edu> wrote:
>
>>
>>
>> > On Dec 30, 2022, at 14:13, Hikmat Jafarli <jafarlihi@gmail.com> wrote:
>> >
>> > I'm trying to implement the BeOS filesystem (BFS) for FreeBSD.
>> > The repository is here: https://github.com/jafarlihi/freebsd-bfs
>> > (Please don't mind bad styling and all the copy-paste work,
>> > I'll polish it later, I'm just trying to get to some PoC where it works)
>> >
>> > Now when I try to mount a valid BFS partition (reported as BFS by
>> `fstyp`)
>> > it executes all the way to printf that logs "Either not a BFS volume or
>> > corrupted" and then crashes with "page fault while in kernel mode" in
>> > vfs_domount_first+0x271. Here's the log:
>> > ```
>> > Either not a BFS volume or corrupted
>> >
>> > Fatal trap 12: page fault while in kernel mode
>> > cpuid = 0; apic id = 00
>> > fault virtual address = 0x18
>> > fault code = supervisor read data, page not present
>> > instruction pointer = 0x20:0xffffffff82b2427b
>> > stack pointer        = 0x28:0xfffffe00df399ac0
>> > frame pointer        = 0x28:0xfffffe00df399ac0
>> > code segment = base 0x0, limit 0xfffff, type 0x1b
>> > = DPL 0, pres 1, long 1, def32 0, gran 1
>> > processor eflags = interrupt enabled, resume, IOPL = 0
>> > current process = 1208 (mount)
>> > trap number = 12
>> > panic: page fault
>> > cpuid = 0
>> > time = 1672414952
>> > KDB: stack backtrace:
>> > #0 0xffffffff80c694a5 at kdb_backtrace+0x65
>> > #1 0xffffffff80c1bb5f at vpanic+0x17f
>> > #2 0xffffffff80c1b9d3 at panic+0x43
>> > #3 0xffffffff810afdf5 at trap_fatal+0x385
>> > #4 0xffffffff810afe4f at trap_pfault+0x4f
>> > #5 0xffffffff810875b8 at calltrap+0x8
>> > #6 0xffffffff80cf0651 at vfs_domount_first+0x271
>> > #7 0xffffffff80cece9d at vfs_domount+0x2ad
>> > #8 0xffffffff80cec2d8 at vfs_donmount+0x8f8
>> > #9 0xffffffff80ceb9a9 at sys_nmount+0x69
>> > #10 0xffffffff810b06ec at amd64_syscall+0x10c
>> > #11 0xffffffff81087ecb at fast_syscall_common+0xf8
>> > ```
>> >
>> > Now I'm trying to understand what exactly goes wrong here
>> > and how to map 0x271 to the exact source line.
>> >
>> > I'd appreciate it if someone could tell me how to debug this.
>> >
>> > (Sorry for noob question, I already tried IRC and was directed here)
>>
>> Your BFS module tried to dereference a null pointer to structure.
>>
>> It's a null pointer dereference because of "fault virtual address =
>> 0x18".  That normally means you tried to access the fourth word of a
>> structure but the pointer to structure was null.  It could be something
>> else, but play the odds.
>>
>> It's in your module because the instruction pointer address is far beyond
>> the other kernel functions in the stack trace.  Stack traces in crash
>> reports are misleading: they tend to omit the function that triggered the
>> crash.  The address of vfs_domount_first is 0xffffffff80cf03e0
>> (0xffffffff80cf0651 - 0x271).  That's the function that called your
>> module.  The address of the faulting instruction is 0xffffffff82b2427b.
>> That's in your module.
>>
>>
>>
>>

--00000000000059449005f112d1f8
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div class=3D"gmail_default" style=3D"fon=
t-family:monospace"><br></div></div><br><div class=3D"gmail_quote"><div dir=
=3D"ltr" class=3D"gmail_attr">On Fri, Dec 30, 2022 at 11:48 AM Rob Wing &lt=
;<a href=3D"mailto:rob.fx907@gmail.com">rob.fx907@gmail.com</a>&gt; wrote:<=
br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8e=
x;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"=
><div>you might try `addr2line -e $path_to_kernel 0xffffffff80cf0651`</div>=
</div></blockquote><div><span class=3D"gmail_default" style=3D"font-family:=
monospace">Just a note. I find I need the kernel called &quot;kernel.debug&=
quot; for</span></div><div><span class=3D"gmail_default" style=3D"font-fami=
ly:monospace">addr2line to work. It normally lives in the kernel build dire=
ctory</span></div><div><span class=3D"gmail_default" style=3D"font-family:m=
onospace">under /usr/obj.</span></div><div><span class=3D"gmail_default" st=
yle=3D"font-family:monospace"><br></span></div><div><span class=3D"gmail_de=
fault" style=3D"font-family:monospace">rick</span></div><div><span class=3D=
"gmail_default" style=3D"font-family:monospace"></span>=C2=A0</div><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px=
 solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div><br></div><=
div>Aside from that, it looks like errors aren&#39;t being handled correctl=
y after failing to find the BFS superblock in bfs_mountfs(). Since no error=
 is returned after failing to find the superblock..I&#39;m guessing that th=
e NULL pointer `bfsmp` is being de-referenced in bfs_statfs().</div></div><=
br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Fri,=
 Dec 30, 2022 at 10:35 AM John F Carr &lt;<a href=3D"mailto:jfc@mit.edu" ta=
rget=3D"_blank">jfc@mit.edu</a>&gt; wrote:<br></div><blockquote class=3D"gm=
ail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,=
204,204);padding-left:1ex"><br>
<br>
&gt; On Dec 30, 2022, at 14:13, Hikmat Jafarli &lt;<a href=3D"mailto:jafarl=
ihi@gmail.com" target=3D"_blank">jafarlihi@gmail.com</a>&gt; wrote:<br>
&gt; <br>
&gt; I&#39;m trying to implement the BeOS filesystem (BFS) for FreeBSD.<br>
&gt; The repository is here: <a href=3D"https://github.com/jafarlihi/freebs=
d-bfs" rel=3D"noreferrer" target=3D"_blank">https://github.com/jafarlihi/fr=
eebsd-bfs</a><br>
&gt; (Please don&#39;t mind bad styling and all the copy-paste work,<br>
&gt; I&#39;ll polish it later, I&#39;m just trying to get to some PoC where=
 it works)<br>
&gt; <br>
&gt; Now when I try to mount a valid BFS partition (reported as BFS by `fst=
yp`)<br>
&gt; it executes all the way to printf that logs &quot;Either not a BFS vol=
ume or<br>
&gt; corrupted&quot; and then crashes with &quot;page fault while in kernel=
 mode&quot; in<br>
&gt; vfs_domount_first+0x271. Here&#39;s the log:<br>
&gt; ```<br>
&gt; Either not a BFS volume or corrupted<br>
&gt; <br>
&gt; Fatal trap 12: page fault while in kernel mode<br>
&gt; cpuid =3D 0; apic id =3D 00<br>
&gt; fault virtual address =3D 0x18<br>
&gt; fault code =3D supervisor read data, page not present<br>
&gt; instruction pointer =3D 0x20:0xffffffff82b2427b<br>
&gt; stack pointer=C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 0x28:0xfffffe00df399ac0<b=
r>
&gt; frame pointer=C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 0x28:0xfffffe00df399ac0<b=
r>
&gt; code segment =3D base 0x0, limit 0xfffff, type 0x1b<br>
&gt; =3D DPL 0, pres 1, long 1, def32 0, gran 1<br>
&gt; processor eflags =3D interrupt enabled, resume, IOPL =3D 0<br>
&gt; current process =3D 1208 (mount)<br>
&gt; trap number =3D 12<br>
&gt; panic: page fault<br>
&gt; cpuid =3D 0<br>
&gt; time =3D 1672414952<br>
&gt; KDB: stack backtrace:<br>
&gt; #0 0xffffffff80c694a5 at kdb_backtrace+0x65<br>
&gt; #1 0xffffffff80c1bb5f at vpanic+0x17f<br>
&gt; #2 0xffffffff80c1b9d3 at panic+0x43<br>
&gt; #3 0xffffffff810afdf5 at trap_fatal+0x385<br>
&gt; #4 0xffffffff810afe4f at trap_pfault+0x4f<br>
&gt; #5 0xffffffff810875b8 at calltrap+0x8<br>
&gt; #6 0xffffffff80cf0651 at vfs_domount_first+0x271<br>
&gt; #7 0xffffffff80cece9d at vfs_domount+0x2ad<br>
&gt; #8 0xffffffff80cec2d8 at vfs_donmount+0x8f8<br>
&gt; #9 0xffffffff80ceb9a9 at sys_nmount+0x69<br>
&gt; #10 0xffffffff810b06ec at amd64_syscall+0x10c<br>
&gt; #11 0xffffffff81087ecb at fast_syscall_common+0xf8<br>
&gt; ```<br>
&gt; <br>
&gt; Now I&#39;m trying to understand what exactly goes wrong here<br>
&gt; and how to map 0x271 to the exact source line.<br>
&gt; <br>
&gt; I&#39;d appreciate it if someone could tell me how to debug this.<br>
&gt; <br>
&gt; (Sorry for noob question, I already tried IRC and was directed here)<b=
r>
<br>
Your BFS module tried to dereference a null pointer to structure.<br>
<br>
It&#39;s a null pointer dereference because of &quot;fault virtual address =
=3D 0x18&quot;.=C2=A0 That normally means you tried to access the fourth wo=
rd of a structure but the pointer to structure was null.=C2=A0 It could be =
something else, but play the odds.<br>
<br>
It&#39;s in your module because the instruction pointer address is far beyo=
nd the other kernel functions in the stack trace.=C2=A0 Stack traces in cra=
sh reports are misleading: they tend to omit the function that triggered th=
e crash.=C2=A0 The address of vfs_domount_first is 0xffffffff80cf03e0 (0xff=
ffffff80cf0651 - 0x271).=C2=A0 That&#39;s the function that called your mod=
ule.=C2=A0 The address of the faulting instruction is 0xffffffff82b2427b.=
=C2=A0 That&#39;s in your module.<br>
<br>
<br>
<br>
</blockquote></div>
</blockquote></div></div>

--00000000000059449005f112d1f8--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy5p-0vtPyU2aTFCdJ50-sbK4uqezzXp%2Bsg9WqOwGbEumQ>