Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Mar 2025 03:21:01 +0000 (UTC)
From:      Pedro Giffuni <pfg@freebsd.org>
To:        "paige@paige.bio" <paige@paige.bio>
Cc:        "hackers@freebsd.org" <hackers@freebsd.org>
Subject:   LLM and file systems (was Re: Porting BeFS to FreeBSD for GSoC2025)
Message-ID:  <773517072.6945384.1741058461093@mail.yahoo.com>
In-Reply-To: <88C34907-1525-4927-8105-153B389BFA55@paige.bio>
References:  <1150935855.6567926.1741010692744@mail.yahoo.com> <88C34907-1525-4927-8105-153B389BFA55@paige.bio>

next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_6945383_2014583426.1741058461091
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

 Hi,
There are good reasons to avoid LLM generated code in an OS like FreeBSD. I=
n short it will take some time to understand the licensing implications of =
using code based on other code used to train it. My previous employer was c=
oncerned about sharing costumer code with the owner of the AI provider.
It may be conceptually acceptable to use LLM to generate test cases though,=
 since the test cases do not end up being part of the end product, but it v=
ery much depends on the project. For FreeBSD nothing has been approved AFAI=
CT.
My guess for your project is that you can package it as an external loadabl=
e module and add a legal disclaimer (which I wouildn't know how to write si=
nce I am not a lawyer ;-) ).For a GSoC we expect a human programmer.
Pedro.

    On Monday, March 3, 2025 at 04:05:03 PM GMT-5, <paige@paige.bio> wrote:=
 =20
=20
 I=E2=80=99ve collectively been making an ExFAT native driver (uses VFS ins=
tead of fuse)=C2=A0
https://github.com/paigeadelethompson/exfat

And I=E2=80=99ve been using an LLM to do it. I recommend using something li=
ke Claude if you can, not sure when I=E2=80=99ll be done with this but if y=
ou want some advice:=C2=A0
- start with newfs and use a known good chkdsk or fsck program on another c=
omputer; macOS is good starting point if you can get befs.fsck there otherw=
ise plan on having to copy stuff back and forth a bit.
If you use an LLM and can get this converted to text:=C2=A0https://www.nobi=
us.org/dbg/practical-file-system-design.pdf=C2=A0it will help you a lot=C2=
=A0ExFAT is documented extensively on MSDN and Claude-3.5-sonnet seems to h=
ave pretty decent RAG. In any case I recommend having a look through my REA=
DME and making heavy use of bootverbose.. but you will also want to enable =
the various kernel level options in my readme, VFS is a little tricky but o=
nce you get through this initial mount trace:
https://github.com/paigeadelethompson/exfat/commit/187c6694c68554f7961b4275=
01373984a0742366

The rest shouldn=E2=80=99t be as bad.. you can see the snippet of bootverbo=
se messages have the function name that its calling from (very helpful to h=
ave honestly especially if you=E2=80=99re using an LLM) but be prepared to =
drop into DDB and reset / retry a few dozen or a hundred times until you fi=
gure out VFS in any case xD
At least with lock debugging enabled in the kernel it=E2=80=99s a little mo=
re actionable.=C2=A0
Sent from my iPhone

On Mar 3, 2025, at 6:05=E2=80=AFAM, Pedro Giffuni <pfg@freebsd.org> wrote:



=EF=BB=BF Hello Krutarth;
Thank you for the interest!
Yes, the idea is still open. In all honesty FreeBSD does have much better f=
ilesystems than openBFS, but we don't have a "true" journalling filesystem =
and BFS is rather well documented with an open implementation so it could s=
till be a nice to have.
At a time I spoke with some Haiku guys and Bruno was interested in co-mento=
ring this project.
As I mentioned in private, you are probably better of checking the ext2fs s=
ources (sys/fs/ext2fs), for a simplified UFS. We don't have any open issues=
 AFAICT, but maybe fedor@ has something pending.
For documentation "The Design and Implementation of the FreeBSD OS", seems =
pretty much compulsory.
Pedro.
ps. I am somewhat retired from FreeBSD, if such a thing exists, but if no o=
ne else steps in I would co-mentor.

    On Monday, March 3, 2025 at 12:53:00 AM GMT-5, Krutarth Patel <krutarth=
patel929@gmail.com> wrote: =20
=20
=20
Hello,


I am interested in porting BeFS from Haiku. I see that it is listed as one =
of the GSoC ideas.

I have done some contributions in the PCI subsystem over at Haiku and have =
some Linux kernel debugging experience.=C2=A0

I am new to FreeBSD( not entirely, I am in the process of porting a driver =
from FreeBSD to Haiku) and filesystems in general( I have an idea of the ba=
sic terminologies like inode, block etc. but thats about it).=C2=A0 But I a=
m willing to learn.

Here are my questions:
  =20
   - Is the idea still open?
   - Are there any smaller issues I can resolve to get myself familiar with=
 codebase?( something related to UFS/ZFS would be perfect)
   - Where is the UFS and ZFS implementation in the source tree?
   - Any recommended resources for learning about filesystems( specifically=
 FreeBSD, I am reading a guide about BeFS )?

Looking forward=C2=A0to hearing from you






 =20
 =20
------=_Part_6945383_2014583426.1741058461091
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html><head></head><body><div class=3D"ydp9f47a6a3yahoo-style-wrap" style=
=3D"font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:16px=
;"><div></div>
        <div dir=3D"ltr" data-setdir=3D"false">Hi,</div><div dir=3D"ltr" da=
ta-setdir=3D"false"><br></div><div dir=3D"ltr" data-setdir=3D"false">There =
are good reasons to avoid LLM generated code in an OS like FreeBSD. In shor=
t it will take some time to understand the licensing implications of using =
code based on other code used to train it. My previous employer was concern=
ed about sharing costumer code with the owner of the AI provider.</div><div=
 dir=3D"ltr" data-setdir=3D"false"><br></div><div dir=3D"ltr" data-setdir=
=3D"false">It may be conceptually acceptable to use LLM to generate test ca=
ses though, since the test cases do not end up being part of the end produc=
t, but it very much depends on the project. For FreeBSD nothing has been ap=
proved AFAICT.</div><div dir=3D"ltr" data-setdir=3D"false"><br></div><div d=
ir=3D"ltr" data-setdir=3D"false">My guess for your project is that you can =
package it as an external loadable module and add a legal disclaimer (which=
 I wouildn't know how to write since I am not a lawyer ;-) ).</div><div dir=
=3D"ltr" data-setdir=3D"false">For a GSoC we expect a human programmer.</di=
v><div dir=3D"ltr" data-setdir=3D"false"><br></div><div dir=3D"ltr" data-se=
tdir=3D"false">Pedro.</div><div dir=3D"ltr" data-setdir=3D"false"><br></div=
><div><br></div>
       =20
        </div><div id=3D"ydpb26e0b43yahoo_quoted_1502622544" class=3D"ydpb2=
6e0b43yahoo_quoted">
            <div style=3D"font-family:'Helvetica Neue', Helvetica, Arial, s=
ans-serif;font-size:13px;color:#26282a;">
               =20
                <div>
                        On Monday, March 3, 2025 at 04:05:03 PM GMT-5,  &lt=
;paige@paige.bio&gt; wrote:
                    </div>
                    <div><br></div>
                    <div><br></div>
               =20
               =20
                <div><div id=3D"ydpb26e0b43yiv4343453424"><div></div><div><=
div>I=E2=80=99ve collectively been making an ExFAT native driver (uses VFS =
instead of fuse)&nbsp;<div><br clear=3D"none"></div><div><a shape=3D"rect" =
href=3D"https://github.com/paigeadelethompson/exfat" rel=3D"nofollow" targe=
t=3D"_blank">https://github.com/paigeadelethompson/exfat</a><br clear=3D"no=
ne"></div><div><br clear=3D"none"></div><div>And I=E2=80=99ve been using an=
 LLM to do it. I recommend using something like Claude if you can, not sure=
 when I=E2=80=99ll be done with this but if you want some advice:&nbsp;</di=
v><div><br clear=3D"none"></div><div>- start with newfs and use a known goo=
d chkdsk or fsck program on another computer; macOS is good starting point =
if you can get befs.fsck there otherwise plan on having to copy stuff back =
and forth a bit.</div><div><br clear=3D"none"></div><div>If you use an LLM =
and can get this converted to text:&nbsp;<a shape=3D"rect" href=3D"https://=
www.nobius.org/dbg/practical-file-system-design.pdf" rel=3D"nofollow" targe=
t=3D"_blank">https://www.nobius.org/dbg/practical-file-system-design.pdf</a=
>&nbsp;it will help you a lot&nbsp;</div><div>ExFAT is documented extensive=
ly on MSDN and Claude-3.5-sonnet seems to have pretty decent RAG. In any ca=
se I recommend having a look through my README and making heavy use of boot=
verbose.. but you will also want to enable the various kernel level options=
 in my readme, VFS is a little tricky but once you get through this initial=
 mount trace:</div><div><br clear=3D"none"></div><div><div style=3D"display=
:block;"><a shape=3D"rect" href=3D"https://github.com/paigeadelethompson/ex=
fat/commit/187c6694c68554f7961b427501373984a0742366" rel=3D"nofollow" targe=
t=3D"_blank">https://github.com/paigeadelethompson/exfat/commit/187c6694c68=
554f7961b427501373984a0742366</a><br clear=3D"none"></div></div><div style=
=3D"display:block;"><br clear=3D"none"></div><div style=3D"display:block;">=
The rest shouldn=E2=80=99t be as bad.. you can see the snippet of bootverbo=
se messages have the function name that its calling from (very helpful to h=
ave honestly especially if you=E2=80=99re using an LLM) but be prepared to =
drop into DDB and reset / retry a few dozen or a hundred times until you fi=
gure out VFS in any case xD</div><div style=3D"display:block;"><br clear=3D=
"none"></div><div style=3D"display:block;">At least with lock debugging ena=
bled in the kernel it=E2=80=99s a little more actionable.&nbsp;</div><div s=
tyle=3D"display:block;"><br clear=3D"none"></div><div><div dir=3D"ltr">Sent=
 from my iPhone</div><div dir=3D"ltr"><div id=3D"ydpb26e0b43yiv4343453424yq=
tfd54385" class=3D"ydpb26e0b43yiv4343453424yqt7444475249"><br clear=3D"none=
"><blockquote type=3D"cite">On Mar 3, 2025, at 6:05=E2=80=AFAM, Pedro Giffu=
ni &lt;pfg@freebsd.org&gt; wrote:<br clear=3D"none"><br clear=3D"none"></bl=
ockquote></div></div><div id=3D"ydpb26e0b43yiv4343453424yqtfd45658" class=
=3D"ydpb26e0b43yiv4343453424yqt7444475249"><blockquote type=3D"cite"><div d=
ir=3D"ltr">=EF=BB=BF<div style=3D"font-family:Helvetica Neue, Helvetica, Ar=
ial, sans-serif;font-size:16px;" class=3D"ydpb26e0b43yiv4343453424ydpa15b2a=
ayahoo-style-wrap"><div></div>
        <div dir=3D"ltr">Hello Krutarth;</div><div dir=3D"ltr"><br clear=3D=
"none"></div><div dir=3D"ltr">Thank you for the interest!</div><div dir=3D"=
ltr"><br clear=3D"none"></div><div dir=3D"ltr">Yes, the idea is still open.=
 In all honesty FreeBSD does have much better filesystems than openBFS, but=
 we don't have a "true" journalling filesystem and BFS is rather well docum=
ented with an open implementation so it could still be a nice to have.</div=
><div><br clear=3D"none"></div><div dir=3D"ltr">At a time I spoke with some=
 Haiku guys and Bruno was interested in co-mentoring this project.</div><di=
v dir=3D"ltr"><br clear=3D"none"></div><div dir=3D"ltr">As I mentioned in p=
rivate, you are probably better of checking the ext2fs sources (sys/fs/ext2=
fs), for a simplified UFS. We don't have any open issues AFAICT, but maybe =
fedor@ has something pending.</div><div dir=3D"ltr"><br clear=3D"none"></di=
v><div dir=3D"ltr">For documentation "The Design and Implementation of the =
FreeBSD OS", seems pretty much compulsory.</div><div dir=3D"ltr"><br clear=
=3D"none"></div><div dir=3D"ltr">Pedro.</div><div><br clear=3D"none"></div>=
<div dir=3D"ltr">ps. I am somewhat retired from FreeBSD, if such a thing ex=
ists, but if no one else steps in I would co-mentor.</div><div><br clear=3D=
"none"></div><div><br clear=3D"none"></div>
       =20
        </div><div id=3D"ydpb26e0b43yiv4343453424ydp7fb8b763yahoo_quoted_14=
02091706" class=3D"ydpb26e0b43yiv4343453424ydp7fb8b763yahoo_quoted">
            <div style=3D"font-family:'Helvetica Neue', Helvetica, Arial, s=
ans-serif;font-size:13px;color:#26282a;">
               =20
                <div>
                        On Monday, March 3, 2025 at 12:53:00 AM GMT-5, Krut=
arth Patel &lt;krutarthpatel929@gmail.com&gt; wrote:
                    </div>
                    <div><br clear=3D"none"></div>
                    <div><br clear=3D"none"></div>
               =20
               =20
                <div><div id=3D"ydpb26e0b43yiv4343453424ydp7fb8b763yiv84147=
67627"><div>
<div><p>Hello,<br clear=3D"none"></p><p>I am interested in porting BeFS fro=
m Haiku. I see that it is listed as one of the GSoC ideas.</p><p>I have don=
e some contributions in the PCI subsystem over at Haiku and have some Linux=
 kernel debugging experience.&nbsp;</p><p>I am new to FreeBSD( not entirely=
, I am in the process of porting a driver from FreeBSD to Haiku) and filesy=
stems in general( I have an idea of the basic terminologies like inode, blo=
ck etc. but thats about it).&nbsp; But I am willing to learn.</p><p>Here ar=
e my questions:</p>
<ul><li>Is the idea still open?</li><li>Are there any smaller issues I can =
resolve to get myself familiar with codebase?( something related to UFS/ZFS=
 would be perfect)</li><li>Where is the UFS and ZFS implementation in the s=
ource tree?</li><li>Any recommended resources for learning about filesystem=
s( specifically FreeBSD, I am reading a guide about BeFS )?</li></ul><p>Loo=
king forward&nbsp;to hearing from you</p><p><br clear=3D"none"></p><p><br c=
lear=3D"none"></p>
</div>
</div>
</div></div>
            </div>
        </div></div></blockquote></div></div></div></div></div></div>
            </div>
        </div></body></html>
------=_Part_6945383_2014583426.1741058461091--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?773517072.6945384.1741058461093>