Date: Tue, 4 Mar 2025 03:21:01 +0000 (UTC) From: Pedro Giffuni <pfg@freebsd.org> To: "paige@paige.bio" <paige@paige.bio> Cc: "hackers@freebsd.org" <hackers@freebsd.org> Subject: LLM and file systems (was Re: Porting BeFS to FreeBSD for GSoC2025) Message-ID: <773517072.6945384.1741058461093@mail.yahoo.com> In-Reply-To: <88C34907-1525-4927-8105-153B389BFA55@paige.bio> References: <1150935855.6567926.1741010692744@mail.yahoo.com> <88C34907-1525-4927-8105-153B389BFA55@paige.bio>
next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_6945383_2014583426.1741058461091 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, There are good reasons to avoid LLM generated code in an OS like FreeBSD. I= n short it will take some time to understand the licensing implications of = using code based on other code used to train it. My previous employer was c= oncerned about sharing costumer code with the owner of the AI provider. It may be conceptually acceptable to use LLM to generate test cases though,= since the test cases do not end up being part of the end product, but it v= ery much depends on the project. For FreeBSD nothing has been approved AFAI= CT. My guess for your project is that you can package it as an external loadabl= e module and add a legal disclaimer (which I wouildn't know how to write si= nce I am not a lawyer ;-) ).For a GSoC we expect a human programmer. Pedro. On Monday, March 3, 2025 at 04:05:03 PM GMT-5, <paige@paige.bio> wrote:= =20 =20 I=E2=80=99ve collectively been making an ExFAT native driver (uses VFS ins= tead of fuse)=C2=A0 https://github.com/paigeadelethompson/exfat And I=E2=80=99ve been using an LLM to do it. I recommend using something li= ke Claude if you can, not sure when I=E2=80=99ll be done with this but if y= ou want some advice:=C2=A0 - start with newfs and use a known good chkdsk or fsck program on another c= omputer; macOS is good starting point if you can get befs.fsck there otherw= ise plan on having to copy stuff back and forth a bit. If you use an LLM and can get this converted to text:=C2=A0https://www.nobi= us.org/dbg/practical-file-system-design.pdf=C2=A0it will help you a lot=C2= =A0ExFAT is documented extensively on MSDN and Claude-3.5-sonnet seems to h= ave pretty decent RAG. In any case I recommend having a look through my REA= DME and making heavy use of bootverbose.. but you will also want to enable = the various kernel level options in my readme, VFS is a little tricky but o= nce you get through this initial mount trace: https://github.com/paigeadelethompson/exfat/commit/187c6694c68554f7961b4275= 01373984a0742366 The rest shouldn=E2=80=99t be as bad.. you can see the snippet of bootverbo= se messages have the function name that its calling from (very helpful to h= ave honestly especially if you=E2=80=99re using an LLM) but be prepared to = drop into DDB and reset / retry a few dozen or a hundred times until you fi= gure out VFS in any case xD At least with lock debugging enabled in the kernel it=E2=80=99s a little mo= re actionable.=C2=A0 Sent from my iPhone On Mar 3, 2025, at 6:05=E2=80=AFAM, Pedro Giffuni <pfg@freebsd.org> wrote: =EF=BB=BF Hello Krutarth; Thank you for the interest! Yes, the idea is still open. In all honesty FreeBSD does have much better f= ilesystems than openBFS, but we don't have a "true" journalling filesystem = and BFS is rather well documented with an open implementation so it could s= till be a nice to have. At a time I spoke with some Haiku guys and Bruno was interested in co-mento= ring this project. As I mentioned in private, you are probably better of checking the ext2fs s= ources (sys/fs/ext2fs), for a simplified UFS. We don't have any open issues= AFAICT, but maybe fedor@ has something pending. For documentation "The Design and Implementation of the FreeBSD OS", seems = pretty much compulsory. Pedro. ps. I am somewhat retired from FreeBSD, if such a thing exists, but if no o= ne else steps in I would co-mentor. On Monday, March 3, 2025 at 12:53:00 AM GMT-5, Krutarth Patel <krutarth= patel929@gmail.com> wrote: =20 =20 =20 Hello, I am interested in porting BeFS from Haiku. I see that it is listed as one = of the GSoC ideas. I have done some contributions in the PCI subsystem over at Haiku and have = some Linux kernel debugging experience.=C2=A0 I am new to FreeBSD( not entirely, I am in the process of porting a driver = from FreeBSD to Haiku) and filesystems in general( I have an idea of the ba= sic terminologies like inode, block etc. but thats about it).=C2=A0 But I a= m willing to learn. Here are my questions: =20 - Is the idea still open? - Are there any smaller issues I can resolve to get myself familiar with= codebase?( something related to UFS/ZFS would be perfect) - Where is the UFS and ZFS implementation in the source tree? - Any recommended resources for learning about filesystems( specifically= FreeBSD, I am reading a guide about BeFS )? Looking forward=C2=A0to hearing from you =20 =20 ------=_Part_6945383_2014583426.1741058461091 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <html><head></head><body><div class=3D"ydp9f47a6a3yahoo-style-wrap" style= =3D"font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:16px= ;"><div></div> <div dir=3D"ltr" data-setdir=3D"false">Hi,</div><div dir=3D"ltr" da= ta-setdir=3D"false"><br></div><div dir=3D"ltr" data-setdir=3D"false">There = are good reasons to avoid LLM generated code in an OS like FreeBSD. In shor= t it will take some time to understand the licensing implications of using = code based on other code used to train it. My previous employer was concern= ed about sharing costumer code with the owner of the AI provider.</div><div= dir=3D"ltr" data-setdir=3D"false"><br></div><div dir=3D"ltr" data-setdir= =3D"false">It may be conceptually acceptable to use LLM to generate test ca= ses though, since the test cases do not end up being part of the end produc= t, but it very much depends on the project. For FreeBSD nothing has been ap= proved AFAICT.</div><div dir=3D"ltr" data-setdir=3D"false"><br></div><div d= ir=3D"ltr" data-setdir=3D"false">My guess for your project is that you can = package it as an external loadable module and add a legal disclaimer (which= I wouildn't know how to write since I am not a lawyer ;-) ).</div><div dir= =3D"ltr" data-setdir=3D"false">For a GSoC we expect a human programmer.</di= v><div dir=3D"ltr" data-setdir=3D"false"><br></div><div dir=3D"ltr" data-se= tdir=3D"false">Pedro.</div><div dir=3D"ltr" data-setdir=3D"false"><br></div= ><div><br></div> =20 </div><div id=3D"ydpb26e0b43yahoo_quoted_1502622544" class=3D"ydpb2= 6e0b43yahoo_quoted"> <div style=3D"font-family:'Helvetica Neue', Helvetica, Arial, s= ans-serif;font-size:13px;color:#26282a;"> =20 <div> On Monday, March 3, 2025 at 04:05:03 PM GMT-5, <= ;paige@paige.bio> wrote: </div> <div><br></div> <div><br></div> =20 =20 <div><div id=3D"ydpb26e0b43yiv4343453424"><div></div><div><= div>I=E2=80=99ve collectively been making an ExFAT native driver (uses VFS = instead of fuse) <div><br clear=3D"none"></div><div><a shape=3D"rect" = href=3D"https://github.com/paigeadelethompson/exfat" rel=3D"nofollow" targe= t=3D"_blank">https://github.com/paigeadelethompson/exfat</a><br clear=3D"no= ne"></div><div><br clear=3D"none"></div><div>And I=E2=80=99ve been using an= LLM to do it. I recommend using something like Claude if you can, not sure= when I=E2=80=99ll be done with this but if you want some advice: </di= v><div><br clear=3D"none"></div><div>- start with newfs and use a known goo= d chkdsk or fsck program on another computer; macOS is good starting point = if you can get befs.fsck there otherwise plan on having to copy stuff back = and forth a bit.</div><div><br clear=3D"none"></div><div>If you use an LLM = and can get this converted to text: <a shape=3D"rect" href=3D"https://= www.nobius.org/dbg/practical-file-system-design.pdf" rel=3D"nofollow" targe= t=3D"_blank">https://www.nobius.org/dbg/practical-file-system-design.pdf</a= > it will help you a lot </div><div>ExFAT is documented extensive= ly on MSDN and Claude-3.5-sonnet seems to have pretty decent RAG. In any ca= se I recommend having a look through my README and making heavy use of boot= verbose.. but you will also want to enable the various kernel level options= in my readme, VFS is a little tricky but once you get through this initial= mount trace:</div><div><br clear=3D"none"></div><div><div style=3D"display= :block;"><a shape=3D"rect" href=3D"https://github.com/paigeadelethompson/ex= fat/commit/187c6694c68554f7961b427501373984a0742366" rel=3D"nofollow" targe= t=3D"_blank">https://github.com/paigeadelethompson/exfat/commit/187c6694c68= 554f7961b427501373984a0742366</a><br clear=3D"none"></div></div><div style= =3D"display:block;"><br clear=3D"none"></div><div style=3D"display:block;">= The rest shouldn=E2=80=99t be as bad.. you can see the snippet of bootverbo= se messages have the function name that its calling from (very helpful to h= ave honestly especially if you=E2=80=99re using an LLM) but be prepared to = drop into DDB and reset / retry a few dozen or a hundred times until you fi= gure out VFS in any case xD</div><div style=3D"display:block;"><br clear=3D= "none"></div><div style=3D"display:block;">At least with lock debugging ena= bled in the kernel it=E2=80=99s a little more actionable. </div><div s= tyle=3D"display:block;"><br clear=3D"none"></div><div><div dir=3D"ltr">Sent= from my iPhone</div><div dir=3D"ltr"><div id=3D"ydpb26e0b43yiv4343453424yq= tfd54385" class=3D"ydpb26e0b43yiv4343453424yqt7444475249"><br clear=3D"none= "><blockquote type=3D"cite">On Mar 3, 2025, at 6:05=E2=80=AFAM, Pedro Giffu= ni <pfg@freebsd.org> wrote:<br clear=3D"none"><br clear=3D"none"></bl= ockquote></div></div><div id=3D"ydpb26e0b43yiv4343453424yqtfd45658" class= =3D"ydpb26e0b43yiv4343453424yqt7444475249"><blockquote type=3D"cite"><div d= ir=3D"ltr">=EF=BB=BF<div style=3D"font-family:Helvetica Neue, Helvetica, Ar= ial, sans-serif;font-size:16px;" class=3D"ydpb26e0b43yiv4343453424ydpa15b2a= ayahoo-style-wrap"><div></div> <div dir=3D"ltr">Hello Krutarth;</div><div dir=3D"ltr"><br clear=3D= "none"></div><div dir=3D"ltr">Thank you for the interest!</div><div dir=3D"= ltr"><br clear=3D"none"></div><div dir=3D"ltr">Yes, the idea is still open.= In all honesty FreeBSD does have much better filesystems than openBFS, but= we don't have a "true" journalling filesystem and BFS is rather well docum= ented with an open implementation so it could still be a nice to have.</div= ><div><br clear=3D"none"></div><div dir=3D"ltr">At a time I spoke with some= Haiku guys and Bruno was interested in co-mentoring this project.</div><di= v dir=3D"ltr"><br clear=3D"none"></div><div dir=3D"ltr">As I mentioned in p= rivate, you are probably better of checking the ext2fs sources (sys/fs/ext2= fs), for a simplified UFS. We don't have any open issues AFAICT, but maybe = fedor@ has something pending.</div><div dir=3D"ltr"><br clear=3D"none"></di= v><div dir=3D"ltr">For documentation "The Design and Implementation of the = FreeBSD OS", seems pretty much compulsory.</div><div dir=3D"ltr"><br clear= =3D"none"></div><div dir=3D"ltr">Pedro.</div><div><br clear=3D"none"></div>= <div dir=3D"ltr">ps. I am somewhat retired from FreeBSD, if such a thing ex= ists, but if no one else steps in I would co-mentor.</div><div><br clear=3D= "none"></div><div><br clear=3D"none"></div> =20 </div><div id=3D"ydpb26e0b43yiv4343453424ydp7fb8b763yahoo_quoted_14= 02091706" class=3D"ydpb26e0b43yiv4343453424ydp7fb8b763yahoo_quoted"> <div style=3D"font-family:'Helvetica Neue', Helvetica, Arial, s= ans-serif;font-size:13px;color:#26282a;"> =20 <div> On Monday, March 3, 2025 at 12:53:00 AM GMT-5, Krut= arth Patel <krutarthpatel929@gmail.com> wrote: </div> <div><br clear=3D"none"></div> <div><br clear=3D"none"></div> =20 =20 <div><div id=3D"ydpb26e0b43yiv4343453424ydp7fb8b763yiv84147= 67627"><div> <div><p>Hello,<br clear=3D"none"></p><p>I am interested in porting BeFS fro= m Haiku. I see that it is listed as one of the GSoC ideas.</p><p>I have don= e some contributions in the PCI subsystem over at Haiku and have some Linux= kernel debugging experience. </p><p>I am new to FreeBSD( not entirely= , I am in the process of porting a driver from FreeBSD to Haiku) and filesy= stems in general( I have an idea of the basic terminologies like inode, blo= ck etc. but thats about it). But I am willing to learn.</p><p>Here ar= e my questions:</p> <ul><li>Is the idea still open?</li><li>Are there any smaller issues I can = resolve to get myself familiar with codebase?( something related to UFS/ZFS= would be perfect)</li><li>Where is the UFS and ZFS implementation in the s= ource tree?</li><li>Any recommended resources for learning about filesystem= s( specifically FreeBSD, I am reading a guide about BeFS )?</li></ul><p>Loo= king forward to hearing from you</p><p><br clear=3D"none"></p><p><br c= lear=3D"none"></p> </div> </div> </div></div> </div> </div></div></blockquote></div></div></div></div></div></div> </div> </div></body></html> ------=_Part_6945383_2014583426.1741058461091--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?773517072.6945384.1741058461093>