Date: Thu, 30 Jan 2025 08:47:32 +0000 From: David Chisnall <theraven@freebsd.org> To: paige@paige.bio Cc: hackers@freebsd.org Subject: Re: Provisions to the contribution guidelines for using LLM generated code Message-ID: <4922BB4E-1361-4AE9-A40D-D75E4875033D@freebsd.org> In-Reply-To: <49B92974-E37A-4786-A456-E258D5A1D35E@paige.bio> References: <49B92974-E37A-4786-A456-E258D5A1D35E@paige.bio>
next in thread | previous in thread | raw e-mail | index | archive | help
I am not a lawyer. If you want legal advice, you should talk to a lawyer. As= a not-a-lawyer, my opinion is: Copyright law, in general, does not in any way describe how copying occurs. I= f you photocopy a book, or if someone reads it to you and you write it down,= that=E2=80=99s equally copyright infringement or fair use based on the resu= lt: the mechanism does not matter. If you take a load of existing exFAT implementations, apply a lossy compress= ion algorithm to them (neural network training) and then decompress, is the o= utput a derived work of the input? That will depend on a load of tests that a= court can apply to judge similarity and so on. In general, a good legal rul= e of thumb is that judges are not idiots (ignoring the Texas West District).= If you use an obfuscated process to hide your illegal action then they will= regard it as both illegal and wilful (and be annoyed with you), which is no= t a good place to be. Is your exFAT implementation a new creative work or a derived work of someth= ing else? Does it infringe Microsoft=E2=80=99s exFAT patents? I don=E2=80=99= t know and going to court is probably the only way of getting a definitive a= nswer. Please don=E2=80=99t expose the FreeBSD project to that legal risk, d= efending it would cost more than the annual budget of the Foundation. David > On 30 Jan 2025, at 02:05, paige@paige.bio wrote: >=20 > =EF=BB=BFHi there, >=20 > As y=E2=80=99all have probably heard AI is the new big thing in town and p= eople are at a bit of a loss for what it means. Despite the news about the s= tock market sell off that came in the wake of the new DeepSeek thing, I=E2=80= =99ve actually been playing around with this thing called Claude for the pas= t couple of weeks and I=E2=80=99m still not really sure what to think of it.= I think it=E2=80=99s really cool to say the least, but I still have a lot o= f questions myself. >=20 > More specifically, I=E2=80=99m not really sure at what point does using so= mething like Claude to create something like a native ExFAT filesystem becom= e an issue of attribution; >=20 > https://github.com/paigeadelethompson/exfat/tree/main/sys/fs/exfat >=20 > it presumably created this based on the parameters in it=E2=80=99s model (= presumably, it is not actually known how Anthropic=E2=80=99s models work bec= ause as far as I know that information is proprietary.) I vaguely understand= how it is able to do this and to the best of my knowledge, it doesn=E2=80=99= t plagiarize code but it does generate code based on facts that it can find i= n it=E2=80=99s own model about ideas which are potentially subject to patent= restrictions. For what this is worth, I think that people are going to find= this to be incredibly valuable regardless of whether or not it produces an e= xact desired result. What it doesn=E2=80=99t get right the first time is oft= en the subject of something being really damn close. >=20 > I=E2=80=99m really just dumbfounded by how much it actually can do that I h= aven=E2=80=99t even tried to compile this code for this filesystem it create= d; it didn=E2=80=99t take me more than an hour of saying =E2=80=9Cyes=E2=80=9D= following the initial "I'd like to make an ExFAT driver for FreeBSD in C ca= n you give me the best starting point possible?=E2=80=9D To be honest I kind= a had to fact check it a couple of times, it wanted to do things like implem= enting extattrs which this filesystem patently doesn=E2=80=99t have. But as s= oon as I asked it, it seemed to know exactly what I meant: >=20 > "No, you're right - I apologize for adding unnecessary complexity. The ExFA= T specification doesn't include support for extended attributes like other f= ilesystems (e.g., UFS or ext4). The only attributes ExFAT supports are the b= asic DOS/FAT attributes we already have defined=E2=80=9D >=20 > And then it proceeded to make changes to remove the stubs and so forth (wh= ich it may not have done right but I haven=E2=80=99t gotten that far yet.) I= n fact, I don=E2=80=99t really feel like I can realistically move forward wi= th this (because I=E2=80=99ll have to fork $20 to get more time out of it) b= ut also I just don=E2=80=99t really know whether or not this is okay. Obviou= sly I want to say yes, but I get the impression that some people might not b= e okay with this, especially if what it creates is not well understood or vi= olates copyright laws. >=20 > "Under U.S. law, you cannot patent an idea, but you may be able to protect= your idea by bringing it to life.=E2=80=9D As far as I know the licensing f= or ExFAT is a little bit of a gray area. It=E2=80=99s Microsoft=E2=80=99s pa= tent, there=E2=80=99s a GPL implementation that exists but asides from that I= don=E2=80=99t know if it=E2=80=99s technically okay to make another impleme= ntation that is licensed any other way. I assume so, but it=E2=80=99s not un= imaginable that even simply ingesting an ExFAT filesystem could come with so= me kind of stipulation. >=20 > And I=E2=80=99m sure some people might even think =E2=80=9Cwhy would you, t= here=E2=80=99s a FUSE implementation for this already=E2=80=9D and you know b= ecause FUSE is FUSE and this is an implementation of ExFAT that uses VFS. Al= so ExFAT/fuse does have problems but it works (sorta) in a pinch. I=E2=80=99= d personally be more interested in improving something that is part of core = FreeBSD than I would anything having to do with a port that I have to instal= l in addition to the OS itself in order to use it. >=20 > The reason why it matters; I just really like ExFAT. Virtually everything n= ow has native support for it out of the box except for UEFI (they should, su= rprised Microsoft hasn=E2=80=99t pushed the standard to adopt it given that .= WIM files can certainly exceed 4.3GB on modern versions of Windows. It just m= akes good sense to me to use it, even though it=E2=80=99s not a journaled fi= lesystem. Using parchive is not lost on me, but I=E2=80=99ve seldom ever tru= ly needed it even with ExFAT. >=20 > Maybe I=E2=80=99m not even really trying to drive this to completion as mu= ch as I just needed an example and am just wanting to understand are people a= lready doing this? Is it possible that people have already done this and nob= ody is really aware of it? I=E2=80=99d like to think if you can then you cer= tainly should but where do you draw the line, and should there perhaps be co= nventions for keeping track of code in FreeBSD that is produced by LLMs? May= be there already is and I just haven=E2=80=99t found it yet but it wouldn=E2= =80=99t come as any surprise if there weren=E2=80=99t given this is all stil= l kind of novel. Either way I=E2=80=99m sure there are things much more subs= tantial than ExFAT worth trying, but there should probably be something of a= n understanding about what is and isn=E2=80=99t okay. I wonder if what we do= n=E2=80=99t know about proprietary LLMs like Claude could potentially be an e= asily overlooked problem that could have legal consequences later. >=20 > In any case I=E2=80=99m sure people will figure it out, but if anybody was= looking for a cue to discuss this I mean.. it=E2=80=99d be really useful to= me if FreeBSD supported ExFAT out of the box (especially since I can=E2=80=99= t get to my offline archive of the ports and it=E2=80=99s distfiles without i= t.) The only available implementations at present are GPL=E2=80=94 so can we= just like=E2=80=A6 generate an implementation with Claude and license it BS= D? I honestly wish that my friend hadn=E2=80=99t insisted on showing me this= kinda because I hoped to avoid something that I know is certainly going to h= ave repercussions for the way things are currently done, but I can=E2=80=99t= unsee this and I feel like I=E2=80=99ve been =E2=80=9Cdoing it wrong=E2=80=9D= my whole life. >=20 > -Paige >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4922BB4E-1361-4AE9-A40D-D75E4875033D>