• Dr. Moose@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    30
    ·
    edit-2
    17 hours ago

    This is the notorious lawsuit from a year ago:

    a group of well-known writers that includes comedian Sarah Silverman and authors Jacqueline Woodson and Ta-Nehisi Coates

    The judge enforces that AI training is fair use:

    But the actual process of an AI system distilling from thousands of written works to be able to produce its own passages of text qualified as “fair use” under U.S. copyright law because it was “quintessentially transformative,” Alsup wrote.

    This is a second judgement of this type this week.

    • deathmetal27@lemmy.world
      link
      fedilink
      English
      arrow-up
      21
      ·
      edit-2
      15 hours ago

      Alsup? Is this the same judge who also presided over Oracle v. Google over the use of Java in Android? That guy really does his homework over cases he presides on, he learned how to code to see if APIs are copyrightable.

      As for the ruling, I’m not in favour of AI training on copyrighted material, but I can see where the judgement is coming from. I think it’s a matter of what’s really copyrightable: the actual text or images or the abstract knowledge in the material. In other words, if you were to read a book and then write a summary of a section of it in your own words or orally described what you learned from the book to someone else, does that mean copyright infringement? Or if you watch a movie and then describe your favourite scenes to your friends?

      Perhaps a case could be made that AI training on copyrighted materials is not the same as humans consuming the copyrighted material and therefore it should have a different provision in copyright law. I’m no lawyer, but I’d assume that current copyright law works on the basis that humans do not generally have perfect recall of the copyrighted material they consume. But then again a counter argument could be that neither does the AI due to its tendency to hallucinate sometimes. However, it still has superior recall compared to humans and perhaps could be the grounds for amending copyright law about AI training?

      • Petter1@lemm.ee
        link
        fedilink
        English
        arrow-up
        7
        ·
        14 hours ago

        Acree 100%

        Hope we can refactor this whole copyright/patent concept soon…

        It is more a pain for artists, creators, releasers etc.

        I see it with EDM, I work as a Label, and do sometimes produce a bit

        Most artists will work with samples and presets etc. And keeping track of who worked on what and who owns how much percent of what etc. just takes the joy out of creating…

        Same for game design: You have a vision for your game, make a poc, and then have to change the whole game because of stupid patent shit not allowing you e.g. not land on a horse and immediately ride it, or throwing stuff at things to catch them…

        • AnarchistArtificer@slrpnk.net
          link
          fedilink
          English
          arrow-up
          3
          ·
          14 hours ago

          I’m inclined to agree. I hate AI, and I especially hate artists and other creatives being shafted, but I’m increasingly doubtful that copyright is an effective way to ensure that they get their fair share (whether we’re talking about AI or otherwise).

          • Petter1@lemm.ee
            link
            fedilink
            English
            arrow-up
            5
            ·
            13 hours ago

            In an ideal world, there would be something like a universal basic income, which would reduce the pressure on artists that they have to generate enough income with their art, this would allow artists to make art less for mainstream but more unique and thus would, in my opinion, allow to weaken copyright laws

            Well, that would be the way I would try to start change.

      • Dr. Moose@lemmy.worldOP
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        15 hours ago

        Your last paragraph would be ideal solution in ideal world but I don’t think ever like this could happen in the current political and economical structures.

        First its super easy to hide all of this and enforcement would be very difficult even domestically. Second, because we’re in AI race no one would ever put themselves in such disadvantage unless its real damage not economical copyright juggling.

        People need to come to terms with these facts so we can address real problems rather than blow against the wind with all this whining we see on Lemmy. There are actual things we can do.

        • deathmetal27@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          6 hours ago

          One way I could see this being enforced is by mandating that AI models not respond to questions that could result in speaking about a copyrighted work. Similar to how mainstream models don’t speak about vulgar or controversial topics.

          But yeah, realistically, it’s unlikely that any judge would rule in that favour.

      • squaresinger@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 hour ago

        Accuracy and hallucination are two ends of a spectrum.

        If you turn hallucinations to a minimum, the LLM will faithfully reproduce what’s in the training set, but the result will not fit the query very well.

        The other option is to turn the so-called temperature up, which will result in replies fitting better to the query but also the hallucinations go up.

        In the end it’s a balance between getting responses that are closer to the dataset (factual) or closer to the query (creative).

      • tabular@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        2
        ·
        12 hours ago

        hallucination refers to the generation of plausible-sounding but factually incorrect or nonsensical information”

        Is an output an hallucination when the training data involved in the output included factually incorrect data? Suppose my input is “is the would flat” and then an LLM, allegedly, accurately generates a flat-eather’s writings saying it is.