• 4 Posts
  • 576 Comments
Joined 9 months ago
cake
Cake day: December 18th, 2023

help-circle

  • Copyright is utterly corrupted. Besides, I believe it is corrosive and outright dangerous in the age of the internet. Every time you open a website or a stream or anything, that is copied to your device. In the age of the printing press, it was about what happened in a few “factories”/printing houses. Libraries were fine because they didn’t copy, but online libraries do. Now, copyright is about all our communications. Total enforcement would mean total surveillance.

    So this is not a defense of copyright. It is simply an explanation.

    Building products for sale is what US-copyright is all about. Think about the copyright clause: To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

    Without copyright, everything would be public domain. Everyone would be free to share any book or movie. That makes it hard to make money, to monetize your product, to recoup your investment. Copyright is supposed to be a way to enable that. It’s supposed to create an incentive to entertain you. If you have to pay for your entertainment, then someone will come along and entertain you to get your money. Piracy is an attack on that system.

    If AI companies have to buy licenses, that would not incentivize much of anything. Licensing curated datasets for AI training would be one thing, but paying for individual books or even Reddit posts makes no sense. It would just make development slower and much more expensive. That makes it an unconstitutional use of copyright.


  • Let’s engage in a little fantasy. Someone invents a magic machine that is able to duplicate apartments, condos, houses, … You want to live in New York? You can copy yourself a penthouse overlooking the Central Park for just a few cents. It’s magic. You don’t need space. It’s all in a pocket dimension like the Tardis or whatever. Awesome, right? Of course, not everyone would like that. The owner of that penthouse, for one. Their multi-million dollar investment is suddenly almost worthless. They would certainly demand that you must not copy their property without consent. And so would a lot of people. And what about the poor construction workers, ask the owners of constructions companies? And who will pay to have any new house built?

    So in this fantasy story, the government goes and bans the magic copy machine. Taxes are raised to create a big new police bureau to monitor the country and to make sure that no one use such a machine without a license.

    That’s turned from magical wish fulfillment into a dystopian story. A society that rejects living in a rent-free wonderland but instead chooses to make itself poor. People work to ensure poverty, not to create wealth.

    You get that I’m talking about data, information, knowledge. The first magic machine was the printing press. Now we have computers and the Internet.

    I’m not talking about a utopian vision here. Facts, scientific theories, mathematical theorems, … All such is free for all. Inventors can get patents, but only for 20 years and only if they publish them. They can keep their invention secret and take their chances. But if they want a government enforced monopoly, they must publish their inventions so that others may learn from it.

    In the US, that’s how the Constitution demands it. The copyright clause: [The United States Congress shall have power] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

    Cutting down on Fair Use makes everyone poorer and only a very few, very rich people richer. Have you ever thought about where the money goes if AI training requires a license?

    For example, to Reddit, because Reddit has rights to all those posts. So do Facebook and Xitter. Of course, there’s also old money, like the NYT or Getty. The NYT has the rights to all their old issue about a century back. If AI training requires a license, they can sell all their old newspapers again. That’s pure profit. Do you think they will their employees raises out of the pure goodness of their heart if they win their lawsuits? They have no legal or economics reason to do so. The belief that this would happen is trickle-down economics.










  • Not at all. The US conserves something of the enlightenment tradition of freely sharing information; vital for the advancement of science, technology, and culture. Free speech and free press means that you can say and print what you like (press at the time literally meant the printing press, not the media). Limitations in the form of copyrights or patents are only allowed where it helps those goals.

    Continental European copyright preserves a monarchical, aristocratic tradition. It’s rooted in ideas of personal privilege and honor. For example, it’s illegal to deface an artwork even when you own it because it’s an attack on the honor of the artist. The term “royalties” comes from the fact that it was a privilege granted by royalty.

    It’s revealing that Europe has basically the same patent system as the US. You can’t do without technology, even if you are an authoritarian ruler. What would your armies do? But copyright is just about culture, usually. You don’t want that to be a needless source of instability. You want a clique of cronies to be in charge of that. That’s what you see in Europe.




  • “AI has the potential to disrupt many professions, not just individual creators. The response to this disruption (e.g., support for worker retraining through institutions such as community colleges and public libraries) should be developed on an economy-wide basis, and copyright law should not be treated as a means for addressing these broader societal challenges.” Going down a typical copyright path of creating new rights and licensing markets could, for AI, serve to worsen social problems like inequality, surveillance and monopolistic behavior of Big Tech and Big Media.

    Second, any new copyright regulation of AI should not negatively impact the public’s right and ability to access information, knowledge, and culture. A primary purpose of copyright is to expand access to knowledge. See Authors Guild v. Google, 804 F.3d 202, 212 (2d Cir. 2015) (“Thus, while authors are undoubtedly important intended beneficiaries of copyright, the ultimate, primary intended beneficiary is the public, whose access to knowledge copyright seeks to advance . . . .”). Proposals to amend the Copyright Act to address AI should be evaluated by the impact such new regulations would have on the public’s access to information, knowledge, and culture. In cases where proposals would have the effect of reducing public access, they should be rejected or balanced out with appropriate exceptions and limitations.

    Third, universities, libraries, and other publicly-oriented institutions must be able to continue to ensure the public’s access to high quality, verifiable sources of news, scientific research and other information essential to their participation in our democratic society. Strong libraries and educational institutions can help mitigate some of the challenges to our information ecosystem, including those posed by AI. Libraries should be empowered to provide access to educational resources of all sorts– including the powerful Generative AI tools now being developed.

    Perhaps controversial statements.




  • Scaling laws are disputed

    Not in general.

    There is not enough permissively licensed text to train models of any size, and what there is, lacks in diversity. Wikipedia, government documents, stack overflow, century old stuff, … An LLM trained on that is not likely to be called “general purpose”, because scaling laws. Sometimes such small models are trained for research purposes but I don’t have a link ready. They are not something you’d actually use. Perhaps you could look at Microsoft’s Phi series of models. They are trained on synthetic data, though that’s probably not what you are looking for.


  • “data dignity”,

    Apparently, this is about creating a new kind of intellectual property; a generalized and hypercharged version of copyright that applies to all sorts of data.

    Maybe, this is a touchy subject, but to me this seems like an extremely right wing approach. Turn anything into property and the magic market will turn everything into rainbows and unicorns. Maybe you feel different about this?

    Regardless of classification, such a policy is obviously devastating to society. Of course, your argument does not consider society but only the feelings of some individuals. Feelings are valid but one has to consider the effect of such a policy, too. Not every impulse should be given power. This is especially true where such feelings are strongly influenced by culture and circumstance. For example, people in the US and the UK have -on the whole - rather different feelings on being ruled by a king. I don’t feel that I should be able to control what other people do with data, maybe because I’m a bit older and was socialized into that whole information-wants-to-be-free culture. I don’t even remember having a libertarian phase.

    How would you pitch this to me?