Internet Archive Submits Comments on Copyright and Artificial Intelligence

General_Effort@lemmy.world · 1 hour ago

But that’s unethical!

General_Effort@lemmy.world · 4 hours ago

Copyright is utterly corrupted. Besides, I believe it is corrosive and outright dangerous in the age of the internet. Every time you open a website or a stream or anything, that is copied to your device. In the age of the printing press, it was about what happened in a few “factories”/printing houses. Libraries were fine because they didn’t copy, but online libraries do. Now, copyright is about all our communications. Total enforcement would mean total surveillance.

So this is not a defense of copyright. It is simply an explanation.

Building products for sale is what US-copyright is all about. Think about the copyright clause: To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

Without copyright, everything would be public domain. Everyone would be free to share any book or movie. That makes it hard to make money, to monetize your product, to recoup your investment. Copyright is supposed to be a way to enable that. It’s supposed to create an incentive to entertain you. If you have to pay for your entertainment, then someone will come along and entertain you to get your money. Piracy is an attack on that system.

If AI companies have to buy licenses, that would not incentivize much of anything. Licensing curated datasets for AI training would be one thing, but paying for individual books or even Reddit posts makes no sense. It would just make development slower and much more expensive. That makes it an unconstitutional use of copyright.

General_Effort@lemmy.world · 8 hours ago

Let’s engage in a little fantasy. Someone invents a magic machine that is able to duplicate apartments, condos, houses, … You want to live in New York? You can copy yourself a penthouse overlooking the Central Park for just a few cents. It’s magic. You don’t need space. It’s all in a pocket dimension like the Tardis or whatever. Awesome, right? Of course, not everyone would like that. The owner of that penthouse, for one. Their multi-million dollar investment is suddenly almost worthless. They would certainly demand that you must not copy their property without consent. And so would a lot of people. And what about the poor construction workers, ask the owners of constructions companies? And who will pay to have any new house built?

So in this fantasy story, the government goes and bans the magic copy machine. Taxes are raised to create a big new police bureau to monitor the country and to make sure that no one use such a machine without a license.

That’s turned from magical wish fulfillment into a dystopian story. A society that rejects living in a rent-free wonderland but instead chooses to make itself poor. People work to ensure poverty, not to create wealth.

You get that I’m talking about data, information, knowledge. The first magic machine was the printing press. Now we have computers and the Internet.

I’m not talking about a utopian vision here. Facts, scientific theories, mathematical theorems, … All such is free for all. Inventors can get patents, but only for 20 years and only if they publish them. They can keep their invention secret and take their chances. But if they want a government enforced monopoly, they must publish their inventions so that others may learn from it.

In the US, that’s how the Constitution demands it. The copyright clause: [The United States Congress shall have power] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

Cutting down on Fair Use makes everyone poorer and only a very few, very rich people richer. Have you ever thought about where the money goes if AI training requires a license?

For example, to Reddit, because Reddit has rights to all those posts. So do Facebook and Xitter. Of course, there’s also old money, like the NYT or Getty. The NYT has the rights to all their old issue about a century back. If AI training requires a license, they can sell all their old newspapers again. That’s pure profit. Do you think they will their employees raises out of the pure goodness of their heart if they win their lawsuits? They have no legal or economics reason to do so. The belief that this would happen is trickle-down economics.

General_Effort@lemmy.world · 9 hours ago

This paperwork is required by EU regulation (Digital Services Act - DSA).

It is theoretically possible to be excepted but I doubt OP has any chance there.

General_Effort@lemmy.world · 14 hours ago

I was just being sarcastic. The article is explicit that there is a copyright organization behind this.

General_Effort@lemmy.world · 1 day ago

In what country is that?

Under US law, you cannot copyright recipes. You can own a specific text in which you explain the recipe. But anyone can write down the same ingredients and instructions in a different way and own that text.

General_Effort@lemmy.world · 1 day ago

SMITH created thousands of accounts on the Streaming Platforms (the “Bot Accounts”) that he could use to stream songs. He then used software to cause the Bot Accounts to continuously stream songs that he owned. At a certain point in the charged time period, SMITH estimated that he could use the Bot Accounts to generate approximately 661,440 streams per day, yielding annual royalties of $1,207,128.

From the original press release: https://www.justice.gov/usao-sdny/pr/north-carolina-musician-charged-music-streaming-fraud-aided-artificial-intelligence

Kinda funny how the term “AI” drowns out all rational thought and reading comprehension. Of course, that’s why it’s there in the clickbait headline. I avoid news sources that pull that sort of thing. I don’t appreciate being manipulated.

General_Effort@lemmy.world · 1 day ago

It’s more about copying, really.

That’s why no one gets sued for downloading.

People do get sued in some countries. EG Germany. I think they stopped in the US because of the bad publicity.

What these lawsuits against OpenAI are claiming is that OpenAI is making a derivative work of the authors/owners works.

That theory is just crazy. I think it’s already been thrown out of all these suits.

General_Effort@lemmy.world · 1 day ago

Yes, that’s exactly the point. It should belong to humanity, which means that anyone can use it to improve themselves. Or to create something nice for themselves or others. That’s exactly what AI companies are doing. And because it is not stealing, it is all still there for anyone else. Unless, of course, the copyrightists get there way.

General_Effort@lemmy.world · 1 day ago

I didn’t make the point clear. The original scenes themselves, as released by the studio, may qualify as “deepfakes”. A little bit of digital post-processing can be enough to qualify them under the proposed bills. Then sharing them becomes criminal, fair use be damned.

General_Effort@lemmy.world · 1 day ago

That’s easy. The movie studios know what post-production went into the scenes and have the documents to prove it. They can easily prove that such clips fall under deepfake laws.

Y’all need to be more cynical. These lobby groups do not make arguments because they believe in them, but because it gets them what they want.

General_Effort@lemmy.world · 1 day ago

Not at all. The US conserves something of the enlightenment tradition of freely sharing information; vital for the advancement of science, technology, and culture. Free speech and free press means that you can say and print what you like (press at the time literally meant the printing press, not the media). Limitations in the form of copyrights or patents are only allowed where it helps those goals.

Continental European copyright preserves a monarchical, aristocratic tradition. It’s rooted in ideas of personal privilege and honor. For example, it’s illegal to deface an artwork even when you own it because it’s an attack on the honor of the artist. The term “royalties” comes from the fact that it was a privilege granted by royalty.

It’s revealing that Europe has basically the same patent system as the US. You can’t do without technology, even if you are an authoritarian ruler. What would your armies do? But copyright is just about culture, usually. You don’t want that to be a needless source of instability. You want a clique of cronies to be in charge of that. That’s what you see in Europe.

General_Effort@lemmy.world · 1 day ago

Make no mistake. The US is heading in the same direction. Look at the proposed anti-deepfake laws. That guy could be prosecuted extremely harshly under those.

General_Effort@lemmy.world · 1 day ago

Rights Alliance also called on Reddit to take the matter seriously. While many of the problematic clips were already removed at that point, the group urged Reddit to implement upload filters to prevent future trouble.

Nothing sinister here, folks. Just defending helpless women against those evil techbros.

General_Effort@lemmy.world · 2 days ago

“AI has the potential to disrupt many professions, not just individual creators. The response to this disruption (e.g., support for worker retraining through institutions such as community colleges and public libraries) should be developed on an economy-wide basis, and copyright law should not be treated as a means for addressing these broader societal challenges.” Going down a typical copyright path of creating new rights and licensing markets could, for AI, serve to worsen social problems like inequality, surveillance and monopolistic behavior of Big Tech and Big Media.

Second, any new copyright regulation of AI should not negatively impact the public’s right and ability to access information, knowledge, and culture. A primary purpose of copyright is to expand access to knowledge. See Authors Guild v. Google, 804 F.3d 202, 212 (2d Cir. 2015) (“Thus, while authors are undoubtedly important intended beneficiaries of copyright, the ultimate, primary intended beneficiary is the public, whose access to knowledge copyright seeks to advance . . . .”). Proposals to amend the Copyright Act to address AI should be evaluated by the impact such new regulations would have on the public’s access to information, knowledge, and culture. In cases where proposals would have the effect of reducing public access, they should be rejected or balanced out with appropriate exceptions and limitations.

Third, universities, libraries, and other publicly-oriented institutions must be able to continue to ensure the public’s access to high quality, verifiable sources of news, scientific research and other information essential to their participation in our democratic society. Strong libraries and educational institutions can help mitigate some of the challenges to our information ecosystem, including those posed by AI. Libraries should be empowered to provide access to educational resources of all sorts– including the powerful Generative AI tools now being developed.

Perhaps controversial statements.

General_Effort@lemmy.world · 2 days ago

Internet Archive Submits Comments on Copyright and Artificial Intelligence

General_Effort@lemmy.world · 2 days ago

And have you stopped beating your wife yet?

Asking loaded questions isn’t the big brain move you think. It’s just dishonest.

General_Effort@lemmy.world · 2 days ago

Scaling laws are disputed

Not in general.

There is not enough permissively licensed text to train models of any size, and what there is, lacks in diversity. Wikipedia, government documents, stack overflow, century old stuff, … An LLM trained on that is not likely to be called “general purpose”, because scaling laws. Sometimes such small models are trained for research purposes but I don’t have a link ready. They are not something you’d actually use. Perhaps you could look at Microsoft’s Phi series of models. They are trained on synthetic data, though that’s probably not what you are looking for.

General_Effort@lemmy.world · 2 days ago

“data dignity”,

Apparently, this is about creating a new kind of intellectual property; a generalized and hypercharged version of copyright that applies to all sorts of data.

Maybe, this is a touchy subject, but to me this seems like an extremely right wing approach. Turn anything into property and the magic market will turn everything into rainbows and unicorns. Maybe you feel different about this?

Regardless of classification, such a policy is obviously devastating to society. Of course, your argument does not consider society but only the feelings of some individuals. Feelings are valid but one has to consider the effect of such a policy, too. Not every impulse should be given power. This is especially true where such feelings are strongly influenced by culture and circumstance. For example, people in the US and the UK have -on the whole - rather different feelings on being ruled by a king. I don’t feel that I should be able to control what other people do with data, maybe because I’m a bit older and was socialized into that whole information-wants-to-be-free culture. I don’t even remember having a libertarian phase.

How would you pitch this to me?

General_Effort@lemmy.world · 2 days ago

This has all been tested and is being continuously retested. Start here, for example: https://en.wikipedia.org/wiki/Neural_scaling_law

I know, on lemmy you will get the impression that engineers and scientists are all just bumbling fools who are intellectually outclassed by any high schooler with internet access. But how likely is that, really?

General_Effort@lemmy.world · 3 days ago

Meta is defending because they trained on books3 which contained all of Bibliotik. https://en.wikipedia.org/wiki/The_Pile_(dataset)

General_Effort@lemmy.world · 3 months ago

Top EU Court Says There’s No Right To Online Anonymity, Because Copyright Is More Important

General_Effort@lemmy.world · 3 months ago

Mozilla Builders Accelerator 2024 Advancing innovation in open source AI

General_Effort@lemmy.world · 3 months ago

Mozilla Builders Accelerator 2024 Advancing innovation in open source AI