In part, this is what Microsoft Recall is about: scraping end users’ data at will to sort and feed to its LLMs, without the user ever seeing what is being scraped or having any real, lasting ability to shut Recall off and keep it shut down.
While I am aware that MS insists none of that is true, it is fact that 1) the snapshotted and OCR’d Recall data is now stored in an encrypted database that takes higher than average user skill to get into; 2) even users who turned off Recall saw it turned on again at the next Windows Update; and 3) even after MS said they were backing off Recall, MS continued to partner with hardware makers to create computers bundled with Windows 11 on top of the extra GPU necessary for processing all these Recall snapshots without making that sluggish Windows bloat even more sluggishly bloated than it already was.
So why all that money and effort, even as they claimed to be backing away from it, just to help a hypothetically forgetful user here and there? Data harvesting was always part of the payoff, why they were and are very willing to piss off a huge part of their own consumer base around the world by ending Windows 10 unnecessarily, and why even now they keep ramming Recall shit down the pipe when literally NO ONE wants it.
They get your data. At will. And as much of it as they like, without you ever having the opportunity to oversee what they’re getting, much less curate it. And after feeding it to their LLMs, they get to aggregate and broker it to their “partners” as well. Never forget what MS did in Palestine and the partners they can and will gladly work with, all based on massive collections of quietly gathered user data that either should not legally exist, or is not known outside of MS and its partners to exist at all.
You know Microsoft isn’t about the user experience the moment they removed free games from their distro
Calling Windows a distro, while technically true, feels offensive
In part, this is what Microsoft Recall is about: scraping end users’ data at will to sort and feed to its LLMs
That’s also what Google has always done. Want a large data set of emails? Look at our new free service, Gmail! Need a lot of images to train machine learning vision models? Check out our newest free backup tool, Google Photos! and so on. When they want one particular data type, they launch a free service that just so happens to collect this exact data type from millions of users.
How is it a crisis? I’m expecting/hoping LLMs will just get increasingly worse as they are fed on their own slop, until they collapse into unusability and the world finally returns to sanity.
The crisis is that if the internet is bad enough that they can’t train LLMs on it, it’s also useless for us human. And if the LLMs leave because there is no information and we come back, eventually we will generate enough content that it will be worth it again
This is said as though it isn’t an immensely expensive endeavour to run these things and the only reason they’re this prevalent right now is the overspeculation and starved growth of US tech companies.
I don’t get the point you are trying to make. I never said any of that. my point is that the information crisis caused by LLMs that they are referring to, affects everyone. Nothing else. Not hidden meanings
Your original comment sparks fear that failure of LLMs as a mass producer of knowledge would only be temporary until humans repopulated the internet with quality content.
Why would they come back after they fail if it costs billions of dollars to run them in the first place? You literally just agreed with someone else making the same point. Jfc.
I don’t have time to write (and filter out) my opinions right now, but basically I think LLMs will find their place and will settle into a duopoly or tripoloy
I didn’t ask for them.
You really did. But whatever.
Eventually is doing some heavy lifting. The costs of these data centers and the power plants to run them has to come from somewhere.
I don’t know if or when the chickens come home to roost, but it could go badly for the US and China to find out their spent trillions to make a million versions of AI shrimp Jesus.
I want to believe that the silver lining is that HOPEFULLY MAYBE (Now these words are doing the heavy lifting. For sure)this will mean that when the ai craze goes away they will have spent so much on infrastructure that we will great cheap energy, cloud computing, and GPUs.
That’s possible, very optimistic, but hey, good luck to us.
deleted by creator
The crisis is that companies will have to pay fair prices for human labor again, which will lead to less profits for the precious shareholders
Not just precious shareholders, it’s a massive bubble that will impact the economy and peoples retirement when it bursts. They keep dumping more money into AI even though there are no returns.
So at this point we’re gonna have to go back to reading books…
Books made before 2019. Amazon is absolutely filled with AI generated books nowadays.
In fact, this whole “consume media only from 2010 and earlier” idea is getting more appealing by the day. I’d rather watch an anime from the 80’s where each frame was drawn by a human hand and somebody spent a week encoding it to extract all the details from the original analogue source, and the subtitles were made by a person who considered each nuance carefully as if their life depended on it, rather than watch a 2025 sequel to a prequel to a reboot of an existing IP where half the assets are AI, the subtitles are AI, the script is AI, and it’s just the most generic mass appealing thing ever made.
It’s pretty sad, creepy and hopeless, but this is exactly how I have been feeling lately with YouTube videos. If it’s made post GPT, I am not inclined to watch it unless it’s a channel I know has a stated anti-AI position, because at least then I know I’m getting human input. They are building up Plato’s cave around us stone by stone, we can’t even move out of the way, we are getting imprisoned, and the few of us who see it happening and shout are being drowned out by the noise of the billions around us who happily hum along.
The system itself needs to come down, with violence, or we won’t make it, none of us, and honestly even if it does, I am not sure we are going to survive. I feel like I’m playing the fiddle on the Titanic.
Since a lot of channels have started stating they use no AI, I have unsubscribed from quite a few that don’t explicitly say so.
The 80s and 90s have some of my favorite anime, movies, and tv shows anyway. I can’t really think of any recent masterpeices aside from Interstellar and The Martian.
If you are a USian then you probably have access to a public library.
If you go to the website of said library, usually a city.gov or countystate.gov, you can create an account.
After you create said account, you can download apps(they will tell you but normally Libby or Hoopla, Hoopla hasn’t been work for me but might be a me issue.).
Use the ID from the website on the apps and you can check out books to read. You also have access to comics and audiobooks. The Invincible comic is good, you should check it out.
Weird that third party apps, made by corporate entities, are needed for this. They’re public libraries funded with public money, it should be one unified backend with libre applications.
Then the libraries would have to pay for hosting, so they’d have to be the ones selling user data to advertisers and stuff. Hence the extra degree of separation / “plausible deniability”
If the hivemind cared about actual important priorities (instead of just the desire for low oil prices), then the Internet Archive would face more pressure to improve its infrastructure and the authorities would face more pressure to leave the Internet Archive tf alone.
Nostr can fix this someday
Then the libraries would have to pay for hosting, so they’d have to be the ones selling user data to advertisers and stuff. Hence the extra degree of separation / “plausible deniability”
What? Libraries don’t sell data to advertisers to acquire, maintain and lend books. Why would they do that to provide ebooks? You unitedstatians got used to this bizarre mix of private corporations and public services and ended up accepting the premise that it’s somehow mandatory.
I’m not accepting the premise, that’s why I use nostr on the internet and actual library buildings off the internet.
But it is mandatory, whether that’s acceptable or not. Authorities in the US aren’t gonna suddenly change their mind when it’s your local librarian instead of the Internet Archive trying to run things differently from the corporations. The librarian needs better infrastructure to stand up to the authorities and break away from the corporate way
deleted by creator
You likely need to physically go to the library and prove your residency but yes. Then you can do the above.
My username isn’t random coincidence.
Most new books are AI generated now.
Human forums. Throw in a tar pit for any scrapers
Explains why my personal blog, wiki, and git repo keep getting hammered by hordes of AI company scrapers. If AI was intelligent, they’d download a single snapshot every month or so and share. But no, eight different scrapers using thousands of different IP addresses (to evade my
fail2ban
measures) each have to follow every single blame and diff link when a simplegit clone
operation would get them the hundreds of megabytes of content in one go.They are getting better, though. More hits are to RecentChanges on my wiki, so there seem to be some optimizations going on. But I refuse to increase my operating costs beyond a few USD/month to serve AI bots when I know barely anyone human visits.
Everything converges to generic sameness
When I hit this sentence in the OP, I realized AI is going to remain very popular with the average joe for a long time.
People who are tech literate, actively curious about the natural world, and do crazy shit like care about humanity (so most people reading this, most likely) will still reject it for the junk it is. But it seems the vast majority of people around me are not like Lemmy users. I mean, they are called normies for a reason, and I don’t mean that in a derogatory way.
Generic sameness seems to be what the rat race pushes people towards. Maybe being burned out and having the economy constantly innovating new ways to bleed you dry makes the pre-packaged commoditized comforts from the advertisements too easy to accept. I look around and I see people anger-driving their pickup trucks and luxury SUVs to their jobs that they hate so that they can afford the cars they drive to get there. Plus they need to be able to afford beer and snacks for the game after they fill up those gas tanks!
These people don’t care if the stories and tiktoks scrolling past their glazed-over eyes are AI generated. It only matters if those things can shake that stubborn drop of dopamine loose that just won’t fall from the faucet. Just get through the day so we can do it again tomorrow.
Could you guys stop dumping your trash in the forest please? It obstructs my garbage trucks which I send to the forest to dump garbage in.
On top of this, the scrapers that feed the AIs are creating more and more traffic, and therefor load on sites that did not have them before.
I can not prove it but i think that already since a long time that most articles are copy + pasted and some kind of summary. Which might have happend automatically. Like someonr might write a real article on anything and then hundreds of sites copied it, without adding anything
The Huffington Post Effect.
Unfortunately, there’s another side of this coin. The “original” content sources value freshness to grab attention and hyperbole to generate interest. The end result is Drudge Report / Brietbart / Alex Jones / Joe Rogan journalism that mills out innuendo, conspiracy theory, and quack medicine as Breaking News. And that becomes the “original journalism” all the other copypasta outlets reproduce ad nauseum.
Yeah I get the main theory you’re going with and it’s been rampant without AI bullshit. When sites became traffic driven to an extreme they started trying to grab whatever was the most rage enduring, verified or not. I forget the exact example but there was a guy that back traced a rage bait article and it was like 10 articles deep, and at the bottom it was a misquoted tweet that was complete misinformation.
The problem is that the resource they consume to feed the AI, (human generated content) has become a limited resource, completely mined.
they could pay people to write, IE, news agencies pay writers to write and AI site are one of their clients.
you should get DMs from anthropic offering 50$ for your weeks posts and comments…
Instead they want to pretend they have still room to grow for free. but they can’t
(That is just basic economic theory, I want those companies to fuck off already)
There’s no way an AI generated article costs less than a cent.
So humans writing articles cost more than the billions they’re pumping in to AI? I very much doubt it.
How could they have ever known about the possibility of the thing that literally everyone everywhere told them almost immediately a couple of years ago when this fad really started charting (or perhaps sharting is more appropriate)?
The incumbents have their training data that can also be used for the next generations of AI. This is just them helping others to pull up the ladder to avoid more competitors.
Well, technically, the AI companies aren’t making any profits so the actual cost is higher, and also the revenues from the AI articles are declining because people aren’t interracting with them.
Mirrors face, fingers point.
Time to start working on the Blackwall