Admin on the slrpnk.net Lemmy instance.

He/Him or what ever you feel like.

XMPP: [email protected]

Avatar is an image of a baby octopus.

  • 32 Posts
  • 421 Comments
Joined 3 years ago
cake
Cake day: September 19th, 2022

help-circle



  • No one is disputing that in theory (!) Anubis offers very little protection against an adversary that specifically tries to circumvent it, but we are dealing with an elephant in the porcelain shop kind of situation. The AI companies simply don’t care if they kill off small independently hosted web-applications with their scraping and Anubis is the mouse that is currently sufficient to make them back off.

    And no, forced site reloads are extremely disruptive for web-applications and often force a lot of extra load for re-authentication etc. It is not as easy as you make it sound.





  • If you check for GPU (not generally a bad idea) you will have the same people that currently complain about JS, complain about this breaking with their anti-fingerprinting browser addons.

    But no, you can’t spoof PoW obviously, that’s the entire point of it. If you do the calculation in Javascript or not doesn’t really matter for it to work.

    In the current shape Anubis has zero impact on usability for 99% of the site visitors, not so with meta refresh.



  • You are arguing a strawman. Anubis works because because most AI scrapers (currently) don’t want to spend extra on running headless chromium, and because it slightly incentivises AI scrapers to correctly identify themselves as such.

    Most of the AI scraping is frankly just shoddy code written by careless people that don’t want to ddos the independent web, but can’t be bothered to actually fix that on their side.


  • AI scraping is a massive issue for specific types of websites, such as git forges, wikis and to a lesser extend Lemmy etc, that rely on complex database operations that can not be easily cached. Unless you massively overprovision your infrastructure these web-applications come to a grinding halt by constantly maxing out the available CPU power.

    The vast majority of the critical commenters here seem to talk from a point of total ignorance about this, or assume operators of such web applications have time for hyperviligance to constantly monitor and manually block AI scrapers (that do their best to circumvent more basic blocks). The realistic options for such operators are right now: Anubis (or similar), Cloudflare or shutting down their servers. Of these Anubis is clearly the least bad option.










  • This is a misreading of the situation, but I guess it makes for a better headline?

    The GDPR etc. only came to be due to a specific political constellation where progressive privacy advocates and nationalistic forces formed an uneasy coalition. But now the political balance has shifted and the respondible EU commissioner has also changed.

    Basically what we see now is that the nationalistic forces have rejoined with the neo-liberal ones that promise that if the EU just deregulates the sector hard enough there will be EU based big-tech to compete with the US based ones. This is of course total bollocks and an EU based big-tech wouldn’t be significantly better than an US or China based one, but it gets the nationalistic fraction’s support.