Many software developers believe that the robots that crawl on the Internet are Internet cockroaches, and many program developers believe. Some Devs began to fight in rich ways, often humor.
While any website may target bad crawling behavior – Sometimes download the site – The developers of the sources are affected by “unpubitive ways”, He writes Niccolò Venerandi, Linux desktop is known as plasma and LibreNews blog.
By nature, free and open source projects (FOSS) shares its infrastructure publicly, and they tend to obtain less resources than commercial products.
The problem is that many AI robots do not honor the Robots Robot.txt Protocol file, which is the tool that tells BOTS to crawl, originally created for search engine robots.
In “a cry for help” Blog post In January, the Foss XE IASO developer described how Amazonbot was unleashed on the GIT server to the DDOS interruption point. GIT servers host Foss projects so that anyone who wants to download or contribute the code can.
But this robot that ignored the iaso’s Robot.txt, was hiding behind another IP address, and pretended to be other users.
“It is not worthy to prevent crawling robots from artificial intelligence because they lie, change the user agent, and use the IP residential addresses as agents, and more.”
“They will get rid of your site until it ends, and after that they will get rid of it more. They will click on each link on each link on each link, and display the same pages over and over again. Some of them will click on the same link several times in the same second,” the developer wrote in the post.
Enter the God of Griffs
So IASO fought with intelligence, building a tool called Anopis.
Anopis is Reverse examination of the work guide This should be passed before allowing requests to click on the Git server. It prevents robots, but allows human humans.
The funny part: Anubis is the name of a god in the Egyptian myths that lead the dead to power.
“Your heart (your heart) has weighted, and if it is heavier than the feather, you have ate your heart while you are, like, Mega died.” If the web request passes the challenge and is determined to be a human being, Nice anime image Advertising success. “Drawing is” it takes me to activate Anubis. “If the robot, the request is rejected.
The project called anxiously like the wind spread among the Foss community. IASO Share it on Jabbap On March 19, and in a few days, 2000 stars, 20 shareholders and 39 forts collected.
Revenge as a defense
The immediate popularity of Anubis shows that IASO pain is not unique. In fact, Vinneri shared a story after the story:
- Founder of the CEO of the company Sourcehut Drew Devaut Description I spent “20 to 100 % of my time in any specific week reduces excessive LLM crawl” and “experience dozens of brief interruption per week.”
- Jonathan Corbit, the famous Foss developer that runs Linux Industry News LWN, warned that his site was It is slowed down due to traffic at the DDOS level “From AI’s active robots.”
- Kevin Venzi, Sysadmin for the massive Linux Fedora project, He said artificial intelligence -intelligence robots He became very aggressive, and he had to prevent the entire Brazilian country from reaching.
Vanerandi techcrunch tells that he knows many other projects that suffer from the same issues. One of them “had to prohibit all Chinese IP addresses at some point.”
Let that drown for a moment – that the developers “must” until they resort to banning the entire countries “just to repel the AI robots that ignore the Robot.txt files, says Venendi.
Besides the weight of the spirit of web, other Dave believes that revenge is the best defense.
A few days ago Hacker newsUser Xyzal Robot.txt download pages with “Aquarius Aquaries on the Benefits of Bleaching” or “Articles on the positive effect of capturing measles on bed in bed”.
“I think we need to aim to get the value of _negative_ the utility tool from visiting our traps, not just the zero value,” explained Xyzal.
As it happens, in January, an unknown originator known as “Haroun” issued a tool called Nepnthes This aims to do so exactly. It surprises the crawl in an endless maze Art Technica My aggression, if not completely malignant. The tool is named after meat.
And Cloudflare, perhaps the largest commercial player who provides several tools for the study of AI Crawles, last week released a similar tool called Ai Labyrinth.
It aims to “slow down, waste and waste the resources of crawling artificial intelligence and other robots that do not respect” not crawling “directions,” In the blog post. Cloudflare said that it nourishes the misconduct in Ai Crawles “instead of extracting legitimate website data.”
I tell Devault’s Devault Techcrunch that “Nepenthes has a satisfactory feeling of justice, because it nourishes the nonsense to crawl and their programs, but ultimately Anubis is the solution that succeeded” for its location.
But Devaut also issued an appeal from the heart, for a more direct repair: “Please stop the legitimacy of LLMS, artificial intelligence generators, or Github coilot or any of this garbage. I beg you to stop using it, stop talking about it, and stop its new making, just stop.”
Since the possibility of this is ZILCH, developers, especially in Foss, fight intelligently and a touch of humor.
https://techcrunch.com/wp-content/uploads/2021/07/NSussman_Techcrunch_CockroachDB-FINAL_q2-L.jpg?resize=1200,800
Source link