A high school has built a web site that allows you to challenge artificial intelligence models to build Minecraft

Photo of author

By [email protected]


It is traditional Artificial intelligence standards The technologies prove insufficient, artificial intelligence builders move to more creative ways to assess the capabilities of artificial intelligence models. For one group of developers, this is the Minecraft game, the Microsoft Sand Box Building game.

the site Minecraft Standard (Or MC-Bus) It was cooperative to liberalize artificial intelligence models against each other in face-to-face challenges to respond to claims with Minecraft creations. Users can vote on the model that did better, and only after voting, they can see any of the artificial intelligence that made both Minecraft.

Image credits:Minecraft Standard (Opens in a new window)

For Uday Singh, the twelfth grade student who started MC-Bused, the value of Minecraft is not the same game, but the familiarity that people enjoy-after everything, it is it best seller The video game ever. Even for people who have not played the game, it is still possible to better evaluate pineapple representation.

“Minecraft allows people to see (artificial intelligence development) more easily.” “People are accustomed to Minecraft, and they are used in appearance and vitality.”

MC-Bench currently lists eight people as volunteer shareholders. Antarbur, Google, Openai, Alibaba supported the project’s use of its products to operate the measurements of measurement, for each site on the MC-Bench, but companies are not affiliated with another way.

Singh said: “We are only building a simple construction of thinking about the extent that we have reached from the GPT-3 era, but we see (we) we see ourselves sensing these long plans and tasks directed towards goals,” Singh said. “Games may be just a way to test the safest thinking of real life and more controlling test purposes, making them more perfect in my eyes.”

Other games like Red Pokemonand Street fighterAnd pisionary It was used as experimental standards for Amnesty International, partly because the art of male measurement Notorious.

Researchers often test artificial intelligence models on Unified reviewsBut many of these tests give artificial intelligence the advantage of the home field. Due to the way they are trained, the models are naturally talented in certain types of problem solving, especially the solution to problems that require stimulation by heart or essential extraction.

Simply put, it is difficult to collect what the GPT-4 of Openai can record 88 percent on LSAT, but it cannot be distinguished How many rupees in the word “strawberry”. Antarbur Claude 3.7 Sonata It has achieved 62.3 % accuracy on a standard of uniform software engineering, but it is worse in Pokemon playing for most children who are five years old.

MC-BENCES is technically a standard of programming, as models are required to write code to create a supported construction, such as “Frosty The Snowman” or “magical tropical beach hut on a primitive sandy beach”.

But it is easier for most MC-Bench users to evaluate whether the snowman looks better than drilling in software instructions, giving the project a wider appeal-and thus the ability to collect more data about models that are constantly recorded better.

Whether these degrees are largely amounting to the benefit of artificial intelligence for discussion, of course. Singh confirms it is a strong sign.

Singh said: “The current lead is closely reflecting my own experience in using these models, which are not similar to many criteria for the pure text,” Singh said. “MC-Bect may be useful for companies to see if they are heading in the right direction.”



https://techcrunch.com/wp-content/uploads/2024/11/minecraft.jpg?w=700

Source link

Leave a Comment