if Claude plays Pokemon It is supposed to give a glimpse of the future of artificial intelligence, it is not a very convincing offer. During the past month and count Red Pokemon. Through several runs, Claude failed to overcome the game of approximately 30 years old. So far for David Hershey, the main project developer, the offer has been successful.
"I wanted a place where I can understand how Claude dealt with the situations in which you need to work for a very long period of time," Hershey explains to me a video call. As part of his daily function in humans, Hershey works on the team to go to the market where the company’s customers help to create their own agents (more about that in a moment). Work began for the first time on Claude playing Pokemon as a side project soon 3.5 Sonata Last June.
You can also guess the name, the project was partially inspired Nashl play PokemonWhich first appeared in 2014 and witnessed 1.16 million posts in an attempt to outlook Red Pokemon Using viewers of the inputs written only in the broadcast box. Hershey was not the first Angroprie employee to try to formulate Claude in the Pokemon League champion, but the project took his own life close to the participation.
In the first days of the project, it was great when Claude managed to leave Red’s house and find Professor Ok. "I spent some evil hours in tampering to make it to achieve this type of progress," Hershey tells me. He used to update his colleagues in Claude on the internal recession channel. At this point, most of the company was not interested, and it was not planned to share with the world.
However, Hershey made it customary to reconsider the project with every issuance of new main models of anthropology Claude 3.5 Sonata last fall And again recently with 3.7 Sonnet. "It is how I go to see “What is this new model?” “How does it work?” “What can I learn about it?”" He explains Hershey. With Claude 3.7 Sonnet, Claude’s version that plays the game now, this was the first time "You can turn and see the signs of life."
Inside the Anthrop, the hope was that Claude would become better in trying different strategies and adjusting his approach when things did not go according to the plan. with Red Pokemon, The company saw Claude to do these things in an actual time. "(Claude 3.7 Sonata) spends less time stuck in the assumptions," Hershey says. "You will still see that he is guessing, then spending a number of hours in the belief that this is true and take stupid decisions in the meantime, but the previous models will continue to do so forever."
Literally, you can see Claude developing and running with these assumptions. Each slow step in the game precedes a paragraph from the text from the artificial intelligence – "He faced wild beverages while trying to move to (24,24). According to my strategy, I must run away from this battle to preserving resources" – Followed by one button, pressure. Then he re -evaluates the condition of the game and does so again.
If you are watching Claude, you are wrinkled Red Pokemon As a game fan, a model spent "The lowest time stuck on the assumptions" It looks minor, especially when Chatbot is frequently stumbled in areas like Forest, and sometimes for several days, given the design level design. However, it is a prominent sign of the type of artificial intelligence system represented by Claude 3.7.
Like many modern Frontier AI systems, Claude 3.7 Sonnet is a model of logic, which means that it is designed to address problems by dividing them into smaller pieces. "Many of our customers are interested in the effectiveness of Claude Agent," He explains Hershey. For beginners, AIS agents or AIS agent They are systems designed to plan and carry out complex tasks without human supervision. At the present time, most people think about artificial intelligence as an empty chat box pending the answer to a question, but the struggles are only the consumer face of the industry; The agents are a gradual but important step towards the promise of artificial general intelligence.
From this perspective, there are some things that make Claude playing Pokemon is interesting. First, there is an amazing fact, Hershey delegated a lot of programming that made the project possible Anthropier -coding agent Including an overburden that allows Claude to understand Pokemon Reed Game world.
Second, and most importantly, Claude was not pre -played Red Pokemon. Chatbot knows some of the basics about the game, such as the name of each gym captain and the system that the player must overcome, but he does not have hundreds of years of knowing the game like some Specialized artificial intelligence systems. "You can throw a model in a game without preparation, no instructions, and it can learn everything itself," He says. "I aim to be close to this aspect as possible."
Hershey had to give Claude some help. I have already mentioned the extent that allows him to explain Pokemon Reed Interface. Pixel art is something that fights all artificial intelligence systems, and 3.7 Sonite is not expecting. As human beings, our imagination does a great job in filling the details that you suggested a few pixels. What’s more, Claude La "Sees" The way we do.
If you see this closely, you will notice every time he moves the player’s personality, he will make some inputs before re -evaluating his site. Between these tires, Claude is not any sensory inputs. He cannot see the red walking, and he does not do that "He hears" When its inputs break a tree or another obstacle. Claude "Poor vision" It is one of the main reasons that you fight with the game; In fact, Hershey had to give Chatbot a way to read the game memory, so he was likely to link it if the screen interpretation was offended.
If the project’s goal is Claude to overcome Red PokemonIt was easy. Hershey could have been programmed through the game to follow Chatbot, but at this point, all that would have been testing was the quality of Claude for a solid set of instructions. "Claude is very good in it," Hershey says. "I knew that. We all know that."
Instead, in leaving Claude to its own devices, the new model showed that it is better to plan, reach new strategies and eventually try something different when his assumptions are wrong. One of the most New solutions Claude evolved during her third race through the game was to deliberately cause Pokemon to fainting so that he could escape from Mount Moon.
However, Claude can be much better in both short and long -term planning. In the same example, she just mentioned, Claude deleted all her observations on Mount Moon after breathing in the nearby Bokimon center, incorrectly believing that she succeeded in moving in the cave. One of her promising runs has finished after Claude failed to admit that she needed to speak to Bill to advance in the game. I stumbled in an endless episode of making bad decisions.
"Moving forward, I don’t know how useful it is internally as a standard. It is possible that with a small group of small skills, Claude improves a little and seizes the game, then this standard is not interesting," Hershey admits. "It may also be that there are things that I do not fully understand about what will make our next model a good, and then we still learn a lot of additional things along the way."
As for what happens after that, Hershey says he does not have a long -term strategy to play Claude Pokemon. "I spent a lot of time – my wife will say a lot of time – stare in this thing," He says, laugh. I also get Hershey’s feeling not quite ready to close the book in the project. "I was imagining whenever a new model appeared, I will play Pokemon with him, and I may also appear to the world as well."
Until then, Anthropor continues, after a recent reset, in the broadcast of Claude Pokemon plays on Nashil. The project was successful enough to inspire an independent developer for the program A. Gemini plays Pokemon Watch, and if I have to guess, we will see more imitators before a long time.
This article was originally appeared on Engadget on https://www.engadget.com/ai/claude-isnt-a- Great-Pokemon-player-and-that-okay-151522448.html? SRC = RSS?
[og_img]
Source link