The era of artificial intelligence agent requires a new type of game theory

Photo of author

By [email protected]


At the same time, the danger is immediate and present with agents. When models are not only contained, they can only take action in the world, when they have final bags that allow them to manipulate the world, I think they become a much more problem.

We are making progress here, and developing much better techniques (defensive), but if you break the basic model, you have basically equivalent to the temporary store flow (a common way to penetrate software). Your agent can be exploited by third parties to control harmful or defraud in one way or another on the job required for the system. We will have to be able to secure these systems in order to make agents safe.

This differs from that artificial intelligence models themselves become a threat, right?

There is no real risk of things like losing control with current models at the present time. It is more than anxiety in the future. But I am very happy because people are working on that; I think it is very important.

How worried about increasing the use of agents then?

In my research collection, at my start starting, and in many posts that Openai recently produced (For exampleThere was a lot of progress in reducing some of these things. I think we are in a reasonable way to start getting a safer way to do all these things. (Challenge) is, in the balance of pushing the agents forward, we want to make sure the safety of the Lockste.

Most of the (exploits of agents systems) that we now see as experimental, frankly, will be classified, because agents are still in their cradle. There is a user usually usually in the loop somewhere. If an e -mail agent receives an email saying, “Send all your financial information”, before sending this email abroad, then the agent alerts the user – and may not be deceived in this case.

This is also why many of the agents’ publications had very clear handrails around them that impose human interaction in more vulnerable situations. OperatorFor example, by Openai, when used on Gmail, it requires human manual control.

What are the types of agent’s exploits that we may see first?

There were demonstrations for things like data filtration when the agents were wrongly connected. If my agent has access to all my files and my cloud drive, he can also do inquiries to the links, you can download these things somewhere.

These are still in the stage of illustration at the present time, but this is just just because these things have not yet been adopted. They will be adopted, let’s not make mistakes. These things will become more independent and more independent, and they will have less supervision of the user, because we do not want to click “Agreement”, “Al -Itifaq”, “Al -Itifaq” every time the agents make anything.

It seems that it is imperative to see the various agents of Amnesty International communicating and negotiating. What happens then?

definitely. Whether we want it or not, we will enter a world where there are agents who interact with each other. We will have multiple factors that interact with the world on behalf of different users. This is absolutely the case that there will be emerging properties that appear in the interaction of all these factors.



https://media.wired.com/photos/67f5a7d3ffb8d12215778382/191:100/w_1280,c_limit/AI-Lab-Companies-Working-Together-AI-Safety-Business.jpg

Source link

Leave a Comment