Learning to play diplomacy is a big deal for several reasons. Not only does this involve multiple players, who perform moves at the same time, but each round is preceded by a brief negotiation where players chat in pairs in an attempt to form alliances or gang up on rivals. After this round of negotiation, the players then decide which pieces to move and whether to honor or forfeit an agreement.
At each stage of the game, Cicero models how other players are likely to act based on the state of the board and his previous conversations with them. It then determines how actors can work together for mutual benefit and generates messages designed to achieve those goals.
To build Cicero, Meta marries two different types of AI: a reinforcement learning model that determines which moves to make, and a large language model that negotiates with other players.
Cicero is not perfect. He always sent messages that contained errors, sometimes contradicting his own plans or making strategic mistakes. But Meta says humans have often chosen to collaborate with him over other players.
And that’s important because while games like chess or go end in a winner and a loser, real-world problems don’t usually have such simple resolutions. Finding compromises and workarounds is often more valuable than winning. Meta says Cicero is a step towards AI that can help solve a range of complex problems that require trade-offs, from planning routes around heavy traffic to negotiating contracts.