Court filings revealed that Meta executives were obsessed with cracking OpenAI’s GPT-4 internally

Photo of author

By [email protected]


The executives and researchers leading Meta’s AI efforts are obsessed with overcoming OpenAI’s GPT-4 model during the development of Llama 3, according to Internal messages revealed by the court on Tuesday In one of the company’s ongoing copyright cases over artificial intelligence, Kadrey v. Meta.

“Frankly…our goal should be GPT-4,” said Meta’s vice president of generative AI, Ahmed Al-Dahl, in an October 2023 letter to Meta researcher Hugo Toffron. “We have 64,000 GPUs coming! We need to learn how to build boundaries and win this race.

Although Meta is releasing open AI models, the company’s AI leaders have been more focused on beating competitors that don’t typically release the weights of their models, such as Anthropic and OpenAI, and instead tie them behind an API. Meta executives and researchers viewed Anthropic’s Claude and OpenAI’s GPT-4 as the gold standard to work towards.

French AI startup Mistral, one of Meta’s biggest open competitors, was mentioned several times in internal messages, but the tone was dismissive.

“Mistral is peanuts for us,” Al-Dahli said in a message. “We should be able to do better,” he later said.

These days, tech companies are racing to outdo each other with cutting-edge AI models, but these court filings reveal just how competitive Meta’s AI leaders really were — and apparently It still is. At several points in the exchange, Meta’s AI leaders talked about how they were “very aggressive” in getting the right data to train the llamas; At one point, one executive said in a letter to co-workers that “Llama 3 is literally all I care about.”

Plaintiffs in the case allege that Meta executives sometimes cut corners in their mad race to ship AI models, training on copyrighted books in the process.

Toffron noted in a letter that the mix of datasets used in Llama 2 “was poor,” and talked about how Meta could use a better mix of data sources to improve Llama 3. Toffron and El Dahli then talked about paving the way to Use the LibGen dataset, which contains copyrighted works From Cengage Learning, Macmillan Learning, McGraw Hill, and Pearson Education.

“Do we have the right data sets out there (?),” Aldahl said. “Is there anything you want to use but can’t for some stupid reason?”

Meta CEO Mark Zuckerberg has previously said he is trying to close the performance gap between Llama’s AI models and closed models from OpenAI, Google and others. Internal messages reveal intense pressure within the company to do this.

Zuckerberg said in a speech: “This year, the Llama 3 competes with the most advanced models and leads in some areas.” letter As of July 2024. “Starting next year, we expect future llama models to be the most advanced in the industry.”

When dead at last Llama 3 is released in April 2024the open AI model was competitive with the leading closed models from Google, OpenAI and Anthropic, and outperformed the open options from Mistral. However, the data that Meta used to train its models — data that Zuckerberg reportedly gave the green light to use, despite its copyright status — is facing scrutiny in several ongoing lawsuits.



https://techcrunch.com/wp-content/uploads/2025/01/GettyImages-2170596204.jpg?resize=1200,799

Source link

Leave a Comment