Join daily and weekly newsletters to obtain the latest updates and exclusive content to cover the leading artificial intelligence in the industry. Learn more
A new frame called Metascale enables large language models (LLMS) to adapt the thinking mode in the time of reasoning. This frame is treated by one of the shortcomings in LLMS, which uses the same thinking strategy for all types of problems.
It was presented in paper By researchers at the University of California, Davis, Southern California University and Microsoft Research, Metascale uses “high-minded”-modified thinking strategies designed for each task-to improve LLM performance and generalization across various tasks.
This approach can provide institutions to enhance the accuracy and efficiency of their LLM applications without changing models or engaging in costly polishing efforts.
Restrictions on fixed thinking strategies
One of the main challenges of LLM applications is fixed and unlawful thinking behavior. Unlike humans, who can choose different ways to solve problems, LLMS often depends on matching patterns from their training data, which may not always be in line with the proper principles of thinking that humans use.
Current ways to control the thinking of LLMS, such as A series of ideas (CO) claim, Self -verification Reverse thinking is often designed for specific tasks, which limits the ability to adapt and its effectiveness through various scenarios.
The researchers note that “these methods impose constant thinking structures instead of enabling LLMS to define the most effective strategy of the task, which limits their performance.”
To address this restriction, researchers suggest the concept of “metaphorical thinking”. This LLMS process allows thinking about its approach before creating a response. The metal panels are directed by the process of thinking through two components inspired by human perception:
Cognitive mentality: Perspective, experience, or role adopted by the model to deal with the task.
Problem solving strategy: An organized pattern used to formulate a task solution based on the selected mentality.
Instead of addressing a problem, LLM first defines how to think, choose the most appropriate cognitive strategy. For example, when you face a complex software problem, LLM may first think about the type of professionals who will solve it (for example, software engineer) and choose a strategy to deal with the problem (for example, using design patterns to break the problem or use a small services approach to simplify publishing).
“By merging this step the superior thinking, LLMS can adapt the thinking process dynamically with various tasks, rather than relying on strict and pre -determined reasoning,” researchers write.

Depending on the metallic sticks, researchers present Metascale, a test time frame that can be applied to any model through immediate engineering.
“The goal is to enable LLMS to explore different thinking strategies, and generate the most effective response to certain inputs,” she says.
Metascale works in three stages:
Preparing: Metascale generates a variety of thinking strategies based on an input wave. It does this by urging LLM to self -adaptive strategies and benefit from the instruction control data sets that contain thinking molds for different types of problems. This mixture creates a rich primary group of metal.
to choose: The multi -weapons algorithm (MAB) chooses the most promising superstructure for each repetition. Mab is a problem frame where the worker must choose again and again between multiple options, or “arms”, each with unknown reward distributions. The main challenge is to achieve a balance between “exploration” (for example, attempting different thinking strategies) and “exploitation” (choosing a thinking strategy that previously presented the best responses). In Metascale, all members of the metaphor are treated as an arm, and the goal is to maximize the reward (response quality) based on the specific superstructure.
development: The genetic algorithm improves and expands the set of cognitive strategies repeatedly. Metascale uses high -performance graphic tools as “parents” to produce new “child” filtering tools. LLM is required to develop and improve tools for the chosen parents. To remain effective, Metascale works within the budget of fixed samples when creating graphic tools.
The researchers evaluated the Metascale on the standards of sports thinking (GSM8K), understanding knowledge and language (MMLU-PRO), Arena-Hard, compared them to four basic inference methods: direct responses (single reasoning), the best COT, the best multiple responses and the choice of the best way), and the best) with COT. They used GPT-4O and Llama-3.1-8B-Instruct The spine models for their experiences.

The results showed that Metascale greatly enhances the llm problem -solving capabilities through various tasks, constantly outperforming the foundation line methods. Metascale has achieved equal or superior performance compared to all basic lines, regardless of whether they are used by Cot Prompting. It is worth noting that GPT-4O with Metascale outperformed O1-MINI under control of style.
The researchers say: “These results show that the merging of the films enables LLMS to expand more effectively during the test time with an increase in the number of samples,” the researchers say.
As the number of candidates ’solutions increased, Metascale has shown much higher gains than other basic lines, indicating that it is a more effective strategy.
The effects of the institution
As test time technology, metascale can help institutions improve the quality of thinking in LLM through smart demand engineering without having to adjust or replace models. It also does not require the construction of complex software scripts above the models, as the logic is fully provided by LLM itself.
By setting thinking strategies in LLMS, Metascale is also practical for realistic applications that deal with different thinking tasks. It is also a black method, which can be applied to open source models that work on the cloud of the institution or closed models that work behind the third -party applications programming facades. It shows promising capabilities of scaling techniques at the time of testing tasks.
https://venturebeat.com/wp-content/uploads/2024/04/an-ai-thinking-deeply-d1QQ83YATmWIi27u1BpWAQ-n6j3Yxn7TMuUfTDndIEbUA-transformed.jpeg?w=1024?w=1200&strip=all
Source link