Beine

Photo of author

By [email protected]


Join daily and weekly newsletters to obtain the latest updates and exclusive content to cover the leading artificial intelligence in the industry. Learn more


LLMS models have witnessed remarkable progress in the use of thinking capabilities. However, their ability to refer to external data and use it properly – the information that has not been trained on it – in conjunction with thinking has been greatly delayed.

This is a special problem when using Dynamic Dynamic scenarios, intense information that requires updated data from search engines.

But the improvement has arrived: Search-R1, a technique that has been presented in paper By researchers at the University of Illinois in Urbana Chambine and the University of Massachusetts Amheres, Trines for a direction to create research inquiries and integrate the search engine recover smoothly into their thinking.

By searching for ways to integrate these new models into their applications, technologies such as Search-R1 are to cancel the new thinking capabilities that depend on external data sources.

The challenge of combining the research with LLMS

Search engines are decisive to provide LLM applications with updated external knowledge. The two main ways to integrate search engines with LLMS A generation for retrieval (Flast) and the use of the tool, and its implementation through fast engineering or Model model.

However, both methods have restrictions that make them not suitable for thinking models. The rag is often struggled with inaccuracy of retrieval and lacks the ability to perform a multi -turn, which is necessary for thinking tasks.

The use of the tool -based tool is often struggled with generalization, while training -based methods require large -scale data groups of research and seasonal reactions, which are difficult to produce on a large scale.

(On our own Experiments with thinking modelsWe found that information retrieval is still one of the main challenges.)

Search R1

Search-R1 LLMS enables interaction with search engines during Their thinking process instead of having a separate retrieval stage.

Search-R1 defines the search engine as part of the LLM environment, allowing the model to integrate the generation of the distinctive symbol with the search engine results smoothly.

Search-R1 researchers designed to support thinking and repetitive research. The model has been trained to create separate groups of symbols for thinking, research, information and answers. This means that during the thinking process (it is characterized by Signs), if the model determines that it needs external information, it generates a file The sequence that contains the search query. Then the query is passed to a search engine and the results are included in the context window in slice. The model then continues to think about the added context and when it is ready, it generates results in slice.

This structure allows the model to summon the search engine several times because it is reasons about the problem and gets new information (see example below).

An example of LLM thinking with Search-R1 (Source: Arxiv)

Learning reinforcement

LLMS training on interleaves search inquiries with their thinking chain is a challenge. To simplify the process, the Search-R1 researchers designed to train the model through pure reinforcement learning (RL), where the model is left to explore the use of thinking and research tools without directing the data created by human.

Search-R1 uses the “bonus form based on results”, where the model is evaluated only based on the validity of the final response. This eliminates the need to create complex reward models that check the thinking process of the model.

This is it The same approach used in Deepseek-R1-ZeroWhere the model was given a task and was sentenced only based on the result. Using RL Pure RL avoids the need to create large data collections from manually explained examples (supervisory control).

“Search-R1 can be considered an extension of Deepseek-R1, which focuses mainly on Parametry thinking by providing RL training that has been moved on the search for retrieval decision-making process,” researchers write in their paper.

Search R1 at work

The researchers tested Search-R1 by formulating the base and directing versions of QWEN-2.5 and Lama -3.2 And its evaluation on seven criteria that include a variety of thinking tasks that require single research and multi -law research. Compare SEARCH-R1 against various basic lines: ‌ direct inference with A series of ideas (COT) inference, inference with rag, and control over the supervision of the use of tools.

The R1 search is constantly outperforming the foundation line methods with a fair margin. It also outperforms the thinking models trained on RL, but without retrieving the research. “This corresponds to expectations, such as integrating research into thinking in LLM provides access to relevant external knowledge and improving general performance,” researchers write.

Search-R1 is also effective for families of different models, base variables and the ongoing instructions, indicating that RL with results-based rewards can be useful to exceed pure thinking scenarios. Researchers released Search code-R1 On Jaithb.

Search institutions for search inquiries can independently have and integrate information in real time in thinking great effects on institutions’ applications. It can enhance the accuracy and reliability of LLM systems in areas such as customer support, knowledge management and data analysis. By enabling LLMS to dynamically adapt to changing information, Search-R1 can help institutions build more intelligent and fast artificial intelligence solutions. This possibility can be very useful for applications that require access to constantly changing data, which require multiple steps to find an answer.

It also indicates that we still have to explore the full potential of the new reinforcement learning model that has appeared since Deepseek-R1 was launched.



https://venturebeat.com/wp-content/uploads/2025/03/cfr0z3n_midcentury_modernist_retrofuturistic_style_det_c17aaff2-438d-490d-a7b3-dfb6b4d7c24d-1.png?w=1024?w=1200&strip=all
Source link

Leave a Comment