Join our daily and weekly newsletters for the latest updates and exclusive content on our industry-leading AI coverage. He learns more
In its latest endeavor to redefine the landscape of artificial intelligence, Google announced twin 2.0 Quick thinkingIt is a multimodal thinking model capable of addressing complex problems with speed and transparency.
In a Posted on social network XGoogle CEO Sundar Pichai wrote that it was: “Our most thoughtful model yet :)”
And on Developer documentationGoogle explains that “Think Mode is able to provide stronger reasoning capabilities in its responses than the base Gemini 2.0 flash modelwhich was previously Google’s latest and greatest release, was released just eight days ago.
The new model supports only 32,000 input characters (approx 50-60 pages of text) and can produce 8000 codes per output response. In a side panel on Google AI Studio, the company claims it’s best for “multimodal understanding and reasoning” and “coding.”
Full details of the model’s training process, architecture, licensing and costs have not yet been published. Currently, the cost is shown as zero per token in Google AI Studio.
Accessible and more transparent logic
Contrary to competing models of thinking o1 and o1 mini from OpenAIGemini 2.0 allows users to access step-by-step inference through a drop-down menu, providing a clearer and more transparent view of how the model reached its conclusions.

By letting users know how decisions are made, Gemini 2.0 addresses long-standing concerns about AI operating as a “black box,” and brings this model — the licensing terms are unclear — to parity with Other open source models offered by competitors.
My early simple tests of the model showed that it answered correctly and quickly (within one to three seconds) some questions that were too difficult for other AI models, such as counting the number of letters in the word “strawberry.” (See photo above).
In another test, when comparing two decimal numbers (9.9 and 9.11), the model systematically broke the problem down into smaller steps, from factoring integers to comparing decimal places.
These findings are supported by independent third-party analysis the mom The arenawhich named the Gemini 2.0 Flash Thinking model the top performer in all LLM categories.
Native support for image loading and analysis
In a further improvement over the competing OpenAI o1 family, Gemini 2.0 Flash Thinking is designed to process images from the jump.
o1 was launched as a text-only model, but has since expanded to include image and file upload analysis. Both forms can also only return text at the moment.
Gemini 2.0 Flash Thinking does not currently support grounding with Google Search, or integration with other Google apps and third-party third-party tools, according to Developer documentation.
The multimedia capability of Gemini 2.0 Flash Thinking expands potential use cases, enabling it to address scenarios that combine different types of data.
For example, in one test, the model solved a puzzle that required analysis of textual and visual elements, demonstrating its versatility in integration and reasoning across formats.
Developers can take advantage of these features via Google AI Studio and Vertex AI, where the model is available for testing.
As competition in the AI field increases, Gemini 2.0 Flash Thinking could mark the beginning of a new era for problem-solving models. Its ability to handle diverse data types, deliver clear reasoning, and perform at scale, positions it as a serious contender in the logical AI market, competing with the OpenAI o1 family and beyond.
https://venturebeat.com/wp-content/uploads/2024/12/robot-thinking.png?w=1024?w=1200&strip=all
Source link