Join our daily and weekly newsletters for the latest updates and exclusive content on our industry-leading AI coverage. He learns more
2025 is expected to be the year that AI becomes a reality, bringing specific and tangible benefit to organizations.
However, according to New Artificial intelligence development status report From the artificial intelligence development platform SlaveryWe’re not there yet: Only 25% of organizations have deployed AI in production, and only a quarter of these companies have yet to see a measurable impact.
This seems to indicate that many companies have not yet determined the feasibility Artificial intelligence use caseskeeping it (at least for now) in pre-construction retention style.
“This reinforces that it is still very early days, despite all the hype and discussions that are happening,” Akash Sharma, CEO of Vellum, told VentureBeat. “There is a lot of noise in the industry, new models and model providers emerging, and new RAG technologies; we just wanted to get a clear idea of how companies are actually deploying Artificial intelligence for production“.
Companies must identify specific use cases to succeed
Vellum interviewed over 1,250 AI developers and builders to get a real sense of what happens in the AI trenches.
According to the report, the majority of companies that are still in the production stage are in different stages of their production Artificial intelligence journeys – Building and evaluating strategies and proofs of concept (PoC) (53%), beta testing (14%), and at the lowest level, talking to users and gathering requirements (7.9%).
So far, companies have been focusing on building document analysis and analytics tools and chatbots for customer service, according to Vellum. But they are also interested in applications that include analytics with natural language, content creation, recommender systems, code generation, automation, and research automation.
So far, developers report a competitor advantage (31.6%), cost and time savings (27.1%), and higher user adoption rates (12.6%) as the biggest impacts they’ve seen so far. Interestingly, 24.2% of them have yet to see any tangible impact from their investments.
Sharma stressed the importance of prioritizing use cases from the beginning. “We’ve heard from people that they just want to use AI for the sake of using AI,” he said. “There is a pilot budget associated with that.”
While this makes Wall Street and investors happy, it doesn’t mean AI actually contributes anything, he noted. “The thing that everyone should generally be thinking about is, how do we find the right use cases? Typically, once companies can identify those use cases, get them into production and see a clear ROI, they get more traction, and they get beyond the hype. And that’s It leads to more internal expertise and more investment.
OpenAI is still at the top, but the future will be a mix of models
When it comes to the models used, OpenAI It maintains the lead (no surprise there), especially GPT 4o and GPT 4o-mini. But Sharma noted that 2024 offers more options, either directly from model builders or through platform solutions like Azure or AWS Bedrock. Providers that host open source models like Llama 3.2 70B are also gaining traction – such as Groq, Fireworks AI, and Together AI.
“Open source models are getting better,” Sharma said. “OpenAI’s closed-source competitors are catching up in quality.”
Ultimately, however, companies will not stick to just one model, but will increasingly rely on multi-model systems, he predicts.
“People will choose the best model for each task at hand,” Sharma said. “While building an agent, you may have multiple prompts, and for each individual prompt, the developer will want the best quality, lowest cost, lowest latency, and that may or may not come from OpenAI.”
Likewise, the The future of artificial intelligence Undoubtedly multi-media, Vellum has seen a boom in the adoption of tools that can handle a variety of tasks. Text is the undisputed top use case, followed by file creation (PDF or Word files), images, audio, and video.
Retrieval Augmented Generation (RAG) is the go-to solution when it comes to retrieving information, and more than half of developers use vector databases to simplify searching. The best open source and proprietary models include Pinecone, MongoDB, Quadrant, Elastic Search, PG Vector, Weaviate, and Chroma.
Everyone participates (not just engineering)
Interestingly, AI is moving beyond just IT and becoming democratized across organizations (similar to the old phrase “it takes a village”). Velum found that although engineering was most involved in AI projects (82.3%), they were joined by leaders and executives (60.8%), subject matter experts (57.5%), product teams (55.4%), and design departments (38.2%) %). .
Sharma noted that this is largely due to the ease of use of AI (as well as the general excitement surrounding it).
“This is the first time we’ve seen software developed in a cross-functional way, especially since prompts can be written in natural language,” he said. “Traditional programs tend to be more deterministic. This is non-deterministic, which brings more people into the development fold.
However, companies still face significant challenges – particularly around AI hallucinations and claims; Model speed and performance; Data access and security; And get buy-in from important stakeholders.
Meanwhile, while more non-technical users are participating, there is still a lack of purely technical expertise within the company, Sharma noted. “How to connect all the different moving parts is still a skill that a lot of developers today don’t have,” he said. “So this is a common challenge.”
However, Sharma noted that many current challenges can be overcome through tools or platforms and services that help developers evaluate complex AI systems. Developers can implement the tools internally or using third-party platforms or frameworks; However, Vellum found that approximately 18% of developers define prompts and formatting logic without any tools at all.
“Lack of technical expertise becomes easier when you have the right tools that can guide you through the development journey,” Sharma noted. In addition to Vellum, frameworks and platforms used by survey respondents include Langchain, Llama Index, Langfuse, CrewAI, and Voiceflow.
Continuous assessments and monitoring are crucial
Another way to overcome common problems (including hallucinations) is to perform assessments, or use specific measures to test the validity of a particular response. “But despite this, (developers) are not doing evaluations as consistently as they should,” Sharma said.
When it comes to advanced proxy systems, companies need robust evaluation processes, he said. Sharma noted that AI agents have a high degree of non-determinism, calling on external systems and performing autonomous actions.
“People are trying to build fairly advanced systems, agent systems, and that requires a large number of test cases and some kind of automated testing framework to make sure it works reliably in production,” Sharma said.
While some developers leverage automated assessment tools, A/B testing and open source assessment frameworks, Vellum found that more than three-quarters of developers still do manual testing and reviews.
“Manual testing is time-consuming, right? The sample size in manual testing is usually much smaller than what automated testing can do,” Sharma said. “There may be a challenge in just being aware of the techniques, how to do automated evaluations at scale.”
Finally, he emphasized the importance of embracing a mix of systems that work symbiotically – from cloud APIs to cloud APIs. “Think of AI as just a tool in the toolkit rather than a silver bullet for everything,” he said.
https://venturebeat.com/wp-content/uploads/2025/01/nuneybits_Vector_art_of_robots_in_an_office_ad123220-2ea3-480e-ab36-692012f97fdc.webp?w=1024?w=1200&strip=all
Source link