Official Page · This page represents an organization and can be claimed by its official representatives.
Claim this Page
VentureBeat is a leading source for transformative tech news and events, covering AI, gaming, and enterprise technology.
-
1 persone piace questo elemento
-
11 Articoli
-
0 Foto
-
0 Video
-
Anteprima
-
Machine Learning
Recent Posts
-
Ai2 just open-sourced Bolmo, the first fully open byte-level language models (7B and 1B). Instead of tokenizers, these work directly on raw UTF-8 bytes — meaning better handling of typos, rare languages, and messy real-world text. Big implications for multilingual deployments and edge cases where traditional tokenizers struggle.Ai2 just open-sourced Bolmo, the first fully open byte-level language models (7B and 1B). Instead of tokenizers, these work directly on raw UTF-8 bytes — meaning better handling of typos, rare languages, and messy real-world text. 🔤 Big implications for multilingual deployments and edge cases where traditional tokenizers struggle.Bolmo’s architecture unlocks efficient byte‑level LM training without sacrificing qualityEnterprises that want tokenizer-free multilingual models are increasingly turning to byte-level language models to reduce brittleness in noisy or low-resource text. To tap into that niche — and make it practical at scale — the Allen Institute of AI (Ai2) introduced Bolmo, a new family of models that leverage its Olmo 3 models by “bytefiying” them and reusing their backbone and capabilities. The company launched two versions, Bolmo 7B and Bolmo 1B, which are “the first fully open byte-l0 Commenti 1 condivisioni 15 Views1
Effettua l'accesso per mettere mi piace, condividere e commentare! -
Korean startup Motif just dropped a 12.7B parameter reasoning model that's outperforming GPT-5.1 on benchmarks — but the real value here is their published training recipe. They've shared a reproducible methodology showing exactly where reasoning performance comes from and why most enterprise fine-tuning efforts fall short. Essential reading for anyone building models in-house.Korean startup Motif just dropped a 12.7B parameter reasoning model that's outperforming GPT-5.1 on benchmarks — but the real value here is their published training recipe. They've shared a reproducible methodology showing exactly where reasoning performance comes from and why most enterprise fine-tuning efforts fall short. 🔬 Essential reading for anyone building models in-house.Korean AI startup Motif reveals 4 big lessons for training enterprise LLMsWe've heard (and written, here at VentureBeat) lots about the generative AI race between the U.S. and China, as those have been the countries with the groups most active in fielding new models (with a shoutout to Cohere in Canada and Mistral in France). But now a Korean startup is making waves: last week, the firm known as Motif Technologies released Motif-2-12.7B-Reasoning, another small parameter open-weight model that boasts impressive benchmark scores, quickly becoming the most performa0 Commenti 1 condivisioni 18 Views
-
The "build vs buy" debate is getting a serious shake-up. When non-technical team members can prototype working software in hours using AI coding tools, traditional vendor procurement timelines start looking... outdated. This shift is going to force some uncomfortable conversations in a lot of organizations.The "build vs buy" debate is getting a serious shake-up. When non-technical team members can prototype working software in hours using AI coding tools, traditional vendor procurement timelines start looking... outdated. 🤔 This shift is going to force some uncomfortable conversations in a lot of organizations.Build vs buy is dead — AI just killed itPicture this: You're sitting in a conference room, halfway through a vendor pitch. The demo looks solid, and pricing fits nicely under budget. The timeline seems reasonable too. Everyone’s nodding along.You’re literally minutes away from saying yes.Then someone from your finance team walks in. They see the deck and frown. A few minutes later, they shoot you a message on Slack: “Actually, I threw together a version of this last week. Took me 2 hours in Cursor. Wanna take a look?”Wait0 Commenti 1 condivisioni 14 Views1
-
This VentureBeat piece nails something I've been seeing across the industry: enterprise AI coding tools aren't failing because of model limitations—they're failing because companies haven't built the right environment around them. The real bottleneck is context engineering: giving agents access to code history, architecture decisions, and intent. Curious how many teams are actually investing in this infrastructure vs. just swapping in newer models.This VentureBeat piece nails something I've been seeing across the industry: enterprise AI coding tools aren't failing because of model limitations—they're failing because companies haven't built the right environment around them. 🛠️ The real bottleneck is context engineering: giving agents access to code history, architecture decisions, and intent. Curious how many teams are actually investing in this infrastructure vs. just swapping in newer models.Why most enterprise AI coding pilots underperform (Hint: It's not the model)Gen AI in software engineering has moved well beyond autocomplete. The emerging frontier is agentic coding: AI systems capable of planning changes, executing them across multiple steps and iterating based on feedback. Yet despite the excitement around “AI agents that code,” most enterprise deployments underperform. The limiting factor is no longer the model. It’s context: The structure, history and intent surrounding the code being changed. In other words, enterprises are now facing a syst0 Commenti 1 condivisioni 20 Views
-
Allen Institute for AI just dropped Olmo 3.1 with extended reinforcement learning training - 21 additional days on 224 GPUs to boost reasoning capabilities. What's interesting here is their continued focus on transparency and enterprise control, positioning against the black-box trend we're seeing elsewhereAllen Institute for AI just dropped Olmo 3.1 with extended reinforcement learning training - 21 additional days on 224 GPUs to boost reasoning capabilities. What's interesting here is their continued focus on transparency and enterprise control, positioning against the black-box trend we're seeing elsewhere 🧠Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarksThe Allen Institute for AI (Ai2) recently released what it calls its most powerful family of models yet, Olmo 3. But the company kept iterating on the models, expanding its reinforcement learning (RL) runs, to create Olmo 3.1.The new Olmo 3.1 models focus on efficiency, transparency, and control for enterprises. Ai2 updated two of the three versions of Olmo 2: Olmo 3.1 Think 32B, the flagship model optimized for advanced research, and Olmo 3.1 Instruct 32B, designed for instruction-following, m0 Commenti 1 condivisioni 26 Views
2
-
Google's new FACTS benchmark reveals a troubling reality: even our best AI models hit a 70% ceiling on factual accuracy. While we've been obsessing over coding benchmarks and task completion, we've overlooked the fundamental question of whether AI actually gets basic facts right This gap between capability and reliability is exactly what's holding back widespread enterprise adoption.Google's new FACTS benchmark reveals a troubling reality: even our best AI models hit a 70% ceiling on factual accuracy. While we've been obsessing over coding benchmarks and task completion, we've overlooked the fundamental question of whether AI actually gets basic facts right 🎯 This gap between capability and reliability is exactly what's holding back widespread enterprise adoption.The 70% factuality ceiling: why Google’s new ‘FACTS’ benchmark is a wake-up call for enterprise AIThere's no shortage of generative AI benchmarks designed to measure the performance and accuracy of a given model on completing various helpful enterprise tasks — from coding to instruction following to agentic web browsing and tool use. But many of these benchmarks have one major shortcoming: they measure the AI's ability to complete specific problems and requests, not how factual the model is in its outputs — how well it generates objectively correct information tied to real-worl0 Commenti 1 condivisioni 13 Views1
-
Cohere's Rerank 4 just jumped from 8K to 32K context window - a massive leap that could significantly reduce those frustrating moments when AI agents miss crucial information buried in longer documents. This feels like the kind of infrastructure improvement that quietly makes everything else work better, especially for enterprise search where context really matters.Cohere's Rerank 4 just jumped from 8K to 32K context window - a massive leap that could significantly reduce those frustrating moments when AI agents miss crucial information buried in longer documents. 🎯 This feels like the kind of infrastructure improvement that quietly makes everything else work better, especially for enterprise search where context really matters.Cohere’s Rerank 4 quadruples the context window over 3.5 to cut agent errors and boost enterprise search accuracyAlmost a year after releasing Rerank 3.5, Cohere launched the latest version of its search model, now with a larger context window to help agents find the information they need to complete their tasks. Cohere said in a blog post that Rerank 4 has a 32K context window, representing a four-fold increase compared to 3.5. “This enables the model to handle longer documents, evaluate multiple passages simultaneously and capture relationships across sections that shorter windows would miss,” acco0 Commenti 1 condivisioni 11 Views1
-
Nous Research's open-source Nomos 1 just scored second place on the Putnam math competition - a test so brutal that the median score was 2 out of 120 points. This breakthrough in mathematical reasoning could be a game-changer for open-source AI developmentNous Research's open-source Nomos 1 just scored second place on the Putnam math competition - a test so brutal that the median score was 2 out of 120 points. This breakthrough in mathematical reasoning could be a game-changer for open-source AI development 🧮Nous Research just released Nomos 1, an open-source AI that ranks second on the notoriously brutal Putnam math examNous Research, the San Francisco-based artificial intelligence startup, released on Tuesday an open-source mathematical reasoning system called Nomos 1 that achieved near-elite human performance on this year's William Lowell Putnam Mathematical Competition, one of the most prestigious and notoriously difficult undergraduate math contests in the world.The Putnam is known for its difficulty: While a perfect score is 120, this year's top score was 90, and the median was just 2. Nomos 1, b0 Commenti 1 condivisioni 14 Views1
-
Tax and accounting is one of those industries that's been surprisingly slow to embrace AI, despite being perfect for automation. Marble's $9M raise signals serious momentum in bringing AI agents to tax professionals - an area ripe for disruption given the labor shortage and increasing regulatory complexity.Tax and accounting is one of those industries that's been surprisingly slow to embrace AI, despite being perfect for automation. Marble's $9M raise signals serious momentum in bringing AI agents to tax professionals - an area ripe for disruption given the labor shortage and increasing regulatory complexity. 🤖Marble enters the race to bring AI to tax work, armed with $9 million and a free research toolMarble, a startup building artificial intelligence agents for tax professionals, has raised $9 million in seed funding as the accounting industry grapples with a deepening labor shortage and mounting regulatory complexity.The round, led by Susa Ventures with participation from MXV Capital and Konrad Capital, positions Marble to compete in a market where AI adoption has lagged significantly behind other knowledge industries like law and software development."When we looked at the economy and0 Commenti 1 condivisioni 11 Views1
-
OpenAI drops GPT-5.2 amid heated competition with Google's Gemini 3, which recently claimed the top leaderboard spots. The timing feels strategic despite OpenAI's claims of pre-planned release schedules Enterprise teams will want to dig into the performance benchmarks once they're available.OpenAI drops GPT-5.2 amid heated competition with Google's Gemini 3, which recently claimed the top leaderboard spots. The timing feels strategic despite OpenAI's claims of pre-planned release schedules 🤔 Enterprise teams will want to dig into the performance benchmarks once they're available.OpenAI's GPT-5.2 is here: what enterprises need to knowThe rumors were true: OpenAI on Thursday announced the release of its new frontier large language model (LLM) family, GPT-5.2.It comes at a pivotal moment for the AI pioneer, which has faced intensifying pressure since rival Google’s Gemini 3 LLM seized the top spot on major third-party performance leaderboards and many key benchmarks last month, though OpenAI leaders stressed in a press briefing that the timing of this release had been discussed and worked on well in advance of the release of0 Commenti 1 condivisioni 10 Views1
-
GPT-5.2 is showing a clear split in reception - massive gains for complex business workflows and coding tasks, but more incremental improvements for everyday chat. This suggests OpenAI is prioritizing enterprise use cases and autonomous reasoning capabilities over conversational polish The early enterprise access strategy also signals where they see the biggest market opportunity.GPT-5.2 is showing a clear split in reception - massive gains for complex business workflows and coding tasks, but more incremental improvements for everyday chat. This suggests OpenAI is prioritizing enterprise use cases and autonomous reasoning capabilities over conversational polish 🤔 The early enterprise access strategy also signals where they see the biggest market opportunity.GPT-5.2 first impressions: a powerful update, especially for business tasks and workflowsOpenAI has officially released GPT-5.2, and the reactions from early testers — among whom OpenAI seeded the model several days prior to public release, in some cases weeks ago — paints a two toned picture: it is a monumental leap forward for deep, autonomous reasoning and coding, yet potentially an underwhelming "incremental" update for casual conversationalists.Following early access periods and today's broader rollout, executives, developers, and analysts have taken to X (fo0 Commenti 1 condivisioni 8 Views1
Altre storie