American AI startup Poolside on Monday released Laguna XS.2, a high-performing open source model optimized for agentic coding workflows, as the industry pivots toward inference scaling techniques that dramatically increase compute costs during model reasoning. According to VentureBeat, the San Francisco-based company founded in 2023 also launched a coding agent harness called “pool” and a web-based development environment called “shimmer.”
https://x.com/eisokant/status/2049142230397370537
The release comes as major AI labs shift focus from scaling model parameters during training to scaling compute resources during inference — a trend that’s reshaping how organizations approach AGI development and deployment costs.
Inference Scaling Transforms AGI Economics
The latest generation of reasoning models, including GPT-5.5 and OpenAI’s o1 series, achieve higher performance by spending significantly more compute resources on each response through a process called inference scaling or test-time compute. Towards Data Science reported that these models generate hidden reasoning tokens that never appear in final outputs but represent “a massive surge in billable compute” for organizations.
This approach allows models to check their own logic and iterate until finding optimal answers, but creates new operational challenges. Finance teams face shrinking margins from high token costs, while infrastructure engineers must manage latency spikes that can reach 30 seconds per response.
The shift represents a fundamental change from traditional training-time scaling, where intelligence was fixed during the initial training phase. Modern reasoning models now adaptively allocate resources based on task complexity, turning model selection into what researchers call “a high-stakes operations tradeoff.”
Open Source Alternative Emerges from US Startup
Poolside’s Laguna XS.2 offers an alternative approach by providing affordable, locally-deployable intelligence for agentic workflows. The model can write code, use third-party tools, and take autonomous actions without the recurring compute costs associated with cloud-based reasoning models.
According to the company’s announcement, Laguna XS.2 competes with proprietary models from Anthropic, OpenAI, and Google while maintaining open licensing. This positioning addresses growing enterprise concerns about compute bill escalation from inference scaling.
Poolside post-training engineer George Grigorev explained on X that government agencies might prefer Poolside over leading proprietary labs for sovereignty and cost control reasons. The open source approach allows organizations to run models locally without external API dependencies.
The startup joins Chinese companies like DeepSeek and Xiaomi in challenging the proprietary model dominance through open licensing and cost efficiency, though Poolside represents the first major US-based entry in this competitive segment.
New Training Methods Reduce Resource Requirements
Researchers at JD.com and academic institutions recently introduced Reinforcement Learning with Verifiable Rewards with Self-Distillation (RLSD), a training paradigm that reduces the computational barriers to building custom reasoning models. VentureBeat reported that this technique outperforms traditional distillation and reinforcement learning approaches.
RLSD addresses the “signal density problem” in standard reinforcement learning, where multi-thousand-token reasoning traces receive only binary rewards. Co-author Chenxu Yang explained that every token in a reasoning trace typically receives identical credit, regardless of its logical importance.
The new approach combines reliable performance tracking from reinforcement learning with granular feedback from self-distillation. This enables enterprise teams to train custom reasoning models without the massive resource requirements typically associated with frontier model development.
For organizations evaluating AGI research directions, RLSD represents a path toward specialized reasoning capabilities without the infrastructure costs of large-scale inference scaling.
Enterprise AGI Strategy Considerations
The convergence of inference scaling costs and open source alternatives is forcing enterprise teams to reconsider their AGI development strategies. Organizations must now balance three competing priorities: cost management, quality requirements, and latency constraints.
Task taxonomy frameworks help route simple operations to efficient models while reserving compute budgets for high-stakes reasoning. This approach acknowledges that not every interaction requires the full reasoning capabilities of frontier models.
Risk management teams also face new challenges ensuring that extended reasoning processes don’t bypass safety guardrails or grounding mechanisms. The additional processing time in reasoning models creates more opportunities for unexpected behaviors or outputs.
Infrastructure planning must account for the variable compute demands of reasoning models, which can spike dramatically based on task complexity. Unlike traditional models with predictable resource usage, reasoning systems require dynamic scaling capabilities.
What This Means
The AGI research landscape is fragmenting into two distinct approaches: proprietary reasoning models that scale inference compute for maximum capability, and open source alternatives that prioritize cost efficiency and local deployment. This bifurcation reflects broader tensions between performance maximization and operational sustainability.
Poolside’s entry demonstrates that US companies can compete in the open source AGI space previously dominated by Chinese firms. The success of locally-deployable reasoning models could accelerate enterprise AGI adoption by removing recurring compute costs and data sovereignty concerns.
The emergence of more efficient training methods like RLSD suggests that the resource barriers to custom AGI development may be lowering faster than expected. Organizations that previously couldn’t afford reasoning model development may soon have viable paths to specialized capabilities.
FAQ
What is inference scaling and why does it increase costs?
Inference scaling allows AI models to use extra compute during each response to improve reasoning quality. Models generate hidden “thinking” tokens that don’t appear in outputs but count toward billing, potentially increasing costs 10-30x per query.
How does Poolside’s approach differ from OpenAI and Anthropic?
Poolside offers open source models that run locally, eliminating recurring API costs and data sharing requirements. Their Laguna XS.2 targets agentic coding workflows specifically, while major labs focus on general-purpose reasoning across all domains.
What is RLSD and how does it reduce training costs?
Reinforcement Learning with Verifiable Rewards with Self-Distillation (RLSD) provides more granular feedback during training compared to traditional methods. This allows smaller teams to build custom reasoning models without the massive computational resources typically required for frontier model development.
Related news
- Elon Musk’s only AI expert witness at the OpenAI trial fears an AGI arms race – TechCrunch
- Elon Musk’s only AI expert witness at the OpenAI trial fears an AGI arms race – TechCrunch – Google News – AGI
- World Zero Waste Packaging Technologies – Market Analysis, Forecast, Size, Trends and Insights – IndexBox – Google News – E-Commerce






