AI Benchmarks Under Scrutiny: Hacks, IQ Scores, and OpenAI’s
A new automated tool called BenchJack found 219 reward-hacking exploits across 10 major AI agent benchmarks,…
How real companies are deploying AI — adoption, ROI, workflow change, and the practical challenges of putting it into production.
A new automated tool called BenchJack found 219 reward-hacking exploits across 10 major AI agent benchmarks,…
Anduril raised $5 billion at a $61 billion valuation on May 13, 2026, while OpenAI launched…
A startup project called AI IQ has mapped 50+ frontier language models onto a human IQ…
A preregistered arXiv study using 365 runs of Claude Sonnet 4.5 found that hidden AI orchestrators…
Anduril raised $5 billion at a $61 billion valuation, SoftBank injected $457 million into Graphcore, and…
Anduril raised $5 billion at a $61 billion valuation, SoftBank injected $457 million into Graphcore, and…
Anduril raised $5 billion at a $61 billion valuation, NVIDIA crossed $40 billion in equity investments,…
Anduril raised $5 billion at a $61 billion valuation, NVIDIA crossed $40 billion in equity investments…
Anthropic reversed its April 2026 ban on third-party agent use by introducing metered Agent SDK credits,…
AI models trained on different data types are converging toward identical internal representations of reality as…
Google CEO Sundar Pichai revealed that AI now generates 75% of the company's code while enterprise…
AI productivity tools deliver 5.4% efficiency gains for active users but only 1.4% organization-wide impact. Frontier…