NVIDIA
ZAYA1-8B Achieves GPT-5 Performance with 760M Parameters
Zyphra's ZAYA1-8B achieves GPT-5 performance with just 760M active parameters, while Subquadratic claims 1,000x efficiency gains…
Zyphra's ZAYA1-8B achieves GPT-5 performance with just 760M active parameters, while Subquadratic claims 1,000x efficiency gains…
Researchers introduce Train-to-Test scaling laws that optimize AI models by jointly considering training costs and inference…
Researchers introduce Train-to-Test scaling laws that optimize AI model architecture by jointly considering training and inference…
Researchers introduce Train-to-Test scaling laws that optimize AI model efficiency by training smaller models on more…