Friday, March 21, 2025

Artificial Intelligence news

Powering the food industry...

There has never been a more pressing time for food producers to...

When you might start...

Last Wednesday, Google made a somewhat surprising announcement. It launched a version...

Is Google playing catchup...

This story originally appeared in The Debrief with Mat Honan, a weekly newsletter...

Gemini Robotics uses Google’s...

Google DeepMind has released a new model, Gemini Robotics, that combines its...
HomeMachine LearningParameters vs FLOPs:...

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models



Scaling the capacity of language models has consistently proven to be a reliable approach for
improving performance and unlocking new capabilities. Capacity can be primarily defined by
two dimensions: the number of model parameters and the compute per example. While scaling
typically involves increasing both, the precise interplay between these factors and their combined contribution to overall capacity remains not fully understood. We explore this relationship
in the context of sparse Mixture-of-Experts (MoEs) , which allow scaling the number of parameters without proportionally increasing…



Article Source link and Credit

Continue reading

M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference

Residual transformations enhance the representational depth and expressive power of large language models (LLMs). However, applying static residual transformations across all tokens in auto-regressive generation leads to a suboptimal trade-off between inference efficiency and generation fidelity. Existing methods,...

Does Spatial Cognition Emerge in Frontier Models?

Not yet. We present SPACE, a benchmark that systematically evaluates spatial cognition in frontier models. Our benchmark builds on decades of research in cognitive science. It evaluates large-scale mapping abilities that are brought to bear when an organism...

SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions

In this work, we present and evaluate SELMA, a Speech-Enabled Language Model for virtual Assistant interactions that integrates audio and text as inputs to a Large Language Model (LLM). SELMA is designed to handle three primary and two...