Monday, January 13, 2025

Artificial Intelligence news

Anthropic’s chief scientist on...

Agents are the hottest thing in tech right now. Top firms from...

What’s next for AI...

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies...

How optimistic are you...

This story originally appeared in The Algorithm, our weekly newsletter on AI....

AI means the end...

We all know what it means, colloquially, to google something. You pop...
HomeMachine LearningText is All...

Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis



Adapting generic speech recognition models to specific individuals is a challenging problem due to the scarcity of personalized data. Recent works have proposed boosting the amount of training data using personalized text-to-speech synthesis. Here, we ask two fundamental questions about this strategy: when is synthetic data effective for personalization, and why is it effective in those cases? To address the first question, we adapt a state-of-the-art automatic speech recognition (ASR) model to target speakers from four benchmark datasets representative of different speaker types. We show that…



Article Source link and Credit

Continue reading

Accelerating LLM Inference on NVIDIA GPUs with ReDrafter

Accelerating LLM inference is an important ML research problem, as auto-regressive token generation is computationally expensive and relatively slow, and improving inference efficiency can reduce latency for users. In addition to ongoing efforts to accelerate inference on Apple...

ARMADA: Augmented Reality for Robot Manipulation and Robot-Free Data Acquisition

Teleoperation for robot imitation learning is bottlenecked by hardware availability. Can high-quality robot data be collected without a physical robot? We present a system for augmenting Apple Vision Pro with real-time virtual robot feedback. By providing users with...

BayesCNS: A Unified Bayesian Approach to Address Cold Start and Non-Stationarity in Search Systems at Scale

Information Retrieval (IR) systems used in search and recommendation platforms frequently employ Learning-to-Rank (LTR) models to rank items in response to user queries. These models heavily rely on features derived from user interactions, such as clicks and engagement...