Saturday, February 8, 2025

Artificial Intelligence news

These documents are influencing...

Reports from the US Government Accountability Office on improper federal payments in...

Reframing digital transformation through...

Enterprise adoption of generative AI technologies has undergone explosive growth in the...

An AI chatbot told...

For the past five months, Al Nowatzki has been talking to an...

What’s next for smart...

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies...
HomeMachine LearningNaturalistic Head Motion...

Naturalistic Head Motion Generation From Speech



Synthesizing natural head motion to accompany speech for an embodied conversational agent is necessary for providing a rich interactive experience. Most prior works assess the quality of generated head motion by comparing them against a single ground-truth using an objective metric. Yet there are many plausible head motion sequences to accompany a speech utterance. In this work, we study the variation in the perceptual quality of head motions sampled from a generative model. We show that, despite providing more diverse head motions, the generative model produces motions with varying degrees of…



Article Source link and Credit

Continue reading

Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization

Machine Translation (MT) is undergoing a paradigm shift, with systems based on fine-tuned large language models (LLM) becoming increasingly competitive with traditional encoder-decoder models trained specifically for translation tasks. However, LLM-based systems are at a higher risk of...

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Scaling the capacity of language models has consistently proven to be a reliable approach for improving performance and unlocking new capabilities. Capacity can be primarily defined by two dimensions: the number of model parameters and the compute per example. While...

Compact Neural TTS Voices for Accessibility

Contemporary text-to-speech solutions for accessibility applications can typically be classified into two categories: (i) device-based statistical parametric speech synthesis (SPSS) or unit selection (USEL) and (ii) cloud-based neural TTS. SPSS and USEL offer low latency and low disk...