Monday, December 9, 2024

Artificial Intelligence news

The US Department of...

The US Department of Defense has invested $2.4 million over two years...

OpenAI’s new defense contract...

At the start of 2024, OpenAI’s rules for how armed forces might...

Google DeepMind’s new AI...

Google DeepMind has unveiled an AI model that’s better at predicting the...

The startup trying to...

A startup called Exa is pitching a new spin on generative search....
HomeMachine LearningMore Speaking or...

More Speaking or More Speakers?



Self-training (ST) and self-supervised learning (SSL) methods have demonstrated strong improvements in automatic speech recognition (ASR). In spite of these advances, to the best of our knowledge, there is no analysis of how the composition of the labelled and unlabelled datasets used in these methods affects the results. In this work we aim to analyse the effect of number of speakers in the training data on a recent SSL algorithm (wav2vec 2.0), and a recent ST algorithm (slimIPL). We perform a systematic analysis on both labeled and unlabeled data by varying the number of speakers while…



Article Source link and Credit

Continue reading

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Diffusion models have emerged as a powerful tool for generating high-quality images from textual descriptions. Despite their successes, these models often exhibit limited diversity in the sampled images, particularly when sampling with a high classifier-free guidance weight. To...

Towards Time-Series Reasoning with LLMs

Multi-modal large language models (MLLMs) have enabled numerous advances in understanding and reasoning in domains like vision, but we have not yet seen this broad success for time-series. Although prior works on time-series MLLMs have shown promising performance...

Private and Personalized Frequency Estimation in a Federated Setting

*Equal Contributors Motivated by the problem of next word prediction on user devices we introduce and study the problem of personalized frequency histogram estimation in a federated setting. In this problem, over some domain, each user observes a number...