Monday, June 24, 2024

Artificial Intelligence news

Synthesia’s hyperrealistic deepfakes will...

Startup Synthesia’s AI-generated avatars are getting an update to make them even...

How underwater drones could...

A potential future conflict between Taiwan and China would be shaped by...

How generative AI could...

First, a confession. I only got into playing video games a little...

I tested out a...

This story first appeared in China Report, MIT Technology Review’s newsletter about...
HomeMachine LearningMatching Latent Encoding...

Matching Latent Encoding for Audio-Text based Keyword Spotting



Using audio and text embeddings jointly for Keyword Spotting (KWS) has shown high-quality results, but the key challenge of how to semantically align two embeddings for multi-word keywords of different sequence lengths remains largely unsolved. In this paper, we propose an audio-text-based end-to-end model architecture for flexible keyword spotting (KWS), which builds upon learned audio and text embeddings. Our architecture uses a novel dynamic programming-based algorithm, Dynamic Sequence Partitioning (DSP), to optimally partition the audio sequence into the same length as the…



Article Source link and Credit

Continue reading

Conformer-Based Speech Recognition on Extreme Edge-Computing Devices

This paper was accepted at the Industry Track at NAACL 2024. With increasingly more powerful compute capabilities and resources in today’s devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect...

AGRaME: Any Granularity Ranking with Multi-Vector Embeddings

Ranking is a fundamental and popular problem in search. However, existing ranking algorithms usually restrict the granularity of ranking to full passages or require a specific dense index for each desired level of granularity. Such lack of flexibility...

Time Sensitive Knowledge Editing through Efficient Finetuning

Large Language Models (LLMs) have demonstrated impressive capability in different tasks and are bringing transformative changes to many domains. However, keeping the knowledge in LLMs up-to-date remains a challenge once pretraining is complete. It is thus essential to...