Monday, June 24, 2024

Artificial Intelligence news

Synthesia’s hyperrealistic deepfakes will...

Startup Synthesia’s AI-generated avatars are getting an update to make them even...

How underwater drones could...

A potential future conflict between Taiwan and China would be shaped by...

How generative AI could...

First, a confession. I only got into playing video games a little...

I tested out a...

This story first appeared in China Report, MIT Technology Review’s newsletter about...
HomeMachine LearningRobustness in Multimodal...

Robustness in Multimodal Learning under Train-Test Modality Mismatch



Multimodal learning is defined as learning over multiple heterogeneous input modalities such as video, audio, and text. In this work, we are concerned with understanding how models behave as the type of modalities differ between training and deployment, a situation that naturally arises in many applications of multimodal learning to hardware platforms. We present a multimodal robustness framework to provide a systematic analysis of common multimodal representation learning methods. Further, we identify robustness short-comings of these approaches and propose two intervention techniques leading…



Article Source link and Credit

Continue reading

Conformer-Based Speech Recognition on Extreme Edge-Computing Devices

This paper was accepted at the Industry Track at NAACL 2024. With increasingly more powerful compute capabilities and resources in today’s devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect...

AGRaME: Any Granularity Ranking with Multi-Vector Embeddings

Ranking is a fundamental and popular problem in search. However, existing ranking algorithms usually restrict the granularity of ranking to full passages or require a specific dense index for each desired level of granularity. Such lack of flexibility...

Time Sensitive Knowledge Editing through Efficient Finetuning

Large Language Models (LLMs) have demonstrated impressive capability in different tasks and are bringing transformative changes to many domains. However, keeping the knowledge in LLMs up-to-date remains a challenge once pretraining is complete. It is thus essential to...