Tuesday, December 10, 2024

Artificial Intelligence news

How to use Sora,...

MIT Technology Review’s How To series helps you get things done.  Today, OpenAI released its...

The US Department of...

The US Department of Defense has invested $2.4 million over two years...

OpenAI’s new defense contract...

At the start of 2024, OpenAI’s rules for how armed forces might...

Google DeepMind’s new AI...

Google DeepMind has unveiled an AI model that’s better at predicting the...
HomeMachine LearningDo LLMs Internally...

Do LLMs Internally "Know" When They Follow Instructions?



This paper was accepted at the Foundation Model Interventions (MINT) Workshop at NeurIPS 2024.
Instruction-following is crucial for building AI agents with large language models (LLMs), as these models must adhere strictly to user-provided guidelines. However, LLMs often fail to follow even simple instructions. To improve instruction-following behavior and prevent undesirable outputs, we need a deeper understanding of how LLMs’ internal states relate to these outcomes. Our analysis of LLM internal states reveal a dimension in the input embedding space linked to successful…



Article Source link and Credit

Continue reading

Memory-Retaining Finetuning via Distillation

This paper was accepted at the Fine-Tuning in Modern Machine Learning: Principles and Scalability (FITML) Workshop at NeurIPS 2024. Large language models (LLMs) pretrained on large corpora of internet text possess much of the world's knowledge. Following pretraining, one...

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Diffusion models have emerged as a powerful tool for generating high-quality images from textual descriptions. Despite their successes, these models often exhibit limited diversity in the sampled images, particularly when sampling with a high classifier-free guidance weight. To...

Towards Time-Series Reasoning with LLMs

Multi-modal large language models (MLLMs) have enabled numerous advances in understanding and reasoning in domains like vision, but we have not yet seen this broad success for time-series. Although prior works on time-series MLLMs have shown promising performance...