Friday, February 7, 2025

Artificial Intelligence news

These documents are influencing...

Reports from the US Government Accountability Office on improper federal payments in...

Reframing digital transformation through...

Enterprise adoption of generative AI technologies has undergone explosive growth in the...

An AI chatbot told...

For the past five months, Al Nowatzki has been talking to an...

What’s next for smart...

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies...
HomeNewsMeta’s latest AI...

Meta’s latest AI model is free for all 


Meta is going all in on open-source AI. The company is today unveiling LLaMA 2, its first large language model that’s available for anyone to use—for free. 

Since OpenAI released its hugely popular AI chatbot ChatGPT last November, tech companies have been racing to release models in hopes of overthrowing its supremacy. Meta has been in the slow lane. In February when competitors Microsoft and Google announced their  AI chatbots, Meta rolled out the first, smaller version of LLaMA, restricted to researchers. But it hopes that releasing LLaMA 2, and making it free for anyone to build commercial products on top of, will help it catch up. 

The company is actually releasing a suite of AI models, which include versions of LLaMA 2 in different sizes, as well as a version of the AI model that people can build into a chatbot, similar to ChatGPT. Unlike ChatGPT, which people can access through OpenAI’s website, the model must be downloaded from Meta’s launch partners Microsoft Azure, Amazon Web Services, and Hugging Face.

“This benefits the entire AI community and gives people options to go with closed-source approaches or open-source approaches for whatever suits their particular application,” says Ahmad Al-Dahle, a vice president at Meta who is leading the company’s generative AI work. “This is a really, really big moment for us.”

But many caveats still remain. Meta is not releasing information about the data set that it used to train LLaMA 2 and cannot guarantee that it didn’t include copyrighted works or personal data, according to a company research paper shared exclusively with MIT Technology Review. LLaMA 2 also has the same problems that plague all large language models: a propensity to produce falsehoods and offensive language. 

The idea, Al-Dahle says, is that by releasing the model into the wild and letting developers and companies tinker with it, Meta will learn important lessons about how to make its models safer, less biased, and more efficient. 

A powerful open-source model like LLaMA 2 poses a considerable threat to OpenAI, says Percy Liang, director of Stanford’s Center for Research on Foundation Models. Liang was part of the team of researchers who developed Alpaca, an open-source competitor to GPT-3, an earlier version of OpenAI’s language model. 

“LLaMA 2 isn’t GPT-4,” says Liang. And in its research paper, Meta admits there is still a large gap in performance between LLaMA 2 and GPT-4, which is now OpenAI’s state-of-the-art AI language model. “But for many use cases, you don’t need GPT-4,” he adds. 

A more customizable and transparent model, such as LLaMA 2, might help companies create products and services faster than a big, sophisticated proprietary model, he says. 

“To have LLaMA 2 become the leading open-source alternative to OpenAI would be a huge win for Meta,” says Steve Weber, a professor at the University of California, Berkeley.   

Under the hood

Getting LLaMA 2 ready to launch required a lot of tweaking to make the model safer and less likely to spew toxic falsehoods than its predecessor, Al-Dahle says. 

Meta has plenty of past gaffes to learn from. Its language model for science, Galactica, was taken offline after only three days, and its previous LlaMA model, which was meant only for research purposes, was leaked online, sparking criticism from politicians who questioned whether Meta was taking proper account of the risks associated with AI language models, such as disinformation and harassment. 

To mitigate the risk of repeating these mistakes, Meta applied a mix of different machine learning techniques aimed at improving helpfulness and safety. 

Meta’s approach to training LLaMA 2 had more steps than usual for generative AI models, says Sasha Luccioni, a researcher at AI startup Hugging Face. 

The model was trained on 40% more data than its predecessor. Al-Dahle says there were two sources of training data: data that was scraped online, and a data set fine-tuned and tweaked according to feedback from human annotators to behave in a more desirable way. The company says it did not use Meta user data in LLaMA 2, and excluded data from sites it knew had lots of personal information. 

Despite that, LLaMA 2 still spews offensive, harmful, and otherwise problematic language, just like rival models. Meta says it did not remove toxic data from the data set, because leaving it in might help LLaMA 2 detect hate speech better, and removing it could risk accidentally filtering out some demographic groups.  

Nevertheless, Meta’s commitment to openness is exciting, says Luccioni, because it allows researchers like herself to study AI models’ biases, ethics, and efficiency properly. 

The fact that LLaMA 2 is an open-source model will also allow external researchers and developers to probe it for security flaws, which will make it safer than proprietary models, Al-Dahle says. 

Liang agrees. “I’m very excited to try things out and I think it will be beneficial for the community,” he says. 



Article Source link and Credit

Continue reading

Four Chinese AI startups to watch beyond DeepSeek

The meteoric rise of DeepSeek—the Chinese AI startup now challenging global giants—has stunned observers and put the spotlight on China’s AI sector. Since ChatGPT’s debut in 2022, the country’s tech ecosystem has been in relentless pursuit of homegrown...

Three things to know as the dust settles from DeepSeek

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here. The launch of a single new AI model does not normally cause much of a stir...

Anthropic has a new way to protect large language models against jailbreaks

AI firm Anthropic has developed a new line of defense against a common kind of attack called a jailbreak. A jailbreak tricks large language models (LLMs) into doing something they have been trained not to, such as help...