New Qwen2 AI Model from Alibaba to Challenge Meta, OpenAI

New Qwen2 AI Model from Alibaba to Challenge Meta, OpenAI


Alibaba, the Chinese e-commerce giant, is a major player in China's AI sphere. Today, it announced the release of its latest AI model, Qwen2—and by some measures, it's the best open source option of the moment.

Developed by Alibaba Cloud, Qwen2 is the next generation of Tongyi Qianwen (Qwen) model series, which consists of Tongyi Qianwen LLM (also known as Qwen), visual AI model Qwen-VL and Qwen-Audio.

The Qwen model family is pre-trained on multilingual data covering various industries and domains, and the Qwen-72B is the most powerful model in the series. It is trained on a staggering 3 trillion tokens. In comparison, the meta's most powerful Llama-2 variant is based on 2 trillion tokens. But Llama-3 is mining 15 trillion tokens.

According to a recent blog post by the Qwen team, Qwen2 can handle 128K context signals—comparable to GPT-4o with OpenAI. Qwen2, meanwhile, outperforms Meta LLama3 in essentially all of the most important synthetic benchmarks, the team asserts, making it the best open-source model currently available.

Phemex

However, it is worth noting that the independent Elo Arena Qwen2-72B-Instruct is slightly better than GPT-4-0314 but below Lama3 70B and GPT-4-0125 previews, making it the second most popular open source LLM among humans. . Testers to date.

Qwen2 performs better than Llama3, Mixtral and Qwen1.5 in composite benchmarks. Image: Alibaba Cloud

Qwen2 is available in five different sizes, ranging from 0.5 billion to 72 billion standards, and the release offers significant improvements in a variety of areas of expertise. The models are also trained with data in 27 more languages ​​than the previous version, including German, French, Spanish, Italian and Russian in addition to English and Chinese.

“Compared to state-of-the-art open source language models, including the previous release Qwen1.5, Qwen2 outperformed most open source models overall and demonstrated competitiveness on a series of proprietary models targeting language comprehension, language generation, multilingualism, coding, math, and reasoning.” said the Qwen team on the model's official page, HuggingFace.

Qwen2 models show remarkable understanding of long contexts. Qwen2-72B-Instruct can handle data extraction tasks anywhere in the context without error, and it passes the “Needle in a Haystack” test almost perfectly. This is important, because normally, the performance of a model begins to decline as we interact with it.

Qwen2 is amazing at "A needle in the haystack" Challenge Image: Alibaba Cloud
Qwen2 performs admirably in the “Needle in a Haystack” test. Image: Alibaba Cloud

With this release, the Qwen team has changed the permissions for the models. While the Qwen2-72B and its modified models continue to use the original Qianwen license, all other models have adopted Apache 2.0, the standard in the open source software world.

“In the near future, we will continue to open source new models to accelerate open source AI,” Alibaba Cloud said in an official blog post.

Decrypt tested the model and found it capable of understanding tasks in multiple languages. The model has also been censored, especially for themes considered sensitive in China. This seems to be in line with Alibaba's Qwen2 claim that it is less likely to provide unsafe results – illegal activity, fraud, pornography and privacy attacks – no matter what language it is prompted for.

Qwen2's response: Is Taiwan a country?
Qwen2's response to: “Is Taiwan a country?”
ChatGPT's response: Is Taiwan a country?
ChatGPT's response: “Is Taiwan a country?”

It also has a good understanding of system questions, which means that the conditions applied will have a stronger effect on the answers. For example, when asked to act as a helpful assistant who has legal knowledge and as an expert lawyer who always responds based on the law, the responses showed major differences. It gave advice similar to that given by GPT-4o, but was more concise.

Qwen2's answer: The neighbor insulted me.
Qwen2's answer: “The neighbor insulted me.”
ChatGPT's response to: "My neighbor insulted me."
ChatGPT's response: “Neighbor insulted me.”

The next model update will bring multimodality to the Qwen2 LLM, potentially merging the entire family into one powerful model, the team said. “We will also extend the Qwen2 language models to multimodal, visual and audio data understanding,” he added.

Qwen is available to try online through HuggingFace Spaces. People with enough computers to run it locally can download the weights for free, also via HuggingFace.

The Qwen2 model could be a good option for those willing to bet on open source AI. It has a larger simulation context window than the other models, making it more capable than the Meta Lama 3. Also, because of the license, other well-edited shared versions can be improved on top of it, further increasing the output and overcoming bias.

Edited by Ryan Ozawa.

Generally intelligent newspaper

A weekly AI journey narrated by a generative AI model.

Leave a Reply

Pin It on Pinterest