Mistral AI has chosen a ‘mix of experts’ model to challenge GPT 3.5

Mistral AI has chosen a 'mix of experts' model to challenge GPT 3.5


Paris-based startup Mistral AI, recently valued at $2 billion, has released Mistral, an Open Large Language Model (LLM) that it says is extremely efficient and outperforms OpenAI's GPT 3.5 in many parameters.

Mistral received a substantial Series A investment from Andreessen Horowitz (a16z), a venture capital firm known for its strategic investments in transformative technology sectors, particularly AI. Other tech giants like Nvidia and Salesforce also participated in the funding round.

“Mistral is at the center of a small but passionate developer community that thrives around open source AI,” Andreessen Horowitz said in the funding announcement. “Well-tuned models in the community now routinely dominate open-source leaderboards (and even beat closed-source models on some tasks).”

Mistral Spars uses a technique called Mixture of Experts (MoE), which makes the Mistral model more powerful and efficient than the previous Mistral 7b – and even more powerful competitors.

Phemex

A mixture of experts (MoE) is a machine learning method in which developers train or deploy multiple virtual experts to solve complex problems. Each professional model is trained in a specific topic or field. When a problem arises, the model selects a group of experts from a set of agents, and those experts use their training to decide which output best suits their profession.

MoE can improve model capacity, efficiency, and accuracy for deep learning models—the secret sauce that sets Mixtral apart is that it can compete with a model trained on 70 billion parameters using 10 times less modeling.

“Mistral has 46.7B total parameters, but only uses 12.9B parameters per token,” said Mistral AI, “so it processes input and generates output at the same speed and cost as the 12.9B model.”

“Hybrid Llama 2 70B outperforms most standard benchmarks with 6x faster estimation and matches or GPT 3.5,” Mistral A said on the official blog.

Image: Mistral AI

Mixtral is also licensed under the Permissive Apache 2.0 License. This allows developers to freely test, run, improve and even build custom solutions on top of the model.

But there is a debate as to whether or not Mistral is 100% open source, as Mistral says only “open weights” are released, and the license of the core model prohibits it from competing with Mistral AI. The startup did not provide the training data set and the code used to create the model, which will be in an open source project.

The company claims that Mixtral has been fine-tuned to work well in foreign languages ​​in addition to English. “Hybrid 8x7B masters French, German, Spanish, Italian and English” with high scores in standardized multilingual benchmarks, Mistral AI said.

A bespoke version called the Mixtral 8x7B Instruct was also released for its meticulously-followed instructions, earning a high score of 8.3 on the MT-Bench benchmark. This makes it the current best open source model on the benchmark.

Mistral's new model promises a revolutionary mix of expertise, excellent multilingual capabilities, and complete accessibility, and considering that it's only months after its creation, the open source community is having an exciting and exciting time.

Mixtral is available for download in Hug Face, but users can use the online manual.

Edited by Ryan Ozawa.

Stay on top of crypto news, get daily updates in your inbox.

Leave a Reply

Pin It on Pinterest