Technology

Microsoft and NVIDIA Revolutionise AI with New Advanced Language Models

Published by
Samuel Bolaji

Microsoft and NVIDIA have each introduced pioneering language models, leveraging innovative techniques to enhance performance and efficiency. Microsoft’s Phi-3.5 series and NVIDIA’s Mistral-NeMo-Minitron 8B stand out as leading examples of the latest advancements in AI technology.

Microsoft’s Phi-3.5: Innovating with Mixture of Experts

Microsoft has unveiled its new Phi-3.5 family of language models, marking a notable advancement in AI capabilities. The series includes three variants: Phi-3.5-Vision, Phi-3.5-MoE (Mixture of Experts), and Phi-3.5-Mini. Notably, Phi-3.5-MoE is Microsoft’s first foray into using Mixture of Experts technology, a method that allows the model to selectively engage different parts of its neural network, enhancing efficiency and output quality.

The Mixture of Experts approach enables the Phi-3.5-MoE model to operate with only 6.6 billion active parameters, despite involving sixteen underlying models or “experts.” This selective activation allows the model to perform at a level comparable to more complex systems, such as GPT-4o-mini, while remaining leaner and more computationally efficient. This technological innovation not only reduces the computational power required for training but also offers significant cost savings. For instance, the Phi-3.5-MoE was trained on 4.9 trillion tokens using 512 H100 GPUs, demonstrating its capability to handle extensive datasets with relative ease.

NVIDIA’s Mistral-NeMo-Minitron 8B: Efficiency Through Pruning and Distillation

Meanwhile, NVIDIA has introduced the Mistral-NeMo-Minitron 8B, a streamlined version of its earlier Mistral NeMo 12B model. The Mistral-NeMo-Minitron 8B employs a sophisticated method of model optimisation known as width pruning, combined with knowledge distillation. This technique refines the model by reducing its complexity without sacrificing performance.

Width pruning works by narrowing down the neural network, focusing on essential components while eliminating redundancies. NVIDIA achieved this by pruning both the embedding and MLP intermediate dimensions of the Mistral NeMo 12B model. Subsequent knowledge distillation allowed NVIDIA to train the smaller Minitron 8B model to retain much of the predictive accuracy of its larger predecessor. This method reduces the training dataset size by a factor of more than 40, making the process both cost-effective and environmentally friendly.

Implications for the Future of AI

Both Microsoft and NVIDIA’s approaches represent significant progress in AI model development, focusing on creating more powerful, efficient, and adaptable systems. Microsoft’s Phi-3.5-MoE showcases the potential of Mixture of Experts technology in improving model performance while keeping resource demands manageable. Similarly, NVIDIA’s Mistral-NeMo-Minitron 8B highlights the effectiveness of pruning and distillation in developing scalable, high-performance AI models.

These advancements indicate a promising future where AI systems can be tailored to specific tasks, improving accuracy and efficiency while minimising resource consumption. The widespread availability of these models on platforms like Hugging Face also suggests a democratisation of AI technology, enabling developers to build and customise applications more readily.

As the AI landscape continues to evolve, the innovations introduced by Microsoft and NVIDIA set the stage for even more advanced models, pushing the boundaries of what artificial intelligence can achieve. With these developments, AI is poised to become even more integral to various industries, driving innovation and transforming how we interact with technology.

Samuel Bolaji

Samuel Bolaji, an alumnus/Scholar of the Commonwealth Scholarship Commission, holds a Master of Letters in Publishing Studies from the University of Stirling, Scotland, United Kingdom, and a Bachelor of Arts in English from the University of Lagos, Nigeria. He is an experienced researcher, multimedia journalist, writer, and Editor. Ex-Chief Correspondent, ex-Acting Op-Ed Editor, and ex-Acting Metro Editor at The PUNCH Newspaper, Samuel is currently the Editor at Arbiterz.

Recent Posts

Netherlands Digitalizes Visa Application Process

VFS Global has announced that starting from August 4, 2025, all Netherlands Visa Application Centres (VACs) will only… Read More

2 hours ago

Dangote Cement to Commision Grinding Plant in Ivory Coast

The management of Dangote Cement has announced that it will commission the 3Mta grinding plant… Read More

3 hours ago

Vitafoam Nigeria Posts ₦9.37 Billion Profit in Year Ending June 2025, Reverses N2.88 Billion Loss Recorded in 2024

Vitafoam Nigeria Plc has reported a profit after tax of ₦9.37 billion for the nine… Read More

4 hours ago

MTN Nigeria Posts N414.9 bn Net Income in H1 2025 a Rebound from N519.1 bn loss in H1 2024

MTN Nigeria has delivered a stunning turnaround in H1 2025, recording a net income of N414.9 billion,… Read More

5 hours ago

Sterling HoldCo Reports 157% Surge in H1 Profit After Tax to ₦41.78bn, Plans ₦53bn in Public Offer to Bolster Capital Base

Sterling Financial Holdings Company Plc (Sterling HoldCo) has reported a profit after tax of ₦41.78… Read More

6 hours ago

FirstHoldco Q2 2025 Pretax Profit Falls 4.58% to ₦169.6 bn amid 61.92% Surge in Interest Income

FirstHoldco Plc has delivered a pretax profit of ₦169.6 billion in Q2 2025, representing a 4.58%… Read More

6 hours ago