Microsoft Launches Phi-3: A New Family of Compact Language Models
Ryan Daws is a senior editor at TechForge Media with a background spanning over a decade in tech journalism. His expertise lies in identifying the latest technological trends, dissecting complex topics, and weaving compelling narratives around the most cutting-edge developments. His articles and interviews with leading industry figures have gained him recognition as a key influencer by organisations such as Onalytica. Publications under his stewardship have since gained recognition from leading analyst houses like Forrester for their performance. Find him on X (@gadget_ry) or Mastodon (@gadgetry@techhub.social)
Microsoft has announced the Phi-3 family of open small language models (SLMs), touting them as the most capable and cost-effective of their size available. The innovative training approach developed by Microsoft researchers has allowed the Phi-3 models to outperform larger models on language, coding, and math benchmarks.
“What we’re going to start to see is not a shift from large to small, but a shift from a singular category of models to a portfolio of models where customers get the ability to make a decision on what is the best model for their scenario,” said Sonali Yadav, Principal Product Manager for Generative AI at Microsoft.
The first Phi-3 model, Phi-3-mini at 3.8 billion parameters, is now publicly available in Azure AI Model Catalog, Hugging Face, Ollama, and as an NVIDIA NIM microservice. Despite its compact size, Phi-3-mini outperforms models twice its size. Additional Phi-3 models like Phi-3-small (7B parameters) and Phi-3-medium (14B parameters) will follow soon.
phi-3-mini: 3.8B model matching Mixtral 8x7B and GPT-3.5
Plus a 7B model that matches Llama 3 8B in many benchmarks.
Plus a 14B model. https://t.co/2h0xahzUUS
“Some customers may only need small models, some will need big models and many are going to want to combine both in a variety of ways,” said Luis Vargas, Microsoft VP of AI.
The key advantage of SLMs is their smaller size enabling on-device deployment for low-latency AI experiences without network connectivity. Potential use cases include smart sensors, cameras, farming equipment, and more. Privacy is another benefit by keeping data on the device.
Large language models (LLMs) excel at complex reasoning over vast datasets—strengths suited to applications like drug discovery by understanding interactions across scientific literature. However, SLMs offer a compelling alternative for simpler query answering, summarisation, content generation, and the like.
“Rather than chasing ever-larger models, Microsoft is developing tools with more carefully curated data and specialised training,” commented Victor Botev, CTO and Co-Founder of Iris.ai.
“This allows for improved performance and reasoning abilities without the massive computational costs of models with trillions of parameters. Fulfilling this promise would mean tearing down a huge adoption barrier for businesses looking for AI solutions.”
What enabled Microsoft’s SLM quality leap was an innovative data filtering and generation approach inspired by bedtime story books.
“Instead of training on just raw web data, why don’t you look for data which is of extremely high quality?” asked Sebastien Bubeck, Microsoft VP leading SLM research.
Ronen Eldan’s habit of reading to his daughter every night inspired him to create a ‘TinyStories’ dataset. This dataset contains millions of straightforward narratives. The narratives were formed by prompting a large model with word combinations understandable by a four-year-old. Astonishingly, a model with 10M parameters, trained using TinyStories, could produce fluent narratives with impeccable grammar.
After their initial success, the team gathered high-quality internet data. They made sure these data are suitable for educational purposes, and used them to form the ‘CodeTextbook’ dataset. The process for creating this dataset involves numerous phases of prompting, generating, and filtering by both human beings and large AI models.
“Not all of what we produce is utilised,” says Bubeck. “We put a lot of thought and effort into creating synthetic data.”
The first-rate training data had a transformative effect. According to Bubeck, “The language model finds reading and understanding material that resembles textbook content much simpler.”
Despite the thoughtful data curation, Microsoft stresses the application of extra safety measures to the Phi-3 release – mirroring its typical protocols for all generative AI models.
“Just as with each release of generative AI models, the product and responsible AI teams at Microsoft harnessed a multi-layered strategy to manage and mitigate risks associated with the development of Phi-3 models,” as quoted in a blog entry.
This incorporated additional training instances to enhance anticipated behaviours, evaluations to spot susceptibilities via red-teaming, and delivering Azure AI instruments for clients to develop trustworthy applications on top of Phi-3.
See also: Microsoft to forge AI partnerships with South Korean tech leaders
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
Tags: ai, artificial intelligence, language models, microsoft, open source, phi-3, small language models
You must be logged in to post a comment.
Discover the pinnacle of WordPress auto blogging technology with AutomationTools.AI. Harnessing the power of cutting-edge AI algorithms, AutomationTools.AI emerges as the foremost solution for effortlessly curating content from RSS feeds directly to your WordPress platform. Say goodbye to manual content curation and hello to seamless automation, as this innovative tool streamlines the process, saving you time and effort. Stay ahead of the curve in content management and elevate your WordPress website with AutomationTools.AI—the ultimate choice for efficient, dynamic, and hassle-free auto blogging. Learn More