MICROSOFT LAUNCHES PHI-3: WHAT ARE SLMS, AND WHAT CAN THEY DO?

2024-04-29 - BIJIN JOSE

MICROSOFT last week unveiled the latest version of its lightweight AI model, the Phi-3-mini, reportedly the first among three small language models (SLMS) that the company plans to release.

SLMS are compact versions of large language models (LLMS), which can comprehend and generate human language text. The new model expands the selection of high-quality models for customers, offering more practical choices as they build generative AI applications.

Phi-3-mini basics

Phi-3-mini is an SLM that has 3.8 billion parameters (a measure of the size and complexity of an AI model), and is trained on a data set that is smaller in comparison to LLMS such as Open AI’S GPT-4.

The AI model is instruction-tuned, which means it can follow a range of instructions given by the user, reflecting how people normally communicate.

It is the first model in its class to support a context window — which helps an AI model recall information during a session — of up to 128,000 tokens, with little impact on quality. A token is the fundamental unit of data used by a language model to process and generate text.

Microsoft has described Phi-3 as a family of open AI models that are the most capable and cost-effective SLMS available. In the coming weeks, the company will add more models, including Phi-3-small and Phi-3-medium, to the Phi-3 family.

Different from LLMS

SLMS like Phi-3-mini are more streamlined versions of LLMS. When compared to LLMS, smaller AI models are cost-effective to operate and perform better on smaller devices like laptops and smartphones.

While LLMS are trained on massive general data, SLMS stand out with their specialisation. Through fine-tuning, SLMS can be customised for specific tasks — achieving accuracy and efficiency in the process. Most SLMS undergo targeted training, which demands considerably less computing power and energy compared to LLMS.

SLMS also differ from LLMS when it comes to inference latency, which is the time taken for a model to make predictions or decisions after receiving input. The compact size of SLMS allows for quicker processing, making them more responsive and apt for real-time applications such as virtual assistants and chatbots. As developing and deploying SLMS is often cheaper than LLMS, organisations and research groups with limited budgets prefer using them.

In a blog post, Microsoft mentioned use cases from India: “ITC, a leading business conglomerate based in India, is leveraging Phi-3 as part of their continued collaboration with Microsoft on the copilot for Krishi Mitra, a farmer-facing app that reaches over a million farmers.”

Phi-3-mini’s capabilities

Phi-3-mini has outperformed models of the same size and next size across a variety of benchmarks in areas like language, reasoning, coding, and maths, according to Microsoft.

It requires less computational power and has much better latency. With longer context windows, it is capable of taking in and reasoning over large text content such as documents, web pages, and code. Microsoft claims that Phi-3-mini demonstrates strong reasoning and logic capabilities, making it ideal for analytical tasks.

MICROSOFT LAUNCHES PHI-3: WHAT ARE SLMS, AND WHAT CAN THEY DO?

Phi-3-mini basics

Different from LLMS

Phi-3-mini’s capabilities

Newspapers in English

Newspapers from India

MICROSOFT LAUNCHES PHI-3: WHAT ARE SLMS, AND WHAT CAN THEY DO?

Phi-3-mini basics

Different from LLMS

Phi-3-mini’s capabiliti­es

Newspapers in English

Newspapers from India

Phi-3-mini’s capabilities