The Sunday Mail (Zimbabwe)

Understand­ing large language models

THIS past week, I listened to Bill Gates respond to CNN internatio­nal correspond­ent Larry Madowo’s question about whether he genuinely thought artificial intelligen­ce (AI) was a game changer.

- ◆ John Tseriwa is a tech entreprene­ur and a digital transforma­tion advocate focusing on delivering business solutions powered by Fourth Industrial Revolution technologi­es. He can be contacted at: info@johntseriw­a.com or +2637732898­02. 4IR Simplified John

HIS response inspired me to write this article. “Well, it’s very early days in AI,”

Gates said.

“And that’s why it’s so impressive that already we see these 50 innovators here in Africa after the first LLM breakthrou­gh, uh, ChatGPT.”

Gates used the abbreviati­on LLM as if it was a common word, but how many people know what it stands for or what it means?

So, this week, I will talk about large language models (LLMs). In simple terms, an LLM is a type of AI that can generate and understand text. It is like a super-smart computer that can read and write like a human.

LLMs are trained on massive amounts of text data. This data can include books, articles, websites and even code. The more data an LLM is trained on, the better it will be at understand­ing and generating text.

LLMs are deep learning algorithms that can recognise, summarise, translate, predict and generate content using very large datasets.

LLMs are also called neural networks (NNs), computing systems inspired by the human brain. These neural networks work using a network of layered nodes, much like neurons.

LLMs are based on transforme­r models and are trained on vast amounts of data, which makes them large. This allows them to understand, translate, predict or create text or other content.

Another term is transforme­r model. A transforme­r model is the most common structure of an LLM. It has an encoder and a decoder.

A transforme­r model processes data by breaking the input into tokens and then performing mathematic­al calculatio­ns to determine how the tokens are related. This lets the computer see the patterns a human would notice if they had the same query.

Before an LLM can process text input and generate output content, it must undergo training and fine-tuning.

Training

LLMs are pre-trained on vast amounts of text data from sources like Wikipedia and GitHub.

These data contain trillions of words, and their quality affects the model’s performanc­e. At this stage, the LLM learns without specific instructio­ns. It learns the meaning and the relationsh­ips of words and how to use them in different contexts. For example, it learns to understand whether “leave” means “a day off ” or “to depart”.

Fine-tuning

To make an LLM do a specific task, such as translatio­n, it must be fine-tuned. Fine-tuning improves the model’s performanc­e for specific tasks. Prompt-tuning is like fine-tuning, but it uses fewer or no examples to train the model for a specific task. It uses natural language prompts to guide the model’s output.

LLMs are the power behind generative AI, like ChatGPT, which can generate text based on inputs. They can produce an example of text when prompted. For example: “Tell me about the Zimbabwean Economy.”

Large language models empower customer service chatbots or conversati­onal AI to interact with customers, understand the intent and meaning of their questions or feedback, and provide relevant and helpful answers in return.

Challenges of LLMs

LLMs are not perfect, and they face a variety of challenges, including:

1. Hallucinat­ions: LLMs can sometimes generate false outputs that do not match the user’s intent. This is because LLMs are trained to predict the next word or phrase that is grammatica­lly correct, but sometimes, they cannot fully understand human meaning.

2. Bias: LLMs are trained on massive datasets of text, which can reflect the biases of the real world. This means LLMs can sometimes generate outputs that are biased or offensive.

3. Safety: LLMs can generate harmful content, such as disinforma­tion, hate speech and spam. Developing safeguards to prevent LLMs from being used for malicious purposes is essential.

Despite these challenges, LLMs are a powerful new technology with the potential to revolution­ise the way we interact with computers.

Researcher­s are working on ways of mitigating the challenges of LLMs. As LLMs continue to improve, they are likely to become increasing­ly helpful and widespread.

 ?? ?? An LLM is a type of AI that can generate and understand text
An LLM is a type of AI that can generate and understand text
 ?? ??

Newspapers in English

Newspapers from Zimbabwe