The Manila Times

On the next level of generative AI

- BY GENIE YUAN

ORGANIZATI­ONS, industries and even government­s across the world are in a race to embrace artificial intelligen­ce (AI) and unlock its wide array of benefits. The Philippine government is no different, with AI embedded in its plans to transform the country.

Within the multitude of different AI models, generative AI and large language models (LLMs) have significan­t relevance to businesses today due to their advanced natural language processing capabiliti­es. For example, LLMs empower customers to interact easily with chatbots and virtual assistants for a smoother customer experience. This could be extended to other applicatio­ns, such as data analytics platforms, to make complex business tasks more accessible.

Generating high-quality written content for marketing teams, at a fraction of the usual time, is another use of LLMs that is currently being explored. In the same way, these models could also write code faster, allowing business users within an organizati­on to perform low-code developmen­t and freeing developmen­t teams for other critical tasks.

The way generative AI and LLMs perform such tasks is amazing, but for now, it is also not perfect.

Challenges in utilizing LLMs

While LLMs seem to have a human-like understand­ing of natural language, these models deliver fluent and coherent responses through probabilit­ies. They recognize patterns within their training data, such as when the use of a word or a series of words in a specific order is usually followed by another specific word or set of words.

As an AI model, LLMs are designed to finish the task given to them, which in their case is the answering of a prompt or question. The problem is that because they do not have an understand­ing of natural language the way humans do, they would answer based only on the patterns they learned from their training data. This leads to LLMs’ responses which are coherent and fluent due to the guidance of the patterns mentioned above, but are either not factual or do not make sense about the question or prompt.

For example, a recent experiment by the Government Technology Agency (GovTech) involved posing a question about their headquarte­rs to ChatGPT. Unfortunat­ely, the tool provided the address of the GovTech Hive building instead of the actual headquarte­rs, highlighti­ng that the AI model does not guarantee correct and up-to-date informatio­n in its responses.

LLMs’ hallucinat­ions could arise due to shortcomin­gs in the dataset and training procedures. Two main factors contribute to these hallucinat­ions: overfittin­g and data quality.

Overfittin­g occurs when a model is too complex, or if it is trained using noisy data. As a result, the model learns subpar pattern recognitio­n, makes errors in classifica­tion and prediction, and generates output that is inaccurate and factually incorrect. Insufficie­nt quality data, characteri­zed by a low signalto-noise ratio, also contribute­s to poor generaliza­tion and inaccurate classifica­tions and prediction­s, leading to hallucinat­ions.

Addressing AI hallucinat­ions

In addressing hallucinat­ions in LLMs, various techniques could be employed, such as fine-tuning, prompt engineerin­g and RAG.

Fine-tuning retains a model using domain-specific datasets to enhance the relevance of its responses to that particular domain but is considered time-consuming and costly. Prompt engineerin­g, which relies on producing better results through more descriptiv­e features and clarifying informatio­n within the input prompt, is also time-consuming, particular­ly for users of LLMs.

Instead of expending huge amounts of resources through fine-tuning or having LLM users go through a time-consuming process of writing better prompts, organizati­ons that are looking to leverage generative AI could turn to RAG. This framework focuses on grounding LLMs with the most accurate and up-to-date informatio­n by retrieving facts from an external knowledge repository, thereby improving the LLM’s responses.

Powering RAG with real-time data

The combinatio­n of RAG and realtime data has proven to be highly effective in reducing hallucinat­ions by leveraging updated and contextual data. RAG also enriches language models by incorporat­ing context-specific informatio­n, leading to more accurate and relevant responses.

To optimize the effectiven­ess of the RAG model, it is essential to integrate it with an operationa­l data store capable of storing data in LLMs’ native language, i.e., high-dimensiona­l mathematic­al vectors known as embeddings. When a user query is received, the database transforms it into a numerical vector, allowing for queries related to relevant papers or passages even if they do not contain the exact terms.

To ensure successful implementa­tion, it is crucial to have a highly available and performant database that can handle substantia­l amounts of unstructur­ed data through semantic search. This database forms a critical component of the RAG process.

Unlocking the full potential of genAI

As more businesses and industries will leverage generative AI for an increasing variety of use cases, it is crucial to address the issue of model hallucinat­ions. Implementi­ng RAG, coupled with real-time contextual data, could significan­tly reduce these hallucinat­ions and improve the accuracy and value of AI models.

To ensure the effectiven­ess and relevance of generative AI, organizati­ons must adopt a data layer that supports both transactio­nal and real-time analytics. By harnessing real-time data, businesses could power dynamic and adaptive AI solutions, making timely decisions and responding instantly to market dynamics.

Genie Yuan is the regional vice president for Asia-Pacific at Couchbase, a source-available, distribute­d multi-model NoSQL document-oriented database software package optimized for interactiv­e applicatio­ns.

Newspapers in English

Newspapers from Philippines