On the next level of generative AI
ORGANIZATIONS, industries and even governments across the world are in a race to embrace artificial intelligence (AI) and unlock its wide array of benefits. The Philippine government is no different, with AI embedded in its plans to transform the country.
Within the multitude of different AI models, generative AI and large language models (LLMs) have significant relevance to businesses today due to their advanced natural language processing capabilities. For example, LLMs empower customers to interact easily with chatbots and virtual assistants for a smoother customer experience. This could be extended to other applications, such as data analytics platforms, to make complex business tasks more accessible.
Generating high-quality written content for marketing teams, at a fraction of the usual time, is another use of LLMs that is currently being explored. In the same way, these models could also write code faster, allowing business users within an organization to perform low-code development and freeing development teams for other critical tasks.
The way generative AI and LLMs perform such tasks is amazing, but for now, it is also not perfect.
Challenges in utilizing LLMs
While LLMs seem to have a human-like understanding of natural language, these models deliver fluent and coherent responses through probabilities. They recognize patterns within their training data, such as when the use of a word or a series of words in a specific order is usually followed by another specific word or set of words.
As an AI model, LLMs are designed to finish the task given to them, which in their case is the answering of a prompt or question. The problem is that because they do not have an understanding of natural language the way humans do, they would answer based only on the patterns they learned from their training data. This leads to LLMs’ responses which are coherent and fluent due to the guidance of the patterns mentioned above, but are either not factual or do not make sense about the question or prompt.
For example, a recent experiment by the Government Technology Agency (GovTech) involved posing a question about their headquarters to ChatGPT. Unfortunately, the tool provided the address of the GovTech Hive building instead of the actual headquarters, highlighting that the AI model does not guarantee correct and up-to-date information in its responses.
LLMs’ hallucinations could arise due to shortcomings in the dataset and training procedures. Two main factors contribute to these hallucinations: overfitting and data quality.
Overfitting occurs when a model is too complex, or if it is trained using noisy data. As a result, the model learns subpar pattern recognition, makes errors in classification and prediction, and generates output that is inaccurate and factually incorrect. Insufficient quality data, characterized by a low signalto-noise ratio, also contributes to poor generalization and inaccurate classifications and predictions, leading to hallucinations.
Addressing AI hallucinations
In addressing hallucinations in LLMs, various techniques could be employed, such as fine-tuning, prompt engineering and RAG.
Fine-tuning retains a model using domain-specific datasets to enhance the relevance of its responses to that particular domain but is considered time-consuming and costly. Prompt engineering, which relies on producing better results through more descriptive features and clarifying information within the input prompt, is also time-consuming, particularly for users of LLMs.
Instead of expending huge amounts of resources through fine-tuning or having LLM users go through a time-consuming process of writing better prompts, organizations that are looking to leverage generative AI could turn to RAG. This framework focuses on grounding LLMs with the most accurate and up-to-date information by retrieving facts from an external knowledge repository, thereby improving the LLM’s responses.
Powering RAG with real-time data
The combination of RAG and realtime data has proven to be highly effective in reducing hallucinations by leveraging updated and contextual data. RAG also enriches language models by incorporating context-specific information, leading to more accurate and relevant responses.
To optimize the effectiveness of the RAG model, it is essential to integrate it with an operational data store capable of storing data in LLMs’ native language, i.e., high-dimensional mathematical vectors known as embeddings. When a user query is received, the database transforms it into a numerical vector, allowing for queries related to relevant papers or passages even if they do not contain the exact terms.
To ensure successful implementation, it is crucial to have a highly available and performant database that can handle substantial amounts of unstructured data through semantic search. This database forms a critical component of the RAG process.
Unlocking the full potential of genAI
As more businesses and industries will leverage generative AI for an increasing variety of use cases, it is crucial to address the issue of model hallucinations. Implementing RAG, coupled with real-time contextual data, could significantly reduce these hallucinations and improve the accuracy and value of AI models.
To ensure the effectiveness and relevance of generative AI, organizations must adopt a data layer that supports both transactional and real-time analytics. By harnessing real-time data, businesses could power dynamic and adaptive AI solutions, making timely decisions and responding instantly to market dynamics.
Genie Yuan is the regional vice president for Asia-Pacific at Couchbase, a source-available, distributed multi-model NoSQL document-oriented database software package optimized for interactive applications.