Deep Learning Architecture: Naive Retrieval-Augmented Generation(RAG)

Lionel Owono
FAUN — Developer Community 🐾
3 min readSep 26, 2024

Introduction

Since 2021, the world has been swept by a huge wave called ChatGPT. Suddenly, terms like Large Language Models (LLMs), Artificial Intelligence, and Machine Learning became familiar to almost everyone. People quickly realized how powerful LLMs can be.

However, despite their strength, LLMs have limitations, especially when asked about recent information, like today’s news. This is because they aren’t trained on the latest updates. To solve this, Retrieval-Augmented Generation (RAG) systems were created.

RAG systems became popular alongside the rise of Transformer models, a breakthrough in deep learning that introduced the concept of “attention” and parallel processing. This allowed models to understand sequences of information much more effectively, fixing problems seen in older architectures like RNNs and LSTMs. As LLM-based products like ChatGPT became widespread, research into RAG systems focused on how to improve the information they retrieve and generate.

There are three main types of RAG architectures:

  • Naive RAG
  • Advanced RAG
  • Modular RAG

In this article, we’ll explore Naive RAG, the simplest form of this architecture.

How Naive RAG Works

Naive RAG operates straightforwardly, following three main steps: indexing, retrieving, and generating.

Indexing

  • The system gathers information from various sources such as internal documents, documents from the internet and different formats like PDFs, Word documents, or text files.
  • This information is split into smaller chunks, which are then encoded into vectors and stored in a vector database. This process prepares the system to efficiently retrieve data later.

Retrieving

  • When a user submits a query, the system converts the question into a vector, similar to how it encoded the indexed chunks.
  • It then searches the vector database for the most relevant chunks, selecting the top matches that closely align with the query.

Generating

  • The selected chunks, combined with the user’s query, are sent to an LLM (Large Language Model) to generate a coherent response.
  • The model may use the retrieved chunks, the system’s internal data, or a combination of both to craft the final answer.
  • In a chat scenario, the system also keeps track of conversation history to ensure continuity.

While this setup seems simple, it comes with certain limitations.

Limitations of Naive RAG

Naive RAG has strengths in its simplicity, but it also faces key challenges in its retrieval and generation steps:

Retrieval Challenges

  • Precision and recall can be problematic. The system may sometimes retrieve irrelevant or misaligned chunks of information, leading to incomplete or incorrect answers.

Generation Challenges

  • The system may generate hallucinations, where the output includes information not present in the retrieved documents.
  • There can also be issues with irrelevance, bias, or toxicity in the generated responses, especially if the indexed data contains such elements.

Augmentation Challenges:

  • Combining retrieved information from multiple sources can be tricky. The system may produce redundant or incoherent outputs when it pulls from different places.
  • It can also struggle with complex queries that require more nuanced context, since Naive RAG uses a single retrieval step, which may not always provide the full picture.

Conclusion

Naive RAG is built on a promising idea: augmenting the knowledge of language models by pulling in fresh information from external sources, rather than relying solely on training data. This approach has the potential to produce more relevant responses, but in practice, it faces challenges with retrieval accuracy and generation quality.

In the next article, we’ll explore Advanced RAG, a more refined architecture designed to overcome some of these issues.

👋 If you find this helpful, please click the clap 👏 button below a few times to show your support for the author 👇

🚀Join FAUN Developer Community & Get Similar Stories in your Inbox Each Week

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in FAUN — Developer Community 🐾

We help developers learn and grow by keeping them up with what matters. 👉 www.faun.dev

Written by Lionel Owono

Jesus’s disciple, beloved husband, beloved father and passionated scientist.

No responses yet

What are your thoughts?