Why Retrieval-Augmented Generation Is Still Relevant in the Era of Long-Context Language Models | by Jérôme DIAZ | Dec, 2024


In this article we will explore why 128K tokens (and more) models can’t fully replace using RAG.

We’ll start with a brief reminder of the problems that can be solved with RAG, before looking at the improvements in LLMs and their impact on the need to use RAG.

Illustration by the author.

RAG isn’t really new

The idea of injecting a context to let a language model get access to up-to-date data is quite “old” (at the LLM level). It was first introduced by Facebook AI/Meta researcher in this 2020 paper “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks”. In comparison the first version of ChatGPT was only released on November 2022.

In this paper they distinguish two kind of memory:

  • the parametric one, which is what is inherent to the LLM, what it learned while being fed lot and lot of texts during training,
  • the non-parametric one, which is the memory you can provide by feeding a context to the prompt.
Read Also:  An Agentic Approach to Reducing LLM Hallucinations | by Youness Mansar | Dec, 2024

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top