Retrieval Augmented Generation (RAG) 

Contributed by:

Will Freeman
EM PGY-3
Washington University in St. Louis/BJC

Retrieval Augmented Generation (RAG) is an extension of large language models (LLMs) aimed at addressing some of the limitations of large language models, namely their difficulties with hallucinations and with citing sources from which they have drawn generated material. By its name, RAG models supply a reference data set from which an underlying LLM retrieves information in order to provide factual, interpretable results in response to a user query. Another key benefit is that updating reference material does not rely on completely retraining the LLM but instead the retrieval dataset can be updated when new sources of information become available. In doing so, RAG models can be an effective method of utilizing LLMs when answers you need from the LLM must be factual, transparent in their source, and up to date with respect to a changing body of knowledge.

One well-recognized example of this technology in practice is OpenEvidence, which is gaining popularity with physicians and other healthcare workers quickly. While the exact technical specifications and data arrangements behind the technology are proprietary, it works by using RAG technology to reference a database of high quality medical sources including the New England Journal of Medicine, JAMA Network, Cochrane systematic reviews, and other reputable guidelines for medical care. Another available product that allows you to customize your own RAG is Google’s NotebookLM. It provides an easy interface for uploading multiple documents, webpages, or other sources to a repository which can subsequently be queried by the user, with the LLM providing answers based on the uploaded sources. In this environment, a custom RAG could be set up for any number of scenarios. While the number of notebooks that can be generated and sources that can be uploaded are limited based on the plan used, the free tier provides a respectable limit to test various ideas in. Open source tools, such as LangFlow (a “low-code” AI building tool offered by IBM), allow for more custom solutions to be implemented with less restrictions, but doing so would require more technical skills and a description of a custom implementation is beyond the scope of this resource at this time. However, there are several use cases for RAG implemented through NotebookLM which may be of use to emergency clinicians.

Obviously the use of medical knowledge repositories such as Open Evidence could be used to assist clinical judgement on shift for complex cases. While NotebookLM currently reports that it does not use notebooks to train models unless the user provides thumbs up/down feedback, physicians and other health professionals should avoid utilizing the tool for clinical work in order to ensure HIPAA compliance. However the utility of these models is not only limited to medical knowledge and clinical medicine. Below are instructions on how to start a notebook and a few examples on how researchers and educators could utilize RAG:

How to set up custom Notebook in NotebookLM:
  1. Navigate to notebooklm.google.com and login with a valid google account.
  2. Click Create new to Start a new Notebook

NotebookLM Dashboard

  1. Upload sources from files, websites, and other sources (Note: using google drive to upload will allow you to easily open the original file, while uploading pdfs and word documents directly will not).
NotebookLM interface displaying a dialog for adding sources from the web, files, Google Drive, websites, or pasted text to create AI-generated audio and video overviews.
  1. See your first notebook created with an AI generated summary of your uploaded resources.
RAG Model Prompt 1
  1. For educators, a course assistant or “TA” RAG model might be set up so students could reference course materials easily. Students can easily ask varied questions of the model and get summaries with citations linking to the sources from where that information originated.
RAG Model Prompt 2RAG Model Prompt 3

While uploading all information sources without safe guards could lead to students querying for the direct answers to homework, essays and take home exams, there are examples of professors in other disciplines coaching the LLM to only give hints and point students in the right direction, rather than answering questions outright. In NotebookLM, chats can be custom configured so prompts could be directed to not directly answer questions but direct students toward the best references to answer their questions. By clicking the “Configure Notebook” sliders button at the top of the chat window, we can customize the LLM output.

RAG Model Chat header

For example, we can prompt the LLM to avoid directly answering student questions and instead ask questions to probe understanding and encourage students to refer to the most useful sources in order to answer their questions.

RAG Model Prompt 4

You can see how to the output from the model changes with these new instructions.

RAG Model Prompt 5

Additionally, using the “Studio,” educators can generate resources for students to use such as study materials like flashcards and quizzes, or Mindmaps that simplify content to the most pertinent topics.

RAG Model Mindmap

While these materials may be somewhat derivative and often test rote fact memorization, they may be a useful starting place for time-strapped educators, where often overcoming the hurdle of a first draft is the biggest barrier to delivering new content.

  1. For more administrative educational materials, these RAG models could help students or residents reference rotation instructions, policies and guidelines. This might be especially helpful when onboarding new residents or off-service residents.
RAG Model Prompt 6

Similarly, for clinical guidelines that are uncommonly referenced, this could be a manner by which clinical workers could quickly reference the policy in a natural way when finding the pdf in an email a long time ago might be impossible.

RAG Model Prompt 7
  1. Researchers could develop a knowledge base for their research area based on papers they have read, and might effectively use a RAG system in multiple ways. Models can easily be used to find the sources whose content you remembered, but cannot easily attribute to a specific source.
RAG Model Prompt 8

At a higher level, these models might help researchers to identify notable gaps in current research that may lead to more persuasive grant applications of novel areas of investigation.

RAG Model Prompt 9RAG Model Prompt 9 ContinuedRAG Model Prompt 9 Continued