All About RAG: What It Is and How to Keep It Secure

Table of Contents

AI is growing in power and scope and many organizations have moved on from “simply” training models. In this blog, we will cover a common system of LLM use called Retrieval-Augmented Generation (RAG).

RAG adds some extra steps to typical use of a large language model (LLM) so that instead of working off just the prompt and its training data, the LLM has additional, usually more up-to-date, data “fresh in mind”.

It’s easy to see how huge this can be for business; being able to reference current company data without having to actually train an AI model on it has many, many useful applications. 

How does RAG work?

RAG requires orchestration of two models, an embedder and a generator. A typical RAG system starts with a user query and a corpus of data such as company PDFs or Word documents.

Here’s how a typical architecture works:

During a pre-processing stage, the corpus is processed by an AI model called an embedder which transforms the documents into vectors of semantic meaning instead of plain words. Technically speaking, this stage is optional, but it makes things a lot faster if the documents are pre-processed and accessed from a vector database, rather than processed at runtime.

When a user query comes in, the prompt is also fed to the embedder, for the same reason.

Next, the embedded user query is used by a retrieval system to pull relevant pieces of text from the pre-embedded corpus. The retrieval system returns with a ranked set of relevant vectors.

The embedded user query and relevant documents are fed into a generative AI model, specifically a pre-trained large language model (LLM), which then combines the user query and retrieved documents to form a relevant and coherent output.

Security risks with RAG

The two biggest risks associated with RAG systems are poisoned databases and the leakage of sensitive data or personally identifiable information (PII). We’ve already seen instances where malicious actors manipulate databases by inserting harmful data. Attackers can skew the system’s outputs by making their data disproportionately influential, effectively controlling the AI’s responses, which poses a serious security threat.

When implementing RAG, it’s essential to ask key questions: What models are you using for embedding and generation, and where are you storing your data? Choosing the right models is crucial because different models handle security, accuracy, and privacy differently. Ensuring that these models are fine-tuned for security and privacy concerns or that services are blocking malicious behavior is key, as poorly selected models and third-party services can introduce vulnerabilities.

If you’re using a vector database like Pinecone or LlamaIndex, you must ensure that your data storage complies with security and privacy regulations, especially if you’re working with sensitive data. These databases store the map between the embedding and text, and ensuring that they are properly encrypted and access-controlled is vital to prevent unauthorized manipulation. Developers often choose platforms like OpenSearch, a low-code vector database solution, because it offers easier management of these security aspects, with built-in monitoring, access control, and logging to help avoid data poisoning and leakage.

In addition to model selection and secure data storage, all AI systems operate with a system prompt—a hidden instruction set that initializes every task or conversation. Adjusting this system prompt can help mitigate security issues, such as preventing the model from generating harmful or sensitive content. However, while strengthening the system prompt can help reduce certain risks, it’s not a comprehensive solution. A strong system prompt serves as the first line of defense, but addressing AI vulnerabilities requires a broader approach, including fine-tuning the models for safety, ensuring data compliance, and implementing real-time monitoring, code sanitizers, and guardrails.

In summary, securing a RAG system involves more than just selecting the right models and storage solutions. It requires robust encryption, data governance policies, and continuous oversight to protect against data poisoning, information leakage, and other evolving security threats.

How to protect RAG systems

Protecting AI systems, including RAG systems, requires a multi-layered approach that combines proactive testing, security mechanisms, and safeguards to prevent vulnerabilities from being exploited.

One effective strategy is to red-team your model. Red-teaming RAG systems involves simulated attacks to identify weaknesses in your AI system, such as prompt injection or data poisoning, before they can be exploited in real-world scenarios.

To protect RAG systems, there are several key approaches to consider:

1. Firewalls

In AI, firewalls act as monitoring layers that evaluate both input and output. They can use heuristic techniques to detect suspicious activity, such as attempts to inject harmful prompts or commands. For example, if a user tries to manipulate the AI to ignore its initial instructions (via prompt injection) and generate unintended or harmful output, the firewall can flag this as a potential attack. While firewalls provide an extra layer of security, they aren’t foolproof and may miss more sophisticated attacks that don’t match known patterns.

2. Guardrails

Guardrails are predefined rules or constraints that limit the behavior and output of AI systems. These can be customized based on the use case, ensuring the AI follows certain safety and ethical standards.

NVIDIA NeMo Guardrails offers several types of guardrails:

  • Input rails filter and control what kinds of inputs are acceptable, ensuring sensitive data (like names or email addresses) is not processed.
  • Dialog rails shape conversational flows to ensure AI responds appropriately, based on predefined conversation structures.
  • Retrieval rails ensure the AI retrieves only trusted and relevant documents, minimizing the risk of poisoned data entering the system.
  • Execution rails limit the types of code or commands the AI can execute, preventing improper actions.
  • Output rails restrict the types of outputs the model can produce, protecting against hallucinations or inappropriate content.

NVIDIA Garak, another tool from NVIDIA, is an open-source red-teaming tool for testing vulnerabilities in large language models (LLMs). Garak helps identify common vulnerabilities, such as prompt injection or toxic content generation. It learns and adapts over time, improving its detection abilities with each use. Promptfoo is another tool that might be used.

3. Fact-checking and hallucination prevention

RAG systems can also incorporate self-checking mechanisms to verify the accuracy of generated content and prevent hallucinations—instances where the AI produces false information. Integrating fact-checking features can reduce the risk of presenting incorrect or harmful responses to users.

4. Shift-left security

A shift-left approach focuses on integrating security practices early in the development process. For RAG systems, this means ensuring that the data used for training and fine-tuning is free of bias, sensitive information, or inaccuracies from the start. Additionally, many RAG vulnerabilities may be in the code itself, so it’s worth scanning the code and organizing for fixes to take place before the production stage. By addressing these issues early, you minimize the risk of the system inadvertently sharing PII or being manipulated by malicious input.

Conclusion

As AI systems like RAG become more advanced, it’s critical to implement these protective measures to guard against an increasing array of security threats. Combining firewalls, guardrails, fact-checking, early security practices, and robust monitoring tools creates a comprehensive defense against potential vulnerabilities.

Increase visibility and control over AI models used in your applications

Recent resources

Cybersecurity Awareness Month: AI Safety for Friends and Family

This blog is for your friends and family working outside of the security and technical industries.

Read more

Don’t Treat DAST Like Dessert

DAST is an essential part of a nutritious application security diet—not just a once-a-quarter treat.

Read more

The Power of Platform-Native Consolidation in Application Security

Streamline workflows, consolidate data, boost security posture, and empower developers to focus on innovation.

Read more