NVLogo wht bg v2
7926  Reviews star_rate star_rate star_rate star_rate star_half

Building RAG Agents with LLMs

Agents powered by large language models (LLMs) are quickly gaining popularity from both individuals and companies as people are finding new emerging capabilities and opportunities to greatly improve...

Read More
$500 USD
Course Code NV-RAG-A-LLM
Duration 1 day
Available Formats Classroom

Agents powered by large language models (LLMs) are quickly gaining popularity from both individuals and companies as people are finding new emerging capabilities and opportunities to greatly improve their productivity. An especially powerful recent development has been the popularization of retrieval-based LLM systems that can hold informed conversations by using tools, looking at documents, and planning their approaches. These systems are very fun to experiment with and offer unprecedented opportunities to make life easier, but also require many queries to large deep learning models and need to be implemented efficiently.

You will be designing retrieval-augmented generation systems and bundling them into deliverable formats. Along the way, you will learn advanced LLM composition techniques for internal reasoning, dialog management, and tooling.

Skills Gained

By participating in this workshop, you'll learn how to:

  • Compose an LLM system that can interact predictably with a user by leveraging internal and external reasoning components.
  • Design a dialog management and document reasoning system that maintains state and coerces information into structured formats.
  • Leverage embedding models for efficient similarity queries for content retrieval and dialog guardrailing.
  • Implement, modularize, and evaluate a retrieval-augmented generation agent that can answer questions about the research papers in its dataset without any fine-tuning.


  • Introductory deep learning, with comfort with PyTorch and transfer learning preferred. Content covered by "Getting Started with Deep Learning" or "Fundamentals of Deep Learning" courses or similar experience is sufficient.
  • Intermediate Python experience, including object-oriented programming and libraries. Content covered by Python Tutorial (w3schools.com) or similar experience is sufficient.

Course Details

Workshop Outline


LLM Inference Interfaces

  • Get comfortable with the course environment and learn about microservices for software compartmentalization and resource delivery.
  • Discuss LLM service options for inference use-cases, including local and scalable deployment strategies and value propositions.
  • Get comfortable with remotely-accessible access points like GPT4 and NGC-hosted NVIDIA AI Foundation Model endpoints.

Pipeline Design with LangChain, Gradio, and LangServe

  • Learn how to use LangChain to chain multiple LLM-enabled modules using the functional LangChain Expression Language (LCEL) syntax.
  • Formalize internal/external reasoning and modularize them into runnables.
  • Use LangServe to interact with a Gradio frontend by sending an LLM chain over a port interface.

Dialog Management with Running States

  • Learn about running state logic to retain state as your chain runs.
  • Leverage knowledge extraction via slot filling to keep a smart knowledge base.
  • Integrate a dialog managing chatbot to coerce the user for credentials, retrieve info from a database interface, and maintain dialog state.

Working with Documents

  • Learn about document chunking, reduction, and refinement strategies.
  • Use the same LLM chaining skills to build systems that summarize research papers by exporting a while-loop-enabled runnable.

Embeddings for Semantic Similarity and Guardrailing

  • Formalize encoder-vs-decoder benefits and understand how embedding logic works.
  • Use vector representations to reason about passage meanings and similarity.
  • Design a guardrailing system that leverages a custom-build input rail to answer a question or kindly refuse.

Vector Stores for RAG Agents

  • Formalize vector stores as structures that help automate vector reasoning logic.
  • Incorporate vector stores into retrieval-augmented generation pipelines that reason about conversation history and preprocessed document pools.

Evaluation, Assessment, and Q&A