Skip to content

LLM is a very powerful tool. It often performs more than required (hallucinations) and may tend to generate output in a pattern it finds best. We need RAG to harness the power of LLM in a controlled manner. In this work we implement a simple RAG system with Codegemma and an in-memory Vector Database.

Notifications You must be signed in to change notification settings

swastikmaiti/Intro-to-RAG-with-CODEGEMMA-7B

Repository files navigation

Simple RAG with CODEGEMMA:7B

This work implements a simple RAG system with Google\Codegemma:7b , Qdrant as VectorDB and CoNaLa dataset. We develop a simple app with GRADIO and create Dcoker image for conatiner runtime.

Gradio App UI

File Structures

  • rag_localLLM.ipynb: This notebook contains code and explanation for implementing RAG with Codegemma.
    • Connecting to Local LLM via Ollama
    • Import Data
    • Import Embedding Model
    • Embed the documents and Create Vector Database
    • Take user query
    • Perform query embedding and Retrieval
    • Perform RAG with Codegemma
  • *vector_database.py: The code for creating VectorDB
  • app.py: The code for Gradio Application

Architectures

  • Vector Database: Qdrant
  • application framework: Gradio
  • LLM: Google\Codegemma
  • LLM server: Ollama
  • Embedding: all-MiniLM-L6-v2
  • Database: CoNaLa

Docker

The docker image of the app is pushed to docker hub. To run the application we need Linux System with minimun 8GB Memory. To run the app simply execute the command

docker compose up

If you find the repo helpful, please drop a ⭐

About

LLM is a very powerful tool. It often performs more than required (hallucinations) and may tend to generate output in a pattern it finds best. We need RAG to harness the power of LLM in a controlled manner. In this work we implement a simple RAG system with Codegemma and an in-memory Vector Database.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published