This work implements a simple RAG system with Google\Codegemma:7b
, Qdrant as VectorDB
and CoNaLa
dataset.
We develop a simple app with GRADIO
and create Dcoker
image for conatiner runtime.
- rag_localLLM.ipynb: This notebook contains code and explanation for implementing RAG with Codegemma.
- Connecting to Local LLM via Ollama
- Import Data
- Import Embedding Model
- Embed the documents and Create Vector Database
- Take user query
- Perform query embedding and Retrieval
- Perform RAG with Codegemma
- *vector_database.py: The code for creating VectorDB
- app.py: The code for Gradio Application
- Vector Database: Qdrant
- application framework: Gradio
- LLM: Google\Codegemma
- LLM server: Ollama
- Embedding: all-MiniLM-L6-v2
- Database: CoNaLa
The docker image of the app is pushed to docker hub. To run the application we need Linux System with minimun 8GB Memory. To run the app simply execute the command
docker compose up