Skip to content

Latest commit

 

History

History
60 lines (49 loc) · 2.04 KB

KnowledgeBase.md

File metadata and controls

60 lines (49 loc) · 2.04 KB

Knowledge Base

GPT-Code-Learner supports using a knowledge base to answer questions. By default, it will use the codebase as the knowledge base.

The knowledge base is powered by a vector database. GPT-Code-Learner supports two types of vector databases: local or cloud. By default, it will use the local version.

The local version uses FAISS, while the cloud version utilizes Supabase.

Supabase Setup

For the Supabase version, create a Supabase account and project at https://app.supabase.com/sign-in. Next, add your Supabase URL and key to the .env file. You can find them in the portal under Project/API.

SUPABASE_URL=https://xxxxxx.supabase.co
SUPABASE_KEY=xxxxxx

Create the default document table using the following SQL, which follows the format of the langchain example.

-- Enable the pgvector extension to work with embedding vectors
create extension vector;

-- Create a table to store your documents
create table documents (
id bigserial primary key,
content text, -- corresponds to Document.pageContent
metadata jsonb, -- corresponds to Document.metadata
embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed
);

CREATE FUNCTION match_documents(query_embedding vector(1536), match_count int)
   RETURNS TABLE(
       id bigint,
       content text,
       metadata jsonb,
       -- we return matched vectors to enable maximal marginal relevance searches
       embedding vector(1536),
       similarity float)
   LANGUAGE plpgsql
   AS $$
   # variable_conflict use_column
BEGIN
   RETURN query
   SELECT
       id,
       content,
       metadata,
       embedding,
       1 -(documents.embedding <=> query_embedding) AS similarity
   FROM
       documents
   ORDER BY
       documents.embedding <=> query_embedding
   LIMIT match_count;
END;
$$;

The knowledge_base.py provides examples of how to use the knowledge base.