feat(vector-store): Implement pgvector adapter for self-hosted RAG (#1211)

* Add conditional migration for vector_store_chunks table Creates the pgvector-backed chunks table when VECTOR_STORE_PROVIDER=pgvector. Enables the vector extension, adds store_id/file_id indexes, and uses vector(1024) column type for embeddings. * Add VectorStore::Embeddable concern for text extraction and embedding Shared concern providing extract_text (PDF via pdf-reader, plain-text as-is), paragraph-boundary chunking (~2000 chars, ~200 overlap), and embed/embed_batch via OpenAI-compatible /v1/embeddings endpoint using Faraday. Configurable via EMBEDDING_MODEL, EMBEDDING_URI_BASE, with fallback to OPENAI_* env vars. * Implement VectorStore::Pgvector adapter with raw SQL Replaces the stub with a full implementation using ActiveRecord::Base.connection with parameterized binds. Supports create_store, delete_store, upload_file (extract+chunk+embed+insert), remove_file, and cosine-similarity search via the <=> operator. * Add registry test for pgvector adapter selection * Configure pgvector in compose.example.ai.yml Switch db image to pgvector/pgvector:pg16, add VECTOR_STORE_PROVIDER, EMBEDDING_MODEL, and EMBEDDING_DIMENSIONS env vars, and include nomic-embed-text in Ollama's pre-loaded models. * Update pgvector docs from scaffolded to ready Document env vars, embedding model setup, pgvector Docker image requirement, and Ollama pull instructions. * Address PR review feedback - Migration: remove env guard, use pgvector_available? check so it runs on plain Postgres (CI) but creates the table on pgvector-capable servers. Add NOT NULL constraints on content/embedding/metadata, unique index on (store_id, file_id, chunk_index). - Pgvector adapter: wrap chunk inserts in a DB transaction to prevent partial file writes. Override supported_extensions to match formats that extract_text can actually parse. - Embeddable: add hard_split fallback for paragraphs exceeding CHUNK_SIZE to avoid overflowing embedding model token limits. * Bump schema version to include vector_store_chunks migration CI uses db:schema:load which checks the version — without this bump, the migration is detected as pending and tests fail to start. * Update 20260316120000_create_vector_store_chunks.rb --------- Co-authored-by: sokiee <sokysrm@gmail.com>
2026-04-18 11:34:13 +00:00 · 2026-03-20 12:01:31 -04:00
parent 2cdddd28d7
commit 6d22514c01
9 changed files with 672 additions and 59 deletions
--- a/compose.example.ai.yml
+++ b/compose.example.ai.yml
@@ -69,6 +69,10 @@ x-rails-env: &rails_env
  OPENAI_ACCESS_TOKEN: token-can-be-any-value-for-ollama
  OPENAI_MODEL: llama3.1:8b # Note: Use tool-enabled model
  OPENAI_URI_BASE: http://ollama:11434/v1
+  # Vector store — pgvector keeps all data local (requires pgvector/pgvector Docker image for db)
+  VECTOR_STORE_PROVIDER: pgvector
+  EMBEDDING_MODEL: nomic-embed-text
+  EMBEDDING_DIMENSIONS: "1024"
  # NOTE: enabling OpenAI will incur costs when you use AI-related features in the app (chat, rules).  Make sure you have set appropriate spend limits on your account before adding this.
  #  OPENAI_ACCESS_TOKEN: ${OPENAI_ACCESS_TOKEN}
  # External AI Assistant — delegates chat to a remote AI agent (e.g., OpenClaw).
@@ -128,7 +132,7 @@ services:
      - "11434:11434"
    environment:
      - OLLAMA_KEEP_ALIVE=1h
-      - OLLAMA_MODELS=deepseek-r1:8b,llama3.1:8b # Pre-load model on startup, you can change this to your preferred model
+      - OLLAMA_MODELS=deepseek-r1:8b,llama3.1:8b,nomic-embed-text # Pre-load model on startup, you can change this to your preferred model
    networks:
      - sure_net
    # Recommended: Enable GPU support
@@ -213,7 +217,7 @@ services:
      - sure_net

  db:
-    image: postgres:16
+    image: pgvector/pgvector:pg16
    restart: unless-stopped
    volumes:
      - postgres-data:/var/lib/postgresql/data