QT/sure - sure

QT/sure

Fork 0

mirror of https://github.com/we-promise/sure.git synced 2026-05-07 12:54:04 +00:00

Commit Graph

Author	SHA1	Message	Date
Juan José Mata	30ef253536	Fix pgvector migration to only run when explicitly configured (#1239 ) The migration previously checked only if the pgvector extension was available on the PostgreSQL server. In production Docker environments (e.g., managed Postgres services), the extension may be present but the database user lacks superuser privileges to enable it, causing the migration to fail. Now the migration requires VECTOR_STORE_PROVIDER=pgvector to be set before attempting to enable the extension, matching the application's own configuration pattern in VectorStore::Registry. https://claude.ai/code/session_017YYBFXGwamXpGwwDZxe8Pv Co-authored-by: Claude <noreply@anthropic.com>	2026-03-21 09:24:11 +01:00
Dream	6d22514c01	feat(vector-store): Implement pgvector adapter for self-hosted RAG (#1211 ) * Add conditional migration for vector_store_chunks table Creates the pgvector-backed chunks table when VECTOR_STORE_PROVIDER=pgvector. Enables the vector extension, adds store_id/file_id indexes, and uses vector(1024) column type for embeddings. * Add VectorStore::Embeddable concern for text extraction and embedding Shared concern providing extract_text (PDF via pdf-reader, plain-text as-is), paragraph-boundary chunking (~2000 chars, ~200 overlap), and embed/embed_batch via OpenAI-compatible /v1/embeddings endpoint using Faraday. Configurable via EMBEDDING_MODEL, EMBEDDING_URI_BASE, with fallback to OPENAI_* env vars. * Implement VectorStore::Pgvector adapter with raw SQL Replaces the stub with a full implementation using ActiveRecord::Base.connection with parameterized binds. Supports create_store, delete_store, upload_file (extract+chunk+embed+insert), remove_file, and cosine-similarity search via the <=> operator. * Add registry test for pgvector adapter selection * Configure pgvector in compose.example.ai.yml Switch db image to pgvector/pgvector:pg16, add VECTOR_STORE_PROVIDER, EMBEDDING_MODEL, and EMBEDDING_DIMENSIONS env vars, and include nomic-embed-text in Ollama's pre-loaded models. * Update pgvector docs from scaffolded to ready Document env vars, embedding model setup, pgvector Docker image requirement, and Ollama pull instructions. * Address PR review feedback - Migration: remove env guard, use pgvector_available? check so it runs on plain Postgres (CI) but creates the table on pgvector-capable servers. Add NOT NULL constraints on content/embedding/metadata, unique index on (store_id, file_id, chunk_index). - Pgvector adapter: wrap chunk inserts in a DB transaction to prevent partial file writes. Override supported_extensions to match formats that extract_text can actually parse. - Embeddable: add hard_split fallback for paragraphs exceeding CHUNK_SIZE to avoid overflowing embedding model token limits. * Bump schema version to include vector_store_chunks migration CI uses db:schema:load which checks the version — without this bump, the migration is detected as pending and tests fail to start. * Update 20260316120000_create_vector_store_chunks.rb --------- Co-authored-by: sokiee <sokysrm@gmail.com>	2026-03-20 17:01:31 +01:00

Author

SHA1

Message

Date

Juan José Mata

30ef253536

Fix pgvector migration to only run when explicitly configured (#1239 )

The migration previously checked only if the pgvector extension was
available on the PostgreSQL server. In production Docker environments
(e.g., managed Postgres services), the extension may be present but the
database user lacks superuser privileges to enable it, causing the
migration to fail.

Now the migration requires VECTOR_STORE_PROVIDER=pgvector to be set
before attempting to enable the extension, matching the application's
own configuration pattern in VectorStore::Registry.

https://claude.ai/code/session_017YYBFXGwamXpGwwDZxe8Pv

Co-authored-by: Claude <noreply@anthropic.com>

2026-03-21 09:24:11 +01:00

Dream

6d22514c01

feat(vector-store): Implement pgvector adapter for self-hosted RAG (#1211 )

* Add conditional migration for vector_store_chunks table

Creates the pgvector-backed chunks table when VECTOR_STORE_PROVIDER=pgvector.
Enables the vector extension, adds store_id/file_id indexes, and uses
vector(1024) column type for embeddings.

* Add VectorStore::Embeddable concern for text extraction and embedding

Shared concern providing extract_text (PDF via pdf-reader, plain-text as-is),
paragraph-boundary chunking (~2000 chars, ~200 overlap), and embed/embed_batch
via OpenAI-compatible /v1/embeddings endpoint using Faraday. Configurable via
EMBEDDING_MODEL, EMBEDDING_URI_BASE, with fallback to OPENAI_* env vars.

* Implement VectorStore::Pgvector adapter with raw SQL

Replaces the stub with a full implementation using
ActiveRecord::Base.connection with parameterized binds. Supports
create_store, delete_store, upload_file (extract+chunk+embed+insert),
remove_file, and cosine-similarity search via the <=> operator.

* Add registry test for pgvector adapter selection

* Configure pgvector in compose.example.ai.yml

Switch db image to pgvector/pgvector:pg16, add VECTOR_STORE_PROVIDER,
EMBEDDING_MODEL, and EMBEDDING_DIMENSIONS env vars, and include
nomic-embed-text in Ollama's pre-loaded models.

* Update pgvector docs from scaffolded to ready

Document env vars, embedding model setup, pgvector Docker image
requirement, and Ollama pull instructions.

* Address PR review feedback

- Migration: remove env guard, use pgvector_available? check so it runs
  on plain Postgres (CI) but creates the table on pgvector-capable servers.
  Add NOT NULL constraints on content/embedding/metadata, unique index on
  (store_id, file_id, chunk_index).
- Pgvector adapter: wrap chunk inserts in a DB transaction to prevent
  partial file writes. Override supported_extensions to match formats
  that extract_text can actually parse.
- Embeddable: add hard_split fallback for paragraphs exceeding CHUNK_SIZE
  to avoid overflowing embedding model token limits.

* Bump schema version to include vector_store_chunks migration

CI uses db:schema:load which checks the version — without this bump,
the migration is detected as pending and tests fail to start.

* Update 20260316120000_create_vector_store_chunks.rb

---------

Co-authored-by: sokiee <sokysrm@gmail.com>

2026-03-20 17:01:31 +01:00

2 Commits