* Add conditional migration for vector_store_chunks table
Creates the pgvector-backed chunks table when VECTOR_STORE_PROVIDER=pgvector.
Enables the vector extension, adds store_id/file_id indexes, and uses
vector(1024) column type for embeddings.
* Add VectorStore::Embeddable concern for text extraction and embedding
Shared concern providing extract_text (PDF via pdf-reader, plain-text as-is),
paragraph-boundary chunking (~2000 chars, ~200 overlap), and embed/embed_batch
via OpenAI-compatible /v1/embeddings endpoint using Faraday. Configurable via
EMBEDDING_MODEL, EMBEDDING_URI_BASE, with fallback to OPENAI_* env vars.
* Implement VectorStore::Pgvector adapter with raw SQL
Replaces the stub with a full implementation using
ActiveRecord::Base.connection with parameterized binds. Supports
create_store, delete_store, upload_file (extract+chunk+embed+insert),
remove_file, and cosine-similarity search via the <=> operator.
* Add registry test for pgvector adapter selection
* Configure pgvector in compose.example.ai.yml
Switch db image to pgvector/pgvector:pg16, add VECTOR_STORE_PROVIDER,
EMBEDDING_MODEL, and EMBEDDING_DIMENSIONS env vars, and include
nomic-embed-text in Ollama's pre-loaded models.
* Update pgvector docs from scaffolded to ready
Document env vars, embedding model setup, pgvector Docker image
requirement, and Ollama pull instructions.
* Address PR review feedback
- Migration: remove env guard, use pgvector_available? check so it runs
on plain Postgres (CI) but creates the table on pgvector-capable servers.
Add NOT NULL constraints on content/embedding/metadata, unique index on
(store_id, file_id, chunk_index).
- Pgvector adapter: wrap chunk inserts in a DB transaction to prevent
partial file writes. Override supported_extensions to match formats
that extract_text can actually parse.
- Embeddable: add hard_split fallback for paragraphs exceeding CHUNK_SIZE
to avoid overflowing embedding model token limits.
* Bump schema version to include vector_store_chunks migration
CI uses db:schema:load which checks the version — without this bump,
the migration is detected as pending and tests fail to start.
* Update 20260316120000_create_vector_store_chunks.rb
---------
Co-authored-by: sokiee <sokysrm@gmail.com>