Add Family vector search function call / support for document vault (#961)

* Add SearchFamilyImportedFiles assistant function with vector store support

Implement per-Family document search using OpenAI vector stores, allowing
the AI assistant to search through uploaded financial documents (tax returns,
statements, contracts, etc.). The architecture is modular with a provider-
agnostic VectorStoreConcept interface so other RAG backends can be added.

Key components:
- Assistant::Function::SearchFamilyImportedFiles - tool callable from any LLM
- Provider::VectorStoreConcept - abstract vector store interface
- Provider::Openai vector store methods (create, upload, search, delete)
- Family::VectorSearchable concern with document management
- FamilyDocument model for tracking uploaded files
- Migration adding vector_store_id to families and family_documents table

https://claude.ai/code/session_01TSkKc7a9Yu2ugm1RvSf4dh

* Extract VectorStore adapter layer for swappable backends

Replace the Provider::VectorStoreConcept mixin with a standalone adapter
architecture under VectorStore::. This cleanly separates vector store
concerns from the LLM provider and makes it trivial to swap backends.

Components:
- VectorStore::Base — abstract interface (create/delete/upload/remove/search)
- VectorStore::Openai — uses ruby-openai gem's native vector_stores.search
- VectorStore::Pgvector — skeleton for local pgvector + embedding model
- VectorStore::Qdrant — skeleton for Qdrant vector DB
- VectorStore::Registry — resolves adapter from VECTOR_STORE_PROVIDER env
- VectorStore::Response — success/failure wrapper (like Provider::Response)

Consumers updated to go through VectorStore.adapter:
- Family::VectorSearchable
- Assistant::Function::SearchFamilyImportedFiles
- FamilyDocument

Removed: Provider::VectorStoreConcept, vector store methods from Provider::Openai

https://claude.ai/code/session_01TSkKc7a9Yu2ugm1RvSf4dh

* Add Vector Store configuration docs to ai.md

Documents how to configure the document search feature, covering all
three supported backends (OpenAI, pgvector, Qdrant), environment
variables, Docker Compose examples, supported file types, and privacy
considerations.

https://claude.ai/code/session_01TSkKc7a9Yu2ugm1RvSf4dh

* No need to specify `imported` in code

* Missed a couple more places

* Tiny reordering for the human OCD

* Update app/models/assistant/function/search_family_files.rb

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Juan José Mata <jjmata@jjmata.com>

* PR comments

* More PR comments

---------

Signed-off-by: Juan José Mata <jjmata@jjmata.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
This commit is contained in:
Juan José Mata
2026-02-11 15:22:56 +01:00
committed by GitHub
parent 1ebbd5bbc5
commit 9e57954a99
20 changed files with 1212 additions and 6 deletions

26
test/fixtures/family_documents.yml vendored Normal file
View File

@@ -0,0 +1,26 @@
tax_return:
family: dylan_family
filename: 2024_tax_return.pdf
content_type: application/pdf
file_size: 102400
provider_file_id: file-abc123
status: ready
metadata: {}
bank_statement:
family: dylan_family
filename: jan_2025_statement.pdf
content_type: application/pdf
file_size: 51200
provider_file_id: file-def456
status: ready
metadata: {}
pending_doc:
family: dylan_family
filename: pending_upload.docx
content_type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
file_size: 25600
provider_file_id:
status: pending
metadata: {}

View File

@@ -0,0 +1,129 @@
require "test_helper"
class Assistant::Function::SearchFamilyFilesTest < ActiveSupport::TestCase
setup do
@user = users(:family_admin)
@function = Assistant::Function::SearchFamilyFiles.new(@user)
end
test "has correct name" do
assert_equal "search_family_files", @function.name
end
test "has a description" do
assert_not_empty @function.description
end
test "is not in strict mode" do
assert_not @function.strict_mode?
end
test "params_schema requires query" do
schema = @function.params_schema
assert_includes schema[:required], "query"
assert schema[:properties].key?(:query)
end
test "generates valid tool definition" do
definition = @function.to_definition
assert_equal "search_family_files", definition[:name]
assert_not_nil definition[:description]
assert_not_nil definition[:params_schema]
assert_equal false, definition[:strict]
end
test "returns no_documents error when family has no vector store" do
@user.family.update!(vector_store_id: nil)
result = @function.call("query" => "tax return")
assert_equal false, result[:success]
assert_equal "no_documents", result[:error]
end
test "returns provider_not_configured when no adapter is available" do
@user.family.update!(vector_store_id: "vs_test123")
VectorStore::Registry.stubs(:adapter).returns(nil)
result = @function.call("query" => "tax return")
assert_equal false, result[:success]
assert_equal "provider_not_configured", result[:error]
end
test "returns search results on success" do
@user.family.update!(vector_store_id: "vs_test123")
mock_adapter = mock("vector_store_adapter")
mock_adapter.stubs(:search).returns(
VectorStore::Response.new(
success?: true,
data: [
{ content: "Total income: $85,000", filename: "2024_tax_return.pdf", score: 0.95, file_id: "file-abc" },
{ content: "W-2 wages: $80,000", filename: "2024_tax_return.pdf", score: 0.87, file_id: "file-abc" }
],
error: nil
)
)
VectorStore::Registry.stubs(:adapter).returns(mock_adapter)
result = @function.call("query" => "What was my total income?")
assert_equal true, result[:success]
assert_equal 2, result[:result_count]
assert_equal "Total income: $85,000", result[:results].first[:content]
assert_equal "2024_tax_return.pdf", result[:results].first[:filename]
end
test "returns empty results message when no matches found" do
@user.family.update!(vector_store_id: "vs_test123")
mock_adapter = mock("vector_store_adapter")
mock_adapter.stubs(:search).returns(
VectorStore::Response.new(success?: true, data: [], error: nil)
)
VectorStore::Registry.stubs(:adapter).returns(mock_adapter)
result = @function.call("query" => "nonexistent document")
assert_equal true, result[:success]
assert_empty result[:results]
end
test "handles search failure gracefully" do
@user.family.update!(vector_store_id: "vs_test123")
mock_adapter = mock("vector_store_adapter")
mock_adapter.stubs(:search).returns(
VectorStore::Response.new(
success?: false,
data: nil,
error: VectorStore::Error.new("API rate limit exceeded")
)
)
VectorStore::Registry.stubs(:adapter).returns(mock_adapter)
result = @function.call("query" => "tax return")
assert_equal false, result[:success]
assert_equal "search_failed", result[:error]
end
test "caps max_results at 20" do
@user.family.update!(vector_store_id: "vs_test123")
mock_adapter = mock("vector_store_adapter")
mock_adapter.expects(:search).with(
store_id: "vs_test123",
query: "test",
max_results: 20
).returns(VectorStore::Response.new(success?: true, data: [], error: nil))
VectorStore::Registry.stubs(:adapter).returns(mock_adapter)
@function.call("query" => "test", "max_results" => 50)
end
end

View File

@@ -0,0 +1,54 @@
require "test_helper"
class FamilyDocumentTest < ActiveSupport::TestCase
setup do
@family = families(:dylan_family)
@document = family_documents(:tax_return)
end
test "belongs to a family" do
assert_equal @family, @document.family
end
test "validates filename presence" do
doc = FamilyDocument.new(family: @family, status: "pending")
assert_not doc.valid?
assert_includes doc.errors[:filename], "can't be blank"
end
test "validates status inclusion" do
doc = FamilyDocument.new(family: @family, filename: "test.pdf", status: "invalid")
assert_not doc.valid?
assert_includes doc.errors[:status], "is not included in the list"
end
test "ready scope returns only ready documents" do
ready_docs = @family.family_documents.ready
assert ready_docs.all? { |d| d.status == "ready" }
assert_not_includes ready_docs, family_documents(:pending_doc)
end
test "mark_ready! updates status" do
doc = family_documents(:pending_doc)
doc.mark_ready!
assert_equal "ready", doc.reload.status
end
test "mark_error! updates status and metadata" do
doc = family_documents(:pending_doc)
doc.mark_error!("Upload failed")
doc.reload
assert_equal "error", doc.status
assert_equal "Upload failed", doc.metadata["error"]
end
test "supported_extension? returns true for supported types" do
doc = FamilyDocument.new(filename: "report.pdf")
assert doc.supported_extension?
end
test "supported_extension? returns false for unsupported types" do
doc = FamilyDocument.new(filename: "video.mp4")
assert_not doc.supported_extension?
end
end

View File

@@ -0,0 +1,42 @@
require "test_helper"
class VectorStore::BaseTest < ActiveSupport::TestCase
setup do
@adapter = VectorStore::Base.new
end
test "create_store raises NotImplementedError" do
assert_raises(NotImplementedError) { @adapter.create_store(name: "test") }
end
test "delete_store raises NotImplementedError" do
assert_raises(NotImplementedError) { @adapter.delete_store(store_id: "test") }
end
test "upload_file raises NotImplementedError" do
assert_raises(NotImplementedError) { @adapter.upload_file(store_id: "s", file_content: "c", filename: "f") }
end
test "remove_file raises NotImplementedError" do
assert_raises(NotImplementedError) { @adapter.remove_file(store_id: "s", file_id: "f") }
end
test "search raises NotImplementedError" do
assert_raises(NotImplementedError) { @adapter.search(store_id: "s", query: "q") }
end
test "supported_extensions includes common file types" do
exts = @adapter.supported_extensions
assert_includes exts, ".pdf"
assert_includes exts, ".docx"
assert_includes exts, ".xlsx"
assert_includes exts, ".csv"
assert_includes exts, ".json"
assert_includes exts, ".txt"
assert_includes exts, ".md"
end
test "SUPPORTED_EXTENSIONS is frozen" do
assert VectorStore::Base::SUPPORTED_EXTENSIONS.frozen?
end
end

View File

@@ -0,0 +1,132 @@
require "test_helper"
class VectorStore::OpenaiTest < ActiveSupport::TestCase
setup do
@adapter = VectorStore::Openai.new(access_token: "sk-test-key")
end
test "create_store wraps response" do
mock_client = mock("openai_client")
mock_vs = mock("vector_stores")
mock_vs.expects(:create).with(parameters: { name: "Test Store" }).returns({ "id" => "vs_abc123" })
mock_client.stubs(:vector_stores).returns(mock_vs)
@adapter.instance_variable_set(:@client, mock_client)
response = @adapter.create_store(name: "Test Store")
assert response.success?
assert_equal "vs_abc123", response.data[:id]
end
test "delete_store wraps response" do
mock_client = mock("openai_client")
mock_vs = mock("vector_stores")
mock_vs.expects(:delete).with(id: "vs_abc123").returns(true)
mock_client.stubs(:vector_stores).returns(mock_vs)
@adapter.instance_variable_set(:@client, mock_client)
response = @adapter.delete_store(store_id: "vs_abc123")
assert response.success?
end
test "upload_file uploads and attaches to store" do
mock_client = mock("openai_client")
mock_files = mock("files")
mock_files.expects(:upload).returns({ "id" => "file-xyz" })
mock_vs_files = mock("vector_store_files")
mock_vs_files.expects(:create).with(
vector_store_id: "vs_abc123",
parameters: { file_id: "file-xyz" }
).returns(true)
mock_client.stubs(:files).returns(mock_files)
mock_client.stubs(:vector_store_files).returns(mock_vs_files)
@adapter.instance_variable_set(:@client, mock_client)
response = @adapter.upload_file(
store_id: "vs_abc123",
file_content: "Hello world",
filename: "test.txt"
)
assert response.success?
assert_equal "file-xyz", response.data[:file_id]
end
test "remove_file deletes from store" do
mock_client = mock("openai_client")
mock_vs_files = mock("vector_store_files")
mock_vs_files.expects(:delete).with(
vector_store_id: "vs_abc123",
id: "file-xyz"
).returns(true)
mock_client.stubs(:vector_store_files).returns(mock_vs_files)
@adapter.instance_variable_set(:@client, mock_client)
response = @adapter.remove_file(store_id: "vs_abc123", file_id: "file-xyz")
assert response.success?
end
test "search uses gem client and parses results" do
mock_client = mock("openai_client")
mock_vs = mock("vector_stores")
mock_vs.expects(:search).with(
id: "vs_abc123",
parameters: { query: "income", max_num_results: 5 }
).returns({
"data" => [
{
"file_id" => "file-xyz",
"filename" => "tax_return.pdf",
"score" => 0.95,
"content" => [ { "type" => "text", "text" => "Total income: $85,000" } ]
}
]
})
mock_client.stubs(:vector_stores).returns(mock_vs)
@adapter.instance_variable_set(:@client, mock_client)
response = @adapter.search(store_id: "vs_abc123", query: "income", max_results: 5)
assert response.success?
assert_equal 1, response.data.size
assert_equal "Total income: $85,000", response.data.first[:content]
assert_equal "tax_return.pdf", response.data.first[:filename]
assert_equal 0.95, response.data.first[:score]
end
test "search returns empty array when no results" do
mock_client = mock("openai_client")
mock_vs = mock("vector_stores")
mock_vs.expects(:search).returns({ "data" => [] })
mock_client.stubs(:vector_stores).returns(mock_vs)
@adapter.instance_variable_set(:@client, mock_client)
response = @adapter.search(store_id: "vs_abc123", query: "nothing")
assert response.success?
assert_empty response.data
end
test "wraps errors in failure response" do
mock_client = mock("openai_client")
mock_vs = mock("vector_stores")
mock_vs.expects(:create).raises(StandardError, "API error")
mock_client.stubs(:vector_stores).returns(mock_vs)
@adapter.instance_variable_set(:@client, mock_client)
response = @adapter.create_store(name: "Broken Store")
assert_not response.success?
assert_equal "API error", response.error.message
end
test "supported_extensions returns the default list" do
assert_includes @adapter.supported_extensions, ".pdf"
assert_includes @adapter.supported_extensions, ".docx"
assert_includes @adapter.supported_extensions, ".csv"
end
end

View File

@@ -0,0 +1,53 @@
require "test_helper"
class VectorStore::RegistryTest < ActiveSupport::TestCase
test "adapter_name defaults to openai when access token present" do
VectorStore::Registry.stubs(:openai_access_token).returns("sk-test")
ClimateControl.modify(VECTOR_STORE_PROVIDER: nil) do
assert_equal :openai, VectorStore::Registry.adapter_name
end
end
test "adapter_name returns nil when no credentials configured" do
VectorStore::Registry.stubs(:openai_access_token).returns(nil)
ClimateControl.modify(VECTOR_STORE_PROVIDER: nil) do
assert_nil VectorStore::Registry.adapter_name
end
end
test "adapter_name respects explicit VECTOR_STORE_PROVIDER" do
ClimateControl.modify(VECTOR_STORE_PROVIDER: "qdrant") do
assert_equal :qdrant, VectorStore::Registry.adapter_name
end
end
test "adapter_name falls back to openai for unknown provider" do
VectorStore::Registry.stubs(:openai_access_token).returns("sk-test")
ClimateControl.modify(VECTOR_STORE_PROVIDER: "unknown_store") do
assert_equal :openai, VectorStore::Registry.adapter_name
end
end
test "adapter returns VectorStore::Openai instance when openai configured" do
VectorStore::Registry.stubs(:openai_access_token).returns("sk-test")
ClimateControl.modify(VECTOR_STORE_PROVIDER: nil) do
adapter = VectorStore::Registry.adapter
assert_instance_of VectorStore::Openai, adapter
end
end
test "adapter returns nil when nothing configured" do
VectorStore::Registry.stubs(:openai_access_token).returns(nil)
ClimateControl.modify(VECTOR_STORE_PROVIDER: nil) do
assert_nil VectorStore::Registry.adapter
end
end
test "configured? delegates to adapter presence" do
VectorStore::Registry.stubs(:adapter).returns(nil)
assert_not VectorStore.configured?
VectorStore::Registry.stubs(:adapter).returns(VectorStore::Openai.new(access_token: "sk-test"))
assert VectorStore.configured?
end
end