From 834686cffd3e35120c151b7f062c2ad3f9343dba Mon Sep 17 00:00:00 2001 From: plind <59729252+plind-junior@users.noreply.github.com> Date: Wed, 13 May 2026 09:17:10 -0700 Subject: [PATCH] fix(simplefin): treat Vanguard/Fidelity cost_basis as total when needed (#1772) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(simplefin): treat Vanguard/Fidelity cost_basis as total when needed PR #1692 normalized SimpleFIN holdings cost_basis under the assumption that the `cost_basis` / `basis` keys carry a per-share value (per the SimpleFIN spec) and only `total_cost` / `value` carry a total position cost. Vanguard and Fidelity violate the spec — they populate `cost_basis` with the *total* (see the payload in #1182). After PR #1692 those holdings get stored with cost_basis = total, and Holding#calculate_trend then computes previous = qty × avg_cost, so the "previous" value is inflated by a factor of qty and an entire investment account renders a phantom return of roughly -(1 − 1/qty), i.e. -97% to -99%. Fix: sanity-check raw cost_basis against the holding's market share price. Let share_price = market_value / qty; the geometric midpoint between "raw is per-share" (raw ≈ share_price) and "raw is total" (raw ≈ qty × share_price) is share_price × √qty. If raw is above the midpoint it is divided by qty; otherwise it is kept as per-share. Falls back to the pre-fix behaviour (trust the spec) when market_value or qty is unavailable, so confidently-correct readings are never made worse. Verified against the reported Vanguard payload (qty=139, cost_basis= 22004.40, market_value=22626.42): normalize_cost_basis now returns $158.31/share, matching 22004.40 / 139, and the phantom -99% return collapses to a realistic ~+2.8%. Per-share readings ($45 cost on a $50 share price) remain untouched. Closes #1718. Refs #1182, #1692. * fixup: replace cost_basis heuristic with institution allowlist Codex and @EdeAbreu23 flagged a real false-positive in the previous geometric-midpoint heuristic: a legitimate per-share `cost_basis` on a holding with a large unrealized loss (e.g. 100 shares with $100/share basis now worth $5/share) trips `share_price × √qty` and gets divided to $1/share — corrupting any standards-compliant brokerage with a big loss. Adopt @EdeAbreu23's safer shape: - total_cost / value: always divide by qty (unchanged from #1692). - cost_basis / basis: keep as-is by default. - Only divide cost_basis / basis when the holding's SimpleFIN account is connected to a known-misbehaving institution. Allowlist starts with `vanguard` and `fidelity`, matched case-insensitively against the account's stored org name and domain. Easy to extend as more brokerages turn up. Trades a small maintenance cost (curated list) for zero risk of corrupting compliant providers. Verified against five scenarios (all expected): Vanguard total in cost_basis (allowlist) → +2.83% Fidelity total in basis (allowlist) → +33.33% Big-loss per-share (Codex case) → -95.0% (preserved) Honest per-share, small loss → +11.11% (unchanged) total_cost on any institution → +11.11% (unchanged) --------- Co-authored-by: plind-junior --- .../investments/holdings_processor.rb | 51 ++++++++++++--- .../investments/holdings_processor_test.rb | 63 +++++++++++++++++++ 2 files changed, 104 insertions(+), 10 deletions(-) diff --git a/app/models/simplefin_account/investments/holdings_processor.rb b/app/models/simplefin_account/investments/holdings_processor.rb index 8cb4882ef..29bf62155 100644 --- a/app/models/simplefin_account/investments/holdings_processor.rb +++ b/app/models/simplefin_account/investments/holdings_processor.rb @@ -48,7 +48,7 @@ class SimplefinAccount::Investments::HoldingsProcessor qty = parse_decimal(any_of(simplefin_holding, %w[shares quantity qty units])) market_value = parse_decimal(any_of(simplefin_holding, %w[market_value current_value])) raw_cost_basis, cost_basis_source_key = cost_basis_from(simplefin_holding) - cost_basis = normalize_cost_basis(raw_cost_basis, qty, cost_basis_source_key) + cost_basis = normalize_cost_basis(raw_cost_basis, qty, cost_basis_source_key, institution_reports_total_basis?) # Derive price from market_value when possible; otherwise fall back to any price field fallback_price = parse_decimal(any_of(simplefin_holding, %w[purchase_price price unit_price average_cost avg_cost])) @@ -124,19 +124,50 @@ class SimplefinAccount::Investments::HoldingsProcessor [ nil, nil ] end - # Sure stores holding cost_basis as per-share average cost. Some SimpleFIN - # providers expose total position basis via total_cost/value, so normalize only - # when the selected provider field is known to represent total position basis. - def normalize_cost_basis(raw_cost_basis, qty, source_key) + # Sure stores holding cost_basis as per-share average cost. SimpleFIN + # brokerages are inconsistent about which field carries which shape: + # + # - total_cost / value: always a total position cost per the SimpleFIN + # spec and observed payloads; divide by qty unconditionally. + # - cost_basis / basis: the spec calls this per-share, and most + # brokerages comply. Keep these values unchanged by default. + # + # Exception: a small allowlist of brokerages (Vanguard, Fidelity) is + # known to populate cost_basis with the total position cost in violation + # of the spec (#1718, #1182). For those connections only, divide by qty. + # + # An earlier revision of this fix used a magnitude heuristic + # (share_price × √qty midpoint). It was withdrawn because a legitimate + # per-share basis on a holding with a large unrealized loss + # (e.g. 100 shares with basis $100 now worth $5) trips the midpoint and + # gets mis-divided to $1/share — corrupting compliant providers. The + # allowlist trades some manual maintenance for that safety. + def normalize_cost_basis(raw_cost_basis, qty, source_key, total_basis_institution = false) return nil if raw_cost_basis.nil? - if %w[total_cost value].include?(source_key) + if %w[total_cost value].include?(source_key) || + (total_basis_institution && %w[cost_basis basis].include?(source_key)) return nil unless qty.to_d.positive? - - raw_cost_basis / qty - else - raw_cost_basis + return raw_cost_basis / qty end + + raw_cost_basis + end + + # Institutions known to populate the SimpleFIN `cost_basis` / `basis` + # field with the total position cost rather than the per-share value the + # spec requires. Matched as case-insensitive substrings against the + # account's stored org name and domain. + TOTAL_BASIS_INSTITUTIONS = %w[vanguard fidelity].freeze + + def institution_reports_total_basis? + org = simplefin_account.respond_to?(:org_data) ? simplefin_account.org_data : nil + return false if org.blank? + + candidates = [ org["name"], org[:name], org["domain"], org[:domain] ].compact.map(&:to_s).map(&:downcase) + return false if candidates.empty? + + TOTAL_BASIS_INSTITUTIONS.any? { |needle| candidates.any? { |c| c.include?(needle) } } end def resolve_security(symbol, description) diff --git a/test/models/simplefin_account/investments/holdings_processor_test.rb b/test/models/simplefin_account/investments/holdings_processor_test.rb index 122d908c0..a3842ac6a 100644 --- a/test/models/simplefin_account/investments/holdings_processor_test.rb +++ b/test/models/simplefin_account/investments/holdings_processor_test.rb @@ -80,6 +80,69 @@ class SimplefinAccount::Investments::HoldingsProcessorTest < ActiveSupport::Test assert_equal "total_cost", source_key end + test "cost_basis from a known total-basis institution is divided by qty" do + # Issue #1718 / #1182: Vanguard populates cost_basis with the total + # position cost. When the institution is on the allowlist we divide. + cost_basis = @processor.send( + :normalize_cost_basis, + BigDecimal("22004.40"), + BigDecimal("139.00"), + "cost_basis", + true # institution_reports_total_basis? + ) + + assert_in_delta 158.30, cost_basis.to_f, 0.01 + end + + test "basis from a known total-basis institution is divided by qty" do + cost_basis = @processor.send( + :normalize_cost_basis, + BigDecimal("9000.00"), + BigDecimal("200"), + "basis", + true + ) + + assert_equal BigDecimal("45.00"), cost_basis + end + + test "cost_basis from a compliant institution is kept untouched (no false divide)" do + # Codex regression: a legitimate per-share basis on a holding with a + # large unrealized loss (e.g. $100/share basis now worth $5/share) must + # NOT be divided by qty. Per the SimpleFIN spec, cost_basis is per-share + # — only the institution allowlist should override that. + cost_basis = @processor.send( + :normalize_cost_basis, + BigDecimal("100.00"), + BigDecimal("100"), + "cost_basis", + false + ) + + assert_equal BigDecimal("100.00"), cost_basis + end + + test "institution_reports_total_basis? matches Vanguard and Fidelity org metadata" do + cases = { + { "name" => "Vanguard" } => true, + { "name" => "VANGUARD BROKERAGE" } => true, + { "name" => "Fidelity Investments" } => true, + { "domain" => "vanguard.com" } => true, + { "domain" => "401k.fidelity.com" } => true, + { "name" => "Charles Schwab", "domain" => "schwab.com" } => false, + { "name" => "Chase" } => false, + {} => false + } + + cases.each do |org, expected| + account = Struct.new(:org_data).new(org) + processor = SimplefinAccount::Investments::HoldingsProcessor.new(account) + assert_equal expected, + processor.send(:institution_reports_total_basis?), + "org_data #{org.inspect} expected #{expected}" + end + end + test "missing cost basis fields return nil" do payload = { "market_value" => "10108.16"