Commit Graph

10 Commits

Author SHA1 Message Date
wps260
c294cbf54b Performance improvements in holding calculation pipeline (#1579)
* Performance improvements in holding calculation pipeline

Investment accounts with large histories were pegging CPU at 100% during
sync. Root cause was a cluster of quadratic and superlinear algorithms in
the inner holding calculation loop. All are replaced with O(1) hash lookups
built from single-pass indexes over the already-loaded data.

Holding::PortfolioCache - load_prices:

  Three O(SxN) patterns inside the per-security loop:

  1. DB prices: `security.prices.where(...)` fired one SQL query per
     security (N+1). Replaced with a single bulk query before the loop:

       Security::Price.where(security_id: ..., date: ...).group_by(&:security_id)

     70 securities -> 70 queries becomes 1.

  2. Trade prices: `trades.select { |t| t.entryable.security_id == id }`
     scanned the full trades array for every security - O(SxT). Replaced
     with trades_by_security_id, pre-indexed once from the loaded array.

  3. Holding prices: `holdings.select { |h| h.security_id == id }` - same
     O(SxH) pattern. Replaced with holdings_by_security_id.

  Prices are now indexed into prices_by_date and prices_by_date_and_source
  hashes during load_prices, making get_price O(1) instead of scanning the
  flat prices array on every lookup.

Holding::PortfolioCache - get_trades / get_price:

  - get_trades(date:): `trades.select { |t| t.date == date }` (O(T) scan)
    replaced with trades_by_date hash (O(1)).

  - get_price: two `prices.select { p.date == date ... }.min_by` linear
    scans replaced with direct hash lookups into prices_by_date and
    prices_by_date_and_source.

Holding::PortfolioCache - collect_unique_securities:

  `holdings.map(&:security)` traversed the security association on every
  holding record (N+1 if not preloaded). Replaced with a pluck of
  security_ids followed by a single Security.where(id: ...) batch load.

Holding::ForwardCalculator / ReverseCalculator:

  `holdings += build_holdings(...)` allocated a new array copy on every
  iteration - O(N) per day x thousands of days = O(D^2) total allocations.
  Replaced with holdings.concat(...) which appends in place, O(1).

Holding::ReverseCalculator - precompute_cost_basis:

  Old: walked every date from account.start_date to Date.current (O(D)),
  writing a cost_basis entry for every security on every date. For an
  account with 2 trades over 9,250 days this wrote ~18,500 hash entries
  and consumed the full date range in the outer loop regardless of trade
  density.

  New: walks only buy trades (O(T)), appending one [date, avg_cost]
  snapshot per trade. cost_basis_for binary-searches the sparse snapshot
  array - O(log T) per lookup. Memory drops from O(DxS) to O(T).

Holding::Gapfillable:

  `security_holdings.find { |h| h.date == date }` was called on every
  date in the gapfill range - O(H) per date, O(HxD) total. Replaced with
  security_holdings.index_by(:date) built once before the loop, making
  each date lookup O(1).

Holding::Materializer - purge_stale_holdings:

  `account.entries.trades.map { |entry| entry.entryable.security_id }.uniq`
  loaded all trade entry records into Ruby then traversed the entryable
  association on each (N+1). Replaced with account.trades.pluck(:security_id).uniq
  (single SQL query returning only the IDs).

In testing, these changes were able to reduce sync time of an account with
25 years of history and 70 securities from about 90 minutes down to under
3 minutes.

* Lint fix

* Lint fix

* addressing the open review nits I agreed with:

* return dup'd arrays from PortfolioCache#get_trades so callers can't mutate memoized cache state
* use the precomputed security-id indexes in collect_unique_securities
* keep security-id dedupe in SQL via distinct.pluck(:security_id)
* tighten the DB price preload to select only needed columns
* harden cost-basis assertions with assert_in_delta

* Back out unnecessary AI slop

* Add back dup to trades array returned from memoized hash

trades_by_date[date] returns a live reference into the memoized hash.
Any caller that mutates the result would silently corrupt the cache for
subsequent calls on the same date within the same sync run. Add .dup to
return a shallow copy, matching the safety of the original select path.
2026-05-05 01:24:33 +02:00
Anas Limouri
a90f9b7317 Add CoinStats exchange portfolio sync and normalize linked investment charts (#1308)
* [FEATURE] Add CoinStats exchange portfolios and normalize linked investment charts

* [BUGFIX] Fix CoinStats PR regressions

* [BUGFIX] Fix CoinStats PR review findings

* [BUGFIX] Address follow-up CoinStats PR feedback

* [REFACTO] Extract CoinStats exchange account helpers

* [BUGFIX] Batch linked CoinStats chart normalization

* [BUGFIX] Fix CoinStats processor lint

---------

Signed-off-by: Juan José Mata <juanjo.mata@gmail.com>
Co-authored-by: Juan José Mata <juanjo.mata@gmail.com>
2026-04-01 20:25:06 +02:00
Ang Wei Feng (Ted)
b88734fb5e fix: allow refreshes from the same source for cost basis updates (#917)
* fix: allow refreshes from the same source for cost basis updates

* test: update cost basis priority expectations
2026-02-06 18:30:50 +01:00
LPW
bbaf7a06cc Add cost basis source tracking with manual override and lock protection (#623)
* Add cost basis tracking and management to holdings

- Added migration to introduce `cost_basis_source` and `cost_basis_locked` fields to `holdings`.
- Implemented backfill for existing holdings to set `cost_basis_source` based on heuristics.
- Introduced `Holding::CostBasisReconciler` to manage cost basis resolution logic.
- Added user interface components for editing and locking cost basis in holdings.
- Updated `materializer` to integrate reconciliation logic and respect locked holdings.
- Extended tests for cost basis-related workflows to ensure accuracy and reliability.

* Fix cost basis calculation in holdings controller

- Ensure `cost_basis` is converted to decimal for accurate arithmetic.
- Fix conditional check to properly validate positive `cost_basis`.

* Improve cost basis validation and error handling in holdings controller

- Allow zero as a valid cost basis for gifted/inherited shares.
- Add error handling with user feedback for invalid cost basis values.

---------

Co-authored-by: Josh Waldrep <joshua.waldrep5+github@gmail.com>
2026-01-12 14:05:46 +01:00
LPW
fa78e1d292 Improve handling of cost_basis during holding materialization and display (#619)
- Refactored `persist_holdings` to separate and conditionally upsert holdings with and without cost_basis.
- Updated `avg_cost` logic to treat 0 cost_basis as unknown and return nil when cost_basis cannot be determined.
- Modified trend and investment calculation to exclude holdings with unknown cost_basis.
- Adjusted `average_cost` formatting to handle nil values in API responses and views.
- Added comprehensive tests to ensure cost_basis preservation and fallback behavior.
- Localized `unknown` label for display when cost_basis is unavailable.

Co-authored-by: Josh Waldrep <joshua.waldrep5+github@gmail.com>
2026-01-11 23:58:51 +01:00
Zach Gollwitzer
8db95623cf Handle holding quantity generation for reverse syncs correctly when not all holdings are generated for current day (#2417)
* Handle reverse calculator starting portfolio generation correctly

* Fix current_holdings to handle different dates and hide zero quantities

- Use DISTINCT ON to get most recent holding per security instead of assuming same date
- Filter out zero quantity holdings from UI display
- Maintain cash display regardless of zero balance
- Use single efficient query with proper Rails syntax

* Continue to process holdings even if one is not resolvable

* Lint fixes
2025-06-26 16:57:17 -04:00
Zach Gollwitzer
10f255a9a9 Clarify backend data pipeline naming concepts (importers, processors, materializers, calculators, and syncers) (#2255)
* Rename MarketDataSyncer to MarketDataImporter

* Materializers

* Importers

* More reference replacements
2025-05-17 16:37:16 -04:00
Zach Gollwitzer
10dd9e061a Improve account sync performance, handle concurrent market data syncing (#2236)
* PlaidConnectable concern

* Remove bad abstraction

* Put sync implementations in own concerns

* Sync strategies

* Move sync orchestration to Sync class

* Clean up sync class, add state machine

* Basic market data sync cron

* Fix price sync

* Improve sync window column names, add timestamps

* 30 day syncs by default

* Clean up market data methods

* Report high duplicate sync counts to Sentry

* Add sync states throughout app

* account tab session

* Persistent account tab selections

* Remove manual sleep

* Add migration to clear stale syncs on self hosted apps

* Tweak sync states

* Sync completion event broadcasts

* Fix timezones in tests

* Cleanup

* More cleanup

* Plaid item UI broadcasts for sync

* Fix account ID namespace conflict

* Sync broadcasters

* Smoother account sync refreshes

* Remove test sync delay
2025-05-15 10:19:56 -04:00
Zach Gollwitzer
2000f05453 Match Plaid holding values on current day (#2212)
* Match Plaid holding values on current day

* Fix chart timezone issue

* Add timezone tests for syncs

* Hide sidebars on trades test
2025-05-06 09:25:49 -04:00
Zach Gollwitzer
e657c40d19 Account:: namespace simplifications and cleanup (#2110)
* Flatten Holding model

* Flatten balance model

* Entries domain renames

* Fix valuations reference

* Fix trades stream

* Fix brakeman warnings

* Fix tests

* Replace existing entryable type references in DB
2025-04-14 11:40:34 -04:00