Fix/issue 954 enable banking duplicate transactions (#988)

* fix: deduplicate Enable Banking API transactions with different entry_reference IDs (#954)

Enable Banking API sometimes returns the same logical transaction multiple
times with different entry_reference values in a single response. This causes
duplicate entries because the existing ID-based deduplication treats them as
distinct transactions.

Add content-based deduplication that compares (date, amount, currency,
creditor, debtor, remittance_information, status) to detect and remove these
API-level duplicates before storing them. The first occurrence is kept.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add Enable Banking processor and importer deduplication tests (#954)

Add tests for:
- EnableBankingEntry::Processor: verifies entry_reference fallback for
  external_id, idempotent re-processing, string key handling
- EnableBankingItem::Importer: verifies content-based deduplication removes
  API-level duplicates while preserving legitimate distinct transactions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle nil values in remittance_information array for dedup key (#954)

Call compact and map(&:to_s) before sort.join when remittance_information
is an array, preventing ArgumentError when it contains nil elements.
Also document the known limitation of content-based deduplication
collapsing genuinely distinct identical transactions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add coverage for nil values in remittance_information array (#954)

Verify that deduplication handles remittance_information arrays containing
nil elements without raising ArgumentError, and correctly treats arrays
with different nil positions but same non-nil content as duplicates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prefer transaction_id over content-based dedup to preserve legit duplicates (#954)

When transaction_id is present, use it as the dedup key instead of falling
back to content-based deduplication. This preserves legitimately distinct
transactions with identical content (e.g. two laundromat payments of the
same amount on the same day) while still deduplicating the Enable Banking
duplicate entry_reference issue when transaction_id is nil.

Addresses review feedback from @jjmata about legitimate duplicate
transactions being incorrectly collapsed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use composite key for dedup instead of transaction_id alone (#954)

Per the Enable Banking API docs, transaction_id is not guaranteed to be
unique. Include it as one component of the composite content key rather
than using it as the sole dedup criterion. This preserves transactions
with non-unique transaction_ids but different content, while still
deduplicating true API-level duplicates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add value_date fallback coverage for dedup key (#954)

build_transaction_content_key falls back to value_date when booking_date
is absent. This test exercises that path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: document known limitation of content-based dedup (#954)

When transaction_id is nil for both transactions, pure content comparison
applies, which could theoretically collapse two genuinely distinct
transactions with identical fields. Document this trade-off inline for
future maintainers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add credit_debit_indicator to dedup composite key (#954)

transaction_amount.amount is always positive in the Enable Banking API,
with direction encoded separately in credit_debit_indicator (CRDT/DBIT).
Without it in the composite key, a payment and a same-day refund of the
same amount to the same merchant would produce identical keys, silently
dropping one transaction.

- Add credit_debit_indicator to build_transaction_content_key
- Add test for payment + same-day refund scenario
- Update docstring to document the rationale

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
0xRozier
2026-03-28 16:53:30 +01:00
committed by GitHub
parent e7af0ad99b
commit 005d2fac20
3 changed files with 536 additions and 0 deletions

View File

@@ -230,6 +230,11 @@ class EnableBankingItem::Importer
break if continuation_key.blank?
end
# Deduplicate API response: Enable Banking sometimes returns the same logical
# transaction with different entry_reference IDs in the same response.
# Remove content-level duplicates before storing. (Issue #954)
all_transactions = deduplicate_api_transactions(all_transactions)
transactions_count = all_transactions.count
if all_transactions.any?
@@ -259,6 +264,71 @@ class EnableBankingItem::Importer
{ success: false, transactions_count: 0, error: e.message }
end
# Deduplicate transactions from the Enable Banking API response.
# Some banks return the same logical transaction multiple times with different
# entry_reference IDs. We build a composite content key that includes
# transaction_id (when present) alongside date, amount, currency, creditor,
# debtor, remittance_information, and status. Per the Enable Banking API docs
# transaction_id is not guaranteed to be unique, so it cannot be used as
# the sole dedup criterion. Including it in the composite key preserves
# legitimately distinct transactions with identical content but different
# transaction_ids (e.g. two laundromat payments on the same day). (Issue #954)
def deduplicate_api_transactions(transactions)
seen = {}
duplicates_removed = 0
result = transactions.select do |tx|
tx = tx.with_indifferent_access
key = build_transaction_content_key(tx)
if seen[key]
duplicates_removed += 1
false
else
seen[key] = true
true
end
end
if duplicates_removed > 0
Rails.logger.info(
"EnableBankingItem::Importer - Removed #{duplicates_removed} content-level " \
"duplicate(s) from API response (#{transactions.count}#{result.count} transactions)"
)
end
result
end
# Build a composite key for deduplication. Two transactions with different
# entry_reference values but identical content fields (including
# transaction_id and credit_debit_indicator) are considered duplicates.
# transaction_id is included as one component — not a standalone key —
# because the Enable Banking API docs state it is not guaranteed to be
# unique. credit_debit_indicator (CRDT/DBIT) is included because
# transaction_amount.amount is always positive — without it, a payment
# and a same-day refund of the same amount would produce identical keys.
# Known limitation: when transaction_id is nil for both, pure content
# comparison applies. This means two genuinely distinct transactions
# with identical content (same date, amount, direction, creditor, etc.)
# and no transaction_id would collapse to one. In practice, banks that
# omit transaction_id rarely produce such exact duplicates in the same
# API response; timestamps or remittance info usually differ. (Issue #954)
def build_transaction_content_key(tx)
date = tx[:booking_date].presence || tx[:value_date]
amount = tx.dig(:transaction_amount, :amount).presence || tx[:amount]
currency = tx.dig(:transaction_amount, :currency).presence || tx[:currency]
creditor = tx.dig(:creditor, :name).presence || tx[:creditor_name]
debtor = tx.dig(:debtor, :name).presence || tx[:debtor_name]
remittance = tx[:remittance_information]
remittance_key = remittance.is_a?(Array) ? remittance.compact.map(&:to_s).sort.join("|") : remittance.to_s
status = tx[:status]
tid = tx[:transaction_id]
direction = tx[:credit_debit_indicator]
[ date, amount, currency, creditor, debtor, remittance_key, status, tid, direction ].map(&:to_s).join("\x1F")
end
def determine_sync_start_date(enable_banking_account)
has_stored_transactions = enable_banking_account.raw_transactions_payload.to_a.any?

View File

@@ -0,0 +1,120 @@
require "test_helper"
class EnableBankingEntry::ProcessorTest < ActiveSupport::TestCase
setup do
@family = families(:dylan_family)
@account = accounts(:depository)
@enable_banking_item = EnableBankingItem.create!(
family: @family,
name: "Test Enable Banking",
country_code: "DE",
application_id: "test_app_id",
client_certificate: "test_cert"
)
@enable_banking_account = EnableBankingAccount.create!(
enable_banking_item: @enable_banking_item,
name: "N26 Hauptkonto",
uid: "eb_uid_1",
currency: "EUR"
)
AccountProvider.create!(
account: @account,
provider: @enable_banking_account
)
end
test "uses entry_reference as external_id when transaction_id is nil" do
tx = {
entry_reference: "31e13269-03fc-11f1-89d2-cd465703551c",
transaction_id: nil,
booking_date: Date.current.to_s,
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar Dankt 3418" },
credit_debit_indicator: "DBIT",
status: "BOOK"
}
assert_difference "@account.entries.count", 1 do
EnableBankingEntry::Processor.new(tx, enable_banking_account: @enable_banking_account).process
end
entry = @account.entries.find_by!(
external_id: "enable_banking_31e13269-03fc-11f1-89d2-cd465703551c",
source: "enable_banking"
)
assert_equal 11.65, entry.amount.to_f
assert_equal "EUR", entry.currency
end
test "uses transaction_id as external_id when present" do
tx = {
entry_reference: "ref_123",
transaction_id: "txn_456",
booking_date: Date.current.to_s,
transaction_amount: { amount: "25.00", currency: "EUR" },
creditor: { name: "Amazon" },
credit_debit_indicator: "DBIT",
status: "BOOK"
}
EnableBankingEntry::Processor.new(tx, enable_banking_account: @enable_banking_account).process
entry = @account.entries.find_by!(external_id: "enable_banking_txn_456", source: "enable_banking")
assert_equal 25.0, entry.amount.to_f
end
test "does not create duplicate when same entry_reference is processed twice" do
tx = {
entry_reference: "unique_ref_abc",
transaction_id: nil,
booking_date: Date.current.to_s,
transaction_amount: { amount: "50.00", currency: "EUR" },
creditor: { name: "Rewe" },
credit_debit_indicator: "DBIT",
status: "BOOK"
}
assert_difference "@account.entries.count", 1 do
EnableBankingEntry::Processor.new(tx, enable_banking_account: @enable_banking_account).process
end
assert_no_difference "@account.entries.count" do
EnableBankingEntry::Processor.new(tx, enable_banking_account: @enable_banking_account).process
end
end
test "raises ArgumentError when both transaction_id and entry_reference are nil" do
tx = {
transaction_id: nil,
entry_reference: nil,
booking_date: Date.current.to_s,
transaction_amount: { amount: "10.00", currency: "EUR" },
creditor: { name: "Test" },
credit_debit_indicator: "DBIT",
status: "BOOK"
}
assert_raises(ArgumentError) do
EnableBankingEntry::Processor.new(tx, enable_banking_account: @enable_banking_account).process
end
end
test "handles string keys in transaction data" do
tx = {
"entry_reference" => "string_key_ref",
"transaction_id" => nil,
"booking_date" => Date.current.to_s,
"transaction_amount" => { "amount" => "15.00", "currency" => "EUR" },
"creditor" => { "name" => "Lidl" },
"credit_debit_indicator" => "DBIT",
"status" => "BOOK"
}
assert_difference "@account.entries.count", 1 do
EnableBankingEntry::Processor.new(tx, enable_banking_account: @enable_banking_account).process
end
entry = @account.entries.find_by!(external_id: "enable_banking_string_key_ref", source: "enable_banking")
assert_equal 15.0, entry.amount.to_f
end
end

View File

@@ -0,0 +1,346 @@
require "test_helper"
class EnableBankingItem::ImporterDedupTest < ActiveSupport::TestCase
setup do
@family = families(:dylan_family)
@enable_banking_item = EnableBankingItem.create!(
family: @family,
name: "Test Enable Banking",
country_code: "AT",
application_id: "test_app_id",
client_certificate: "test_cert",
session_id: "test_session",
session_expires_at: 1.day.from_now
)
mock_provider = mock()
@importer = EnableBankingItem::Importer.new(@enable_banking_item, enable_banking_provider: mock_provider)
end
test "removes content-level duplicates with different entry_reference IDs" do
transactions = [
{
entry_reference: "ref_aaa",
transaction_id: nil,
booking_date: "2026-02-07",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar Dankt 3418" },
credit_debit_indicator: "DBIT",
status: "BOOK"
},
{
entry_reference: "ref_bbb",
transaction_id: nil,
booking_date: "2026-02-07",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar Dankt 3418" },
credit_debit_indicator: "DBIT",
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 1, result.count
assert_equal "ref_aaa", result.first[:entry_reference]
end
test "keeps transactions with different amounts" do
transactions = [
{
entry_reference: "ref_1",
booking_date: "2026-02-07",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar" },
status: "BOOK"
},
{
entry_reference: "ref_2",
booking_date: "2026-02-07",
transaction_amount: { amount: "23.30", currency: "EUR" },
creditor: { name: "Spar" },
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 2, result.count
end
test "keeps transactions with different dates" do
transactions = [
{
entry_reference: "ref_1",
booking_date: "2026-02-07",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar" },
status: "BOOK"
},
{
entry_reference: "ref_2",
booking_date: "2026-02-08",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar" },
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 2, result.count
end
test "keeps transactions with different creditors" do
transactions = [
{
entry_reference: "ref_1",
booking_date: "2026-02-07",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar" },
status: "BOOK"
},
{
entry_reference: "ref_2",
booking_date: "2026-02-07",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Lidl" },
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 2, result.count
end
test "removes multiple duplicates from same response" do
base = {
booking_date: "2026-02-07",
transaction_amount: { amount: "3.00", currency: "EUR" },
creditor: { name: "Bakery" },
status: "BOOK"
}
transactions = [
base.merge(entry_reference: "ref_1"),
base.merge(entry_reference: "ref_2"),
base.merge(entry_reference: "ref_3")
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 1, result.count
assert_equal "ref_1", result.first[:entry_reference]
end
test "handles string keys in transaction data" do
transactions = [
{
"entry_reference" => "ref_aaa",
"booking_date" => "2026-02-07",
"transaction_amount" => { "amount" => "11.65", "currency" => "EUR" },
"creditor" => { "name" => "Spar" },
"status" => "BOOK"
},
{
"entry_reference" => "ref_bbb",
"booking_date" => "2026-02-07",
"transaction_amount" => { "amount" => "11.65", "currency" => "EUR" },
"creditor" => { "name" => "Spar" },
"status" => "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 1, result.count
end
test "differentiates by remittance_information" do
transactions = [
{
entry_reference: "ref_1",
booking_date: "2026-02-07",
transaction_amount: { amount: "100.00", currency: "EUR" },
creditor: { name: "Landlord" },
remittance_information: [ "Rent January" ],
status: "BOOK"
},
{
entry_reference: "ref_2",
booking_date: "2026-02-07",
transaction_amount: { amount: "100.00", currency: "EUR" },
creditor: { name: "Landlord" },
remittance_information: [ "Rent February" ],
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 2, result.count
end
test "handles nil values in remittance_information array" do
transactions = [
{
entry_reference: "ref_aaa",
booking_date: "2026-02-07",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar" },
remittance_information: [ nil, "Payment ref 123", nil ],
status: "BOOK"
},
{
entry_reference: "ref_bbb",
booking_date: "2026-02-07",
transaction_amount: { amount: "11.65", currency: "EUR" },
creditor: { name: "Spar" },
remittance_information: [ "Payment ref 123", nil ],
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 1, result.count
assert_equal "ref_aaa", result.first[:entry_reference]
end
test "preserves distinct transactions with same content but different transaction_ids" do
transactions = [
{
entry_reference: "ref_1",
transaction_id: "txn_001",
booking_date: "2026-02-09",
transaction_amount: { amount: "1.50", currency: "EUR" },
creditor: { name: "Waschsalon" },
status: "BOOK"
},
{
entry_reference: "ref_2",
transaction_id: "txn_002",
booking_date: "2026-02-09",
transaction_amount: { amount: "1.50", currency: "EUR" },
creditor: { name: "Waschsalon" },
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 2, result.count
end
test "deduplicates same transaction_id even with different entry_references" do
transactions = [
{
entry_reference: "ref_aaa",
transaction_id: "txn_same",
booking_date: "2026-02-09",
transaction_amount: { amount: "25.00", currency: "EUR" },
creditor: { name: "Amazon" },
status: "BOOK"
},
{
entry_reference: "ref_bbb",
transaction_id: "txn_same",
booking_date: "2026-02-09",
transaction_amount: { amount: "25.00", currency: "EUR" },
creditor: { name: "Amazon" },
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 1, result.count
assert_equal "ref_aaa", result.first[:entry_reference]
end
test "preserves transactions with same non-unique transaction_id but different content" do
# Per Enable Banking API docs, transaction_id is not guaranteed to be unique.
# Two transactions sharing a transaction_id but differing in content must both be kept.
transactions = [
{
entry_reference: "ref_1",
transaction_id: "shared_tid",
booking_date: "2026-02-09",
transaction_amount: { amount: "25.00", currency: "EUR" },
creditor: { name: "Amazon" },
status: "BOOK"
},
{
entry_reference: "ref_2",
transaction_id: "shared_tid",
booking_date: "2026-02-09",
transaction_amount: { amount: "42.00", currency: "EUR" },
creditor: { name: "Amazon" },
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 2, result.count
end
test "deduplicates using value_date when booking_date is absent" do
transactions = [
{
entry_reference: "ref_1",
transaction_id: nil,
value_date: "2026-02-10",
transaction_amount: { amount: "1.50", currency: "EUR" },
creditor: { name: "Waschsalon" },
status: "BOOK"
},
{
entry_reference: "ref_2",
transaction_id: nil,
value_date: "2026-02-10",
transaction_amount: { amount: "1.50", currency: "EUR" },
creditor: { name: "Waschsalon" },
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 1, result.count
assert_equal "ref_1", result.first[:entry_reference]
end
test "keeps payment and same-day refund with same amount as separate transactions" do
transactions = [
{
entry_reference: "ref_payment",
transaction_id: nil,
booking_date: "2026-02-09",
transaction_amount: { amount: "25.00", currency: "EUR" },
creditor: { name: "Amazon" },
credit_debit_indicator: "DBIT",
status: "BOOK"
},
{
entry_reference: "ref_refund",
transaction_id: nil,
booking_date: "2026-02-09",
transaction_amount: { amount: "25.00", currency: "EUR" },
creditor: { name: "Amazon" },
credit_debit_indicator: "CRDT",
status: "BOOK"
}
]
result = @importer.send(:deduplicate_api_transactions, transactions)
assert_equal 2, result.count
end
test "returns empty array for empty input" do
result = @importer.send(:deduplicate_api_transactions, [])
assert_equal [], result
end
end