Hybrid Neuro-Symbolic Fraud Detection: Guiding Neural Networks with Domain Rules

-

Abstract

datasets are extremely imbalanced, with positive rates below 0.2%. Standard neural networks trained with weighted binary cross-entropy often achieve high ROC-AUC but struggle to discover suspicious transactions under threshold-sensitive metrics. I propose a Hybrid Neuro-Symbolic (HNS) approach that comes with domain knowledge directly into the training objective as a differentiable rule loss — encouraging the model to assign high fraud probability to transactions with unusually large amounts and atypical PCA signatures. On the Kaggle Credit Card Fraud dataset, the hybrid achieves ROC-AUC of 0.970 ± 0.005 across 5 random seeds, in comparison with 0.967 ± 0.003 for the pure neural baseline under symmetric evaluation. A key practical finding: on imbalanced data, threshold selection strategy affects F1 as much as model architecture — each models should be evaluated with the identical approach for any comparison to be meaningful. Code and reproducibility materials can be found at GitHub.

The Problem: When ROC-AUC Lies

I had a fraud dataset at 0.17% positive rate. Trained a weighted BCE network, got ROC-AUC of 0.96, someone said “nice”. Then I pulled up the rating distributions and threshold-dependent metrics. The model had quietly discovered that predicting “not fraud” on anything ambiguous was the trail of least resistance — and nothing within the loss function disagreed with that call.

What bothered me wasn’t the mathematics. It was that the model had no idea what fraud . A junior analyst on day one could let you know: large transactions are suspicious, transactions with unusual PCA signatures are suspicious, and when each occur together, it’s best to definitely be being attentive. That knowledge just… never makes it into the training loop.So I ran an experiment. What if I encoded that analyst intuition as a soft constraint directly within the loss function — something the network has to satisfy while also fitting the labels? The result was a Hybrid Neuro-Symbolic (HNS) setup. This text walks through the total experiment: the model, the rule loss, the lambda sweep, and — critically — what a correct multi-seed variance evaluation with symmetric threshold evaluation actually shows.

The Setup

I used the Kaggle Credit Card Fraud dataset — 284,807 transactions, 492 of that are fraud (0.172%). The V1–V28 features are PCA components from an anonymized original feature space. Amount and Time are raw. The severe imbalance is the entire point; that is where standard approaches begin to struggle [1].

Split was 70/15/15 train/val/test, stratified. I trained 4 things and compared them head-to-head:

  • Isolation Forest — contamination=0.001, matches on the total training set
  • One-Class SVM — nu=0.001, matches only on the non-fraud training samples
  • Pure Neural — three-layer MLP with BCE + class weighting, no domain knowledge
  • Hybrid Neuro-Symbolic — the identical MLP, with a differentiable rule penalty added to the loss

Isolation Forest and One-Class SVM function a gut-check. If a supervised network with 199k training samples cannot clear the bar set by an unsupervised method, that’s value knowing before you write up results. A tuned gradient boosting model would likely outperform each neural approaches; this comparison is meant to isolate the effect of the rule loss, not benchmark against all possible methods. Full code for all 4 is on GitHub.

The Model

Nothing exotic. A 3-layer MLP with batch normalization after each hidden layer. The batch norm matters greater than you would possibly expect — under heavy class imbalance, activations can drift badly without it [3].

class MLP(nn.Module):
    def __init__(self, input_dim):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(input_dim, 128),
            nn.ReLU(),
            nn.BatchNorm1d(128),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.BatchNorm1d(64),
            nn.Linear(64, 1)
        )

    def forward(self, x):
        return self.net(x)

For the loss, BCEWithLogitsLoss with pos_weight — computed because the ratio of non-fraud to fraud counts within the training set. On this dataset that’s 577 [4]. A single fraud sample in a batch generates 577 times the gradient of a non-fraud one.

That weight provides a directional signal when labeled fraud does appear. However the model still has no concept of what “suspicious” looks like in feature space — it only knows that fraud examples, once they do show up, ought to be heavily weighted. That’s different from knowing where to look on batches that occur to contain no labeled fraud in any respect.

The Rule Loss

Here is the core idea. Fraud analysts know two things empirically: unusually high transaction amounts are suspicious, and transactions that sit removed from normal behavior in PCA space are suspicious. I need the model to assign high fraud probabilities to transactions that match each signals — even when a batch accommodates no labeled fraud examples.

The trick is making the rule . An if/else threshold — flag any transaction where amount > 1000 — is a tough step function. Its gradient is zero all over the place except at the brink itself, where it’s undefined. Which means backpropagation has nothing to work with; the rule produces no useful gradient signal and the optimizer ignores it. As an alternative, I exploit a steep sigmoid centered on the batch mean. It approximates the identical threshold behavior but stays smooth and differentiable all over the place — the gradient is small removed from the boundary and peaks near it, which is precisely where you wish the optimizer being attentive. The result’s a smooth suspicion rating between 0 and 1:

def rule_loss(x, probs):
    # x[:, -1]   = Amount  (last column in creditcard.csv after dropping Class)
    # x[:, 1:29] = V1–V28  (PCA components, columns 1–28)
    amount   = x[:, -1]
    pca_norm = torch.norm(x[:, 1:29], dim=1)

    suspicious = (
        torch.sigmoid(5 * (amount   - amount.mean())) +
        torch.sigmoid(5 * (pca_norm - pca_norm.mean()))
    ) / 2.0

    penalty = suspicious * torch.relu(0.6 - probs.squeeze())
    return penalty.mean()

A note on why PCA norm specifically: the V1–V28 features are the results of a PCA transform applied to the unique anonymized transaction data. A transaction that sits removed from the origin on this compressed space has unusual variance across multiple original features concurrently — it’s an outlier within the latent representation. The Euclidean norm of the PCA vector captures that distance in a single scalar. This is just not a Kaggle-specific trick. On any dataset where PCA components represent normal behavioral variance, the norm of those components is an affordable proxy for atypicality. In case your features will not be PCA-transformed, you’d replace this with a domain-appropriate signal — Mahalanobis distance, isolation rating, or a feature-specific z-score.

The relu(0.6 – probs) term is the constraint: it fires only when the model’s predicted fraud probability is below 0.6 for a suspicious transaction. If the model is already confident (prob > 0.6), the penalty is zero. That is intentional — I’m not penalizing the model for being too aggressive on suspicious transactions, just for being too conservative. The asymmetry means the rule can never fight against an accurate high-confidence prediction.

Formally, the combined objective is:

The λ hyperparameter controls how hard the rule pushes. At λ=0 you get the pure neural baseline. The total training loop:

for xb, yb in train_loader:
    xb, yb = xb.to(DEVICE), yb.to(DEVICE)

    logits = model(xb)
    bce    = criterion(logits.squeeze(), yb)
    probs  = torch.sigmoid(logits)
    rl     = rule_loss(xb, probs)
    loss   = bce + lambda_rule * rl

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

Tuning Lambda

Five values tested: 0.0, 0.1, 0.5, 1.0, 2.0. Each model trained to best validation PR-AUC with early stopping at patience=7, seed=42:

Lambda 0.0  →  Val PR-AUC: 0.7580
Lambda 0.1  →  Val PR-AUC: 0.7595
Lambda 0.5  →  Val PR-AUC: 0.7620   ← best
Lambda 1.0  →  Val PR-AUC: 0.7452
Lambda 2.0  →  Val PR-AUC: 0.7504

Best Lambda: 0.5

λ=0.5 wins narrowly on validation PR-AUC. The gap between λ=0.0, 0.1, and 0.5 is small — throughout the range of seed variance because the multi-seed evaluation below shows. The meaningful drop at λ=1.0 and a couple of.0 suggests that aggressive rule weighting can override the BCE signal relatively than complement it. On recent data, treat λ=0 because the default and confirm any improvement holds across seeds before trusting it.

One thing to watch out about with threshold selection: I computed the optimal F1 threshold on the validation set and applied it to the test set — for each models symmetrically. On a 0.17% positive-rate dataset, the optimal decision boundary is nowhere near 0.5. Applying different thresholding strategies to different models means measuring the brink gap, not the model gap. Each must use the identical approach:

def find_best_threshold(y_true, probs):
    precision, recall, thresholds = precision_recall_curve(y_true, probs)
    f1_scores = 2*(precision*recall) / (precision+recall+1e-8)
    return thresholds[np.argmax(f1_scores)]

# Applied symmetrically to BOTH models — val set only
hybrid_thresh, _ = find_best_threshold(y_val, hybrid_val_probs)
pure_thresh,   _ = find_best_threshold(y_val, pure_val_probs)

Results

Model F1 PR-AUC ROC-AUC Recall@1%FPR

On this seed, the hybrid and pure baseline are competitive on F1 (0.767 vs 0.776) and equivalent on Recall@1%FPR. The hybrid’s PR-AUC is lower on this particular seed (0.745 vs 0.806). The cleanest signal is ROC-AUC — 0.970 for the hybrid vs 0.969 for the pure baseline. ROC-AUC is threshold-independent, measuring rating quality across all possible cutoffs. That edge is where the rule loss shows up most consistently.

Precision-Recall Curve

Figure 1 — Precision-Recall curve for the Hybrid model (seed=42). PR-AUC = 0.745. Image by Writer.

Strong early precision is what you wish in a fraud system. The curve holds reasonably before dropping — meaning the model’s top-ranked transactions are genuinely fraud-heavy, not only a lucky threshold. In production you’d tune the brink to your actual cost ratio: the fee of a missed fraud versus the fee of a false alarm. The val-optimized F1 threshold used here is an affordable middle ground for reporting, not the one valid alternative.

Confusion Matrix

Confusion matrix for the Hybrid model (seed=42) at validation-tuned threshold
Figure 2 — Confusion Matrix at validation-tuned threshold (seed=42). Image by Writer.

Rating Distributions

Histogram of predicted probabilities for non-fraud (blue) and fraud (orange) classes using the Hybrid model (seed=42)
Figure 3 — Predicted probability distributions (seed=42). Non-fraud (blue) clusters near 0; fraud (orange) is pushed higher by the rule penalty. Image by Writer.

This histogram is what I have a look at first after training any classifier on imbalanced data. The non-fraud distribution should spike near zero; the fraud distribution should spread toward 1. The overlap region in the center is where the model is genuinely uncertain — that’s where your threshold lives.

Variance Evaluation — 5 Random Seeds

A single-seed result on a dataset this imbalanced is just not enough to trust. I ran each models across seeds [42, 0, 7, 123, 2024], applying val-optimized thresholds symmetrically to each in every run:

Seed   42 | Hybrid F1: 0.767  PR-AUC: 0.745 | Pure F1: 0.776  PR-AUC: 0.806
Seed    0 | Hybrid F1: 0.733  PR-AUC: 0.636 | Pure F1: 0.788  PR-AUC: 0.743
Seed    7 | Hybrid F1: 0.809  PR-AUC: 0.817 | Pure F1: 0.767  PR-AUC: 0.755
Seed  123 | Hybrid F1: 0.797  PR-AUC: 0.756 | Pure F1: 0.757  PR-AUC: 0.731
Seed 2024 | Hybrid F1: 0.764  PR-AUC: 0.745 | Pure F1: 0.826  PR-AUC: 0.763
Model F1 (mean ± std) PR-AUC (mean ± std) ROC-AUC (mean ± std)
Pure Neural 0.783 ± 0.024 0.760 ± 0.026 0.967 ± 0.003
Hybrid (λ=0.5) 0.774 ± 0.027 0.740 ± 0.058 0.970 ± 0.005
Bar chart showing mean and standard deviation of F1 and PR-AUC across 5 random seeds for pure neural and hybrid models
Figure 4 — F1 and PR-AUC mean ± std across 5 seeds. Differences on threshold-dependent metrics are inside noise range. Image by Writer.

Three observations from the variance data. The hybrid wins on F1 in 2 of 5 seeds; the pure baseline wins in 3 of 5. Neither dominates on threshold-dependent metrics. The hybrid’s PR-AUC variance is notably higher (±0.058 vs ±0.026), meaning the rule loss makes some initializations higher and a few worse — it’s a sensitivity, not a guaranteed improvement. The one result that holds without exception: ROC-AUC is higher for the hybrid across all 5 seeds. That’s the cleanest signal from this experiment.

Why Does the Rule Loss Help ROC-AUC?

ROC-AUC is threshold-independent — it measures how well the model ranks fraud above non-fraud across all possible cutoffs. A consistent improvement across 5 seeds is an actual signal. Here’s what I feel is occurring.

With 0.172% fraud prevalence, most 2048-sample batches contain only 3–4 labeled fraud examples. The BCE loss receives almost no fraud-relevant gradient on nearly all of batches. The rule loss fires on every suspicious transaction no matter label — it generates gradient signals on batches that may otherwise tell the optimizer almost nothing about fraud. This provides the model consistent direction throughout training, not only on the rare batches where labeled fraud happens to seem.

The penalty can be feature-selective. By pointing the model specifically toward amount and PCA norm, the rule reduces the prospect that the model latches onto irrelevant correlations in the opposite 28 dimensions. It functions as soft regularization over the feature space, not only the output space.

The one-sided relu matters too. I’m not penalizing the model for being too aggressive on suspicious transactions — just for being too conservative. The rule cannot fight against an accurate high-confidence prediction, only push up underconfident ones. That asymmetry is deliberate.

On Threshold Evaluation in Imbalanced Classification

One finding from this experiment is value its own section since it applies to any imbalanced classification problem, not only fraud.

On a dataset with 0.17% positive rate, the optimal F1 threshold is nowhere near 0.5. A model can rank fraud almost perfectly and still rating poorly on F1 at a default threshold, just because the choice boundary must be calibrated to the category imbalance. Which means if two models are evaluated with different thresholding strategies — one at a hard and fast cutoff, the opposite with a val-optimized cutoff — you will not be comparing models. You’re measuring the brink gap.

The sensible checklist for clean comparison on imbalanced data:

  • Each models evaluated with the same thresholding strategy
  • Threshold chosen on validation data, never on test data
  • PR-AUC and ROC-AUC reported alongside F1 — each are threshold-independent
  • Variance across multiple seeds to separate real differences from lucky initialization

Things to Watch Out For

Batch-relative statistics. The rule computes “high amount” and “high PCA norm” relative to the batch mean, not a hard and fast population statistic. During training with large batches (2048) and stratified sampling, batch means are stable enough. In online inference scoring individual transactions, freeze those statistics to training-set values. Otherwise the “suspicious” boundary shifts with every call.

PR-AUC variance increases with the rule loss. Hybrid PR-AUC ranges from 0.636 to 0.817 across seeds versus 0.731 to 0.806 for the pure baseline. A rule that helps on some initializations and hurts on others requires multi-seed validation before drawing conclusions. Single-seed results will not be enough.

High λ degrades performance. λ=1.0 and a couple of.0 show a meaningful drop in validation PR-AUC. Aggressive rule weighting can override the BCE signal relatively than complement it. Start at λ=0.5 and confirm on your personal data before going higher.

A natural extension would make the rule weights learnable relatively than fixed at 0.5/0.5:

# Learnable combination weights
self.rule_w = nn.Parameter(torch.tensor([0.5, 0.5]))

w = torch.softmax(self.rule_w, dim=0)
suspicious = (
    w[0] * torch.sigmoid(5 * (amount   - amount.mean())) +
    w[1] * torch.sigmoid(5 * (pca_norm - pca_norm.mean()))
)

This lets the model resolve whether amount or PCA norm is more predictive for the precise data, relatively than hard-coding equal weights. This variant has not been run yet — it’s the subsequent thing on the list.

Closing Thoughts

The rule loss does something real — the ROC-AUC improvement is consistent and threshold-independent across all 5 seeds. The development on threshold-dependent metrics like F1 and PR-AUC is inside noise range and is determined by initialization. The honest summary: domain rules injected into the loss function can improve a model’s underlying rating distributions on rare-event data, however the magnitude depends heavily on the way you measure it and the way stable the advance is across seeds.

When you work in fraud detection, anomaly detection, or any domain where labeled positives are rare and domain knowledge is wealthy, this pattern is value experimenting with. The implementation is easy — a handful of lines on top of an ordinary training loop. The more essential discipline is measurement: use symmetric threshold evaluation, report threshold-independent metrics, and at all times run multiple seeds before trusting a result.

The repo has the total training loop, lambda sweep, variance evaluation, and eval code. Download the CSV from Kaggle, drop it in the identical directory, run app.py. The numbers above should reproduce — in the event that they don’t in your machine, open a difficulty and I’ll have a look.

References

[1] A. Dal Pozzolo, O. Caelen, R. A. Johnson and G. Bontempi, (2015), IEEE SSCI. https://dalpozz.github.io/static/pdf/SSCI_calib_final_noCC.pdf

[2] ULB Machine Learning Group, (Kaggle). https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud (Open Database license)

[3] S. Ioffe and C. Szegedy, (2015), arXiv:1502.03167. https://arxiv.org/abs/1502.03167

[4] PyTorch Documentation — BCEWithLogitsLoss. https://pytorch.org/docs/stable/generated/torch.nn.BCEWithLogitsLoss.html

[5] Experiment code and reproducibility materials. https://github.com/Emmimal/neuro-symbolic-fraud-pytorch/

Disclosure

This text is predicated on independent experiments using publicly available data (Kaggle Credit Card Fraud dataset) and open-source tools (PyTorch). No proprietary datasets, company resources, or confidential information were used. The outcomes and code are fully reproducible as described, and the GitHub repository accommodates the whole implementation. The views and conclusions expressed listed below are my very own and don’t represent any employer or organization.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x