Pavan Kumar Dubasi

Overview

HealthFraudMLChain is a healthcare insurance fraud detection system built as an M.Sc. dissertation at NIT Patna. It combines a weighted ensemble of five Optuna-tuned classifiers with a SHA-256 blockchain audit trail and ECIES encryption for provider PII protection.

Key Results

F1 = 0.7345 (fixed threshold t = 0.444, 10-fold stratified CV, zero data leakage)
Precision: 73.7%, Recall: 74.7%, ROC-AUC: 0.9587
Friedman test p = 0.00089 confirming significant model differences
Bootstrap 95% CI: F1 [0.7118, 0.7715]

Technical Architecture

The system aggregates 558,211 Medicare claims into 5,410 provider-level records with 190 engineered features. Five gradient-boosted classifiers (XGBoost, LightGBM, CatBoost, GradientBoosting, RandomForest) are tuned with Optuna TPE (60 trials each) and combined via AUC-PR-optimized weighted voting. Predictions are encrypted with ECIES (secp256k1 + HKDF-SHA256 + AES-256-GCM) and recorded on a custom SHA-256 blockchain with Merkle tree integrity verification.

Explainability

SHAP (global feature importance) and LIME (local conditional rules) provide dual explainability. Top fraud indicators: deductible payment patterns, maximum reimbursement amounts, and claim duration anomalies.

HealthFraudMLChain

Highlights

Overview

Key Results

Technical Architecture

Explainability

Related Projects

Attestix

CodeSage

VibeMCP

Like what you see? Let's talk.

HealthFraudMLChainHealthFraudMLChain

Highlights

Overview

Key Results

Technical Architecture

Explainability

Related Projects

Attestix

CodeSage

VibeMCP

Like what you see? Let's talk.

HealthFraudMLChain