PetFinder AdoptionSpeed prediction — comparing Baseline ResNet, Two-Stage ResNet, and LightGBM pipeline across accuracy, QWK, feature attribution, and ablation.
| Rank | Model | Type | Val Accuracy | QWK | Notes |
|---|
Validation accuracy and QWK for each model. LightGBM leads both metrics.
Multi-axis comparison scaled 0–1. Accuracy (×100), QWK (×10 for visibility).
Raw prediction counts per (true class, predicted class).
Row-normalised recall matrix — each row sums to 1.0. Diagonal = per-class recall.
Stage 1 trains each model to predict tabular attributes from the image alone. High accuracy means the visual signal contains enough information to infer the attribute.
| Feature | # Classes | ResNet (No Norm) | ResNet (GradNorm) | LightGBM | XGBoost | CatBoost | DT | Best |
|---|
Per-feature validation accuracy across all Stage-1 configurations. Type and Health exceed 90% in every model; Breed1 is hardest for LightGBM/CatBoost but XGBoost (63.7%) matches CNN accuracy.
Combinatorial ablation on the LightGBM Stage-2 classifier (298 subsets, max combo size 3). Each group is zeroed out; ΔQWK measures the impact relative to the full baseline (QWK = 0.2616).
ΔQWK when each feature group is individually zeroed. Negative = harmful to remove; positive = redundant.
All 298 ablation combinations plotted by combo size. Each point is one subset; colour encodes ΔQWK magnitude.
| Features Ablated | n | Accuracy | QWK | ΔACC | ΔQWK |
|---|
| Features Ablated | n | Accuracy | QWK | ΔACC | ΔQWK |
|---|
Gradient-weighted Class Activation Maps (Grad-CAM) back-propagate through the last ResNet block
(backbone.layer4[-1]) to highlight the image regions that drove the AdoptionSpeed prediction.
Warmer colours (red) indicate higher influence; cooler colours (blue/green) are less attended.
Overlays are from the TwoStage ResNet + GradNorm Stage-2 head on 16 validation samples.