Chest radiography is the most common medical imaging procedure worldwide. Getting reliable, low-cost triage to every patient — especially in under-served regions — is a true ML-for-healthcare problem.
A shared feature extractor feeds two heads — a fast triage decision and a detailed pathology profile. Trained jointly with a combined loss so both objectives shape the representation.
Standard transfer-learning hygiene: a high learning rate for fresh heads destroys the pretrained backbone. We split training in two.
snapshot_download pins a Parquet revision; 78,468 / 11,210 / 22,442 split.torch.cuda.amp · fp16 — ~2× throughput on T4.pos_weight per class · AdamW · CosineAnnealingLR · grad clip 1.0 · grad accum ×4 → effective batch 96.Classes with clear anatomical signatures (cardiomegaly, emphysema) are the easy wins. Diffuse infiltrative patterns are the hard ones — matches radiologists' own reliability.
| Custom SE-ResNet (scratch) | DenseNet-121 (transfer) | |
|---|---|---|
| Parameters | ~23 M | 7.9 M |
| Epochs to best | 41 | 18 |
| Macro AUC-ROC | 0.8141 | 0.8459 (+3.2 pp) |
| Binary AUC / F1 | 0.7739 / 0.6587 | 0.7867 / 0.6736 |
| When it fails | Rare, diffuse, subtle classes — never sees enough examples with random init. | Same failure modes, but with a higher floor — pretrained edges + textures transfer cleanly. |
| Our read | A strong "sanity baseline" and ablation for the ImageNet-prior question. | Our strongest model — research-grade. Informative, but not clinically actionable: held-out test and external validation come first. |