Chest radiography is the most common medical imaging procedure worldwide. Getting reliable, low-cost triage to every patient — especially in under-served regions — is a true ML-for-healthcare problem.
A shared feature extractor feeds two heads — a fast triage decision and a detailed pathology profile. Trained jointly with a combined loss so both objectives shape the representation.
Standard transfer-learning hygiene: a high learning rate for fresh heads destroys the pretrained backbone. We split training in two.
snapshot_download pins a Parquet revision; 78,468 / 11,210 / 22,442 split.torch.cuda.amp · fp16 — ~2× throughput on T4.pos_weight per class · AdamW · CosineAnnealingLR · grad clip 1.0 · grad accum ×4 → effective batch 96.Classes with clear anatomical signatures (cardiomegaly, emphysema) are the easy wins. Diffuse infiltrative patterns are the hard ones — matches radiologists' own reliability.
| Custom SE-ResNet (scratch) | DenseNet-121 (transfer) | |
|---|---|---|
| Parameters | ~23 M | 7.9 M |
| Epochs to best | 60+ (still climbing) | 18 |
| Macro AUC-ROC | 0.8008 | 0.8459 (+4.5 pp) |
| Binary AUC / F1 | 0.7571 / 0.6474 | 0.7867 / 0.6736 |
| When it fails | Rare, diffuse, subtle classes — never sees enough examples with random init. | Same failure modes, but with a higher floor — pretrained edges + textures transfer cleanly. |
| Our read | A strong "sanity baseline" and ablation for the ImageNet-prior question. | Production-grade. We'd ship this if the project had a clinical partner. |