Case Study

OncoVLM: Domain-Specific Foundation Models

Proving that focused training beats raw scale

Training oncology-specific multimodal models that outperform larger general-purpose models. Multi-teacher knowledge distillation at three scales: 4B, 1.7B, and 500M parameters.

92.4%

PubMedQA

Model Sizes

Training Cost

Teacher Models

!The Scale Assumption

The AI field assumed bigger models were always better. But for specialized domains like oncology, general-purpose 70B models often missed domain-specific nuances that smaller, focused models could capture.

General models lacking oncology-specific knowledge
Expensive inference costs for large models
No multimodal understanding of pathology/radiology
Hallucinations in clinical contexts

Multi-Teacher Distillation

Instead of training one massive model, we distilled knowledge from multiple specialized teachers into smaller, focused students optimized for oncology tasks.

Three model scales: 4B, 1.7B, 500M parameters
Multi-teacher distillation from MedGemma, GPT-OSS-20B, Qwen-3-30B
Multimodal: pathology images, radiology, clinical text
LoRA fine-tuning for parameter efficiency

Architecture

The training pipeline uses DGX Spark's 128GB VRAM for full-batch training with automated experiment tracking.

Teachers

MedGemmaGPT-OSS-20BQwen-3-30B

Distillation

Knowledge ExtractionResponse AlignmentQuality Filtering

Student Models

4B Flagship1.7B Balanced500M Edge

Infrastructure

DGX SparkNGC ContainersAutonomous Researcher

Timeline

Mar 2025

Nanochat experiments validate focused training

Apr 2025

Initial OncoVLM architecture design

May 2025

4B model training complete

Jun 2025

1.7B and 500M variants trained

Jul 2025

92.4% PubMedQA achieved

Key Lessons

10K focused examples can outperform 500K general examples

Multi-teacher distillation captures complementary strengths

Personal GPU infrastructure enables research-grade experiments

Smaller models can beat larger ones on specialized tasks

Tech Stack

PyTorchGemmaPaliGemmaLoRADGX Spark

View Project Details Back to Graph