Announcing BioClinical ModernBERT: a new SOTA encoder model for Medical NLP

The recent release of ModernBERT by LightOn and AnswerAI aims at providing the best base model that can be then used in different industry verticals. Today, Thomas Sounack from the Dana-Farber Cancer Institute in collaboration with researchers at Harvard University, LightOn, MIT, McGill University, Albany Medical College and Microsoft Research, used this capability and trained a new State-Of-The-Art (SOTA) medical encoder named BioClinical ModernBERT.

June 13, 2025

TL;DR

"BioClinical ModernBERT was created by continuing ModernBERT's pre-training on medical texts, using a special scheduling feature that avoids cold restarts. Results:

New SOTA on medical NLP tasks (classification and NER)
Outperforms all existing medical encoders
Handles long documents (full clinical notes, medical reports) via ModernBERT's hybrid attention
Reproducible approach - the continued pre-training method can be applied to adapt ModernBERT to other specialized domains

Essentially: A medical-specialized version of ModernBERT that shows how to efficiently adapt the model to any domain-specific use case while achieving top performance."

*BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP*

Efficient Continued Pre-Training, Streamlined for Medicine

One critical but lesser-known scheduling feature of ModernBERT allows researchers a seamless continued pre-training while eliminating cold restarts. Stable-phase checkpoints and a decay phase contribute to having models that can efficiently converge on specific domains.

Leveraging this scheduling feature, Thomas Sounack continued the pre-training of ModernBERT on an extensive collection of medical texts. The result: BioClinical ModernBERT, a new model that outperforms all existing encoders on medical classification and Named Entity Recognition (NER) tasks, setting a new SOTA benchmark for medical NLP applications.

Optimized for the Realities of Clinical Context

Real-world medical texts can be very long; they span full clinical notes and as well as large reports. BioClinical ModernBERT’s ModernBERT backbone provides long-context document support, with hybrid attention and unpadding mechanisms for rapid processing, crucial for healthcare and clinical workflows.

A Recipe for Continued Pre-Training

Beyond the model itself, this experience refines continued pre-training for domain adaptation. This approach is reproducible: BioClinical ModernBERT demonstrates robust transfer to new domains, opening the door for anyone seeking to tailor ModernBERT for their own specialized data.

Try It Out

Interested in leveraging continued pre-training or ModernBERT’s long-context expertise in a different domain? Explore the BioClinical ModernBERT collection and see how it can advance specialized NLP tasks in your domain.

Ready to Transform Your Enterprise?

TL;DR

Efficient Continued Pre-Training, Streamlined for Medicine

Optimized for the Realities of Clinical Context

A Recipe for Continued Pre-Training

Try It Out

Recent Blogs

Making Knowledge Machine-Readable

When buzzwords meet reality: MCP integration comes to LightOn

LightOn, Edarat Group, and NVIDIA collaborate with Orange Business to launch Live Intelligence in the Gulf region

Ready to Transform Your Enterprise?