TL;DR
Enterprise AI with Sovereignty and Scale
At LightOn, multiple large language models (LLMs) work in concert, each solving specific enterprise search and reasoning challenges that organizations face every day.
Our flagship platform, Paradigm, runs on a powerful sovereign LLM designed for on-premise or private-cloud deployment. Businesses and government agencies gain full control of their data and workflows without sacrificing AI performance or flexibility.
Why Paradigm is Different
Beyond infrastructure, Paradigm orchestrates a sophisticated ensemble of specialized Language Models. Each model contributes unique strengths in retrieval, semantic search, and reasoning, resulting in intelligence that far exceeds what any single model can deliver.
👉 Learn more about Paradigm for Enterprises lighton.ai
Open Source Models Pushing the Boundaries of AI
LightOn has trained and released a portfolio of state-of-the-art open source models that continue to shape the AI landscape:
- ModernBERT (2024) - A modernized bidirectional encoder with 8192 token context length and 2 trillion tokens of training data, in collaboration with AnswerAI.
- Ettin (2025) - The first paired encoder-decoder suite (17M–1B params), developed with Johns Hopkins University, enabling fair performance comparisons. https://github.com/JHU-CLSP/ettin-encoder-vs-decoder
- BioClinical ModernBERT (2025) - A benchmark-setting medical NLP encoder, developed with Dana-Farber Cancer Institute and Harvard University.https://huggingface.co/thomas-sounack/BioClinical-ModernBERT-base
- Reason-ModernColBERT (2025) - A multi-vector model purpose-built for Deep Research and reasoning-intensive retrieval. https://huggingface.co/lightonai/Reason-ModernColBERT
- GTE-ModernColBERT (2025) - A state-of-the-art multi-vector retrieval model with extended context handling. https://huggingface.co/lightonai/GTE-ModernColBERT-v1
- Mambaoutai (2024) - A 1.6B parameter model based on the Mamba State Space architecture. https://huggingface.co/lightonai/mambaoutai
- BLOOM (2023) - LightOn contributed to the architecture of BLOOM, the world’s largest open source language model project at the time.
- RITA (2022) - Autoregressive generative protein models (up to 1.2B parameters), developed with Oxford and Harvard. https://github.com/lightonai/rita
- PAGnol (2021) - The first large French generative model, with 1.5B parameters, trained by LightOn.
- Monoqwen: A vision-language model collection for multimodal understanding and visual reasoning https://huggingface.co/collections/lightonai/vision-66c5ab2d18743b108e90723a
- ArabicWeb24 ablation models: several 900M models trained on 25BT to compare different data processing choices using Arabic data.
From Research to Enterprise Deployment
While Paradigm is LightOn’s commercial centerpiece, the real value lies in how this carefully orchestrated portfolio powers enterprise knowledge management:
- Advanced document retrieval
- Context-aware reasoning
- Secure on-premise AI deployment
All of this is supported by production-ready infrastructure:
- PyLate - Flexible training & retrieval for late interaction models. https://github.com/lightonai/pylate
- FastPlaid - Multi-vector search with +554% throughput improvement.
- MCPylate: a space where you can try Multi-vector search has shown very strong performance compared to single dense vector search in numerous domain, including out-of-domain, long-context and reasoning-intensive retrieval. This repository propose to search among the leetcode split of the BRIGHT dataset using Reason-ModernColBERT.
These libraries ensure cutting-edge research doesn’t stay in the lab, but scales seamlessly in enterprise environments.
👉 Discover LightOn’s Infrastructure Tools
Research Collaborations
Over the years, LightOn has actively collaborated with leading research universities and R&D labs worldwide, advancing the state of AI and building a strong foundation for enterprise deployment.
Our partners include:
Criteo Labs, Dana-Farber Cancer Institute, École Normale Supérieure (LPENS), École Polytechnique Fédérale de Lausanne (EPFL), EURECOM, Harvard Medical School, Hugging Face, Inria, Johns Hopkins University, Massachusetts Institute of Technology (MIT), McGill University, Microsoft Research, NVIDIA, OpenEvidence, Queen’s University, and the University of Oxford.
These collaborations are a key part of LightOn’s DNA, ensuring that our enterprise solutions are built on top of cutting-edge research and global scientific excellence.
LightOn in One Line
Deep tech, simple delivery.
We turn extreme engineering complexity into an effortless enterprise AI experience.
🔑 Ready to bring AI sovereignty and state-of-the-art enterprise search to your organization?
👉 Contact LightOn today to see how Paradigm can transform your data workflows.