Web Analytics Made Easy - Statcounter

Research And Development

R&D
Overview

Advancing Generative AI through Innovation

The R&D team at LightOn plays a pivotal role in advancing the field of generative AI through continuous innovation and development. Their expertise spans across creating and fine-tuning large language models (LLMs) that form the backbone of the Paradigm platform, a comprehensive AI solution designed for enterprise use. This platform simplifies the integration of generative AI into business workflows, offering both on-premise and cloud options to ensure flexibility and scalability for various business needs.​

r&d publications

Recent R&D Posts

Read post

Multi-vector retrieval now on the infrastructure you already run

ColBERT-grade retrieval quality on the infrastructure teams already operate, at a fraction of the storage and serving cost.

June 16, 2026
R&D

CTA Title

Lorem Ipsum

Read post

LightOn Demonstrates the Flexibility of Its OCR Model by Adapting It to Arabic Through Targeted Training

LightOn demonstrates the flexibility of LightOnOCR-2, its document understanding model, by adapting it to Arabic through fine-tuning.

June 12, 2026
R&D

CTA Title

Lorem Ipsum

Read post

Adaptive Chunking: Reasoning Starts Before the LLM Sees a Token

Document-aware chunking selection for production RAG systems

May 19, 2026
R&D

CTA Title

Lorem Ipsum

Read post

Deep Research is now Open

Agent-ModernColBERT adds ~10% over Reason-ModernColBERT on BrowseComp-Plus, stays at 149M parameters, and brings GPT-5 + Qwen3-8B-level retrieval performance to a fully open stack.

May 12, 2026
R&D

CTA Title

Lorem Ipsum

Read post

🔴 The Retriever You Actually Need

Introducing LateOn and DenseOn, two Apache 2.0 retrievers: SOTA on BEIR, built to generalize.

April 21, 2026
R&D

CTA Title

Lorem Ipsum

Read post

Document Intelligence at First Sight

OriOn-Qwen-SR1: Fast Implicit Reasoning for Long Documents

April 9, 2026
R&D

CTA Title

Lorem Ipsum

Read post

Open-Source LightOnOCR-2 Just Outscored Claude, GPT-5, Qwen3, Mistral and Mathpix at Table Extraction

The most valuable information in enterprise documents doesn't live in paragraphs. It lives in tables

April 7, 2026
R&D

CTA Title

Lorem Ipsum

Read post

The Bloated Retriever Era Is Over

Reason-ModernColBERT tops BrowseComp-Plus, the most rigorous agentic search benchmark, across every metric, with 54× fewer parameters and fewer search calls.

March 19, 2026
R&D

CTA Title

Lorem Ipsum

Read post

NOVA: A Guide to Actually Measuring How Your Agent Works on Your Data

Numbers Over Vibes: Building a RAG Evaluation Framework That Actually Works

March 11, 2026
R&D

CTA Title

Lorem Ipsum