Summary of LightOn AI meetup #14

WeightWatcher a Diagnostic Tool for Deep Neural Networks

June 4, 2021

TL;DR

t is about the one-year mark since we started our (virtual) LightOn AI Meetups, and to mark the anniversary 🥳, we had Charles Martin as a guest. Charles is the Chief Scientist at Calculation Consulting and he presented his work on WeightWatcher: a Diagnostic Tool for Deep Neural Networks, a Python package built around a series of papers.

The 📺 recording of the meetup is on LightOn’s Youtube channel. Subscribe to the channel and subscribe to our Meetup to get notified of the next videos and events!

Weightwatcher is a Python package dedicated to analyze trained models and inspect models that are difficult to train🏋️. It can be used to gauge improvements in model performance and predict test accuracies across different models 🔮(without ever looking at the data!). It can also detect potential problems when compressing or fine-tuning pre-trained models 🗜️.

It is based on ideas from Random Matrix Theory, Statistical Mechanics, and Strongly Correlated Systems. The main idea is to fit a power law to the tail of the empirical spectral density (ESD) of the layer weights. The power-law exponent α is what helps us detect potential problems.

Fitting a power law in log-log to the tail of ESD needs to be done carefully!

Poorly trained models tend to have large layer α, as can be seen for example comparing GPT and GPT-2: the same model trained on dirty versus well-curated data.

GPT is trained on dirtier data than GPT-2, and it shows in the unusually large α values for some of the layers.

In particular, a weighted α can predict the test accuracy for models in the same architecture series across varying depths and other architectures and regularization parameters 📉.

The correlation between test accuracy and weighted alpha is remarkable.

Finally, there is some early research to extend this idea on when to perform optimal early stopping 🛑, or per-layer learning rate settings 🎛️, or detect over-fitting 🔍. Quite a program! We look forward to even more insightful empirical metrics in Charles’ WeightWatcher in the future. The video of the meetup is here.

Ready to Transform Your Enterprise?

TL;DR

Recent Blogs

Cyllene and LightOn join forces to accelerate the adoption of sovereign and responsible AI within French enterprises

LightOn: Deep Tech, Simple Delivery

How Europe is shaping the contours of next-generation industrial AI

Ready to Transform Your Enterprise?