in text-based data, one way I think of sample efficiency is from the delta gains of models with respect to a certain amount of training input.

as of may 2024, a paper from Rohan Pandey explored and verified that gzip compression can be a proxy of text data sample efficiency.
- implications? need to think more broadly, as the paper suggested the relevance of information theory appplied to text data.
- connecting compression through the lens of information or energy in other modalities can be a [fun project to do](Compressibility and sample efficiency?).

Driving Questions

Gaia Prime