In image/video/audio/numerical data especially lossless compression we often subtract predictor (context based), then encode the difference (residue) usually assuming Laplace distribution e.g. through Golomb coding (~2% more bits/pixel than Shannon for used parameters).

The tough question ishow to choose width of such Laplace distribution/parameter of Golomb depending on context?

LOCO-I ( https://en.wikipedia.org/wiki/Lossless_JPEG ) quantizes context into 365 discrete possibilities treated independently.

I didn’t like such brute force approach – without exploiting their dependencies.

So recently working on modelling conditional probability distributions, I decided to test low parametric ARCH-like models – turns out a few parameter models can give similar or even better compression ratios than 365 parameters for LOCO-I: https://arxiv.org/pdf/1906.03238 , files with Mathematica notebook:

Any thoughts on that – used, interesting approaches to model conditional continuous distributions, like context dependent width of Laplace for residue encoding?