> I think you mean "unless the lossy algorithm has a perverse encoding."
That's right.
> It's true that no lossless algorithm can beat a lossy algorithm if averaged across all possible images
I concede the point. I thought that this argument applied to the average of any input set, not just the set of all possible images, but then I realized that it's actually pretty easy to come up with counterexamples e.g. N input images, which can be losslessly compressed to log2(N) bits, which will easily outperform jpg for small N. That's not a very realistic situation, but it does torpedo my "proof".
It's probably still true that no lossless format could possibly beat lossy JPEG for typical photos.
If I understand your example correctly, it only has the lossless algorithm beating JPEG by cheating: the lossless algorithm contains a dictionary of the possible highly compressible images in it, so the algorithm plus the compressed images still weighs more than libjpeg plus the JPEGs. But there are other cases where the lossless algorithm doesn't have to cheat in this way; for example, if you have a bunch of images generated from 5×5-pixel blocks of solid colors whose colors are generated by, say, a known 68-bit LFSR, you can code one of those images losslessly in about 69 bits, but JPEG as typically used will probably not compress them by an atypical amount. Similarly for images generated from the history of an arbitrary 1-D CA, or—to use a somewhat more realistic example—bilevel images of pixel-aligned monospaced text in a largish font neither of whose dimensions is a multiple of 8, say 11×17. All of these represent images with structure that can be straightforwardly extracted to find a lossless representation that is more than an order of magnitude smaller than the uncompressed representation, but where JPEG is not good at exploiting that structure.
That's right.
> It's true that no lossless algorithm can beat a lossy algorithm if averaged across all possible images
I concede the point. I thought that this argument applied to the average of any input set, not just the set of all possible images, but then I realized that it's actually pretty easy to come up with counterexamples e.g. N input images, which can be losslessly compressed to log2(N) bits, which will easily outperform jpg for small N. That's not a very realistic situation, but it does torpedo my "proof".
So you were right, and I was wrong.