Hacker Newsnew | past | comments | ask | show | jobs | submit | liamYC's commentslogin

Developed a UI with Claude here:

https://ljsimpkin.github.io/pi-compress

It really shows how inefficient such a compression would be. Haha nice idea


Point taken about the index potentially being really long. Why would the length be longer than the data? Don’t you need to find the right sequence?

For a given length of data, considering all possible data of that length, it's impossible for the median length to be shorter than the data length. There aren't enough strings of that length that early in the data.

I wonder if it might make more sense to come at it from the opposite angle. Take pi as a sequence you want to compress with. But pi, being random, has redundancies in it that make it less than optimal. So instead, for a given size of block you want to look up, design the optimal number to use for compression. For instance, if you want to compress "594" in the digits of pi, the sequence 253 appears before it twice, which means any attempt to "compress" any three-digit sequence that only first appears after the second 253 is costing you more to get past the second 253, and "pi, but with all the 253s removed after the first one" is clearly a more efficient encoder for 3-digit numbers than pi itself.

So, instead of using pi, design an optimal number to encode with.

What you'll find is that the optimal sequence ends up being equally efficient as listing the blocks in order and indexing by block number itself. There are a number of other solutions; you could use superpermutations to get "all possible subsequences" with fewer digits in your target number, but you'll end up needing to provide the encoder and decoder a table of where the digit sequences appear since they are no longer regular and indexing into that table will cost exactly the same as just writing your number as the concatenation of all the blocks and its efficient method for indexing into them by indexing on the block rather than the digit number.

This actually has some natural overlap with the "normal numbers" in that one of the earlier normal numbers was: https://en.wikipedia.org/wiki/Champernowne_constant I'm not sure whether this is necessarily optimal for an arbitrary block size. (My quick intuitive check suggests it may be, but "my quick intuitive check" in the time of an HN post is not something I'd count on.) In this scheme, you can include the fact that the person using this constant to encode knows the nature of the constant, so they know that if you give index 0-9, it's single digit, and if you index into the two-length blocks, it must have a length of two. Since the encoder and decoder know that, they can also skip the middle of the block and just index into "the n'th number"... which degenerates into "the index of number N is N", which means this is not a compression scheme.

To put all that in a nutshell, if you want to deeply understand why this compression scheme doesn't work, I think you can attain a deep understanding of why by optimizing it.


What does ADRS stand for?


This blog post has more accessible writing and diagrams: https://www.sigops.org/2025/barbarians-at-the-gate-how-ai-is...

From TFA: https://arxiv.org/pdf/2510.06189

> We term this approach as AI-Driven Research for Systems (ADRS), which iteratively generates, evaluates, and refines solutions.

> The central thesis of this paper is that a new class of AI-driven approaches, which we term AI-Driven Research for Systems (ADRS), is beginning to show promising results in automated algorithm discovery, and will ultimately prompt a re-evaluation of the traditional role of systems researchers.


Smart, how do you backup the users encryption keys?


A set of encryption keys is a lot smaller than the set of all user data, so it's much more viable to have both more redundant hot storage and more frequently rotated cold storage of just the keys.


You’re assuming they would attempt to produce gold exactly the same way. The process would likely evolve to become better. What happens if you add a growth rate?


I’d like to 3rd the thanks for Aider it’s fantastic!


How do you measure positive and negative societal impact of this technology?

I find mobile phone face unlock so useful, giving every citizen the power to use face recognition could be better than a few people, robots that identify someone and give them lifesaving medication are great (but the opposite, robot assassin can also be created). I guess it comes down to good people building good tools. Humans are generally kind and empathetic


It'll be a net-positive for things like that, passports might be rendered useless in the future since they're already using facial recognition at Customs. It can help solve crimes given it can recognize faces using CCTV footage amongst other things.

There is a certain level of distrust since it can be abused and people think it will lead to a dystopian police state.


Yet also short sighted hairless apes with all the genetic programming that comes with, for better AND for worse.


It depends on the tool you use and I guess the use case too. Some are language model agnostic like aider in the command line, I use sonnit sometimes and then 4o other times. I wonder if or when language models will become highly differentiable. Right now I see them more like a commodity that are relatively interchangeable but that is shifting slightly with other features as they battle to become platforms


Haha yes!!


“You can’t make this stuff up”


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: