Hacker Newsnew | past | comments | ask | show | jobs | submit | hiroakiaizawa's commentslogin

Good reminder that raw tokens/sec numbers can be misleading without latency and context-window considerations.


Interesting approach. I like that the implementation focuses on scalability rather than only visualization.


It's extremely inefficient, using pointers to neighboring cells.

If you want to handle the grid edges (whether for a wrap-around "infinite" grid, or not) without too much special code, then leave a 1-cell border around the grid and fill this with the appropriate data (empty cells, or wraparound cells). If you really want to be efficient then just write the special-case edge code.


The notebook is intentionally minimal.

Not a prediction model or causal explanation — just a reproducible concentration check with fixed ex-ante definitions and minimal outputs.

Runs in ~30 sec on Colab.


One thing I've started appreciating with LLM-assisted workflows is how important fixed evaluation protocols are.

Without pre-defined definitions and locked procedures, it's extremely easy to mistake iterative adaptation for genuine signal.


Curious what domains people would try this on. Would love to see other datasets.


Tested on finance / power / earthquakes.

Minimal version only. Happy to adapt to other datasets.


Interesting. What are the main latency bottlenecks in practice?


Nice. What scale does this realistically reach on a single machine?


Model: 36L/36H/576D, 144.2M params

runs on a Blackwell 6000 Max-Q, using 86GB VRAM. Training supposedly takes 3h40m


Interesting. What are the main trade-offs they expect from the switch?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: