Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Okay but does any one actually _want_ a reasoning model at such low tok/sec speeds?!


Lots of use cases don’t require low latency. Background work for agents. CI jobs. Other stuff I haven’t thought of.


If my "automated" CI job takes more than 5 minutes, I'll do it myself..


I bet the raspberry pi takes a smaller salary though.


There are tasks that I don't want to do whose delivery sensitivity is 24 hours, aka they can be run while I'm sleeping.


Where I’ve been doing CI 5 minutes was barely enough to warm caches on a cold runner


No, but the alternative in some places is no reasoning model. Just like people don't want old cell phones / new phones with old chips - but often that's all that is affordable in some places.

If we can get something working, then improving it will come.


You can have questions that are not urgent. It's like Cursor, I'm fine with the slow version until a certain point, I launch the request then I alt-tab to something else.

Yes it's slower, but well, for free (or cheap) it is acceptable.


Only interactive uses cases need high tps, if you just want a process running somewhere ingesting and synthesizing data it’s fine. It’s done when it’s done.


Having one running in the background constantly looking at your home assistant instance might be an interesting use case for a smart home


It's local?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: