In the dot com boom there were companies spending like $100+ on ads per $1 of revenue. The cost of customer acquisition was insanely high because of the hype of ecommerce and it was being subsidized by VC and IPO's.
This AI boom feels similar, a lot of hype and the AI usage costs are being subsidized by private equity/VC so far. IPO's are supposed to happen this fall for OpenAI and Anthropic. They're going to have to face the music of corporate governance, accounting rules, reporting revenue, earnings, etc. Subsidizing users seems unsustainable, they need to either jack up rates or downgrade usage per plans. Then there is the circular investments between all of them and Google, Microsoft, etc. Seems like a house of cards.
It’s hilarious to me, seeing that one of the most fundamental challenges and frustrations of the profession of software development is the inability of management to meaningfully measure productivity and resorting to proxy measures, that a huge and vocal community of software developers have themselves become focused on a metric that maps directly to dollars spent.
Was already astounded that after decades of universally agreeing that number of lines of code is a terrible metric for software engineering productivity, devs are now using it as the proof that agentic software development is the future. Can't wrap my head around how lighting money on fire is now apparently something to maximize [0].
at this point capitalism has created its circular economy without providing any benefit to it's surrounding. In this specific case it's closer to a human centipede of circular funding.
This is quite a misleading title because this is the raw API cost, but he (obviously) has unlimited usage as an OpenAI employee. Moreover, if you use e.g. the $200 Codex sub, you get about ~$5k-$6k monthly API usage if you spend every week of your usage, if not more, which shows that the raw API cost is not how much it (likely) costs to OpenAI, unless they're subsidizing all this.
He did clarify that it was with fast mode. Without fast mode it'd "only" be $300k in raw API cost, or ~60 $200 Codex subscriptions.
I ran 50 instances and had them all fix the same bugs at the same time and then analyzed the results of all 50 runs to have AI score each of the attempts, then sort them, then compare them to each other in a round robin tournament style double elimination to ensure I got the best result. Then I had AI convert this into a skill, and then ran all 50 attempts again and repeated the process to ensure that I had the absolute best result. It was amazing and I used 1.3 billion tokens!
Claude, somewhere in this codebase I've mispelled a common word; the word is also a homophone and further, is easily confused with another word that has three r's; please start up a subagent for each file and count the r's and verify how many r's there are's; if there's three, then make sure to review potential homophones and check that I've spelt the correct worrrd incorrectly correct.
Inference is highly marked up. Total costs including training may be subsidized (,in a sense since the AI companies are widely reported to not break even as yet)
Peter shows the near-term future. Raw API consumer price cost is arbitrary. (The frontier labs can put a 100x markup to cover other operational expenses.) The true cost of inference with same-capability models keeps dropping at dizzying rates, especially at the data-center batch size. (Due to both NVidia hardware and algorithmic changes.) So the developments that Peter can achieve today with internal support from OpenAI will be doable by anyone in a few years without breaking the bank.
Peter shows shit. What did Peter meaningfully achieve? What additional revenue is he creating? ah yes - shit and more shit on all accounts as it seems.
But.... why? Like I read his thing on how he spends the tokens [0] and it sounds like satire.
He has agents write shitty code for features other agents think other people want, then has it reviewed by other agents in hopes of catching bugs that the first agent put there, then has some more agents try to find security bugs in the now double-agented code to make it triple-agented and at the end of the day, he spent a shitton of tokens, probably emitted enough carbon to heat our planet by another degree, and has a feature nobody really asked for that might or might not work.
He then has the sense of humor to call this grotesque process "incredibly lean".
What's the point in all of this? What problems is this solving? Who's benefiting?
I don’t use openclaw myself anymore, but this agonizing is thin and unbearable. He did a thing. People use the thing. He got paid for the thing. He iterates the thing. What’s hard to understand about this?
The morality issues about consumption climate impacts are not his alone, and are not unique by itself to his endeavor. Every company with an enterprise LLM agreement has a share, for instance.
Firstly, who TF would use that crap in the first place at all?
Yeah, he did some crap he got paid for. So did the people who created the addictive algorithms for social and media or creators of the brainrot videos that infest kids' minds. Should we applaud them too?
You can hate it, but pretending it has no value isn’t a meaningful counter, esp given its user base. Gary Tan built GBrain on it. Poor logical fallacy-ing on your part.
It's a very simple question, the subthread you created based on reducing everything to "he did a thing" and calling the comment you didn't interact with at all "agonizing".
Why not rather leave it at "they wrote a comment"? What is so hard to understand about that, to use your words?
>He then has the sense of humor to call this grotesque process "incredibly lean".
> What's the point in all of this? What problems is this solving? Who's benefiting?
The economy doesn't work like how you think it does. Its not central planning. All the usages aren't detailed in a specification, submitted for approval to 100 agencies and then allowed to be used.
It shows lack of intellectual curiosity to not engage deeply with obviously profound technology and what the implications are. I find this exercise helpful.
Peter is predicting how LLMs will be used in the future when the prices go down. And they will definitely go down. I think his predictions are correct and we will definitely have something similar to OpenClaw.
like one bot finding similar issues and PRs, the another bot closing issues for "lack of activity", meanwhile people are reacting and pleading to speak to a real human?
Congrats builders of the future, you've turned software development into automated voice systems.
> The economy doesn't work like how you think it does. Its not central planning.
I'm aware. That is in fact my central critique. The way it works is incredibly wasteful of our limited resources, as illustrated by this guy burning through fuel during a time of crisis for no perceptible gain.
> It shows lack of intellectual curiosity to not engage deeply with obviously profound technology and what the implications are.
The "obviously profound" is an assertion without proof.
The rest I agree with, we should engage with the implications of burning through energy to build features that bots think humans want, but nobody actually asked for, all while climate scientists are telling us we're heading for the apocalypse. It is intellectually incurious to just ignore the questions of why and at what cost, maybe even dangerously so.
> The way it works is incredibly wasteful of our limited resources
You should try playing the game “workers and resources”; it’s a simcity like game, but based in the Soviet system of central planning, not capitalism. It will make you loathe the inefficiencies in central planning.
The appropriate comparison is command vs market. Capitalism is efficient in utilising the characteristics of humans to bring about expansion of markets.
Mario Zechner wrote the main part of this IP laundering application.
I didn't know that studying photocopiers is suddenly linked to "intellectual curiosity". Being a photocopier maintenance guy was always considered boring.
What you put on top of the machine was intellectually interesting.
I don't understand how he is a scam artist. Lots of people are using the things he built. TBH this kind of rhetoric is a bit degrading experience on this website
“He has /people/ write shitty code for features other /people/ think other people want, then has it reviewed by other /people/ in hopes of catching bugs that the first /people/ put there, then has some more /people/ try to find security bugs in the now /double-peopled/ code to make it /triple-peopled/ and at the end of the day, he spent a shitton of /money, the people/ probably emitted enough carbon to heat our planet by another degree, and has a feature nobody really asked for that might or might not work.”
Honestly sounds like a normal tech company to me. Just with much dumber “people” who are getting exponentially smarter, eventually never die, eventually never forget.
You have to skate to where the puck is going, not where it is.
Even at unlimited budget, there is a crossover where outsourcing thinking to the machine costs more than the machine.
What I mean by this:
1. Intern, analyst, junior, or offshore level coding is cheaper when done by the machine.
// Side note: There is good reason the industry invests in suboptimal output from this set which moves to the "cost" column when using an LLM, but nobody's accounting for that.
2. For the interns, analysts, junior, or offshoring to do the right thing costs a multiple of the coding effort: the PdM/PjM stuff of course, but also the Stakeholder, Product Owner, Architect, Principal Engineer, QA, and SRE stuff.
3. If you are not a principal or staff engineer level engineer, you are likely unqualified to catch and fix the errors LLMs make across engineering, much less these other PDLC (product development lifecycle, which includes SDLC and SRE) loop.
4. For LLM output to be useful, your 'harness' has to incorporate all of that as well, which because it's so much harder than transliterating spec-to-code, balloons tokens exponentially.
5. Today it is faster, more efficient, and costs less, to work with LLMs "XP" (eXtreme Programming) style, pairing with the LLM actively co-creating and co-reviewing, steering for more effective turns.
So, your options are:
- ship garbage while costing less than a median first world SWE
- pair with the LLM actively for the benefits of XP
- add enough harness and steering the LLM costs more than SWEs, and still needs a human loop “move fast and break things to find out what's broken” style
I would expect that within a couple years, these other disciplines can be baked in enough the machine costs less for everything but surprises.
> I would expect that within a couple years, these other disciplines can be baked in enough the machine costs less for everything but surprises.
They already are. I’m successfully using frameworks like bmad to deliver complex apps at that level. My job is to manager the see, as, ux, sre processes and catch errors.
I spend more time refinding prd
, epics and stories than I do elbows deep in code.
If I don’t like the output of a story I nuke it change the story and have the flanker try again.
I’m using the open source glm, kimi, deepseek models. I expect the full pipeline to be good enough by the end of the year.
> I spend more time refinding prd , epics and stories than I do elbows deep in code.
And do you enjoy this more than writing code? I used to look forward to writing code, solving these little optimization puzzles, learning, and staying sharp. Working with agents is dreadful in comparison. They lie, rarely learn, and I feel like a proctor.
Sure, you sometimes get to see something amazing, but usually I am just very annoyed by their performance and ever-changing but never-ending billing issues. First, with Claude Code, now with Codex, which was fine for a minute, but now I am out of tokens for the majority of time. (I don't have the income for those Pro INTx plans.)
I think its less misleading this way because every other reader would have to pay $1.3M to emulate his workflow for a similar size project. His discounted internal costs are relevent only to openai.
But even going with the $5k - $6k monthly usage on a $200 codex subscription even going over their limits is also unrealistic in the long term and that is just ONE person.
Lets say I was at the casino and was spending a lot on casino chips but I also happen to work at the casino. I'm not really losing money whether if I win / lose since I'm using the houses money and there's little risk involved on every dice roll or press of the button. The risk is far higher if I don't have that level of access and continue to spend the same amount of money on lots of tokens (or casino chips, spins or button presses.)
The same is true here with these agents. Some companies will realize that they can no longer afford to spend millions a month on tokens or even startups spending $5k - $6k per person per month on tokens.
I can only see local efficient models making sense on recovering from this unnecessary spending or even light gambling on tokens.
"All that automation allows us to run extremely lean"
He has a different opinion of what it means to be lean than almost everyone else. That's fine, he's allowed to, but it's something you have to understand to make sense of any of his comments on things. He has a radically different set of values to most people.
His team is basically him and two other humans, powering an ambitious well-known project so successful an industry titan ended up acquihiring him/them. That's pretty lean, no?
The ambitious idea is actually giving a chatbot/agent access to a bunch of personal data and having it self-modify its harness and context to some extent.
If these methods prove successful it isn't going to matter. A user doesn't care if code is 'slop' or artisanal, so long as the app/site/whatever works.
If you can combine autonomous flows (and millions of dollars in tokens) to produce work comparable to a traditional engineering team, then why would the user care which wrote the app/site/whatever?
But it's a self fulfilling prophecy. They need all this stuff because it's a vibe coded app where bugs are randomly introduced, the architecture is overcomplicated and sucks, and stuff is just added for the fun of it.
Do existing companies run entire end-to-end product integration tests on every single change they make to a repo to make sure something hasn't broken? No, they just architect things in a way such that a minor change to something can be tested in isolation. And that can be automated, deterministically and efficiently.
Where I work we can release changes to our production site in minutes almost completely autonomously with high confidence with absolutely zero AI agents in the loop. How did we do it? With lessons learned from the past 5 decades of professional software development experience.
Lets not forget what OpenClaw is at it's core. It's a glorified cron scheduler. Why on earth does any of this effort need to exist. It's not that deep, it's not that complex, it's all AI for AI's sake.
OpenClaw has surprisingly few "dumb" bugs. Is it as stable and secure as the Linux kernel? God no, obviously not. But it has never just crashed for me, for example. Bugs are of the type "X with Y and Z disabled and T turned on - doesn't work", where you're likely one of a few people that have ever tried this combination. Not to mention it can then debug itself and file a bug report, with a bugfix - if you give it a GitHub token.
I run it in a firewalled VM and am very conscious about any tokens I give it access to - so far for all I know this was unnecessary.
PS. for me the core feature of OpenClaw isn't the cron, though that is nice. It's the memory and instant extensibility. Like it takes 5-15 minutes to add an SSH tool where all agent requests go through a manual review, together with a good auto loaded description that just works in all future sessions.
For the few weeks in which I’ve been using it, it has brought down the Raspberry Pi it’s running on several times with extreme resource hogging, local history/memory search is broken due to a trivial bug for which all issues are auto-closed by bits, and it has changed its configuration standards a handful of times in a way that broke my instant messaging access to it, just to name a few gripes.
This is clearly an implementation and not a conceptual issue, as I had none of these issues using the same model with Hermes, for example.
> And was he 5x more productive in those 30d than a years worth of a dev making 200k/yr?
He was. When it comes to marketing. This is was most people don't understand. Peter is a great marketing guy who got hired because of a hype vision, not because he is an outstanding engineer. Think of it like OpenAI hiring MrBeast of the coding world.
I 'd say he is an outstanding engineer as well. He may favor output over security more than outstanding engineers at 2025 but in the 2026 world what he does is impressive. And with OpenAI's resources he has turned OpenClaw's security woes around. Latest versions are much more secure than 2 months ago.
It’s much worse than that. Openclaw cosplays security, which is much more dangerous than just outright defining the security model as “this is new and risky, better sandbox this thing thoroughly at a different layer”.
It sucks at both security and usability as a result (all the vibe-designed security layers are constantly getting in my way).
If you review the openclaw release schedule and code output you will see that yes, he was. I’m not saying you’ll like what you see, but the openclaw release schedule is well faster than human ability to assess it.
With a lot of these AI tools yea, they release very often. But half the features they add aren't even that useful. They just add shit because they can and they introduce bugs and change behaviour all the time.
Opencode has the same problems. They often do multiple releases of that app a day, yet within the span of a week or two I have had to update my config because some random change has altered the behaviour and my permissions broke. Or I've noticed the way the app renders is suddenly different.
Yet, my day to day usage has barely changed since the version I installed last year. It's like everything changes but nothing changes.
Even claude code has this happen, though perhaps to a lesser extent. I'm getting really tired of having new bugs pop up on me or subtle behavior change near daily that requires me to change things. The most annoying thing ever that was just introduced is a giant spew of context mode crap that Claude aggressively adds to every CLAUDE.md file, and I can't find a way to turn it off. I just have to `git checkout CLAUDE.md` repeatedely right now. If I have to add a bash alias to work around your annoying bug, that's pretty bad.
I read the OpenClaw subreddit for comedy. Every release just floods of posts about how everything is constantly broken and people stoping using it because of how broken it is.
Oh I agree. They’ve said LTS is coming, that will be a relief. I wonder what “LTS” means in this context. Monthly? I’d settle for just not randomly dying on point version updates to config files TBF
I am not joking when I say this, if you pay me 1.3 million dollars today, I will get so much more done with just a single 200$ codex sub in 30 days than he has in 30 days, I can promise you that.
I just checked the code and feature outputs, and I can build all that in 15 days, for 1.3M USD. Fuck I would do it for 1M...
Scratch that, if it's 300K then sure I could do the same too, if you paid me that for 30 days of work. Lmao, the quality and the feature volume is just not worth anything worth paying so much money for.
I am not saying this because I don't like LLMs or I may think that AI coding can't work, but folks whatever openclaw has built for that much money is not worth nearly that much money...
Yes, because doing things 12x faster is extremely valuable. How much do you think companies would pay to do things 12x faster? Your competitors take a year to ship a feature that you can ship in a month
Generating 12x the amount of code/commits/releases isn't a useful metric. That could just as easily be 12x more code to maintain or iterations needed to get it working. Products are measured by the value they provide, not by the resources it cost to create them.
Regardless of one’s opinion about AI, from a product perspective this seems somewhat similar to the dev using his 48gb ram machine and latest iphone to test an app that will be used by consumers with entry-level devices
The mentioned menu bar app is a MITM (man in the middle) and rightly discloses that it gets all your session creds and uses them, along with keychain and full disk access:
Privacy: Reuses existing provider sessions — OAuth, device flow, API keys, browser cookies, local files — so no passwords are stored.
macOS permissions: Full Disk Access for Safari cookies, Keychain access for cookie decryption and OAuth flows...
It's excellent this is disclosed as a reminder of how things work and the tradeoffs you're making to use it.
After trying openclaw a bit myself, no wonder. Without the best models, capabilities drop significantly. And I guess he has a lot of automations and stuff, which explains the 19'000 daily spend. I hit my personal spend limit when it cost like 40 USD to get Google auth tokens working. Which is very complicated when you run openclaw on a vps. And it even broke like a week after. Maybe one could justify the 40usd if it would save my time instead. But I was babysitting openclaw doing it anyhow. So I actually double spend. Money plus time.
Btw, same frustration for me setting up signal, Whatsapp or slack...
I work at a bigtech and we’re being measured on how many tokens we consume.
We know it’s totally stupid, but unfortunately tokenmaxxing is real. I know our management line isn’t that dumb, but this is what you get when the business is selling it.
Lot of online presence seems to be tied to consumerism. That is consuming anything, more ostentatious the better. This is just specific digital version of that.
Or more probably, he will be wrong. We really need to stop amplifying marketing statements like the bullshit Huang and Amodei tell all the time. There's no much thought behind them, just marketing and wishes.
Cui bono? Jensen Huang wants you to believe AI is a necessity and that we will need 1000x the energy because he gets even richer if you believe him. It isn't true, though.
I would bet money Anthropic and OpenAI are actually profitable on inference. The problem is they have to spend large sums of money to train models that are essentially worthless after a few months.
They make more money from inference than they do training the model, but then the next model gets so much more expensive to train so their annual figures have been in the red.
One could say "that's a great point, we should take more direct ideological action to address this issue!", but expounding upon the finer details would likely get one banned here.
What I truly don't understand, as a daily heavy Opus 4.7 user, is how you can coherently prompt 15 different parallel conversations at the same time.
For me it's not even a "what the hell are you working on" so much as complete inability to understand how you can keep so many different processes working on distinct tasks. It simply doesn't map on to how I use these tools.
I spend most of my day writing extremely detailed prompts and that's how I'm able to get the sort of excellent results that confound skeptics. But I have to be honest with you: I don't think I can write (or think) fast enough to do two of these at a time, much less 15.
I definitely could not review what they are generating with any degree of confidence.
I'm really hoping you can explain what the heck your usage pattern actually looks like, because reading this makes me feel like I'm missing something.
Yeah good luck with that. I find SystemVerilog is probably the thing that AI is worst at, presumably because there's not that much training data out there, and pretty much everything about the commercial tools is paywalled.
AI bros love hyping about their insanely inefficient token usage. It's become some sort of a dick-measuring contest. And if you work for OpenAI, of course you can claim insane measurements.
Just last week I saw a dude boasting about how they used their $20/month ChatGPT subscription to earn $15 (or similar trivial amount) in a bug bounty by running the model the whole day. Sam Altman replied to that tweet but not entirely positively.
OpenAI has been removing limits on token usage to take on Anthropic but I'm sure most of the users they are acquiring are these AI bros who are burning tokens for the sake of it. Massive price hikes are coming after OpenAI and Anthropic IPOs probably an order of magnitude larger than what happened to ride sharing.
tl;dr Peter Steinberger shared a product demo for CodexBar [0] with a graph of OpenAI token usage. This graph shows one million spent, prefers gpt-5.5 and spent twenty thousand today.
However, I do not see a strong reason to believe that this is his actual, personal usage. It could be all openclaw usage or some subset of openai usage, given that he is inside them. I suspect it is far more likely to be fake data [1] that exercises the graph library in a visually satisfying way. Notice that it has no usage for a 'week' after April 15 (a Wednesday), but picks up a bunch later. As marketing copy it needn't have any basis in reality [2]. I should hope openai would put a procedure in front of their entrepreneur acquisition that prevents accidentally exposing trade secrets [3].
I view this type of post (his, not yours) as meta deception. I only became aware of this type of deception and its power from a bit of reading in to magicians and stage craft in the last few months. There’s a video on YouTube as well that does a great job of breaking down a Derren Brown stunt that uses it to great effect manipulating the TV viewing audience.
I’d actually seen the original DB episode years before when it first aired and it definitely had an affect on me through this form of manipulation - it altered my internal understanding of marketing/advertising, which was the actual underlying purpose of the episode.
It’s altered how I internally accept and process information from any 2nd or 3rd hand source. BTW, people aren’t necessarily always aware they’re doing it. We all suffer from our own internal biases and deceptions, and sometimes we spread them unknowingly!
i built my personal app mostly with ollama and it’s been smooth sailing so far. basically openclaw + hermes-style agents running on android phones, and the stuff it can do is kinda insane
I’ve worked for companies with less than that in total personnel expenditures (100ish people) and they were actually economically productive / provided a tangible service. With 15 million and a year(ish) I could spin up dedicated teams to build multiple profitable applications. That sounds like hubris, and maybe it is, but i’m pretty confident given my domain knowledge.
Why just him, and not every other user on HN that has said things like.
"Programmers don't need unions or professional standards, it will stand in our way of making as much money as we can, and will slow down the speed of software development".
With that said, HN does provide the tools to find users that said things like this and if I wasn't lazy I'd love to find at least a few that said things like the above, but are now pearl clutching over AI being bad they are going to crush the things down to singularities.
What a clown. And Twitter bozos will cheer and clap. As far as money spent, this is still much better than rounding up and/or bombing brown people, but shows insanity of the current market. The saddest part is that bootlickers/temporarily embarrassed AI millionaires will defend this.
And of course I'm just yet another envious hater from "the orange website". Your conscience is clear, AI bros. /s
I'm sorry, where did you get millions of deployments? I see 300k+ Github stars, that have as much worth as a bookmarked page (why don't we count those too?). And 2 mil (alleged) website views, which is also moderately nothing.
This site digs in more: https://www.trendingtopics.eu/openclaw-numbers/, and refers to stats from gradually.ai. Stepfun flash alone had 3.4 trillion tokens used for openclaw as of mid April. That’s not counting GLM, Kimi, Claude (which was being used so heavily for this that Anthropic instituted emergency policy changes mid billing cycle), etc. In fact, Hermes, a smaller competitor harness from Mistral (153k stars) was large enough to have a custom ‘kill’ pathway in claude code. (https://github.com/anthropics/claude-code/issues/53262).
I don’t feel the need to spend all day auditing, and I don’t care very much, but generally I think the combination of Nvidia corporate enthusiasm, available github stats and industry analysis all tells a pretty coherent story: A project with 70k forks on github is likely to have more than, say 700k users. My own fork-to-usage ratio is far less than that.
Put another way, I would suggest that most public evidence points one direction. If you believe something else, that’s fine. But if you want to convince me there’s less than, say, 100k deployments worldwide, I’d want to understand where those numbers came from before being convinced.
This AI boom feels similar, a lot of hype and the AI usage costs are being subsidized by private equity/VC so far. IPO's are supposed to happen this fall for OpenAI and Anthropic. They're going to have to face the music of corporate governance, accounting rules, reporting revenue, earnings, etc. Subsidizing users seems unsustainable, they need to either jack up rates or downgrade usage per plans. Then there is the circular investments between all of them and Google, Microsoft, etc. Seems like a house of cards.
reply