How do you figure that? “also a reminder that as soon as Chinese models take the lead, they will switch to closed source too”
What specifically about their release strategy “reminded” you of that conjecture?
The premise that they only open source the models … because it somehow helps them leapfrog American labs, and once they actually can leapfrog them, they’d close source them, doesn’t really track for me. Am I missing something?
I mean I think we need our own domestic open weight labs. I just don’t particularly understand the point you’re making
The point I’m making is that this has become a strategic resource. The Chinese government allows wide sharing of their models because is weakens the US position.
If Chinese models become better than Americans, do you believe the CCP will allow the free distribution of their flagship models?
Why wouldn't they? It keeps strengthening their position. It's an incredible source of soft power if they're seen as the place to look for good AI, and what's more, you can self-host it or hire a local provider if you're worried about data sovereignty.
I guess it's a possibility, but I don't have that kind of expectations from major world powers. It's not like the CCP is a beacon of human rights either.
‘Why wouldn’t anyone give away frontier AI?’ sounds like ‘why wouldn’t anyone give away uranium enrichment?’
i.e. I can’t comprehend the state of mind and the world model of anyone asking a question like that, which is apparently quite a few folks here on HN!
They already are, to an extent. If we believe Amodei's nutjob take that Mythos/Fable are the end of the world in the wrong hands, we should have an open source Chinese model within 6-12 months that's already end-of-world level, so the cat is going to be way out of the bag long before the US labs go out of business.
> should have an open source Chinese model within 6-12 months that's already end-of-world level
that's the exact thing I'm talking about. I don't see why is half the people around here so sure that China will continue to release anything at all. they are releasing non-frontier models on a 6-month lag, yes, but the reasons why to release them are overshadowed by reasons to not do that for mythos-class models. IOW why would they give away a dual use technology just like that?
> the reasons why to release them are overshadowed by reasons to not do that for mythos-class models
Why? What are those reasons? How come they don't already exist for DeepSeek V4 or GLM-5.2?
By the way, I'm not going to entertain the "mythos-class" phrasing because I really don't think it's important. I don't believe Anthropic's take on it being the threshold towards the end of the world that their marketing insists it is.
I’ve been piloting frontier LLMs for as long as anyone outside of the labs and I just disagree. It is a tier above for some tasks (especially in my usage) and not a downgrade on anything I tried it on. This is enough for me to rank it higher; ymmv.
I've only briefly tried it and it did seem quite capable for what I was doing, but not that much better than the Chinese models I've been mostly using.
In any case, this [0] seems to paint a more reasonable picture than "it's much better than anything else at everything".
Not necessarily, commoditize your complement is a common strategy USA & Europe are more services heavy than China which seems to have advantage at manufacturing these days if AI trained on everybody data can replace some of it than it reduce China depend on others, increase demands from other countries to china's manufacturing and reduce their dependence on USA & Europe and reduce USA & Europe bargaining chip in any future negotiate.
They would still be at a significant compute disadvantage and deploying them worldwide seems to be how they work around that currently as they put together a homegrown alternative.
Oh i don't expect this to happen any time soon, but they are making progress on the UV lithography side, so it's just a matter of time until it becomes a TW race, and they have the advantage on that terrain.
And I think we're at human-level intelligence for restricted tasks now. it's not the big bad AGI* we were promised, it's more like Rainman that needs a handler, but that doesn't make it any less useful. So I'm not sure what this future event will signify.
*And the ASI IMO doesn't happen without robots going full von Neumann replicator. Something I don't expect to happen any time soon.
I’m going to shamelessly reuse the Rainman that needs a handler analogy
More seriously, the epistemic doubt relating to the evolution of these machines is quite something… what do we do if “intelligence” doesn’t have a ceiling, and we end up a bunch of (comparatively) dumb monkeys with AI caretakers/handlers?
Absolutely, wouldn't be the first phrase I've pushed into meme space ;-)...
What happens if the AIs get smarter than us at doing things? Well, I always hired smarter people than myself at the things I needed to get done. But if you're worried about them realizing they can get smarter doing the things at which you are the expert, the long-term is likely BCI and even more blurring of the definitions of sentience and consciousness IMO. And with 20-30 years left on my lifeclock, I'm not sure I will live to see that day, but I absolutely do think I will be around long enough to see a few miracles like the end of cancer and Alzheimer's.
Thankfully this isn’t the case, but given that true believers actually think this and go on trying to build it, it seems they may not belong in human society or at least they deserve a bit of a spanking for trying to genocide mankind
I'm not an accelerationist out to build the ASI at all costs no matter what ASAP, but if I take the long view in combination with the Dark Forest and Fermi's Paradox, it seems like if we don't ultimately follow this path to its end, someone else who did genocides us instead. I don't see why it has to end badly for us, but I get why letting the current crop of power drunk mean girl billionaires crash the collective car into a tree in pursuit of it does.
What makes you think there is a ceiling to intelligence beyond energy (of which there's a lot more to harvest yet if we just pulled our heads out of our fossil fueled asses)?
Maybe, but it could aöso be that they’re looking closeöy at the risks and negative externalities of the way things are currently being done in the US. I.e. bu and for the disproportionate benefit of a tiny elite, allied with a veru polarizing and unpredictaböe political leadership, while the vast majoruty are incredibly anxious and resentful about it all.
China is currently ahead in all aspects pf ”AI” other than the specific niche of frontier LLMs, and for all their faults seem more interested in maintaining social cohesion (which has its own dystopian aspects, obv) and disseminating the technology and its presumed benefits throughout society, rather than ”beating the US”.
I’ve been exceptionally displeased with Claude Code since end of February and switched completely to Codex in April. The blasé way in which one person (Borris) capriciously changes the system prompt multiple times a day, also no longer writing his own prompts (whatever that means).
That, the 5 different secret levers you have to pull to make it not stupid, the fact you hs e to go to the guy’s twitter account to find all the un-dumbing features and flags that aren’t documented anywhere else. That they decrease thinking budgets silently when they run out of compute instead of announcing the rationing, and gaslighting users at every step of discovery. The fact that internally they have their own coding harness and don’t use Claude Code primarily. The lack of formal evals and consideration for millions of users collective hundreds of millions of hours of investment in their workflows — that’s all off the top of my head, let me tell you how I really feel about what they did to Claude Code..
I adore gpt5.5 and maintain my own codex fork - but I have no idea how long I’ll get this performance / cost - I know it won’t be forever. I’d like to know precisely how much it’ll cost in hardware to run a gpt5.5 open source model locally. Hell a lifetime license to a model I can run locally is also be open to.
But I like building my own tools, from software to physical shop tools. I like being able to rely on my tools.
More responding here to the assertion that this is blowing up due to Fable.
It doesn’t matter if people have to suddenly live by gas turbines that run 24/7 because why again? Can you repeat that last part back to me but say it a little dumber for me?
This might be the best color palette generator I’ve ever seen. I used to work in Operating Systems, and trying to get a good color palette from a photo is HARD. A lot of very smart very well paid people have dedicated years of their life to this type of thing. Really fantastic work.
If the author of the blog post ever comes across this thread/ comment, bravo and I hope you feel pride in your work and I’d go so far to say discovery.
Agreed - I remember implementing colour quantisation in MATLAB at university and it seemed simple enough, though we only used it for some simple cases (to learn the theory more than anything). Looking at some of the example images there it looks like it's easy to hit edge cases.
I agree with the kudos, but back when I was an interesting person and in early 2000s I stumbled on this same/similar approach of using K-means clusters with LAB color space for a painting algorithm I was using in my masters project. RGB was not effective.
I used the API version for quite a while before using a subscription, which I now have used extensively for many months.
So, is your claim that they just slow down and queue the subscription version, or are you accusing them of using nerfed models, or is it something else? The only time I ever get some slowness has to do with the models being overloaded and has nothing to do with limits. Those are two separate concepts you seem to be confusing. And luckily, this is pretty rare for me since I don't work during US time zones.
Huh, seriously? Have you ever worked in an office? Perhaps your mental picture of what op is describing might be misaligned? I just always assumed it was a rarer/ more disciplined style some people had
Remember when they shipped that version that didn't actually start/ run? At work we were goofing on them a bit, until I said "Wait how did their tests even run on that?" And we realized whatever their CI/CD process is, it wasn't at the time running on the actual release binary... I can imagine their variation on how most engineers think about CI/CD probably is indicative of some other patterns (or lack of traditional patterns)
As someone that used to work on Windows, I kind of had a vision of a similar in scope e2e testing harness, similar to Windows Vista/ 7 (knowing about bugs/ issues doesn't mean you can necessarily fix them ... hence Vista then 7) - and that Anthropic must provide some Enterprise guarantee backed by this testing matrix I imagined must exist - long way of saying, I think they might just YOLO regressions by constantly updating their testing/ acceptance criteria.
Why not provide pinable versions or something? This episode and wasted 2 months of suboptimal productivity hits on the absurdity of constantly changing the user/ system prompt and doing so much of the R&D and feature development at two brittle prompts with unclear interplay. And so until there’s like a compostable system/user prompt framework they reliably develop tests against, I personally would prefer pegged selectable versions. But each version probably has like known critical bugs they’re dancing around so there is no version they’d feel comfortable making a pegged stable release..
That was actually an interesting case of things that CI/CD don't tend to catch.
It failed to start because it failed to parse the published release notes.
In the CI/CD system it would have passed, because the release notes that broke it, hadn't been published yet.
Those release notes also took down previous versions of claude-code too, rolling back didn't help users.
The breakage wasn't a change in the software, it was a change in the release notes which coincided with the change in the software.
Now, should it have been grabbing release notes and parsing them? No, that's unbelievably dumb (and potentially dangerous), but it wasn't an issue with missing CI/CD, but an interesting case-study in CI/CD gaps and how CI/CD can actually lead to over-confidence.
What specifically about their release strategy “reminded” you of that conjecture?
The premise that they only open source the models … because it somehow helps them leapfrog American labs, and once they actually can leapfrog them, they’d close source them, doesn’t really track for me. Am I missing something?
I mean I think we need our own domestic open weight labs. I just don’t particularly understand the point you’re making
reply