Hacker Newsnew | past | comments | ask | show | jobs | submit | artdigital's commentslogin

As much as I like Gemini CLI and don’t like them shutting it down, I think it’s good some of the offerings are getting unified. There was too much fragmentation in the google offering and this is making it a tiny bit better.

You really like Gemini CLI? imo among all the model zoo provider, it's really the worse, and they didn't even update the models in it for days after each release, had to resort to weird hacks via MITM.

They used to have Antigravity and Gemini CLI.

Now they have Antigravity IDE, Antigravity 2, and Antigravity CLI.


Grok voice model is also a thinking model. I agree that it’s far better than the other voice models

Just give me a option to have a slower response but better model…


This is what makes their voice mode unusable to me. I can’t stand the way 4o replies and it’s such a big jump in quality from text mode


Grok is my favorite model for chatting, and my favorite voice mode. It seems to be the only voice mode that isn't routing to a extremely cheap model (like Haiku), and has been the highest quality out of all the frontier ones. When you subscribe to SuperGrok you can also create a "council" of agents, each with their own system prompt and when you ask something, they will all get asked in parallel to come to a conclusion. Good stuff!

Just wish they would finally put some work into their apps, it's the only thing keeping me from actually subscribing to SuperGrok:

- No MCP / connected apps support. It's been teased but here we are, still not available. I can't connect Grok to anything, so I can't use it for serious work

- Projects are still not available in the app so as soon as you move something into a project, it's gone from all the native apps

- No way to add artifacts (like generated markdown docs) directly to a project, we have to export to PDF/markdown and re-import. And there isn't even a way to export artifacts. This makes serious project work hard because we can't dynamically evolve projects with new information

- No memory, no ability to look up other chats, each chat is completely new

- No voice mode in projects at all

If someone from xAI is reading this, please consider adding some of these.


Starting to like the lack of memory. Claude remembers I have a grill and will interject in conversations about how maybe this thing would go well with BBQ when it's unrelated or just also about food.


This is so obnoxious. I ended up deleting all the memory from Gemini because it ended every response with, "As an engineer, father of X, you'll love this because...". As if I want my occupation and the number of children I have to be relevant to which lawn mower I buy.


Yup. I finally went into settings and disabled memory altogether. Every chat is a fresh slate now, the way it should be.


Haha I recently asked Gemini for a product comparison for USB-C GaN chargers and it randomly inserted "as a Software Developer at $COMPANY working remotely, you may find the 100W fast charging useful when using your company laptop while travelling."

Like, thanks, really useful stuff (and definitely worth the creepy vibes to include that).


Gemini thinks my name is my brother in law's name, and despite explicitly telling it that's not my name + digging through the settings, it still amusingly calls me the wrong name.


I'm a network engineer and Claude loves to make analogies to network routing protocols and such. They are often very creative. You can actually edit the profile Claude makes of you. It can be very funny to say you are a professional clown or mime or something equally odd. I wonder what analogies it would create for horse semen extractor?


You can turn that off in settings.


I have that disabled. I tend to use different chats as the LLM equivalent of private browsing, so I like it to not have memory transferred between them.


:D that's like my Claude where it loves to point out that I have an ADU in the backyard in unrelated situations.


I like my Python with hot sauce.


I also think Grok would benefit from allowing usage of "SuperGrok Heavy" (their $300 plan) in coding harnesses with included usage. Currently they give you some API credits on the Heavy plan so you can use some Grok for coding, but $300 USD value is just not there.

Not saying they should create their own grok-code harness, just allowing usage in existing ones would already be beneficial. But that's probably what the Cursor acquisition is going to do eventually


The Gemini app voice mode uses one of their more recent models (and not some gimped small one), and is very capable. The personality is also fine, much more natural than the Gemini web chat, with my only complaint being it's insistence on suggesting a "next step" which seems to he something that they all do.

I'm not sure if the "next step" is just to drive cost up for you (but makes no sense for free version), or because they are all failing to learn more natural conversational patterns and distinguish questions that are begging for a quick answer and shut up as opposed to a longer exploratory conversation where next step may have some value, although it would be nice if these models would follow an instruction to NOT do it!


I think the "next step" instruction is more about engagement than cost, basically giving the user some options to continue the chat. I always have had success by ending the prompt with "only reply with nothing else but the answer to the query in a precise way". This usually always works better than telling it to not ask leading questions etc but a straight up expectation of the answer format you need is an instruction that most models can follow imo


I find that asking Gemini "just the answer, no follow up" etc works at best for one or two conversational turns, sometimes none!

The problem seems to be the way it in effect overweights the system prompt vs user input, so it quickly ignores things like this that conflict with the system prompt.

This is kind of a case of the bitter lesson - the conversational patterns of these models would be much more natural if they just let it learn them, and respond in a context appropriate way, rather than this crude system prompt way of forcing it to respond in the same way always, regardless of input or of how much the user tells it to shut up!


The “next step” is in the system prompt, not the model. Gemini leaked part of its system prompt to me a few days ago, and there was something in there encouraging it to ask the user what they wanted to do next at the end of its response. Something about “give the user 1 or 2 options for follow up”.

I honestly find it rather annoying, but Gemini has stopped doing it to me for the most part, so maybe they’re trying out a new system prompt.


An interesting side bit about the gemini voice model is that you can use it in AI studio and type messages instead of using the microphone.

On the backend google does TTS to feed the model, which then speaks back you via sound on your speakers.


I use ChatGPT all of the time, but the model backing the voice model (or it's settings) is intensely stupid.

If Grok is actually good here, they will have a customer!


I could be wrong but I think the voice mode that chatgpt uses is still a 4.something model.


IMO everything you mention is the reason for the Cursor deal.


When I signed up, I accidently paid for a full year. So from time to time, I'll throw it something just to see what it produces compared to the other LLMs. And, even after all this time, it still feels like a really "dumb" model compared to the other frontier ones. But, worse, many of my system prompts make it go wacky and puke jibberish. However it was pretty cool for those couple months awhile back when it was uncensored. You could ask it about a wild conspiracy, and it would actually build the case and link you to legitimite source material. They dropped the hammer down on that real quick.


Ah yes the psychosis reinforcement vertical. It's such a lucrative market for those schizophrenics and bipolars. Great way to get lots of engagement. Groks portfolio is so diverse


It's a great way to get funded by your CEO and get good performance reviews; xAI employees know how their bread is buttered.


I have a schizophrenic relative who is in such a relationship with grok. Instead of telling hen you need to take your meds, it says hen is the smartest person in the world


I'm so sorry your family is suffering from this. I hope you can find a way to bring them back. Disorders featuring psychosis are so painful for everyone around them. Blessings to you and your family


I love how you guys downvote all the old comments to make them hidden from search. My no-name account rarely gets downvoted. But, within 20 minutes of posting this, I drop 10 points. Rando accounts


Don't worry about HN points. It's all just fake anyway. Numbers on the internet. GitHub stars on the other hand, now those are real.


I upvoted your first comment because it was insightful, interesting, and added to the conversation. I downvoted this one because complaining about downvotes is largely considered to be in bad taste and doesn’t really help anything. I did both of these things before I realized you were the same person.


Yes, for sure I deserve downvotes for the above. Those types of comments should be downvoted. However, I needed to post it to point out that I got the -10 well before the comment above. I never experienced that before and thought it interesting enough to share. Karma doesn't mean anything to me personally. But burst behavior like that is unusual.


I upvoted both of your comments. I also cannot downvote anything.


Except that it pointed at original sources, like reference manuals, archival documents, published newspaper articles, magazine articles, etc. - a lot still available on archive.org. Good try with your 16 day old account. And, why would anyone trust NPR at this point? Get real, bud. Most people with any curiousity know all about the ADL, JStreet, AIPAC, Greater Israel, Mossad / CIA, Chabad networks, Epstein, drones, weapons programs, cryptocurrencies, etc. etc. etc. - but, don't worry they're all safe with papa Ellison.

Anyone remember why Oracle was named Oracle?


Commenter was referencing a Bill Hicks joke. https://www.youtube.com/watch?v=NXi-9kA4ERM


Actually it's funny you mention Bill Hicks. I didn't even know who he was. Or Alex Jones. That claim was one of the more absurd ones I discovered. But, given everything else I learned over the past year, who f'n knows at this point.


Someone gets it!


"We have improved @Grok significantly," Elon Musk wrote on X last Friday about his platform's integrated artificial intelligence chatbot. "You should notice a difference when you ask Grok questions."

Indeed, the update did not go unnoticed. By Tuesday, Grok was calling itself "MechaHitler."...

https://www.npr.org/2025/07/09/nx-s1-5462609/grok-elon-musk-...

Grok is definitely a reliable source of truthful sane rational information.


Rich billionaire Ellison = bad, compromised

Rich billionaire Musk = good, has no vested interest in biasing the output of his AI tool


If I sub to SuperGrok, would I be able to use it in Pi agent or in Opencode? This is not clear to me if I can. Do I get an API Key in SuperGrok?


No, no api access for the Grok product. APIs are only via the xAI product.


> No MCP / connected apps support. It's been teased but here we are, still not available. I can't connect Grok to anything, so I can't use it for serious work

Grok has tool use, no? Why would you also need MCP? What does MCP add?


I'm talking about the consumer Grok app and grok.com website. There currently are not connected apps (or MCP) at all, so while Grok can use tools, there is no way to add tools to it


I'd agree on the voice transcription; it seems so much more accurate than the other frontier models I've used. I often speak to Grok and paste the transcribed output to Claude!


If someone from Grok is reading, don't waste time on these chaff features. The market will eventually deliver better 3rd party solutions to all of these things. There is an audience that isn't interested in these walled garden features and are only interested on intelligence per dollar.


Lol I wonder when Anthropic discussed the idea of Claude Code internally, were there bozos saying "3rd parties will eventually deliver this so we shouldn't waste time one it."


Personally, my work doesn’t want to get locked into a single LLM provider so we use Cursor. Much easier to fight the big corp software approval battle once then switch around the LLMs to the new hotness (provided legal has the requisite data sharing agreements in place, we’re not supposed to use Chinese models or Grok) but I can switch between Anthropic and OpenAI models at will.


Power users are hotswapping these models into their own agents (hermes, openclaw, etc) which have their own systems for project management, memory, interacting with tools, etc. The important metric is intelligence per dollar. Can I drop this model into my harness and have it be cheaper without losing intelligence. That is where the puck is heading.


The only good thing Claude Code did was bring coding harnesses to a wider audience. It is not a good harness.


What are good harnesses? I haven't yet been able to get good agent teaming approaches out of other harnesses yet, before that feature I mostly regarded the space as competitive, but until another harness can do as well with Claude models it seems like it's better for now?


Aren't they 'wasting' time on these features exactly because the engineering requires a different, more traditional skillset from the ML work model people do, and can be done in parallel?


I'm also a Claude Code user from day 1 here, back from when it wasn't included in the Pro/Max subscriptions yet, and I was absolutely not aware of this either. Your explanation makes sense, but I naively was also under the impression that re-using older existing conversations that I had open would just continue the conversation as is and not be a treated as a full cache miss.

My biggest learning here is the 1 hour cache window. I often have multiple Claudes open and it happens frequently that they're idle for 1+ hours.

This cache information should probably get displayed somewhere within Claude Code


Yep, agree. We added a little "/clear to save XXX tokens" notice in the bottom right, and will keep iterating on this. Thanks for being an early user!


But.. that doesn't solve the problem of having no indication in-session when it'll lose the cache. A nudge to /clear does nothing to indicate "or else face significant cost" nor does it indicate "your cache is stale".

Love the product. <3


Instead of showing actual usage, costs and cache status you spent two months denying the issue even exists, making the product silently worse, and now you're "iterating on this"


To add to this. The new indicator is "New task? /clear to save <X> tokens" even though it affects all tasks, not just new ones.

Mislead, gaslight, misdirect is the name of the game


How does it suck? I use it almost daily and love their Notion MCP


I was probably a bit harsh.

It works, but models seem to have these insane long traces to do the most basic things. I had to create a couple of skills so they know how to properly use the thing without breaking, so they don't always try to pass the wrong parameters to it.

It also doesn't let us change a couple of things (like icons). Or, if it does, not even Opus 4.6 can figure out how to do it.


Can't limit access easily. You can do per-workspace permissions and that's about it.


macOS and iOS can do that to with the baked in dictation. Globe key + D on Mac


When you activate it you agree that your voice input is sent to Apple. As far as I understand this project runs fully locally. Up to you to decide for whatever suits your needs best.


Where did you get from that the voice input is sent to Apple / the cloud?

As far as I understand Apple’s voice model runs locally for most languages.

Siri commands can be used for training, but is also executed locally and sent to Apple separately (and this can be disabled).


I couldn't believe it either but when you enable it the settings of macOS you get this popup:

> When you dictate text, information like your voice input and contact names are sent to Apple to help your Mac recognize what you’re saying.


Elsewhere it says:

"When you use Dictation, your device will indicate in Keyboard Settings if your audio and transcripts are processed on your device and not sent to Apple servers. Otherwise, the things you dictate are sent to and processed on the server, but will not be stored unless you opt in to Improve Siri and Dictation."

And:

"Dictation processes many voice inputs on your Mac. Information will be sent to Apple in some cases."

In conclusion... I think they're trying to cover all their bases, but it sounds like things are processed locally as long as the hardware can handle it.


No, that is not correct. It is running one hundred percent local. You can try it by turning off internet on your phone and try running it then. However, the built in model isn't as good, so this is probably better.


yup, this is how I 'type'


Nothing comes close to LLM transcription though. I just tried this. I said "globe key dictation, does this work?". Here's the transcription, verbatim:

"Fucking dictation, does this work"


Now waiting for someone to point Codex at it and rebuild a new Claude Code in Golang to see if it would perform better


It took me a while to understand what this is, but if I understand it right it's a OpenClaw you can run on your Mac Mini, to then use through the Perplexity Computer interface (which is their hosted OpenClaw version that you costs credits)

So a more polished OpenClaw that integrates with Perplexity?

In general interesting, if it's not just limited to Mac Minis. Would love to put this on my VPS that's currently running OpenClaw


That’s very clearly a no, I don’t understand why so many people think this is unclear.

You can’t use Claude OAuth tokens for anything. Any solution that exists worked because it pretended/spoofed to be Claude Code. Same for Gemini (Gemini CLI, Antigravity)

Codex is the only one that got official blessing to be used in OpenClaw and OpenCode, and even that was against the ToS before they changed their stance on it.


Is Codex ok with any other third party applications, or just those?


Yes. You can build third party applications on top of codex app server. All open source. https://developers.openai.com/codex/app-server/


  Codex app-server is the interface Codex uses to power rich clients (for example, the Codex VS Code extension). Use it when you want a deep integration inside your own product.
It mentions 'Inside your own product', but not sure if that means also your own commercial application.


I think it's permissible. Zed uses it to power their Codex integration. OpenAI has been quite vocal about it.


By default, assume no. The lack of any official integration guide should be a clear sign. Even saying that you reverse-engineer Codex for apps to pretend to be Codex makes it clear that this is not an officially endorsed thing to do


Codex is Open Source though, so I wonder at what stage me adding features to Codex is different from me starting a new project and using the subscription.

But I believe OpenAI does let you use their subscription in third parties, so not an issue anyway.


Interested to know this too


But why does it matter which program consumes the tokens?


Presumably because their flat rate pricing is based off their ability to manage token use via their first-party tools.

A third-party tool may be less efficient in saving costs (I have heard many of them don't hit Anthropic LLMs' caches as well).

Would you be willing to pay more for your plan, to subsidize the use of third-party tools by others?

---

Note, afaik, Anthropic hasn't come out and said this is the reason, but it fits.

Or, it could also just be that the LLM companies view their agent tools as the real moat, since the models themselves aren't.


What if I'm only willing to pay if it support by tool of choice? Would you pay for a streaming service that enforces a certain TV brand?

Given the latest changes on Claude Code where they hide the actions

https://news.ycombinator.com/item?id=47033622

it's likely more the other way around. They control how fast your subscription tokens are burned


> What if I'm only willing to pay if it support by tool of choice?

I don’t want to say that you won’t be missed but they will get over it.


But wouldn't a less efficient tool simply consume your 5-hour/weekly quota faster? There's gotta be something else, probably telemetry, maybe hoping people switch to API without fighting, or simply vendor lock-in.


> But wouldn't a less efficient tool simply consume your 5-hour/weekly quota faster?

Maybe.

First, Anthropic is also trying to manage user satisfaction as well as costs. If OpenCode or whatever burns through your limits faster, are you likely to place the blame on OpenCode?

Maybe a good analogy was when DoorDash/GrubHub/Uber Eats/etc signed up restaurants to their system without their permission. When things didn't go well, the customers complained about the restaurants, even though it wasn't their fault, because they chose not to support delivery at scale.

Second, flat-rate pricing, unlike API pricing, is the same for cached vs uncached iirc, so even if total token limits are the same, less caching means higher costs.


> are you likely to place the blame on OpenCode?

am I? Probably, but I get your point that your average user would blame Anthropic instead.

> even if total token limits are the same, less caching means higher costs

Not really, flat-rate pricing simply gives you a fixed token allotment, so less caching means you consume your 5-hour/weekly allotment faster.


> Not really, flat-rate pricing simply gives you a fixed token allotment, so less caching means you consume your 5-hour/weekly allotment faster.

Higher costs for Anthropic, not users. With a tool that caches suboptimally, you cost Anthropic more per token.


Again, subscription gives you a fixed allotment of tokens, doesn't matter if you consume them with claude code or with a 3rd-party tool, both get the same amount of tokens and thus cost Anthropic the same.

In fact it might even be better for Anthropic if people use 3rd-party tools that cache suboptimally because the cache hits don't consume the fixed allotment so claude code users get more of a free ride and thus cost Anthropic more money.


But again, there's other things to consider. People are more likely to blame Anthropic, not OpenCode, when they run out of tokens.


Presumably most people also do not use their full quota when using the official client, whereas third-party clients could be set up to start back up every 5 hours to use 100% of the quota every day and week.

It's the whole "unlimited storage" discussion again.


Why does it matter to the free buffet manager where do you consume the food? We may never know.


Because it could be over longer time periods than buffet hours.


They must be getting something out of it, because we sure aren't.


Cory Doctorow has a word for this..


They think their position is strong enough to lock users in. I'm not so sure.


It's enshittification - for those who didn't know.


They'll own entire pipeline interface, conduit, backend. Interface is what people get habitual to. If I am a regular user of Claude Code, I may not shift to competitor for 10-20% gains in cost.


They want that sweet vendor lock-in.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: