All the time, yes. But you have to keep two things separate in your thinking:
- Prompted as in prompt made of tokens -> for LLMs, tokens double as a clock signal. Time only flows when tokens are pushed through them.
- Prompted as in specific request placed in the stream of tokens -> Yeah, they do that all the time whether it's getting into infinite loops of repeating same pattern, or suddenly deciding to do things based on inputs they normally ignored.
Also don't forget that everything is a "prompt" for LLM. All input tokens end up in the same place.
So without a token pushed into them they do something? Not sure I understand...
In the current UIs is there a lot of suppression then as I have not seen things start on their own?
I meant an LLM doing something without any external prompt at all. Not doing something different etc but rather do something without a token/prompt ever flowing to it.
Hard to say, because jobs and processes will adjust to accommodate AI strengths and deficiencies, as AI usage increases.
It's similar to how automation in manufacturing works: we may start with augmenting the human workers with machines improving some parts of the process, but eventually the process itself gets redesigned around the machines.
Because the scientists and mathematicians and sci-fi authors have all already been writing about this for the past 20 years (and I mean non-fiction writing), and nobody cared, giving similar dismissals instead.
Wrong answer. Or at least, obvious and not particularly useful.
Truth is, none of those parties are "nefarious" - they're all just not on your side. And "security" is never an unqualified good thing to have (it's not an unqualified bad thing either). It's just a framework of coercion.
The most important questions to answer about any security system is, what is being protected, for who, and from who. People don't ask that much, not even in the industry - it's an implicit assumption that everyone themselves is a "good person" and is on the protected side of security systems. And then they're confused because it turns out end-users are more often seen as threat actors. All the players mention, but perhaps especially Apple, in its own special way, is protecting the computer from the user just as much as they're protecting the user/user's data from third parties.
They kinda do though, in that instances have been observed to send unrequited messages even when the person/people in charge of some account didn't expressly ask the models to do so.
For my own use of LLMs, I do try to avoid anything which I know has a risk the artefacts they produce may end up DoSing or spamming, and I've avoided the OpenClaw-type pattern for a broader range of reasons of which this is simply one tiny part, but I'm not absolutely confident I could avoid this even in the code coming out of the free tier of the web chat interfaces except by checking every single line of output every single time.
At 1000, you can afford better tools and better employees, and replacement parts get cheaper as you order in bulk, and you can explore clever strategies to smooth risk curves.
At 100 000, you can afford a better and continuously improving process, and dedicated facilities, and skilled experts, and parts get even cheaper because you're a volume buyer or perhaps own the supply side, and you get to set your own risk curve.
Lots of things get cheaper at scale. Insurance, too.
No, it's the actual reasonable approach that sane people have to security. In the real world, security is always about costs and benefits, because you can always make something more secure than it is by spending more money, but it also doesn't make sense to spend more than you're getting from it.
Normally, you secure things up to minimize (${cost of security measures} + ${expected damage from attacks that materialized}), writing off actual material damage with insurance wherever possible. You pick security measures based on their effectiveness, which usually translates to "how expensive will it make success for attackers", aiming to push that above the value the attackers can expect to gain.
There are obvious exceptions to that, like risk to life and limb, as well as some other special situations where attackers may have unusual motivations and thus the economic logic of "make stealing treasure cost more than the treasure" stops applying. But those are exceptions. Almost everything you deal with in your life - from your bike shed to the corporation that owns your bank - follows the above logic in terms of security.
--
I spell this out because I've noticed that tech industry circles have this weird, belief in security as some kind of binary, holy good, that you either have and are blessed, or don't and sin. This obsession starts with failing to even recognize, much less ask, the most important questions about security: why do you want to protect it, and who are you protecting it from?
100% agree, and so happy to see somebody call this out. If you go on /r/SelfHosted or any other novice oriented forum, you’ll quickly realize that most users are simply “keeping up with the joneses” when it comes to security & redundancy. That itself is fine I guess, but the zero tolerance they have for anything else is just absurd.
My approach for AI-first code review, or really any kind of AI technical opinion, is that if the claim AI made is both important and not obviously true at a glance, it has to prove it to me, and keep trying until I'm convinced or can spot an obvious mistake in the proof.
With reviews, this is usually the case where AI is making a claim that something in the PR will fail because of some assumptions or behaviors in code outside of the PR - e.g. "this change will fail in scenario X, because foo is null in this case, because the SQL query doesn't populate it when bar == quux, and it gets propagated as null through the JSON deserialization (optional field)...", where all the SQL and JSON parsing was not part of the code under review, and "bar == quux" is some weird domain special case.
Stuff like this is both critical, and there's no way for me to judge it without an expensive context switch. So I learn to ask for a more detailed walk-through once, and if that doesn't make me "see" it, I just ask it to reproduce it with tests, and confirm it's a real problem. Reviewing the reproduction is usually enough for me to either "see it" or accept they're probably right and ask the author to recheck it.
(Why not jump straight to "reproduce it" for every finding? Because it still takes time to have AI do the repro. It's cheaper than a deep context switch, but not free.)
- Prompted as in prompt made of tokens -> for LLMs, tokens double as a clock signal. Time only flows when tokens are pushed through them.
- Prompted as in specific request placed in the stream of tokens -> Yeah, they do that all the time whether it's getting into infinite loops of repeating same pattern, or suddenly deciding to do things based on inputs they normally ignored.
Also don't forget that everything is a "prompt" for LLM. All input tokens end up in the same place.
reply