Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've also noticed that when I communicate with Grok in my native language, its tone is more natural than other models. I think this is due to the advantage of being trained on a large amount of Twitter data. However, as Twitter contains more and more AI-generated content now, I'm afraid continued training will make it less natural.


The causation could also be the other way round.

Twitter language has started seeming normal casual to us, rather than us using normal casual language in Twitter.


Sadly, it's more likely that people will just start talking like bots


I've seen this expressed as a concern even from one of my colleagues. My retort was:

"English is not my native language and LLMs taught me quite a few very useful formalisms that do land well for people and they change their attitude towards you to be more respectful afterwards. It also showed me how to frame and reframe certain arguments. I agree sounding like an LLM is kind of sad but I am getting a lot of educational value -- and with time I'll sneak my own voice back in these newly learned idioms and ways to talk."


Since you seem interested in the ins and outs of English, I want to say that "retort" has a connotation of anger or sharpness. Your response reads more like a "rebuttal" to me.

This is not a correction; maybe retort is what you meant and I'm not trying to be the English police. I just like discussing the intricacies of language :)


Actually super helpful, thank you!


Like most of all widely spoken languages, there's a lot of regional variation in English. There's even a bunch of quizzes online where you answer 20 questions about phrasings, and they can tell you where you're from with a disconcertingly high degree of accuracy.

In my experience a "retort" is sharp or witty, but certainly not angry, whereas the word "rebuttal" is itself essentially antagonistic. You might use it when referring to something or someone that you look down upon, whereas a more neutral term would simply be "response."


Just personally I tend to regard retort as short and reactive while rebuttal as a longer and more considered disagreement. A retort could be defensive and wrong or it could be sharp and insightful - it doesn't imply one or the other. A rebuttal is mostly an attempt to correct something while a retort doesn't need to be a correction (although it could).

Even something like "piss off!" could be a retort, but usually never a rebuttal :)


Just as I was reading your comment I remembered that Samuel Jackson used "retort" in his speech in the "Pulp Fiction" movie and was wondering whether he was openly antagonistic there (I mean, he killed a bunch of guys with a pistol shortly afterwards but still) or was it a witticism.

I admit I am lost on these nuances and I usually kind of use whatever idiom comes to mind, which yes, likely would net me some weird looks depending on where I am geographically.


It's impressive that you've even managed to use an em-dash in spoken language. /s


I did spot the /s but it's not relevant: I use two normal dashes actually. :)


You're absolutely right!


So human language will improve and become more precise? I'm all for it, especially if we get more emojis in speech! Why is that sadly? Humans will learn to imitate their more intelligent betters.


There was already evidence last year[1] that pointed to ChatGPT-specific words like "meticulous," "delve," etc becoming more frequently used than they were previously. The linked study used audio of academic talks and podcasts to determine this.

[1] https://arxiv.org/abs/2409.01754


Part of me wanted to object to those two examples, which I’ve used frequently since the reaching adulthood in the 80s. Another part of me has been triggered by an apparent uptick in the word “crisp”, which my gut takes as an coding-LLM tell.


Opus 4.7 loves to use the word “substrate” whenever it gets the chance, it’s a really weird tic. How do these models end up this these sorts of behaviors?


Did you try meta? I was into grok but now meta works well for me


I'm sure Twitter knows which are the bot accounts and is surely excluding them from their model training. Twitter bots aren't a new phenomenon after all.


I don't think Twitter/X know for sure who the bots are, since Elon has been pretty vocal about trying to stop them for ages, yet I still get lots of spam DMs (as do others with far fewer followers/reach).

Even if 95% of the spam gets actively reported and dealt with, that still leaves a ton of nonsense on the platform, getting fed into the LLM. And spam has only gotten worse over the years, as the barrier to entry has lowered and lowered.


"Elon has been pretty vocal about trying to stop them for ages"

Elon lies a lot. Like ALL THE TIME.


Are the spam DMs advertisements or more generally something linked to a product or service? I wouldn't be surprised if X is more lenient towards bots that pay them for adverts.


Most of what I get seem to be advertisements or automated messages if you follow large(r) accounts.

One of the most interesting things that I've noticed is these advertisements will be triggered if you follow accounts that are positioned as influencers. I followed one out of curiosity and received a DM from that account advertising some cryptocurrency service.

It's a good way to filter out and block accounts that have almost certainly not grown organically.


I'd have guessed that at least some of the bots are Twitter itself, trying to draw you in with some sense of engagement. Given that Musk is the owner, and everything we know about him and have seen him do, I'd not be surprised if some of the MAGA bots are his too.


>Elon has been pretty vocal about trying to stop them for ages

You know people lie, right? Especially when the lie casts them in a better light and/or makes them more money.


Elon lied on record many times, admitting to the lies only when forced, under oath.


Highly doubtful seeing as my 14 year old twitter account got caught in a recent bot ban wave with no means of contacting a human for recovery.


There is bots everywhere, it has nothing to do with the platform, it has to do with attackers having an incentive to do mass account farming, no platform is secure against it.


Super easy, just make a web-of-trust type of thing: messages are only visible to those who already vouched for you. Otherwise, you pay $0.01/per message/per user reached.


How would that solve it? If I pay, I can still push the content I want (factual or not) which is the same equivalent as paying for accounts directly.


By buying accounts, you are buying reputation. By paying for the posts, you are maybe paying for reach at first, but (a) it will be costly and (b) it does not guarantee that the reached ones will spread anything further.


With banning and deboosting they need to be very accurate but with filtering they can be more liberal in excluding


not really. there are easy heuristics to filter out bots with good confidence. FWIW i don't see any bots posting anything in my feed


Yes your individual feed isn't really relevant if we talk about the masses, Reddit accounts are for sale quite cheap, HN as well, X too and so-on, it's literally just a matter of means/methodology. If I want today to do 1000 random posts talking about a certain thing, I could.


my individual feed does matter because it shows that it is possible to curate something without bots which is obviously what XAI would do


congratulations, you have solved anti-scam. go make your billion since its easy.


its easy to solve at the offline level where you have time to filter out. in fact this is already done in pre-training by OpenAI and other companies.

you think its hard?


Yes I think it's hard.

OpenAI has already been proven to be easily gamed through very unsophisticated poisoning (fake information in a web page + an edit to a wiki page pointing at it, fake information in a reddit post), so I'm not sure we shoudl hold up their efforts at data cleaning as a gold standard.

https://www.sei.cmu.edu/blog/data-poisoning-in-ai-models-the...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: