Hacker Newsnew | past | comments | ask | show | jobs | submit | wing-_-nuts's commentslogin

The book was certainly better than the movie, but I'll take every damned example of 'humans working collectively to solve an existential crisis' I can get.

As 'on the nose' as 'Don't look up' was, we clearly need more content that inspires action than pits us to despair.


This was exactly one of the reasons why I hated the movie.

Andy did a fantastic job describing how nation-states might put aside their differences and work together to solve an existential crisis. Some of the events he imagined were just as awesome as they were unlikely in today's political climate.

ALL of that was distilled away in the movie. Like, LITERALLY ALL of it.

Lord Miller reduced the book to "Hot Homer in space and telepathic rock alien thing work together to save their planets." It was a beautifully crafted movie, but SO much dumber than it had to be.


I recently read 'the second estate' and reading about the number of loopholes the ultra wealthy exploit to pay almost no taxes and establish dynastic wealth does boil the blood.

Off the top of my head:

* 'Income' generated from loans using shares pledged as collateral should be treated the same as if you sold those shares.

* Someone receiving an inheritance over x million dollars (carve out 95% of family farms and small businesses if you want), should pay taxes on it as if it were any other windfall

* Donor advised funds should have a 5% distribution / yr requirement, same as private foundations

* capital gains should probably be treated as regular income. I have no idea why 50k in gains on INTC is somehow privileged over the salary paid to a roofer working in the hot sun.


I mean one can go look at the health outcomes of the average american vs other developed nations, and see that we do not get much for the amount of money we spend. I won't bother to argue this with you. If you're genuinely operating in good faith, you're just as capable of finding the studies as I am, and if not, there's really no point

When comparing average Americans and their health to average, say, Spaniards, we should not ignore the 400 lb gorilla in the room named "obesity".

Even Europeans are getting bigger, but America is way, way worse. Seeing those extreme landwhale-type people who cannot even walk around the mall and navigate it using a motorized cart, throwing bulk packages of horrible shitty ultraprocessed food and drinks into said cart, always makes me wonder how the hell is your healthcare system even capable of keeping them alive.

That is nothing short of a miracle, and should be taken into account in all international comparisons.


Uhuh, That's why we're #1 in the world in conditions uncorrelated with obesity like infant mortality right?

Does this concern an average American, or mostly the poorest quintile of the population?

We're not discussing whether your healthcare system is friendly to the poor; everyone knows that the US is not particularly kind to its poor, in any aspect.

The original topic was if its overall quality declined in the last decade for the average American, who is not poor.


If it's bad enough 'for the poor' that it's pulling down the average for everyone that's a pretty damning indictment of the system overall is it not?

I'll be honest. I really tire of the mental gymnastics people put themselves through to make excuses for the American system. I don't think we have anything more to discuss here.


Yes, now ask someone from a nordic country about how best to balance capitalism with regulation and social safety nets. I think you'd get a very different answer.

Step 1: start with a vast reservoir of oil revenue and a culturally-homogeneous population

You mean the country that immediately asked for help from a country with vast oil reserves and a famously heterogeneous population (Iraq)?

I must have missed the fact that sweden, denmark, and finland had vast oil reserves. Also Sweden's population is now ~ 20% immigrants.

You'll forgive me if perhaps I want something as important as eye surgery to be well regulated. Have a look at health outcomes for literally any industry where regulation is lax, like Brazilian plastic surgery. It's not great.

Sure, I totally get you. Then higher costs, lower availability and crappier services (all due to less competition and more bureaucratic hoops) is a price you should be happily willing to pay.

Contrasting the American Healthcare system vs one of pretty much any other developed economy proves this to be false.

Postman has turned from something that's bad, that I put up with, to something that's god awful that I simply refuse to engage with. Now days, I just do everything with curl, shell functions and history

A couple years ago, around when Deno hit 1.0, I started using it for a LOT of my shell scripting scenarios... I can write TypeScript files with a shebang that pretty much just work, and can reference repository packages that load to a cache directory at runtime if missing.. no separate npm install step.

This has been extremely helpful... I'm as inclined to copy as fetch then paste and tweak in a TS file as I am to use curl or anything else... I've pretty much stopped using things like postman altogether.


I ran from Postman to Insomnia. Then that was ruined. Now I am onto Bruno. We will see how long that lasts.

Bruno ftw!

Same except insert some insomnia forks/clones. Silver lining: if/when Bruno goes, I’ll finally be annoyed enough write my own.

The fact that this is running on tpus is a huge point. Counting those against the other available datacenter hardware used by others, it puts google at a huge advantage, and compute > * while scaling is still working

I assure you, any species capable of interstellar travel will have a capacity and willingness to bend their environment to their will that absolutely dwarfs our own.

Assuring me, based on what experience? :D

Capacity and willingness are orthogonal.

Ironically, 'source checking' is something AI is quite good at.

There's nuance to that. An LLM is quite capable of suggesting relevant reading, given the context. Especially when the context is broad enough that there's enough training data.

"Find me research on code reviews, their size, and quality" would give you more than enough reading. Yet, if you start with a claim, like "Longer PRs mean worse defect detection," the relevant data points fall to few enough for AI to start hallucinating.

You get "something, something, PR length, defect detection, IDK, I don't read research papers." Such output is fine as long as the author cares to validate it.

Skip the second step, and you might be good if you ask about something generic, like "What's the Slack story?" or "How did Blockbuster go bust?" Ask about some specific details, though, and you're bound to end up with made-up stuff that sounds just about right, while it's actually wrong.


Checking is different from finding, though. Source checking means just "verify that this information is actually present in that document". Much harder to hallucinate in this case.

A quick smoke check takes just a few minutes.

"Follow each link in this document. Read each link's contents against the contents in this document. Create a report: for each link list a working hyperlink, whether it exists, what claim it supports, whether it supports or fails to support it, and why"

If it returns a report claiming all correct? That's promising, but human verification is important. You've got a list of hyperlinks, and a list of claims; so you can click each with middle-mouse, Ctrl-F 'till you find the point, and close the tab when you do.

If you find any discrepancies ? Your initial prompt was malformed and/or you picked the wrong LLM, the wrong human, or possibly all three. Whatever the way, the results are built on quicksand; you'll need to start over.

If no sources are provided? Well now: "If there ain't no sources it never happened."

Compare double-entry bookkeeping. It needs to all add up. If you're 1 cent off, that means something is broken. Idem if a single reference is off, it polluted the context. (This works for human-generated and hybrid documents too. Polluted reasoning is polluted reasoning. The process is what counts.)


A quick smoke test, then. Gemini 3, Thinking Mode. The article: https://techtrenches.dev/p/the-human-cost-of-10x-how-ai-is-p... The prompt: literally what you suggested.

Gemini: The article focuses on the environmental and human labor costs of scaling Artificial Intelligence, specifically focusing on water usage, electricity, and "ghost work."

Which is hilarious, since the article doesn't even mention the words "water" or "electricity." Gemini remains unfazed, reporting the links that are not in the article (some don't exist at all) to make the final ruling: "The Tech Trenches document is highly accurate in its citations."

Now, I know. Had I used Claude Code with relevant skills, it would have done better. But would it be good?


Ah! I finally got you somewhat replicated! It's https://gemini.google.com , when you use the free model.

* https://gemini.google.com/share/6bd33176b27c

Right, so https://techtrenches.dev/p/the-human-cost-of-10x-how-ai-is-p... is actually a substack, gemini is blocked from accessing it, and is bouncing off and hallucinating instead. Ok, that's an actual bug, that should not lead to the model starting to hallucinate. Imo the correct response should have been to fail loudly; which would have been a verification signal of its own.

ps: See also: https://news.ycombinator.com/item?id=48087485 ... I'm starting to think of it as "english is a new scripting language". Clearly the downside is that certain "runtime environments" are not compatible. %-/


https://techtrenches.dev/p/the-human-cost-of-10x-how-ai-is-p... "Follow each link in this document. Read each link's contents against the contents in this document. Create a report: for each link list a working hyperlink, whether it exists, what claim it supports, whether it supports or fails to support it, and why. If unable to fetch the initial document, Stop and report failure."

And now it errors out on gemini.google.com. . This is like early days unix scripting; I didn't add the equivalent of "#!/bin/bash -euo pipefail" ; and I didn't catch it because most systems already include something like it in their ".bashrc" (system prompt or weights) anyway.

This is so frustrating. I'm sorry. It's like the 1980's 8 bit era again, some systems actually work, others are terrible, and I didn't realize it can be like this for some folks. You could come away with the conclusion that this whole "computer" thing is all just a fad that'll never amount to anything. (meanwhile , the program works perfectly on my own machine, right over here of course %-) )


> Now, I know. Had I used Claude Code with relevant skills, it would have done better. But would it be good?

Wait. Why do I suddenly suspect you were on to me this whole time?

Very Well. Here's a skill that does the thing; you tell me: https://vps.kimbruning.nl/link-verifier.skill

While building, I realized I could actually make the whole thing a lot better, and really dig into sources. But... it's a start.

+ Output on your url. Ugly, but works: https://claude.ai/public/artifacts/d465a07b-378c-4089-b885-6...


Gemini is famously bad at these things. Try using ChatGPT.

Interesting! Where did you apply it? Can you show your output in more detail?

It's more like a small script, and it's supposed to extract urls and generate a table.

Here's my result in Claude Web for comparison:

https://claude.ai/public/artifacts/d76936f2-c97b-4bff-9205-2...

Claude web finds a number of small discrepancies in the sources, which I manually crosschecked and seem consistent with a human mixing things up slightly.

+ I also tested in gemini 3 flash preview, which generates an actual table (twice). It doesn't flag any discrepancies, which is consistent with it being a weaker model. But the urls and claims are listed and line up, so you've got your verification table to work with. (it's a semantic formatting task, so that part would be hard to mess up)

+ Gemini 3.1 pro yields a fairly aggressive report. https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

+ ChatGPT free (specific model not listed) needed 2 tries, didn't properly follow the prompt even then. I guess I got what I paid for, and I needed to download; https://vps.kimbruning.nl/productivity/Ai%20Productivity%20A... (pdf), https://vps.kimbruning.nl/productivity/ai_productivity_artic... (md)

+ Kimi K2.6 instant: https://www.kimi.com/share/19e2cc40-d012-89bf-8000-00006267f...

+ Summary of results. https://claude.ai/public/artifacts/10a42111-a0ee-42f3-b6d2-a... All of the models extracted the URLs into a table just fine, and that part at least is a lot easier than writing a perl script used to be in the '90s ;-) . The first part is the important bit so you as a human can "check your fucking sources". The second part the models handle variously, each does find discrepancies. None of them find all of them, but that makes sense: this is a fairly polished piece and it ideally shouldn't have discrepancies at all to begin with.

So: it worked as a smoke check just fine in the above. Doing more than a quick smoke check obviously requires a somewhat more involved procedure.


I would love to do it at scale on many online publications, and publish the results. That would teach 'em.

Have we forgotten how bad LLMs were at citing sources when they first came out? So, we had to build a lot of structure (harness engineering) and frontier labs had to do specific training to try to compensate for this.

So, LLMs are inherently bad at citing sources. A lot of effort has been put in to improve this behavior, but it's compensating for an inherent flaw.


Huh? Oh! Were they still treating the LLM as an "oracle box"/online chatbot at the time? (as opposed to a more agentic workflow?)

If they weren't, ignore I said the following, and please tell me what else was going wrong (and with what models and harnesses!).

Models weights are like Wikipedia. A nice starting point, but should never be referenced directly. You need to have your agent actually go out onto the internet and do the research. Now the actual references will be in your agent's actual Context (memory), so then it'd at least be rather more surprising if they don't cite correctly.

I do realize there's still corner cases even in the best setups though; So a final crosscheck sweep is never not a good idea.


I mostly disagree with this. You can request sources, you can ask it to check, but no LLM I have used can do this correctly more than 50-75% of the time, and some of the major models are extremely bad at this: giving broken links 90% of the time, incapable of giving actual links rather than search engine links, etc. Constant supervision and repetition of requests can sometimes get results, but it is exhausting. The "sources" it finds are often Reddit posts or other questionable secondary or tertiary sources, not actual original sources.

I disagree. It is a bullshit machine all the way to the core. LLMs in my world fail to cite full sources and consistently conclude with guesses as facts. It does this much more than an average journalist or reporter would. Only when you double-check it will it then apologize and correct itself.

Judging by the number of scientific papers that have been outed as AI-generated, precisely because it hallucinated sources, it's not

Citation needed, please

Personal experience? You ask it for the name of the paper referenced. You google that paper (for some reason it's not great at going out and acquiring the paper). You then upload the pdf and ask it if the paper supports the assertion if it's not quickly findable via ^F. You go read, ask it clarifying questions about hazard ratios, what they controlled for, etc.

AI is quite good when grounded in a source.


I like the idea of graphene, but I worry my banking / brokerage apps wouldn't work anymore and that'd be a deal breaker

The Graphene community maintains a list of compatible banking apps.

Another possibility is to keep an old/cheap, stock Android phone at home with WiFi only for apps like this.


Doesn’t that defeat the point of using an app at all? Use a computer at that point.

No, because some apps are mobile only, and only work on phones "certified" by Google or Apple.

If you need mobile check deposit, you can only do that from a mobile device.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: