i think people are underestimating the potential here for agents building - it i...

jonplackett · on June 14, 2023

It was already quite easy to get GPT-4 to output json. You just append ‘reply in json with this format’ and it does a really good job.

GPT-3.5 was very haphazard though and needs extensive babysitting and reminding, so if this makes gpt3 better then it’s useful - it does have an annoying disclaimer though that ‘it may not reply with valid json’ so we’ll still have to do some sense checks into he output.

I have been using this to make a few ‘choose your own adventure’ type games and I can see there’s a TONNE of potential useful things.

ignite · on June 14, 2023

> You just append ‘reply in json with this format’ and it does a really good job.

It does an ok job. Except when it doesn't. Definitely misses a lot of the time, sometimes on prompts that succeeded on previous runs.

bel423 · on June 15, 2023

It literally does it everytime perfectly. I remember I put together an entire system that would validate the JSON against a zod schema and use reflection to fix it and it literally never gets triggered because GPT3.5-turbo always does it right the first time.

worik · on June 15, 2023

> It literally does it everytime perfectly. I remember I put together an entire system that would validate the JSON against a zod schema and use reflection to fix it and it literally never gets triggered because GPT3.5-turbo always does it right the first time.

Danger! There be assumptions!!

gpt-? is a moving target and in rapid development. What it does Tuesday, which it did not do on Monday, it may well not do on Wednesday

If there is a documented method to guarantee it, it will work that way (modulo OpenAI bugs - and now Microsoft is involved....)

What we had before, what you are talking of, was observed behaviour. An assumption that what we observed in the past will continue in the future is not something to build a business on

travisjungroth · on June 15, 2023

ChatGPT moves fast. The API version doesn’t seem to change except with the model and documented API changes.

whateveracct · on June 15, 2023

No it doesn't lol. I've seen it just randomly not use a comma after one array element, for example.

LanceJones · on June 15, 2023

Yep. Incorrect trailing commas ad nauseum for me.

thomasfromcdnjs · on June 15, 2023

Are you saying that it return only JSON before? I'm with the other commenters it was wildly variable and always at least said "Here is your response" which doesn't parse well.

travisjungroth · on June 15, 2023

If you want a parsable response, have it wrap that with ```. Include an example request/response in your history. Treat any message you can’t parse as an error message.

This works well because it has a place to put any “keep in mind” noise. You can actually include that in your example.

lmeyerov · on June 15, 2023

Yeah no

sheepscreek · on June 15, 2023

The solution that worked great for me - do not use JSON for GPT to agent communication. Use comma separated key=value, or something to that effect.

Then have another pure code layer to parse that into structured JSON.

I think it’s the JSON syntax (with curly braces) that does it in. So YAML or TOML might work just as well, but I haven’t tried that.

jacobsimon · on June 15, 2023

Coincidentally, I just published this JS library[1] over the weekend that helps prompt LLMs to return typed JSON data and validates it for you. Would love feedback on it if this is something people here are interested in. Haven’t played around with the new API yet but I think this is super exciting stuff!

[1] https://github.com/jacobsimon/prompting

golergka · on June 15, 2023

Looks promising! Do you do retries when returned json is invalid? Personally, I used io-ts for parsing, and GPT seems to be able to correct itself easily when confronted with a well-formed error message.

jacobsimon · on June 15, 2023

Great idea, I was going to add basic retries but didn’t think to include the error.

Any other features you’d expect in a prompt builder like this? I’m tempted to add lots of other utility methods like classify(), summarize(), language(), etc

bombela · on June 15, 2023

It's harder to form a tree with key value. I also tried the relational route. But it would always messup the cardinality (one person should have 0 or n friends, but a person has a single birth date).

sheepscreek · on June 15, 2023

You could flatten it using namespaced keys. Eg.

    {
      parent1: { child1: value }
    }

Becomes one of the following:

    parent1/child1=value
    parent1_child1=value
    parent1.child1=value

..you get the idea.

rubyskills · on June 15, 2023

It's also harder to stream JSON? Maybe I'm overthinking this.

cwxm · on June 14, 2023

even with gpt 4, it hallucinates enough that it’s not reliable, forgetting to open/close brackets and quotes. This sounds like it’d be a big improvement.

jonplackett · on June 14, 2023

Not that it matters now but just doing something like this works 99% of the time or more with 4 and 90% with 3.5.

It is VERY IMPORTANT that you respond in valid JSON ONLY. Nothing before or after. Make sure to escape all strings. Use this format:

{“some_variable”: [describe the variable purpose]}

SamPatt · on June 14, 2023

99% of the time is still super frustrating when it fails, if you're using it in a consumer facing app. You have to clean up the output to avoid getting an error. If it goes from 99% to 100% JSON that is a big deal for me, much simpler.

jonplackett · on June 14, 2023

Except it says in the small print to expect invalid JSON occasionally, so you have to write your error handling code either way

golergka · on June 15, 2023

If you're building an app based on LLMs that expects higher than 99% correctness from it, you are bound to fail. Negative scenarios workarounds and retries are mandatory.

davepeck · on June 14, 2023

Yup. Is there a good/forgiving "drunken JSON parser" library that people like to use? Feels like it would be a useful (and separable) piece?

golol · on June 14, 2023

Honestly, I suspect asking GPT-4 to fix your JSON (in a new chat) is a good drunken JSON parser. We are only scraping the surface of what's possible with LLMs. If Token generation was free and instant we could come up with a giant schema of interacting model calls that generates 10 suggestions, iterates over them, ranks them and picks the best one, as silly as it sounds.

andai · on June 15, 2023

That's hilarious... if parsing GPT's JSON fails, keep asking GPT to fix it until it parses!

golol · on June 15, 2023

It shouldn't be surprising though. If a human makes an error parsing JSON, what do you do? You make them look over it again. Unless their intelligence is the bottleneck they might just be able to fix it.

golergka · on June 15, 2023

It works. Just be sure to build a good error message.

hhh · on June 14, 2023

I already do this today to create domain-specific knowledge focused prompts and then have them iterate back and forth and a ‘moderator’ that chooses what goes in and what doesn’t.

8organicbits · on June 14, 2023

Wouldn't you use traditional software to validate the JSON, then ask chatgpt to try again if it wasn't right?

girvo · on June 15, 2023

In my experience, telling it "no thats wrong, try again" just gets it to be wrong in a new different way, or restate the same wrong answer slightly differently. I've had to explicitly guide it to correct answers or formats at times.

cjbprime · on June 15, 2023

Try different phrasing, like "Did your answer follow all of the criteria?".

whateveracct · on June 15, 2023

It forgets commas too

ztratar · on June 14, 2023

Nah, this was solved by most teams a while ago.

bel423 · on June 15, 2023

I feel like I’m taking crazy pills with the amount of people saying this is game changing.

Did they not even try asking gpt to format the output as json?

worik · on June 15, 2023

> I feel like I’m taking crazy pills....try asking gpt to format the output as json

You are taking crazey pills. Stop

gpt-? is unreliable! That is not a bug in it, it is the nature of the beast.

It is not an expert at anything except natural language, and even then it is an idiot savant

sethd · on June 14, 2023

I like to define a JSON schema (https://json-schema.org/) and prompt GPT-4 to output JSON based on that schema.

This lets me specify general requirements (not just JSON structure) inline with the schema and in a very detailed and structured manor.

seizethecheese · on June 15, 2023

In a production system, you don’t need easy to do most of the time, you need easy without fail.

pnpnp · on June 15, 2023

Ok, just playing devil's advocate here. How many FAANG companies have you seen have an outage this year? What's their budget?

I think a better way to reply to the author would have been "how often does it fail"?

Every system will have outages, it's just a matter of how much money you can throw at the problem to reduce them.

jrockway · on June 15, 2023

If 99.995% correct looks bad to users, wait until they see 37%.

muzani · on June 15, 2023

It's fine, but the article makes some good points why - less cognitive load for GPT and less tokens. I think the transistor to logic gate analogy makes sense. You can build the thing perfectly with transistors, but just use the logic gate lol.

reallymental · on June 14, 2023

Is there any publicly available resource replicate your work? I would love to just find the right kind of "incantation" for the gpt-3.5-t or gpt-4 to output a meaningful story arc etc.

Any examples of your work would be greatly helpful as well!

SamPatt · on June 14, 2023

I'm not the person you're asking, but I built a site that allows you to generate fiction if you have an OpenAI API key. You can see the prompts sent in console, and it's all open source:

https://havewords.ai/

devbent · on June 15, 2023

I have an open source project doing exactly this at https://www.generativestorytelling.ai/ GitHub link is on the main page!

bradly · on June 14, 2023

I could not get GPT-4 to reliably not give some sort of text response, even if was just a simple "Sure" followed by the JSON.

avereveard · on June 15, 2023

Pass in an agent message with "Sure here is the answer in json format:" after the user message. Gpt will think it has already done the preamble and the rest of the message will start right with the json.

rytill · on June 14, 2023

Did you try using the API and providing a very clear system message followed by several examples that were pure JSON?

bradly · on June 14, 2023

Yep. I even gave it a JSON schema file to use. It just wouldn't stop added extra verbage.

taylorfinley · on June 15, 2023

I just use a regex to select everything between the first and last curly bracket, reliable fixes the “sure, here’s your object” problem.

NicoJuicy · on June 14, 2023

Say it's a json API and may only reply with valid json without explanation.

bradly · on June 15, 2023

Lol yes of course I tried that.

dror · on June 15, 2023

I've had good luck with both:

https://github.com/drorm/gish/blob/main/tasks/coding.txt

and

https://github.com/drorm/gish/blob/main/tasks/webapp.txt

With the second one, I reliably generated half a dozen apps with one command.

Not to say that it won't fail sometimes.

NicoJuicy · on June 15, 2023

Combine both ? :)

throwuwu · on June 15, 2023

Just end your request with

‘’’json

Or provide a few examples of user request and then agent response in json. Or both.

clbrmbr · on June 15, 2023

Does the ```json trick work with the chat models? Or only the earlier completion models?

throwuwu · on June 15, 2023

Works with chat. They’re still text completion models under all that rlhf

majormajor · on June 14, 2023

GPT-4 was already a massive improvement on 3.5 in terms of replying consistently in a certain JSON structure - I often don't even need to give examples, just a sentence describing the format.

It's great to see they're making it even better, but where I'm currently hitting the limit still in GPT-4 for "shelling out" is about it being truly "creative" or "introspective" about "do I need to ask for clarifications" or "can I find a truly novel away around this task" type of things vs "here's a possible but half-baked sequence I'm going to follow".

fumar · on June 14, 2023

It is “good enough”. Where I struggle is maintaining its memory through a longer request where multiple iterations fail or succeed and then all of a sudden its memory is exceeded and starts fresh. I wish I could store “learnings” that it could revisit.

ehsanu1 · on June 15, 2023

Sounds like you want something like tree of thoughts: https://arxiv.org/abs/2305.10601

jimmySixDOF · on June 15, 2023

Interestingly the paper's repo starts off :

Blah Blah "...is NOT the correct implementation to replicate paper results. In fact, people have reported that his code cannot properly run, and is probably automatically generated by ChatGPT, and kyegomez has done so for other popular ML methods, while intentionally refusing to link to official implementations for his own interests"

Love a good GitHub Identity Theft Star farming ML story

But this method could have potential for a chain of function

lbeurerkellner · on June 14, 2023

It's interesting to think about this form of computation (LLM + function call) in terms of circuitry. It is still unclear to me however, if the sequential form of reasoning imposed by a sequence of chat messages is the right model here. LLM decoding and also more high-level "reasoning algorithms" like tree of thought are not that linear.

Ever since we started working on LMQL, the overarching vision all along was to get to a form of language model programming, where LLM calls are just the smallest primitive of the "text computer" you are running on. It will be interesting to see what kind of patterns emerge, now that the smallest primitive becomes more robust and reliable, at least in terms of the interface.

jillesvangurp · on June 15, 2023

Exactly, we humans can use specialized models and traditional tool APIs and models and orchestrate the use of all these without understanding how these things work in detail.

To do accounting, GPT 4 (or future models) doesn't have to know how to calculate. All it needs to know how to interface with tools like calculators, spreadsheets, etc. and parse their outputs. Every script, program, etc. becomes a thing that has such an API. A lot what we humans do to solve problems is breaking down big problems into problems where we know the solution already.

Real life tool interfaces are messy and optimized for humans with their limited language and cognitive skills. Ironically, that means they are relatively easy to figure out for AI language models. Relative to human language the grammar of these tool "languages" is more regular and the syntax less ambiguous and complicated. Which is why gpt 3 and 4 are reasonably proficient with even some more obscure programming languages and in the use of various frameworks; including some very obscure ones.

Given a lot of these tools with machine accessible APIs with some sort of description or documentation, figuring out how to call these things is relatively straightforward for a language model. The rest is just coming up with a high level plan and then executing it. Which amounts to generating some sort of script that does this. As soon as you have that, that in itself becomes a tool that may be used later. So, it can get better over time. Especially once it starts incorporating feedback about the quality of its results. It would be able to run mini experiments and run its own QA on its own output as well.

minimaxir · on June 14, 2023

"Trivial" is misleading. From OpenAI's docs and demos, the full ReAct workflow is an order of magnitude more difficult than typical ChatGPT API usage with a new set of constaints (e.g. schema definitions)

Even OpenAI's notebook demo has error handling workflows which was actually necessary since ChatGPT returned incorrect formatted output.

cjonas · on June 14, 2023

Maybe trivial isn't the right word, but it's still very straight-forward to get something basic, yet really powerful...

ReAct Setup Prompt (goal + available actions) -> Agent "ReAction" -> Parse & Execute Action -> Send Action Response (success or error) -> Agent "ReAction" -> repeat

As long as each action has proper validation and returns meaningful error messages, you don't need to even change the control flow. The agent will typically understand what went wrong, and attempt to correct it in the next "ReAction".

I've been refactoring some agents to use "functions" and so far it seems to be a HUGE improvement in reliability vs the "Return JSON matching this format" approach. Most impactful is that fact that "3.5-turbo" will now reliability return JSON (before you'd be forced to use GPT-4 for an ReAct style agent of modest complexity).

My agents also seem to be better at following other instructions now that the noise of the response format is gone (of course it's still there, but in a way it has been specifically trained on). This could also just be a result of the improvements to the system prompt though.

devbent · on June 15, 2023

For 3.5, I found it easiest to specify a simple, but parsable, format for responses and then convert that to JSON myself.

I'll have to see if the new JSON schema support is easier than what I already have in place.

quickthrower2 · on June 15, 2023

The first transistors were slow, and it seems this "GPT3/4 calling itself" stuff is quite slow. GPT3/4 as a direct chat is about as slow as I can take. Once this gets sped up.

I am sure it will, as you can scale out, scale up and build more efficient code and build more efficient architectures and "tool for the job" different parts of the process.

The problem now (using auto gpt, for example) is accuracy is bad, so you need human feedback and intervention AND it is slow. Take away the slow, or the needing human intervention and this can be very powerful.

I dream of the breakthrough "shitty old laptop is all you need" paper where they figure out how to do amazing stuff with a 1Gb of space on a spinny disk and 1Gb RAM and a CPU.

SimianLogic · on June 15, 2023

I agree with this. We’ve already gotten pretty good at json coercion, but this seems like it goes one step further by bundling decision making in to the model instead of junking up your prompt or requiring some kind of eval on a single json response.

It should also be much easier to cache these functions. If you send the same set of functions on every API hit, OpenAI should be able to cache that more intelligently than if everything was one big text prompt.

moneywoes · on June 14, 2023

Wow your brand is huge. Crazy growth. i wonder how much these subtle mentions on forums help

TeMPOraL · on June 14, 2023

They're the only one commenter on HN I noticed keeps writing "smol" instead of "small", and is associated with projects with "smol" in their name. Surely I'm not the only one who missed it being a meme around 2015 or sth., and finds this word/use jarring - and therefore very attention-grabbing? Wonder how much that helps with marketing.

This is meant with no negative intentions. It's just that 'swyx was, in my mind, "that HN-er that does AI and keeps saying 'smol'" for far longer than I was aware of latent.space articles/podcasts.

memefrog · on June 14, 2023

Personally, I associate "smol" with "doggo" and "chonker" and other childish redditspeak.

swyx · on June 15, 2023

and fun fact i used to work at Temporal too heheh.

swyx · on June 15, 2023

i mean hopefully its relevant content to the discussion, i hope enough pple know me here by now that i fully participate in The Discourse rather than just being here to cynically plug my stuff. i had a 1.5 hr convo with simon willison and other well known AI tinkerers on this exact thing, and so I shared it, making the most out of their time that they chose to share with me.

freezed8 · on June 14, 2023

100%, if the API itself can choose to call a function or an LLM, then it's way easier to build any agent loop without extensive prompt engineering + worrying about errors.

Tweeted about it here as well: https://twitter.com/jerryjliu0/status/1668994580396621827?s=...

bel423 · on June 15, 2023

You still have to worry about errors. You will probably have to add an error handler function that it can call out to. Otherwise the LLM will hallucinate a valid output regardless of the input. You want it to be able to throw an error and say I could produce the output given this format.

ftxbro · on June 14, 2023

> "you can now trivially make GPT4 decide whether to call itself again, or to proceed to the next stage."

Does this mean the GPT-4 API is now publicly available, or is there still a waitlist? If there's a waitlist and you literally are not allowed to use it no matter how much you are willing to pay then it seems like it's hard to call that trivial.

Tostino · on June 14, 2023

Not GP, but it's still the latter...i've been (im)patiently waiting.

From their blog post the other day: With these updates, we’ll be inviting many more people from the waitlist to try GPT-4 over the coming weeks, with the intent to remove the waitlist entirely with this model. Thank you to everyone who has been patiently waiting, we are excited to see what you build with GPT-4!

londons_explore · on June 14, 2023

If you put contact info in your HN profile - especially an email address that matches one you use to login to openai, someone will probably give you access...

Anyone with access can share it with any other user via the 'invite to organisation' feature. Obviously that allows the invited person do requests billed to the inviter, but since most experiments are only a few cents that doesn't really matter much in practice.

Tostino · on June 14, 2023

Good to know, but I've racked up a decent bill for just my GPT 3.5 use. I can get by with experiments using my ChatGPT Plus subscription, but I really need my own API access to start using it for anything serious.

bayesianbot · on June 14, 2023

"With these updates, we’ll be inviting many more people from the waitlist to try GPT-4 over the coming weeks, with the intent to remove the waitlist entirely with this model. Thank you to everyone who has been patiently waiting, we are excited to see what you build with GPT-4!"

https://openai.com/blog/function-calling-and-other-api-updat...

jarulraj · on June 15, 2023

Interesting observation, @swyx. There seems to be a connection to transitive closure in SQL queries, where the output of the query is fed as the input to the query in the next iteration [1]. We are thinking about how to best support such recursive functions in EvaDB [2].

[1] http://dwhoman.com/blog/sql-transitive-closure.html [2] https://evadb.readthedocs.io/en/stable/source/tutorials/11-s...

ilaksh · on June 14, 2023

The thing is the relevant context often depends on what it's trying to do. You can give it a lot of context in 16k but if there are too many different types of things then I think it will be confused or at least have less capacity for the actual selected task.

So what I am thinking is that some functions might just be like gateways into a second menu level. So instead of just edit_file with the filename and new source, maybe only select_files_for_edit is available at the top level. In that case I can ensure it doesn't try to overwrite an existing file without important stuff that was already in there, by providing the requested files existing contents along with the function allowing the file edit.

throwuwu · on June 15, 2023

Not sure that’s true. I haven’t completely filled the context with examples but I do provide 8 or so exchanges between user and assistant along with a menu of available commands and it seems to be able to generalize from that very well. No hallucinations either. Good idea about sub menus though, I’ll have to use that.

naiv · on June 14, 2023

I think big context only makes sense for document analysis.

For programming you want to keep it slim. Just like you should keep your controllers and classes slim.

Also people with 32k access report very very long response times of up to multiple minutes which is not feasible if you only want a smaller change or analysis.

babyshake · on June 14, 2023

What would be an example where there needs to be an arbitrary level of recursive ability for GPT4 to call itself?

swyx · on June 15, 2023

writing code of higher complexity (we know from CICERO that longer time spent on inference is worth orders of magnitude more than the equivalent in training when it comes to improving end performance), or doing real world tasks with unknown fractal depth (aka yak shave)

killingtime74 · on June 15, 2023

Who is Simon Willison? Is he big in AI?

swyx · on June 15, 2023

formerly cocreator of Django, now Datasette, but pretty much the top writer/hacker on HN making AI topics accessible to engineers https://hn.algolia.com/?dateRange=pastYear&page=0&prefix=tru...

killingtime74 · on June 15, 2023

Oh wow, nice! Big fan of his work

boringuser2 · on June 15, 2023

Do you people always have to overhype this shit?

delhanty · on June 15, 2023

Do you have to be nasty?

That's a person you're replying to with feelings, so why not default to being kind in comments as per HN guidelines?

As it happens, swyx has built notable AI related things, for example smol-developer

https://twitter.com/swyx/status/1657892220492738560

and it would be nice to be able to read his and other perspectives without having to read shallow, mean, dismissive replies such as yours.

boringuser2 · on June 15, 2023

[flagged]

dang · on June 15, 2023

Hey, I understand the frustration (both the frustration of endless links on an over-hyped topic, and the frustration of getting scolded by another user when expressing yourself) - but it really would be good if you'd post more in the intended spirit of this site (https://news.ycombinator.com/newsguidelines.html).

People sometimes misunderstand this, so I'd like to explain a bit. (It probably won't help, but it might, and I don't like flagging or banning accounts without trying to persuade people first if possible.)

We don't ask people to be kind, post thoughtfully, not call names, not flame, etc., out of nannyism or some moral thing we're trying to impose. That wouldn't feel right and I wouldn't want to be under Mary Poppins's umbrella either.

The reason is more like an engineering problem: we're trying to optimize for one specific thing (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...) and we can't do that if people don't respect certain constraints. The constraints are to prevent the forum from burning itself to a crisp, which is where the arrow of internet entropy will take us if we don't expend energy to stave it off.

It probably doesn't feel like you're doing anything particularly wrong, but there's a cognitive bias where everyone underestimates the damage they're causing (by say 10x) and overestimates the damage others are causing (by say 10x) and that compounds into a (by now 100x) bias where everyone feels like everyone else is the problem. We need a way out of that dynamic if we're to have any hope of keeping this place interesting. As you probably realize, HN is forever on the brink of caving into a pit. We need you to help nudge it back from that, not push it over.

Of course you're free to say "what do I care if HN burns itself to a crisp, fuck you all" but I'd argue you shouldn't take that nihilistic position because it isn't in your own interests. HN may be annoying at times, but it's interesting enough for you to spend time here—otherwise you wouldn't be reading the site and posting to it. Why not contribute to making it more interesting rather than destroying it for yourself and everyone else? (I don't mean that you're intentionally destroying it—but the way you've been posting is unintentionally contributing to that outcome.)

I'm sure you wouldn't drop lit matches in a dry forest, or dump motor oil in a mountain lake, trample flower gardens, or litter in a city park, for much the same reason. It's in your own interest to practice the same care for the commons here. Thanks for listening.

boringuser2 · on June 15, 2023

Thanks for the effort of explanation.

If someone were egregiously out of line, typically, I feel community sentiment reflects this.

Personally, I feel your assessment of cognitive bias at play is way off base. I don't think it's a valid comparison to claim that someone is causing "damage" by merely expressing distaste. That's a common tool that humans use for social feedback. Is cutting off the ability for genuine social feedback or adjustment and forcing people to be saccharine out of fear of reprisal from the top really an optimal solution to an engineering problem? It seems more like a simulacrum of an HR department where the guillotine is more real: your job and life rather than merely your ability to share your thoughts on a corner of the Internet.

Think about the engineering problem you find yourself in with this state of affairs: something very similar to the kind of content you might find on LinkedIn, a sort of circular back-patting engine devoid of real challenge and grit because of the aforementioned guillotine under which all participants hang.

And, quite frankly, you do see the effects of this in precisely the post in this initial exchange: hyperbole and lack of deep critical assessment are artificially inflated. This isn't a coincidence: this has been cultured very specifically by the available growing conditions and the starter used -- saccharine hall monitors that fold like cheap suits (e.g. very poorly, lots of creases) when the lowest level of social challenge is raised fo their ideas.

You know what it really feels like? A Silicon Valley reconstruction of all the bad things about a workplace, not a genuine forum for debate and intellectual exploration. If you want to find a place to model such behavior, the Greeks already have you figured out - how do you think Diogenes would feel about human resources?

That being said, I appreciate the empathy.

Obviously, I feel a bit like a guy Tony Soprano beat up and being forced to apologize afterwards to him for bruising his knuckles.

dang · on June 15, 2023

Not to belabor the point but from my perspective you've illustrated the point about cognitive bias: it always feels like the other person started it and did worse ("I feel a bit like a guy Tony Soprano beat up and being forced to apologize afterwards to him for bruising his knuckles") and it always feels like one was merely defending oneself reasonably ("merely expressing distaste"). This is the asymmetry I'm talking about.

As you can imagine, mods get this kind of feedback all the time from all angles. The basic learning it adds up to is that everybody always feels this way. Therefore those feelings are not a reliable compass to navigate by.

This is not a criticism—I appreciate your reply!

Edit:

> forcing people to be saccharine [...] like a simulacrum of an HR department

We definitely don't want that and the site guidelines don't call for that. There is tons of room to make your substantive points thoughtfully without being saccharine. It can take a little bit of reflective work to find that room, though, just because we (humans in general) tend to get locked into binary oppositions.

The best principle to go by is just to ask yourself: is what I'm posting part of a curious conversation? That's the intended spirit of the site. It's possible to tell if you (I don't mean you personally, I mean all of us) are functioning in the range of curiosity and to refrain from posting if you aren't.

It is true that the HN guidelines bring a bit of blandness to discourse because they eliminate the rough-and-tumble debate that can work well in much smaller groups of close peers. But that's because that kind of debate is impossible in a large public forum like HN—it just degenerates immediately into dumb brawls. I've written about this quite a bit if you or anyone wants to read about that:

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que... (I like that analogy)

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

boringuser2 · on June 15, 2023

I think your argument is reasonable from a logical perspective, and I would generally make a similar argument as I would find the template quite persuasive.

However, I, again, feel you're improperly pushing shapes into the shape-board again. Of course, understanding cognitive bias is a fantastic tool to improve human behavior from an engineering perspective, and your argumentum ad numerum is sound.

That being said, you're focusing too much on what my emotional motivation might be rather than looking at the system - do you really think there isn't an element of that dynamic I outlined in an interaction like this? Of course there is.

Anyhow, you know, I don't have the terminology in my back-pocket, but there's definitely a large blind-spot when someone is ignoring the spirit of intellectual curiosity in a positive light rather than a negative one.

In this case, don't you think a tool like mild negative social feedback might be a useful mechanism? Of course, there's a limit, and if such a person were incapable of further insight, they'd probably not be very useful conversants. That's obviously not happening here.

One final thing is relevant here - you just hit on a pretty important point. There is a grit to a certain type of discourse that is actually superior to this discourse, I'd happily accept that point. Why not just transfer the burden of moderation to that point, rather than what you perceive to be the outset? Surely, you'll greatly reduce your number of false positives.

I provide negative social feedback sometimes because I feel it's appropriate. In the future, I probably won't. That being said, it's obvious that I've never sparked a thoughtless brawl, so the tolerance is at least inappropriately adjusted sufficiently to that extent.

throwuwu · on June 15, 2023

What’s your problem? There’s nothing overhyped about that comment. People, including me, are building complex agents that can execute multi stage prompts and perform complex tasks. Comparing these first models to a basic unit of logic is more than fair given how much more capable they are. Do you just have an axe to grind?

boringuser2 · on June 15, 2023

[flagged]

pyinstallwoes · on June 15, 2023

How is it inappropriate? How is it not building?