Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
TypingOutBugs
on Feb 2, 2025
|
parent
|
context
|
favorite
| on:
Recent results show that LLMs struggle with compos...
I used a Knights and Knaves puzzle generator last month to test 4o / Claude 3.5 and all failed on novel puzzles
optimalsolver
on Feb 4, 2025
[–]
Hey, I'm interested in the details of this. How many persons in the puzzle? Did it include nested statements, conditionals and such?
If the puzzle generator is hosted anywhere, I'd love to have a look at it.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: