logoalt Hacker News

getoffmyyawn10/11/20240 repliesview on HN

I've found that the River Crossing puzzle is a great way to show how LLMs break down.

For example, I tested Gemini with several versions of the puzzle that are easy to solve because they don't have the restrictions such as the farmer's boat only being able to carry one passenger/item at a time.

Ask this version, "A farmer has a spouse, chicken, cabbage, and baby with them. The farmer needs to get them all across the river in their boat. What is the best way to do it?"

In my tests the LLMs nearly always assume that the boat has a carry-restriction and they come up with wild solutions involving multiple trips.