Can anyone more familiar with lambda calculus speculate why all models fail to implement fft? There are gazzilion fft implementations in various languages over the web and the actual cooley-tukey algorithm is rather short.
Pure lambda calculus doesn’t have numbers, and FFT, as traditionally specified, needs real numbers or a decent approximation of them. There are also Fourier transforms over a ring. So the task needs to specify what kind of numbers are being Fourier transformed.
But the prompt is written very tersely written. It’s not immediately obvious to me what it’s asking. I even asked ChatGPT what number system was described by the relevant part of the prompt and it thought it was ambiguous. I certainly don’t really know what the prompt is asking.
Adding insult to injury, I think the author is trying to get LLMs to solve these in one shot with no tools. Perhaps the big US models are tuned a little bit for writing working code in their extremely poor web chat environments (because the companies are already too unwieldy to get their web sites in sync with the rest of their code), but I can’t imagine why a company like DeepSeek would use their limited resources to RL their model for its coding performance in a poor environment when their users won’t actually do this.
I can guess:
Pure lambda calculus doesn’t have numbers, and FFT, as traditionally specified, needs real numbers or a decent approximation of them. There are also Fourier transforms over a ring. So the task needs to specify what kind of numbers are being Fourier transformed.
But the prompt is written very tersely written. It’s not immediately obvious to me what it’s asking. I even asked ChatGPT what number system was described by the relevant part of the prompt and it thought it was ambiguous. I certainly don’t really know what the prompt is asking.
Adding insult to injury, I think the author is trying to get LLMs to solve these in one shot with no tools. Perhaps the big US models are tuned a little bit for writing working code in their extremely poor web chat environments (because the companies are already too unwieldy to get their web sites in sync with the rest of their code), but I can’t imagine why a company like DeepSeek would use their limited resources to RL their model for its coding performance in a poor environment when their users won’t actually do this.