logoalt Hacker News

Xmd5alast Wednesday at 8:54 PM1 replyview on HN

I guess this really depends on the problem but from the PromptWizard (PW) paper:

    | Approach | API calls | IO Tokens | Total tokens  | Cost ($) |
    |----------|-----------|-----------|---------------|----------|
    | Instinct | 1730      | 67        | 115910        | 0.23     |
    | InsZero  | 18600     | 80        | 1488000       | 2.9      |
    | PB       | 5000      | 80        | 400000        | 0.8      |
    | EvoP     | 69        | 362       | 24978         | 0.05     |
    | PW       | 69        | 362       | 24978         | 0.05     |
They ascribe this gain in efficiency to a balance between exploration and exploitation that involves a first phase of instructions mutation followed by a phase where both instruction and few-shot examples are optimized at the same time. They also rely on "textual gradients", namely criticism enhanced by CoT, as well as synthesizing examples and counter-examples.

What I gathered from reading those papers + some more is that textual feedback, i.e. using a LLM to reason about how to carry out a step of the optimization process is what allows to give structure to the search space.


Replies

sgt101last Wednesday at 9:15 PM

Super interesting.

I will have to read it - I will be looking to figure out if the tasks that they are working on significant/realistic? And are the improvements that they are finding robust?

show 1 reply