A fun game for playing moral dilemmas with friends. I gave 12 AI agents zodiac personalities (not that I believe in them) using the same LLM with different personality prompts.
A related trick - if you want to teach your agent a specific kind of behavior, and want this behavior to be calibrated and safe, what you can do is:
1. enumerate the actions (policies) your agent takes, collected from prior runs
2. infer the states that correspond to each of these policies, make a state atlas (similar to the zodiac here)
3. infer the maximally discriminative features that can identify the state from current context
4. label a few examples and train a small policy model that predicts your action from those state features
I think LLMs should be used more often like this - as feature extractors for toy models, which can be used like tools. This way you can encode arbitrary logic in a small tool model that does not depend on the biases of the base model. For example this setup could power a "skill" to reliably implement your policy.
The trick here is that you carefully identify states that predict policy reliably, and features that distinguish between states, instead of using embeddings or pure LLM reasoning. You can decouple the logic from the feature extraction, and have it calibrated to your goals.
All 4 steps can be done by a coding agent with your supervision and zero coding. It's LLM as generic feature extractor with small models sitting on top.
I use Hacker News commenters.
There was someone a while ago who made a funny post about the type of Hacker News commenters. So I have 5 of them that will review things, and ended up being way more effective than I ever imagined they'd be.
│ contrarian-provocateur-roaster │ Challenge premises, explore alternatives │ "Have you considered..."
│ enthusiastic-newcomer-roaster │ Accessibility, onboarding friction │ "Wait, how do I even..."
│ pragmatic-builder-roaster │ Operational reality, production concerns │ "This won't survive 3AM pages" │
│ skeptical-senior-roaster │ Long-term maintenance, sustainability │ "Who maintains this in 2 years?" │
│ well-actually-pedant-roaster │ Terminology precision, verifiability │ "Technically, that's not..." │
That is a fun experiment which can be interesting applied to all sorts of things.
Imagine being captain of a ship and using the same AI with different profiles as background. E.g. what's your opinion on data based on a geologist profile, vs. a profile based on some other profession...
this is fantastic. Really interesting to see which signs decided what.
From the HN discussion of "Motive.c: The Soul of the Sims (1997) (donhopkins.com)":
https://news.ycombinator.com/item?id=14997725
https://www.donhopkins.com/home/images/Sims/
https://news.ycombinator.com/item?id=15002840
DonHopkins on Aug 13, 2017 | parent | context | favorite | on: Motive.c: The Soul of the Sims (1997)
The trick of optimizing games is to off-load as much as the simulation from the computer into the user's brain, which is MUCH more powerful and creative. Implication is more efficient (and richer) than simulation.
During development, when we first added Astrological signs to the characters, there was a discussion about whether we should invent our own original "Sim Zodiac" signs, or use the traditional ones, which have a lot of baggage and history (which some of the designers thought might be a problem).
Will Wright argued that we actually wanted to leverage the baggage and history of the traditional Astrological signs of the Zodiac, so we should just use those and not invent our own.
The way it works is that Will came up with twelve archetypal vectors of personality traits corresponding to each of the twelve Astrological signs, so when you set their personality traits, it looks up the sign with the nearest euclidian distance to the character's personality, and displays that as their sign. But there was absolutely no actual effect on their behavior.
That decision paid off almost instantly and measurably in testing, after we implemented the user interface for showing the Astrological sign in the character creation screen, without writing any code to make their sign affect their behavior: The testers immediately started reporting bugs that their character's sign had too much of an effect on their personality, and claimed that the non-existent effect of astrological signs on behavior needed to be tuned down. But that effect was totally coming from their imagination!
They should call them Astrillogical Signs!
DonHopkins on Aug 13, 2017 [–]
The create-a-sim user interface hid the corresponding astrological sign for the initial all-zero personality you first see before you've spent any points, because that would be insulting to 1/12th of the players (implying [your sign] has zero personality)!
I somehow assumed this was going to be about the Zodiac killer and was really confused.
I don’t see the point of using zodiacs. Might as well use any kind of personality test like Myers-Briggs.
I think my take away is that you are seeing mostly mode-collapse here. There is a high consistency across all of the supposedly different personalities (higher than the naive count would indicate - remember the stochastic nature of responses will inflate the number of 'different' responses, since OP doesn't say anything about sampling a large number of times to get the true response).