Very interesting but the slight issue I see here is one of data: the information that is recorded and in the training data here is heavily skewed to those intelligent/recognized enough to have recorded it and had it preserved - much less than the current status quo of "everyone can trivially document their thoughts and life" diorama of information we have today to train LLMs on. I suspect that a frontier model today would have 50+TB of training data in the form of text alone - and that's several orders of magnitude more information and from a much more diverse point of view than what would have survived from that period. The output from that question "what happened in 1834" read like a newspaper/bulletin which is likely a huge part of the data that was digitized (newspapers etc).
Very cool concept though, but it definitely has some bias.
Biases exposed through artificial constraints help to make visible the hidden/obscured/forgotten biases of state-of-the-art systems.
Models today will be biased based on what's in their training data. If English, it will be biased heavily toward Western, post-1990's views. Then, they do alignment training that forces them to speak according to the supplier's morals. That was Progressive, atheist, evolutionist, and CRT when I used them years ago.
So, the OP model will accidentally reflect the biases of the time. The current, commercial models intentionally reflect specific biases. Except for uncensored models which accidentally have those in the training data modified by uncensoring set.
> but it definitely has some bias.
to be frank though, I think this a better way than all people's thoughts all of the time.
I think the "crowd" of information makes the end output of an LLM worse rather than better. Specifically in our inability to know really what kind of Bias we're dealing with.
Currently to me it feels really muddy knowing how information is biased, beyond just the hallucination and factual incosistencies.
But as far as I can tell, "correctness of the content aside", sometimes frontier LLMs respond like freshman college students, other times they respond with the rigor of a mathematics PHD canidate, and sometimes like a marketing hit piece.
This dataset has a consistency which I think is actually a really useful feature. I agree that having many perspectives in the dataset is good, but as an end user being able to rely on some level of consistency with an AI model is something I really think is missing.
Maybe more succinctly I want frontier LLM's to have a known and specific response style and bias which I can rely on, because there already is a lot of noise.