I spent at least 10 hours testing it yesterday. I got a lot of relief when the number badge incremented telling me that some commented on this post. Thank you.
To me the most interesting thing is the different red team adversary agents I'm using. There is a Jony Ive design critic agent which is surprisingly very good, a red team agent that does normal code review and bug hunting by injecting logging into the code running it in isolation in the /tmp/ folder, a red team agent that code reviews and find bugs in the test harnesses, and an agent that does mutation testing by breaking the code creating regressions to make sure that the test harness catch them -- I wanted to call it the trickster agent but did didn't want to drift from training and density in the LLM model.
I did a huge amount of experimentation last week discovering that if a model misses a bug or gets something wrong, running an adversary agent using the same model or family of models will not surface it. Everyone has the intuition about that but I can describe why using data. So Claude writes code that is orders of magnitude better than any project I inherited in the past 15 years and I'd have ChatGPT run all the adversaries.
In order to surface replies to posts and comments it requires huge amounts requests so I needed to figure out what the optimal request rate is based on frequency of replies over time. First posts get replies after a week so there isn't any reason to surface them. After analysis, I can conclude a request every 5 minutes in the background is enough. What is that 288 (pollComments) + 144 (author-sync) = 432 requests/day per user? I spent a couple hours on that. Actually, I started with the Hacker News API and then realized that I should check the https://hn.algolia.com/api but wanted to know which is optimal including using both. After experimentation and research I discovered that ~432 requests a day at Algolia is enough.
I spent at least 10 hours testing it yesterday. I got a lot of relief when the number badge incremented telling me that some commented on this post. Thank you.
To me the most interesting thing is the different red team adversary agents I'm using. There is a Jony Ive design critic agent which is surprisingly very good, a red team agent that does normal code review and bug hunting by injecting logging into the code running it in isolation in the /tmp/ folder, a red team agent that code reviews and find bugs in the test harnesses, and an agent that does mutation testing by breaking the code creating regressions to make sure that the test harness catch them -- I wanted to call it the trickster agent but did didn't want to drift from training and density in the LLM model.
I did a huge amount of experimentation last week discovering that if a model misses a bug or gets something wrong, running an adversary agent using the same model or family of models will not surface it. Everyone has the intuition about that but I can describe why using data. So Claude writes code that is orders of magnitude better than any project I inherited in the past 15 years and I'd have ChatGPT run all the adversaries.
In order to surface replies to posts and comments it requires huge amounts requests so I needed to figure out what the optimal request rate is based on frequency of replies over time. First posts get replies after a week so there isn't any reason to surface them. After analysis, I can conclude a request every 5 minutes in the background is enough. What is that 288 (pollComments) + 144 (author-sync) = 432 requests/day per user? I spent a couple hours on that. Actually, I started with the Hacker News API and then realized that I should check the https://hn.algolia.com/api but wanted to know which is optimal including using both. After experimentation and research I discovered that ~432 requests a day at Algolia is enough.