https://gitlab.com/usecaliper/caliper-python-sdk
An LLM observability SDK that let's you store pre and post request metadata with every call in as lightweight an SDK as possible.
Stores to S3 in batched JSON files, so can easily plug into existing tooling like DuckDB for analysis.
It's designed to answer questions like; "how do different user tiers of my services rate this two different models and three different systems prompts?". You can capture all the information required to answer this in the SDK and do some queries over the data to get the answers.