logoalt Hacker News

codelionyesterday at 11:30 AM1 replyview on HN

The DAG feature for subjective metrics sounds really promising. I've been struggling with the same "good email" problem. Most of the existing benchmarks are too rigid for nuanced evaluations like that. Looking forward to seeing how that part of DeepEval evolves.


Replies

jeffreyipyesterday at 5:43 PM

Definitely, feel free to join our discord for any questions on it: https://discord.com/invite/a3K9c8GRGt