logoalt Hacker News

softwaredouglast Wednesday at 4:28 PM0 repliesview on HN

It’s the mean. At least in Lucene. Using median would be an interesting experiment.

Do you know of a search dataset with very large document length differences? MSMarco for example is pretty consistent in length.