Netflix are right that 5-stars is too many, it translates to a 6 point scale when you include non-rating, and I don't think there is a consistent view on what "3 stars" means, and how it's different to either 4 stars or 2 stars ( depending on the person ).
For some people 3 stars is an acceptable rating, closer to 4 stars than 2 stars. For others, 3 stars is a bad rating, closer to 2 stars than 5 stars. And for others still, it doesn't give signal beyond what a non-rating would be, it's "I don't have a strong opinion about this".
Effectively chopping out the 3-star rating, leaves it with a better a scale of:
- Excellent, I want to put effort into seeking out similar content
- Fine, I'd be happy to watch more like it
- Bad, I didn't enjoy this
- Terrible, I want to put effort into avoiding this
With the implicit: - I have no opinion on this
But since it's not a survey, it doesn't need to be explicit, that's coded into not rating it instead.These are comparable to a 5 point Likert scale:
"I enjoy this content"
- Strongly agree
- Agree
- Neither Agree nor Disagree
- Disagree
- Strongly Disagree
The current Netflix scale effectively merges Disagree and Strongly Disagree, and for matters of taste that may well be fine.It would be interesting to conduct social science with a similar scale with merged Disagree and Strongly disagree to see if that gave it any better consistency.
> The current Netflix scale effectively merges Disagree and Strongly Disagree, and for matters of taste that may well be fine.
I'm a bit skeptical about this.
To me there's a big difference between "This didn't spark joy" and "I actively hated this": I might dislike a poorly-made sequel of a movie I previously enjoyed, but I never ever want to see baby seals getting clubbed to death again.
Every series has that one bad episode you have to struggle through during a full rewatch. Very few series have an episode bad enough that it'll make you quit watching the series entirely, and ruin any chance at a future rewatch.
When given a 5-star choice “very bad/bad/ok-ish/good/very good”, I rarely pick one of the extremes.
I suspect there are others who rarely click “bad” or “good”.
Because of that, I think you first need to train a model on scaling each user’s judgments to a common unit. That likely won’t work well for users that you have little data on.
So, it’s quite possible that a ML model trained on a 3-way choice “very bad or bad/OK-ish/good or very good” won’t do much worse than on given the full 5-way choice.
I think it also is likely that users will be less likely to click on a question the more choices you give them (that certainly is the case if the number of choices gets very high as in having to separately rate a movie’s acting, scenery, plot, etc)
Combined, that may mean given users less choice leads to better recommendations.
I’m sure Netflix has looked at their data well and knows more about that, though.