They mention this in the article. This is why private (non public) benchmark tasks that have been made from scratch are necessary.