It's not just better performance on latency benchmarks, it likely improves throughput as well because the writes will be batched together.
Many applications do not require true durability and it is likely that many applications benefit from lazy fsync. Whether it should be the default is a lot more questionable though.
I also think fsync before acking writes is a better default. That aside, if you were to choose async for batching writes, their default value surprises me. 2 minutes seems like an eternity. Would you not get very good batching for throughout even at something like 2 seconds too? Still not safe, but safer.
You can batch writes while at the same time not acknowledging them to clients until they are flushed, it just takes more bookkeeping.
For transactional durability, the writes will definitely be batched ("group commit"), because otherwise throughput would collapse.
> Many applications do not require true durability
Pretty much no application requires true durability.
It’s like using a non-cryptographically secure RNG: if you don’t know enough to look for the fsync flag off yourself, it’s unlikely you know enough to evaluate the impact of durability on your application.