Remember that models on different inference platforms might not necessarily give exactly the same re...

spindump8930 • yesterday at 8:17 PM • 1 reply • view on HN

Remember that models on different inference platforms might not necessarily give exactly the same results, adding another axis of non-determinism to development. Things like quantization, custom model serving silicon, batching, or other inference optimizations might mean a model from the original provider performs differently from the hosted one :/

This paper isn't the exact same scenario, since it's an auditable open weight llama model, but shows the symptoms of this: https://arxiv.org/pdf/2410.20247

Replies

bossyTeacher • yesterday at 9:12 PM

Anyone who has used gpt-x via openai vs microsoft has experienced this very clearly.

alt Hacker News

Replies