At a past job (hedge fund), my role was to co-ordinate investigations into why latency may have changed when sending orders.
A couple of quants had built a random forest regression model that could take inputs like time of day, exchange, order volume etc and spit out an interval of what latency had historically been in that range.
If the latency moved outside that range, an alert would fire and then I would co-ordinate a response with the a variety of teams e.g. trading, networking, Linux etc
If we excluded changes on our side as the culprit, we would reach out to the exchange and talk to our sales rep there would might also pull in networking etc.
Some exchanges, EUREX comes to mind, were phenomenal at helping us identify issues. e.g. they once swapped out a cable that was a few feet longer than the older cable and that's why the latency increased.
One day, it's IEX, of Flash Boys fame, that triggers an alert. Nothing changed on our side so we call them. We are going back and forth with the networking engineer and then the sales rep says, in almost hushed tones:
"Look, I've worked at other exchange so I get where you are coming from in asking these questions. Problem is, b/c of our founding ethos, we are actually not allowed to track our own internal latency so we really can't help you identify the root cause. I REALLY wish it was different."
I love this story b/c HN, as a technology focused site, often thinks all problems have technical solutions but sometimes it's actually a people or process solution.
Also, incentives and "philosophy of the founders" matter a lot too.
Can you talk a bit more about the incentives to trade latency sensitive strategies on IEX in the first place? Is it still lucrative for its liquidity despite them artificially slowing down orders? Does a meta game evolve with HFTs all working around their system, essentially making it still a HFT playground but with extra steps? Do you think their unexpected latency increase for you guys was intentional, to free the water from sharks?
Curious what your actual role was -- sounds very interesting! Project manager? Dev? Operations specialist? E.g. were you hired into this role, and what were the requisites?
> they once swapped out a cable that was a few feet longer than the older cable and that's why the latency increased
That was not why. Possibly the cable made a difference (had an open circuit that made the NICs back down to a lower speed; noisy leading to retransmissions) but it wasn't the length per se.
All technical problems are people problems
What kind of founding ethos doesn't allow tracking internal latency? Is their founding ethos "Never Admit Responsibility?"; "Never Leave A Paper Trail?"
This company's official ethical foundation is "Don't Get Caught."