> How do you handle versioning/updates when datasets change? For data MCPs, we use remote ...

anthuswilliams • today at 1:34 AM • 0 replies • view on HN

> How do you handle versioning/updates when datasets change?

For data MCPs, we use remote MCPs that are served over an stdio bridge. So our configuration is just mcp-proxy[0] pointed at a fixed URL we control. The server has an /mcp endpoint that provides tools and that endpoint is hit whenever the desktop LLM starts up. So adding/removing/altering tools is simply a matter of changing that service and redeploying that API. (Note: There are sometimes complications, e.g. if I change an endpoint that used to return data directly, but now it writes a file to cloud storage and returns a URL (because the result is to large, i.e. to work around the aforementioned broken factor of MCP) we have to sync with our IT team to deploy a configuration change to everyone's machine.)

I have seen nicer implementations that use a full MCP gateway that does another proxy step to the upstream MCP servers, which I haven't used myself (though I want to). The added benefit is that you can log/track which MCPs your users are using most often and how they are doing, and you can abstract away a lot of the details of auth, monitor for security issues, etc. One of the projects I've looked at in that space is Mint MCP, but I haven't used it myself.

> What's your hit rate on researchers actually converting LLM explorations into permanent artifacts vs just using it as a one-off?

Low. Which in our case is ideal, since most research ideas can be quickly discarded and save us a ton of time and money that would otherwise be spent running doomed lab experiments, etc. As you get later in the drug discovery pipeline you have a larger team built around the program, and then the artifacts are more helpful. There still isn't much of a norm in the biotech industry of having an engineering team support an advanced drug program (a mistake, IMO) so these artifacts go a long way given these teams don't have dedicated resources.

> Do you think this pattern (LLM exploration > traditional tools) generalizes outside domains with high uncertainty?

I don't know for sure, as I don't live in that world. My instinct is: I wouldn't necessarily roll something like this out to external customers if you have a well-defined product. (IMO there just isn't that much of a market for uncertain outputs of such products, which is why all of the SaaS companies that have launched their integrated AI tools haven't seen much success with them.) But even within a domain like that, it can be useful to e.g. your customer support team, your engineers, etc. For example, one of the ideas on my "cool projects" list is an SRE toolkit that can query across K8s, Loki/Prometheus, your cloud provider, your git provider and help quickly diagnose production issues. I imagine the result of such an exploration would almost always be a new dashboard/alert/etc.

[0] https://github.com/sparfenyuk/mcp-proxy - don't know much about this repo, but it was our starting point

alt Hacker News