logoalt Hacker News

mixedbityesterday at 7:41 PM1 replyview on HN

When I was doing Fast.AI Deep Learning course, I was surprised by the number of Python dependencies machine learning projects bring. Web front-end projects were always considered very third-party dependencies heavy, but to me, the machine learning ecosystem looks much more entangled. In addition, unlike web development, which is considered security critical and has over the many years accumulated a lot of wisdom and good security-related practices, machine learning development looks much more ad-hoc, with many common software engineering practices not applied.

For example, at that time, one way to distribute machine learning models was via Python pickles. Which are executable objects with no restriction built in. Models in this format could do anything on a computer where the model was imported. Such an early 'wild-west' ecosystem can definitely make security compromises easier and resulting supply chain attacks more common.


Replies

zelphirkaltyesterday at 10:19 PM

There are many people in that ecosystem, who are not primarily software engineers. Some just learned some coding along the way. Some are mathematicians. Some are devs who are AI drunk or something. Some have the mindset of "code doesn't matter any longer, if it works it works". For many proper dependency management is just a chore, that they don't want to care about. These things come together in various ML projects, even though ML projects should be amongst the projects most focused on reproducibility.