Can you talk more about why you chose CLJ for datascience / ML. Are there any benefits of usi...

uxcolumbo • yesterday at 9:40 PM • 1 reply • view on HN

Can you talk more about why you chose CLJ for datascience / ML.

Are there any benefits of using it over Python?

And how is the interop with Python libs?

Replies

> Can you talk more about why you chose CLJ for datascience / ML.

I use Python for a lot of machine learning. My vision transformers, for example, are in Python. There is a lot to like about the Python ecosystem. Throwing away libraries like ablumentations and pytorch because you move to a different ecosystem is a real loss. You probably ought to be using Python if you're doing machine learning of the sort that one immediately thinks of when they see ML.

That said, data science and machine learning are words that cover a lot of ground.

Python often works because it serves as glue code to more optimized libraries. Sometimes, it is annoying to use it as glue code. For example, when you're working on computational game theory problems, the underlying data model tends to be a tree structure and the exploration algorithm explores that tree structure. There is a lot of branching. Vanilla python in such a case is horrifically slow.

I was looking at progress bars in tqdm reporting 10,000 years until the computation was done. I had already reached for numba and done some optimizations. Computational game theory is quite brutal. You're very often reminded that there are less atoms in the universe than objects of interest to correctly calculating what you want to calculate.

Most people use C, C++, and CUDA kernels for the sort of program I was writing. Some people have tried to do things in Python.

> Are there any benefits of using it over Python?

There is an open source implementation of a thing I built. It solves the same problem I solved, but in Python and worse than I solved it and with a lot of missing features. It has a comment in it, discussing that the universe will end before the code would finish, were it to be used at the non-trivial size. The code I wrote worked at the non-trivial size. Clojure, for me, finished. The universe hasn't ended yet. So I can't yet tell you how much faster my code was than the Python code I'm talking about.

> And how is the interop with Python libs?

Worked for me without issue, but I eventually got annoyed that I had to wait for two rounds of dependency resolution in some builds. Conda builds can sometimes have issues with dependency resolution taking an unreasonable amount of time. I was hitting that despite using very few libraries.

alt Hacker News

Replies