logoalt Hacker News

hermitcrabtoday at 2:29 PM3 repliesview on HN

I guess this article is an interesting exercise from a pure maths point of view. But, as someone developing a drag and drop data wrangling tool the important thing is creating a set of composable operations/primitive that are meaningful and useful to your end user. We have ended up 73 distinct transforms in Easy Data Transform. Sure they overlap to an extent, but feel they are at the right semantic level for our users, who are not category theorists.


Replies

tikhonjtoday at 4:23 PM

You can have both: you start with a small, mathematically inspired algebraic core, then you express the higher-level more user-friendly operations in terms of the algebraic core.

As long as your core primitives are well designed (easier said than done!), this accomplishes two things: it makes your implementation simpler, and it helps guide and constrain your user-facing design. This latter aspect is a bit unintuitive (why would you want more constraints to work around?), but I've seen it lead to much better interface designs in multiple projects. By forcing yourself to express user-level affordances in terms of a small conceptual core, you end up with a user design that is more internally consistent and composable.

show 1 reply
mrlongrootstoday at 2:40 PM

Algebras are also nice for implementations. If you can decompose a domain into a few algebraic primitives you can write nice SIMD/CUDA kernels for those primitives.

To your point, I wonder if the 73 distinct transforms were just different defaults/usability wrappers over these. And you may also get into situations where kernels can be fused together or other batching constraints enable optimizations that nice algebraic primitives don't capture. But that's just systems---theory is useful in helping rethink API bloats and keeping us all honest.

show 1 reply
whattheheckhecktoday at 3:16 PM

Have you heard of the book Mathematics for Big data

https://github.com/Accla/d4m

He says himself the ideas are more important than the software package

show 1 reply