logoalt Hacker News

adolphlast Wednesday at 6:14 PM0 repliesview on HN

Since Pandas 2, Apache Arrow replaced NumPy as the backend for Pandas. Arrow is also used by Polars, DuckDB, Ibis, the list goes on.

https://arrow.apache.org/overview/

Apache Arrow solves most discussed problems, such as improving speed, interoperability, and data types, especially for strings. For example, the new string[pyarrow] column type is around 3.5 times more efficient. [...] The significant achievement here is zero-copy data access, mapping complex tables to memory to make accessing one terabyte of data on disk as fast and easy as one megabyte.

https://airbyte.com/blog/pandas-2-0-ecosystem-arrow-polars-d...