datapythonista – Telegram
datapythonista
218 subscribers
69 photos
1 video
22 links
Data, Python, Free Software and Memes
Download Telegram
Channel photo updated
Rust is showing impressive results in its way to the Linux kernel. Not long ago Linus anticipated Rust could land as a kernel language next year, or in next release. Today at Linux Plumbers conference results on a new NVMe driver coded in Rust were presented, and performance is on par with the existing C driver. But with the readability and memory safety of Rust. For context, 25 years ago there were attempts to introduce C++ in the kernel, but the efforts were cancelled not long after.
2
Meta (Facebook) just announced that PyTorch is moving to a new independent PyTorch Foundation under the umbrella of the Linux Foundation.

Congrats to the Pytorch team on what seems like an important milestone for the future of the project.
👏10
🤔😂
🤩2😁1
Today is release day for pandas. The pandas 1.5.0 packages for the different architectures are being built right now, and should be available via pip and mamba/conda in the next few hours.

While waiting for the official announcement, some interesting things about the new pandas:

- Support to the Apache ORC file format has been added 🧌
- A new function from_dummies, reverse to get_dummies has been implemented
- The documentation for pandas 1.5.0 will also be available in dark mode 🌙
- The new version will implement the dataframe exchange protocol. This will allow libraries like scikit-learn or matplotlib to receive dataframes as parameters, without caring if the dataframe is a pandas, dask, vaex, polars... object.
- Around 270 people contributed to the new release, 68% of them being first time contributors
- I was the release manager for this version 🤓

The official announcement will be made in @pandas_dev once the packages are ready
🔥6👍1
😁2
I'll be speaking about the future of data engineering in Python at GITEX Dubai next week. Ping me if you are around. https://globaldevslam.com/
🔥1
Very interesting discussion about memory profilers with Pablo Galindo Salgado, CPython core developer and release manager. Pablo has been doing an amazing job at optimizing CPython memory usage, and he is the main developer of memray, a memory profiler. A memory profiler helps understand which parts of a program are responsible for memory usage, see how a program uses memory over time, or see what is its memory usage at the peak (to know how much available memory is needed to run the program): https://realpython.com/podcasts/rpp/128/
👍2🔥2
Happy Diwali to all my friends in India, Nepal, Sri Lanka, and to everbody who celebrates! 🙏🏾
4
CPython 3.11 has been released this week. The main change is an increase in performance, it's between 10% to 60% faster based on the CPython benchmarks.

I tested the #pandas benchmarks with pandas 3.10 and pandas 3.11, and they are less than 1% faster with the new version (all critical code in #Python data projects is in C, not in Python).

Exceptions got couple of improvement, and there are several additions to typing.

For the Python data community, in my opinion the main improvement to Python would be to be able to overwrite the and and or operators in our libraries (pandas and numpy mainly). I wrote about it in this post.
😁8🤣5🤩2
😁16
Do you have any question about #pandas? Few core devs including myself will be answering questions in an AMA (ask me anything) session. Officially scheduled for tomorrow Thursday at 5:30pm UTC, but already open.

https://www.reddit.com/r/Python/comments/11fio85/we_are_the_developers_behind_pandas_currently/
👍8