Python Daily – Telegram
Python Daily
2.56K subscribers
1.49K photos
53 videos
2 files
39K links
Daily Python News
Question, Tips and Tricks, Best Practices on Python Programming Language
Find more reddit channels over at @r_channels
Download Telegram
Tuesday Daily Thread: Advanced questions

# Weekly Wednesday Thread: Advanced Questions 🐍

Dive deep into Python with our Advanced Questions thread! This space is reserved for questions about more advanced Python topics, frameworks, and best practices.

## How it Works:

1. **Ask Away**: Post your advanced Python questions here.
2. **Expert Insights**: Get answers from experienced developers.
3. **Resource Pool**: Share or discover tutorials, articles, and tips.

## Guidelines:

* This thread is for **advanced questions only**. Beginner questions are welcome in our [Daily Beginner Thread](#daily-beginner-thread-link) every Thursday.
* Questions that are not advanced may be removed and redirected to the appropriate thread.

## Recommended Resources:

* If you don't receive a response, consider exploring r/LearnPython or join the [Python Discord Server](https://discord.gg/python) for quicker assistance.

## Example Questions:

1. **How can you implement a custom memory allocator in Python?**
2. **What are the best practices for optimizing Cython code for heavy numerical computations?**
3. **How do you set up a multi-threaded architecture using Python's Global Interpreter Lock (GIL)?**
4. **Can you explain the intricacies of metaclasses and how they influence object-oriented design in Python?**
5. **How would you go about implementing a distributed task queue using Celery and RabbitMQ?**
6. **What are some advanced use-cases for Python's decorators?**
7. **How can you achieve real-time data streaming in Python with WebSockets?**
8. **What are the

/r/Python
https://redd.it/1pnn7xm
Looking for CSS frameworks, recommendations?

For my next project I'm staying with full stack Django templating with htmx I'm terrible at CSS and I hate writing it. A few of you will moan about that but I like frame works that have lots of components.

Do you have any recommendations?

Boot strap
Metroui
Beercss
Basecoatui


All great 👍 but are there anymore hiding in the wood work?

/r/django
https://redd.it/1pnlff8
Why don't dataclasses or attrs derive from a base class?

Both the standard `dataclasses` and the third-party `attrs` package follow the same approach: if you want to tell if an object or type is created using them, you need to do it in a non-standard way (call dataclasses.is_dataclass(), or catch attrs.NotAnAttrsClassError). It seems that both of them rely on setting a magic attribute in generated classes, so why not have them derive from an ABC with that attribute declared (or make it a property), so that users could use the standard isinstance? Was it performance considerations or something else?

/r/Python
https://redd.it/1pnne6l
P Built semantic PDF search with sentence-transformers + DuckDB - benchmarked chunking approaches

I built DocMine to make PDF research papers and documentation semantically searchable. 3-line API, runs locally, no API keys.



Architecture:

PyMuPDF (extraction) → Chonkie (semantic chunking) → sentence-transformers (embeddings) → DuckDB (vector storage)



Key decision: Semantic chunking vs fixed-size chunks

\- Semantic boundaries preserve context across sentences

\- \~20% larger chunks but significantly better retrieval quality

\- Tradeoff: 3x slower than naive splitting



Benchmarks (M1 Mac, Python 3.13):

\- 48-page PDF: 104s total (13.5s embeddings, 3.4s chunking, 0.4s extraction)

\- Search latency: 425ms average

\- Memory: Single-file DuckDB, <100MB for 1500 chunks



Example use case:

```python

from docmine.pipeline import PDFPipeline



pipeline = PDFPipeline()

pipeline.ingest_directory("./papers")

results = pipeline.search("CRISPR gene editing methods", top_k=5)



GitHub: https://github.com/bcfeen/DocMine



Open questions I'm still exploring:

1. When is semantic chunking worth the overhead vs simple sentence splitting?

2. Best way to handle tables/figures embedded in PDFs?

3. Optimal chunk_size for different document types (papers vs manuals)?



Feedback on the architecture or chunking approach welcome!

/r/Python
https://redd.it/1pnvuhf
I built PyGHA: Write GitHub Actions in Python, not YAML (Type-safe CI/CD)

# What My Project Does

PyGHA (v0.2.1, early beta) is a Python-native CI/CD framework that lets you define, test, and transpile workflow pipelines into GitHub Actions YAML using real Python instead of raw YAML. You write your workflows as Python functions, decorators, and control flow, and PyGHA generates the GitHub Actions files for you. It supports building, testing, linting, deploying, conditionals, matrices, and more through familiar Python constructs.

from pygha import job, defaultpipeline
from pygha.steps import shell, checkout, uses, when
from pygha.expr import runner, always

# Configure the default pipeline to run on:
# - pushes to main
# - pull requests
default
pipeline(onpush=["main"], onpullrequest=True)

# ---------------------------------------------------
# 1. Test job that runs across 3 Python versions
# ---------------------------------------------------

@job(
name="test",
matrix={"python": ["3.11", "3.12", "3.13"]},
)
def test
matrix():


/r/Python
https://redd.it/1pni2se
YourTimeStarts.now: A Small Flask App for Taskmaster-style Tasks
https://yourtimestarts.now

/r/flask
https://redd.it/1pnpqu1
Tool for splitting sports highlight videos into individual clips

Hi folks, I am looking for a way to split rugby highlight videos automatically into single clips containing tries. For example: https://www.youtube.com/watch\\?v\\=rnCF2VqYwdM to be split into videos of each of the 9 tries during the match.


Here are some of the complications involved:

\- Scenes have multiple camera angles and replays - so scene detection cutting based on visual by itself isn't feasible.

\- Not every scene is a try

\- Not every highlight video has consistent graphics - Some show a graphic between scenes, some do a cross fade. The scoreboard looks different in different competitions.


I imagine that the solution to this is some sort of combination of frame by frame analysis for scene detection, OCR of the scoreboard/time, audio analysis and commentary dialog. The solution also may have to be different for each broadcast so there might not even be a one size fits all solution.


Any suggestions?

/r/Python
https://redd.it/1pnznd9
Front end

So, I know backend (django) like at least to the point where I know what to search yk? . And can somewhat build backend of an app, but I am pretty bad at frontend , like I don't understand anything at all. ( I've always hated templates and static files and DTL) . But I do wanna learn it now (ps some one told me they can't give an opportunity since I'm not a full stack guy) . How do I approach front end? Like from the basics ? I would appreciate if you experienced folks can guide this hermit😔✋🏻

/r/django
https://redd.it/1po18nb
Recommended approach for single-endpoint, passwordless email-code login with domain restrictions with django-allauth

Hi, I am looking for guidance on implementing the following authentication flow using django-allauth.

Requirements

1. Restrict URL access Only /accounts/login/ should be accessible. All other django-allauth endpoints (signup, logout, password reset, email management, etc.), should be inaccessible. This applies regardless of whether the user is authenticated
2. Passwordless login via email code. No passwords are used, a user submits their email address on the login form and a one-time login code is sent to that email. If the email does not already exist, automatically create the user and send the login code, them log the user in after code verification
3. Domain-restricted access. Only email addresses from a whitelist of allowed domains may log in or be registered, attempts from other domains should be rejected before user creation.

I am building a service that depends on the student having access to the email address they are authenticating with, so email based verification is a core requirement. I want to avoid exposing any user facing account management or password based flows.

How may I achieve this?

/r/django
https://redd.it/1po8pxg
WhatsApp Wrapped with Polars & Plotly: Analyze chat history locally

I've always wanted something like Spotify Wrapped but for WhatsApp. There are some tools out there that do this, but every one I found either runs your chat history on their servers or is closed source. I wasn't comfortable with all that, so this year I built my own.

## What My Project Does

WhatsApp Wrapped generates visual reports for your group chats. You export your chat from WhatsApp (without media), run it through the tool, and get an HTML report with analytics. Everything runs locally or in your own Colab session. Nothing gets sent anywhere.

Here is a Sample Report.

Features include message counts, activity patterns, emoji stats, word clouds, and calendar heatmaps. The easiest way to use it is through Google Colab - just upload your chat export and download the report. There's also a CLI for local use.

## Target Audience

Anyone who wants to analyze their WhatsApp chats without uploading them to someone else's server. It's ready to use now.

## Comparison

Unlike other web tools that require uploading your data, this runs entirely on your machine (or your own Colab). It's also open source, so you can see exactly what it does with your chats.

Tech: Python, Polars, Plotly, Jinja2.

Links:
- GitHub
- Sample Report
- Google

/r/Python
[https://redd.it/1po9n17
Looking for Django developer for long term collaboration

Hello, I am looking for developer for my work.

It's easy, long term part time work.

Only US, America, Europe based developers are available.

DM for details.

/r/django
https://redd.it/1podw9b
I made FastAPI Clean CLI – Production-ready scaffolding with Clean Architecture

Hey r/Python,

What My Project Does
FastAPI Clean CLI is a pip-installable command-line tool that instantly scaffolds a complete, production-ready FastAPI project with strict Clean Architecture (4 layers: Domain, Application, Infrastructure, Presentation). It includes one-command full CRUD generation, optional production features like JWT auth, Redis caching, Celery tasks, Docker Compose orchestration, tests, and CI/CD.

Target Audience
Backend developers building scalable, maintainable FastAPI apps – especially for enterprise or long-term projects where boilerplate and clean structure matter (not just quick prototypes).

Comparison
Unlike simpler tools like cookiecutter-fastapi or manage-fastapi, this one enforces full Clean Architecture with dependency injection, repository pattern, and auto-generates vertical slices (CRUD + tests). It also bundles more production batteries (Celery, Prometheus, MinIO) in one command, while keeping everything optional.

Quick start:
pip install fastapi-clean-cli
fastapi-clean init --name=my_api --db=postgresql --auth=jwt --docker

It's on PyPI with over 600 downloads in the first few weeks!

GitHub: https://github.com/Amirrdoustdar/fastclean
PyPI: https://pypi.org/project/fastapi-clean-cli/
Stats: https://pepy.tech/project/fastapi-clean-cli

This is my first major open-source tool. Feedback welcome – what should I add next (MongoDB support coming soon)?

Thanks! 🚀

/r/Python
https://redd.it/1poh525
Spark can spill to disk why do OOM errors still happen

I was thinking about Spark’s spill to disk feat. My understanding is that spark.local.dir acts as a scratchpad for operations that don’t fit in memory. In theory, anything that doesn’t fit should spill to disk, which would mean OOM errors shouldn’t happen.

Here are a few scenarios that confuse me

A shuffle between executors. The receiving executor might get more data than RAM can hold but shouldn’t it just start writing to disk
A coalesce with one partition triggers a shuffle. The executor gathers a large chunk of data. Spill-to-disk should prevent OOM here too
A driver running collect on a massive dataset. The driver keeps all data in memory so OOM makes sense, but what about executors
I can’t think of cases where OOM should happen if spilling works as expected. Yet it does happen.

want to understand what actually causes these OOM errors and how people handle them

/r/Python
https://redd.it/1poqgba
Looking for collaborator who has some web develop skills and strong communication

I am looking for a American or European individual with strong English skills and general knowledge of programming languages.

They should be able to fluently explain general concepts of program development in English and possess excellent communication skills.

The pay is $50 or more per hour, and specific details will be discussed after we meet.
If you don't mind, let me know your idea.
Thanks for your attention.

/r/django
https://redd.it/1povgtc