How prompt caching works - Paged Attention and Automatic Prefix Caching plus practical tips
Prompt caching in large language models (LLMs) is an optimization technique that stores and reuses intermediate computational states (key-value caches) of repeated prompt prefixes, significantly reducing redundant processing and speeding up responses. By breaking prompts into fixed-size token blocks and utilizing a hash-based prefix matching system, prompt caching enables multiple reques...
https://sankalp.bearblog.dev/how-prompt-caching-works
Prompt caching in large language models (LLMs) is an optimization technique that stores and reuses intermediate computational states (key-value caches) of repeated prompt prefixes, significantly reducing redundant processing and speeding up responses. By breaking prompts into fixed-size token blocks and utilizing a hash-based prefix matching system, prompt caching enables multiple reques...
https://sankalp.bearblog.dev/how-prompt-caching-works
sankalp's blog
How prompt caching works - Paged Attention and Automatic Prefix Caching plus practical tips
A deep dive into prompt caching - practical tips to improve cache hits and how vLLM's paged attention enables KV-cache reuse across requests via automatic prefix-caching
Can LLMs give us AGI if they are bad at arithmetic?
Wes McKinney's post questions whether large language models (LLMs) can achieve artificial general intelligence (AGI) given their persistent struggles with basic arithmetic tasks like adding single-digit numbers, even in top models. Through experiments and analysis, he shows that while LLMs perform inconsistently on simple math (e.g., summing ~10 numbers), this reveals deeper limitations ...
https://wesmckinney.com/blog/llms-arithmetic/
Wes McKinney's post questions whether large language models (LLMs) can achieve artificial general intelligence (AGI) given their persistent struggles with basic arithmetic tasks like adding single-digit numbers, even in top models. Through experiments and analysis, he shows that while LLMs perform inconsistently on simple math (e.g., summing ~10 numbers), this reveals deeper limitations ...
https://wesmckinney.com/blog/llms-arithmetic/
Wes McKinney
Can LLMs give us AGI if they are bad at arithmetic? – Wes McKinney
Modernising Django Packages Without Breaking Everything
To successfully modernize a mature Django package without breaking user code, the maintainer should phase in new tools to consolidate configuration into a single pyproject.toml file. Key strategies involve streamlining the developer experience with fast tools like uv and Ruff, using a Justfile for memorable commands, and automating releases with Towncrier for clean changelog management.
https://lincolnloop.com/blog/modernising-django-packages-without-breaking-everything/
To successfully modernize a mature Django package without breaking user code, the maintainer should phase in new tools to consolidate configuration into a single pyproject.toml file. Key strategies involve streamlining the developer experience with fast tools like uv and Ruff, using a Justfile for memorable commands, and automating releases with Towncrier for clean changelog management.
https://lincolnloop.com/blog/modernising-django-packages-without-breaking-everything/
Lincoln Loop
Modernising Django Packages Without Breaking Everything | Lincoln Loop
A case study in upgrading django-countries to v8. I’m the solo maintainer for django-countries, which provides a country field for …
vllm-omni
A framework for efficient model inference with omni-modality models.
https://github.com/vllm-project/vllm-omni
A framework for efficient model inference with omni-modality models.
https://github.com/vllm-project/vllm-omni
GitHub
GitHub - vllm-project/vllm-omni: A framework for efficient model inference with omni-modality models
A framework for efficient model inference with omni-modality models - vllm-project/vllm-omni
Django 6.0 released
Django 6.0 introduces major new features: built-in support for template partials (for cleaner, reusable templates), a native background-task framework, a built-in Content Security Policy (CSP) system, and a more modern, Unicode-friendly email API. This release marks the end of mainstream support for Django 5.2; developers are encouraged to upgrade to 6.0 to benefit from the new features ...
https://www.djangoproject.com/weblog/2025/dec/03/django-60-released/
Django 6.0 introduces major new features: built-in support for template partials (for cleaner, reusable templates), a native background-task framework, a built-in Content Security Policy (CSP) system, and a more modern, Unicode-friendly email API. This release marks the end of mainstream support for Django 5.2; developers are encouraged to upgrade to 6.0 to benefit from the new features ...
https://www.djangoproject.com/weblog/2025/dec/03/django-60-released/
Django Project
Django 6.0 released
Posted by Natalia Bidart on Dec. 3, 2025
Can Google's ADK Replace LangChain and MCP?
Christina Lin (Google) demos Agent Development Kit (ADK), open-source Python framework for agentic pipelines: assemble LLMs + tools (via MCP servers/function calling) + prompts for complex workflows like version control or Friday night bookings, with grounding for cited real-time data to cut hallucinations/token costs.
https://www.youtube.com/watch?v=nMnQ63YkftE
Christina Lin (Google) demos Agent Development Kit (ADK), open-source Python framework for agentic pipelines: assemble LLMs + tools (via MCP servers/function calling) + prompts for complex workflows like version control or Friday night bookings, with grounding for cited real-time data to cut hallucinations/token costs.
https://www.youtube.com/watch?v=nMnQ63YkftE
YouTube
Can Google's ADK Replace LangChain and MCP? (with Christina Lin)
How do you build systems with AI? Not code-generating assistants, but production systems that use LLMs as part of their processing pipeline. When should you chain multiple agent calls together versus just making one LLM request? And how do you debug, test…
python-injection
Fast and easy dependency injection framework.
https://github.com/100nm/python-injection
Fast and easy dependency injection framework.
https://github.com/100nm/python-injection
GitHub
GitHub - 100nm/python-injection: Dead-simple dependency injection framework for Python.
Dead-simple dependency injection framework for Python. - 100nm/python-injection
Stop Hardcoding Everything: Use Dependency Injection
The video explains Dependency Injection (DI) in Python with a practical data pipeline example, showing how DI improves code flexibility, testability, and separation of concerns by injecting dependencies like loaders, transformers, and exporters rather than hardcoding them. It covers manual DI with functions and classes, abstraction with protocols, building a simple DI container, and DI u...
https://www.youtube.com/watch?v=Xhzn1eAxoXk
The video explains Dependency Injection (DI) in Python with a practical data pipeline example, showing how DI improves code flexibility, testability, and separation of concerns by injecting dependencies like loaders, transformers, and exporters rather than hardcoding them. It covers manual DI with functions and classes, abstraction with protocols, building a simple DI container, and DI u...
https://www.youtube.com/watch?v=Xhzn1eAxoXk
YouTube
Stop Hardcoding Everything: Use Dependency Injection
→ Check out Thesys here: https://www.thesys.dev/?utm_source=youtube&utm_medium=creators&utm_campaign=arjan
In this video, I explore how Dependency Injection can make your Python code cleaner, more testable, and easier to extend, using a real-world data pipeline…
In this video, I explore how Dependency Injection can make your Python code cleaner, more testable, and easier to extend, using a real-world data pipeline…
Context Data Platform for Self-learning Agents
One Place for Agents to Store, Observe, and Learn. Designed to simplify context engineering, improve agent reliability and task success rates.
https://github.com/memodb-io/Acontext
One Place for Agents to Store, Observe, and Learn. Designed to simplify context engineering, improve agent reliability and task success rates.
https://github.com/memodb-io/Acontext
GitHub
GitHub - memodb-io/Acontext: Context Data Platform for Agents. Join the community❤️: https://discord.acontext.io
Context Data Platform for Agents. Join the community❤️: https://discord.acontext.io - memodb-io/Acontext
PyTogether: Collaborative lightweight real-time Python IDE for teachers/learners
https://github.com/SJRiz/pytogether
https://github.com/SJRiz/pytogether
GitHub
GitHub - SJRiz/pytogether: Source code for pytogether.org — A fully browser-based collaborative IDE with real-time editing, live…
Source code for pytogether.org — A fully browser-based collaborative IDE with real-time editing, live drawing, and voice chat. - SJRiz/pytogether
kubesdk — async-first, fully typed Python SDK for Kubernetes
Open-source Python SDK with fully typed models, async client, and multi-cluster support for Kubernetes automation.
https://github.com/puzl-cloud/kubesdk
Open-source Python SDK with fully typed models, async client, and multi-cluster support for Kubernetes automation.
https://github.com/puzl-cloud/kubesdk
GitHub
GitHub - puzl-cloud/kubesdk: Kubernetes client for Python + CRD & API models generator. Fast, fully typed, async.
Kubernetes client for Python + CRD & API models generator. Fast, fully typed, async. - puzl-cloud/kubesdk
FunAudioLLM / CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://github.com/FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://github.com/FunAudioLLM/CosyVoice
GitHub
GitHub - FunAudioLLM/CosyVoice: Multi-lingual large voice generation model, providing inference, training and deployment full-stack…
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability. - FunAudioLLM/CosyVoice
facebookresearch / sam3
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
https://github.com/facebookresearch/sam3
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
https://github.com/facebookresearch/sam3
GitHub
GitHub - facebookresearch/sam3: The repository provides code for running inference and finetuning with the Meta Segment Anything…
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho...
We Got Claude to Fine-Tune an Open Source LLM
We gave Claude the ability to fine-tune language models using a new tool called Hugging Face Skills. Not just write training noscripts, but to actually submit jobs to cloud GPUs, monitor progress, and push finished models to the Hugging Face Hub. This tutorial shows you how it works and how to use it yourself.
https://huggingface.co/blog/hf-skills-training
We gave Claude the ability to fine-tune language models using a new tool called Hugging Face Skills. Not just write training noscripts, but to actually submit jobs to cloud GPUs, monitor progress, and push finished models to the Hugging Face Hub. This tutorial shows you how it works and how to use it yourself.
https://huggingface.co/blog/hf-skills-training
huggingface.co
We Got Claude to Fine-Tune an Open Source LLM
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Learn NLP Research: 7 Papers Implemented
This video traces the evolution of neural machine translation from RNNs and LSTMs to attention mechanisms, Transformers, and multilingual models like GNMT. It includes PyTorch implementations of 7 landmark papers, mathematical explanations, and tools like Transformer Playground for hands-on learning.
https://www.youtube.com/watch?v=kRv2ElPNAdY
This video traces the evolution of neural machine translation from RNNs and LSTMs to attention mechanisms, Transformers, and multilingual models like GNMT. It includes PyTorch implementations of 7 landmark papers, mathematical explanations, and tools like Transformer Playground for hands-on learning.
https://www.youtube.com/watch?v=kRv2ElPNAdY
YouTube
From RNNs to Transformers: The Complete Neural Machine Translation Journey
This course is a comprehensive journey through the evolution of sequence models and neural machine translation (NMT). It blends historical breakthroughs, architectural innovations, mathematical insights, and hands-on PyTorch replications of landmark papers…
Python Workers redux: fast cold starts, packages, and a uv-first workflow
https://blog.cloudflare.com/python-workers-advancements/
https://blog.cloudflare.com/python-workers-advancements/
The Cloudflare Blog
Python Workers redux: fast cold starts, packages, and a uv-first workflow
Recent advancements in Cloudflare Python Workers means fast cold starts, comprehensive package support, and a great developer experience. We explain how they were achieved and show how Python can be used to build serverless applications on Cloudflare.