Reddit Programming – Telegram
Reddit Programming
211 subscribers
1.22K photos
125K links
I will send you newest post from subreddit /r/programming
Download Telegram
Agent Tech Lead + RTS game
https://www.reddit.com/r/programming/comments/1ptwzo7/agent_tech_lead_rts_game/

<!-- SC_OFF -->Wrote a blog post about using Cursor Cloud API to manage multiple agents in parallel — basically a kanban board where each task is a separate agent. Calling it "Agent Tech Lead". The main idea: software engineering is becoming an RTS game. Your company is the map, coding agents are your units, and your job is to place them, unblock them, and intervene when someone gets stuck. Job denoscription for this role if anyone wants to reuse: https://github.com/kyryl-opens-ml/ai-engineering/blob/main/blog-posts/agent-tech-lead/JobDenoscription.md <!-- SC_ON --> submitted by /u/Such_Tale_9830 (https://www.reddit.com/user/Such_Tale_9830)
[link] (https://kyrylai.com/2025/12/23/becoming-an-aiagent-tech-lead/) [comments] (https://www.reddit.com/r/programming/comments/1ptwzo7/agent_tech_lead_rts_game/)
OS virtual memory concepts from 1960s applied to AI: PagedAttention code walkthrough
https://www.reddit.com/r/programming/comments/1ptxiqe/os_virtual_memory_concepts_from_1960s_applied_to/

<!-- SC_OFF -->I came across vLLM and PagedAttention while trying to run LLM locally. It's a two-year-old paper, but it was very interesting to see how OS virtual memory concept from 1960s is applied to optimize GPU memory usage for AI. The post walks through vLLM's elegant implementation of block tables, doubly-linked LRU queues, and reference counting in optimizing GPU memory usage. <!-- SC_ON --> submitted by /u/noninertialframe96 (https://www.reddit.com/user/noninertialframe96)
[link] (https://codepointer.substack.com/p/vllm-pagedattention-saving-millions) [comments] (https://www.reddit.com/r/programming/comments/1ptxiqe/os_virtual_memory_concepts_from_1960s_applied_to/)
An interactive explanation of recursion with visualizations and exercises
https://www.reddit.com/r/programming/comments/1pty3uv/an_interactive_explanation_of_recursion_with/

<!-- SC_OFF -->Code simulations are in pseudocode. Exercises are in javanoscript (nodejs) with test cases listed. The visualizations work best on larger screens, otherwise they're truncated. <!-- SC_ON --> submitted by /u/dExcellentb (https://www.reddit.com/user/dExcellentb)
[link] (https://larrywu1.github.io/recursion) [comments] (https://www.reddit.com/r/programming/comments/1pty3uv/an_interactive_explanation_of_recursion_with/)
We reduced transformer inference calls by ~75% without changing model weights (MFEE control-plane approach)
https://www.reddit.com/r/programming/comments/1puks9j/we_reduced_transformer_inference_calls_by_75/

<!-- SC_OFF -->I’ve been working on a systems paper proposing a simple idea: instead of optimizing how transformers run, decide whether they need to run at all. We introduce Meaning-First Execution (MFEE), a control-plane layer that gates transformer inference and routes requests into: - RENDER (run the model) - DIRECT (serve from cache / deterministic logic) - NO_OP (do nothing) - ABSTAIN (refuse safely) On a representative replay workload (1,000 mixed prompts), this reduced transformer execution by 75.1% while preserving 100% output equivalence when the model was invoked. Below is a derived economic impact table showing what that reduction implies at scale. These are not claims about any specific company, just linear extrapolations from the measured reduction. Economic Impact (Derived) Example Workload Savings (Based on Original Paper Results) Workload Type Daily Requests Transformer Reduction Annual GPU Cost Savings Web Search-like 8.5B 75% $2.1B – $4.2B Code Assist 100M 80% $292M – $584M Chat-style LLM 1.5B 70% $511M – $1.0B Enterprise API 10M 75% $27M – $55M Assumptions: - GPU cost: $1.50–$3.00/hr - Standard transformer inference costs - Linear scaling with avoided calls - Based on 75.1% measured reduction from the paper If you think these numbers are wrong, the evaluation harness is public. What surprising to me is that a lot of effort in the ecosystem goes toward squeezing marginal gains out of model execution, while the much larger question of when execution is even necessary seems to be the more important examination. MFEE isn’t meant to replace those optimizations. It sits upstream of them and reduces how often they’re even needed in the first place. Thoughts? <!-- SC_ON --> submitted by /u/anima-core (https://www.reddit.com/user/anima-core)
[link] (https://zenodo.org/records/18045379) [comments] (https://www.reddit.com/r/programming/comments/1puks9j/we_reduced_transformer_inference_calls_by_75/)
GitHub repos aren’t documents — stop treating them like one
https://www.reddit.com/r/programming/comments/1pun9oq/github_repos_arent_documents_stop_treating_them/

<!-- SC_OFF -->Most repo-analysis tools still follow the same pattern:
embed every file, store vectors, and rely on retrieval later. That model makes sense for docs.
It breaks down for real codebases. Where structure, dependencies, and call flow matter more than isolated text similarity. What I found interesting in an OpenCV write-up is a different way to think about the problem:
don’t index the repo first, navigate it. The system starts with the repository structure, then uses an LLM to decide which files are worth opening for a given question. Code is parsed incrementally, only when needed, and the results are kept in state so follow-up questions build on earlier context instead of starting over. It’s closer to how experienced engineers explore unfamiliar code:
look at the layout, open a few likely files, follow the calls, ignore the rest. In that setup, embeddings aren’t the foundation anymore, they’re just an optimization. <!-- SC_ON --> submitted by /u/Different-Opinion973 (https://www.reddit.com/user/Different-Opinion973)
[link] (https://learnopencv.com/how-to-build-a-github-code-analyser-agent/) [comments] (https://www.reddit.com/r/programming/comments/1pun9oq/github_repos_arent_documents_stop_treating_them/)
Choosing the Right C++ Containers for Performance
https://www.reddit.com/r/programming/comments/1pup1re/choosing_the_right_c_containers_for_performance/

<!-- SC_OFF -->I wrote a short article on choosing C++ containers, focusing on memory layout and performance trade-offs in real systems. It discusses when vector, deque, and array make sense, and why node-based containers are often a poor fit for performance-sensitive code. <!-- SC_ON --> submitted by /u/Clean-Upstairs-8481 (https://www.reddit.com/user/Clean-Upstairs-8481)
[link] (https://techfortalk.co.uk/2025/12/24/optimal-c-containers-for-performance-efficiency/) [comments] (https://www.reddit.com/r/programming/comments/1pup1re/choosing_the_right_c_containers_for_performance/)
lwlog 1.5.0 Released
https://www.reddit.com/r/programming/comments/1pviztk/lwlog_150_released/

<!-- SC_OFF -->Whats new since last release: A lot of stability/edge-case issues have been fixed The logger is now available in vcpkg for easier integration What's left to do: Add Conan packaging Add FMT support(?) Update benchmarks for spdlog and add comparisons with more loggers(performance has improved a lot since the benchmarks shown in the readme) Rewrite pattern formatting(planned for 1.6.0, mostly done, see pattern_compiler branch, I plan to release it next month) - The pattern is parsed once by a tiny compiler, which then generates a set of bytecode instructions(literals, fields, color codes). On each log call, the logger executes these instructions, which produce the final message by appending the generated results from the instructions. This completely eliminates per-log call pattern scans, strlen calls, and memory shifts for replacing and inserting. This has a huge performance impact, making both sync and async logging even faster than they were. I would be very honoured if you could take a look and share your critique, feedback, or any kind of idea. I believe the library could be of good use to you <!-- SC_ON --> submitted by /u/ChrisPanov (https://www.reddit.com/user/ChrisPanov)
[link] (https://github.com/ChristianPanov/lwlog) [comments] (https://www.reddit.com/r/programming/comments/1pviztk/lwlog_150_released/)
What building with AI taught me about the role of struggle in software development
https://www.reddit.com/r/programming/comments/1pvo93k/what_building_with_ai_taught_me_about_the_role_of/

<!-- SC_OFF -->Technical writeup: Built a CLI tool with Claude Code in 90 minutes (React Ink + Satori). Covers the technical challenges (font parsing bugs, TTY handling, shell history formats) and an unexpected realization: when AI removes the mechanical struggle, you lose something important about the learning process. Not about whether AI will replace us, but about what "the wrestling" actually gives us as developers. <!-- SC_ON --> submitted by /u/knutmelvaer (https://www.reddit.com/user/knutmelvaer)
[link] (https://www.knut.fyi/blog/2025-12-25/what-vibe-coding-taught-me-about-why-i-build) [comments] (https://www.reddit.com/r/programming/comments/1pvo93k/what_building_with_ai_taught_me_about_the_role_of/)
How Versioned Cache Keys Can Save You During Rolling Deployments
https://www.reddit.com/r/programming/comments/1pvomnn/how_versioned_cache_keys_can_save_you_during/

<!-- SC_OFF -->Hi everyone! I wrote a short article about a pattern that’s helped my team avoid cache-related bugs during rolling deployments: 👉 Version your cache keys — by baking a version identifier into your cache keys, you can ensure that newly deployed code always reads/writes fresh keys while old code continues to use the existing ones. This simple practice can prevent subtle bugs and hard-to-debug inconsistencies when you’re running different versions of your service side-by-side. I explain why cache invalidation during rolling deploys is tricky and walk through a clear versioning strategy with examples. Check it out here: https://medium.com/dev-genius/version-your-cache-keys-to-survive-rolling-deployments-a62545326220 Would love to hear thoughts or experiences you’ve had with caching problems in deployments! <!-- SC_ON --> submitted by /u/Specific-Positive966 (https://www.reddit.com/user/Specific-Positive966)
[link] (https://medium.com/dev-genius/version-your-cache-keys-to-survive-rolling-deployments-a62545326220) [comments] (https://www.reddit.com/r/programming/comments/1pvomnn/how_versioned_cache_keys_can_save_you_during/)
ACE - a tiny experimental language (function calls as effects)
https://www.reddit.com/r/programming/comments/1pvqqcl/ace_a_tiny_experimental_language_function_calls/

<!-- SC_OFF -->I spent Christmas alone at home, talking with AI and exploring a weird language idea I’ve had for a while. This is ACE (Algebraic Call Effects) — a tiny experimental language where every function call is treated as an effect and can be intercepted by handlers. The idea is purely conceptual. I’m not a PL theorist, I’m not doing rigorous math here, and I’m very aware this could just be a new kind of goto. Think of it as an idea experiment, not a serious proposal. The interpreter is written in F# (which turned out to be a really nice fit for this kind of language work), the parser uses XParsec, and the playground runs in the browser via WebAssembly using Bolero. (Ace Lang - Playground (https://lee-wonjun.github.io/ACE/)) Curious what people think — feedback welcome <!-- SC_ON --> submitted by /u/See-Ro-E (https://www.reddit.com/user/See-Ro-E)
[link] (https://github.com/Lee-WonJun/ACE) [comments] (https://www.reddit.com/r/programming/comments/1pvqqcl/ace_a_tiny_experimental_language_function_calls/)