Reddit DevOps – Telegram
Need help for a stack of a saap that have the potential to be a supperapp , priority is performance , responce speed not animation and useless features that will slow down my app

i have an idea of saas and i'm searching for tecknologies to build this and make it in real , but i have some confusions , my priority is performance and user experiance because it have the potential to be superapp .So what frontend teck should i use. Also, in the backend i want to use node.js(express) and fastapi for ml tasks is it the best option with rest api and json data format for dabases i will use postgresql , mongodb and redis

https://redd.it/1ppx81x
@r_devops
What unfinished side-project are you hoping to finally finish over the holidays?

With the holidays coming up, I'm curious what side-projects everyone has sitting in the "almost done” (or "started... then life happened”) pile.

It Could be:

A repo that's 80% complete
An app missing "just one more feature”
A tool you built for yourself that never got polished
Something you want to open-source but haven't yet

What is it, and what's stopping you from finishing it?

Bonus points if you drop a link or explain what "done” actually looks like for you.

Hoping this thread gives some motivation (and maybe accountability) to finally ship something before the new year.

https://redd.it/1pq2zhu
@r_devops
The Future of Kubernetes Networking: Gateway API Explained

Hi All,

I put together a video explaining Gateway API purely from an architectural and mental-model perspective (no YAML deep dive, no controller comparison).

Video: The Future of Kubernetes Networking: Gateway API Explained

Your feedback is welcome, comments (Good & Bad) are welcome as well :-)


Cheers

https://redd.it/1pq4vkq
@r_devops
Looking for Career Advice


Hello, everyone.

I don’t know where to begin with, but I’ll try. I want to learn Devops for the long-term, however it seems there are programming courses in my city, but they also promise hiring you if you end up being the best one. The programming courses have 3 phrases, each month is 110$, my salary is around 650$ in my country.

Currently, i don’t know what to do? Save money to learn Devops (each month - 210$) orrr go for the programming course and if i perform the best, i might end up getting hired.


https://redd.it/1pq6ng7
@r_devops
Inference is underpriced. Designing systems as if that’s permanent feels risky.

From an ops perspective, something about current AI system design feels off.



Inference for LLM-backed systems is often priced below marginal cost right now

to accelerate adoption. The gap is being covered by venture capital.



That creates incentives that look fine short-term but feel risky operationally:

\- Heavy fan-out and retry loops instead of tighter control

\- Latency + quality prioritized over efficiency

\- Deep coupling to a single provider’s API semantics

\- Little pressure to build portability, guardrails, or eval infra



We’ve seen this movie before (fiber glut, early cloud pricing).



The interesting question isn’t “is AI overhyped?”

It’s "which systems survive when pricing and providers normalize?"



Curious how other teams are thinking about this from a durability and cost

containment perspective?


Wrote up a clearer explanation + simple diagram here if helpful.

https://redd.it/1pq814z
@r_devops
Hiring JavaScript / React Developer (2+ Years Experience) | Long-Term Contract

We’re expanding our development team and are searching for a skilled JavaScript & React developer who’s interested in a long-term hourly engagement.



💼 Role Overview

Develop and enhance front-end features using React and modern JavaScript (ES6+)

Translate designs and requirements into clean, reusable components

Integrate APIs and handle dynamic data flows

Improve performance, fix bugs, and refactor existing code

Communicate progress clearly and meet agreed timelines



Requirements

Minimum 2+ years of professional experience with JavaScript and React

Strong understanding of hooks, component lifecycle, and state management

Experience working with RESTful APIs

Ability to work independently and take ownership of tasks

Clear communication and reliability

Bonus Skills

Next.js, TypeScript, Redux, or similar tools

Familiarity with Git and collaborative workflows

Eye for UI/UX details

💰 Compensation

Hourly rate: $35 – $42

Consistent workload with long-term potential

📩 To Apply

Please include:

A short intro about your experience

Relevant portfolio, GitHub, or live project links

Your availability (hours/week)

We’re looking for someone dependable who wants to grow with an ongoing project—not a short-term gig. If that’s you, let’s talk.

https://redd.it/1pqb4j2
@r_devops
Is Bare Metal Kubernetes Worth the Effort? An Engineer's Experience Report

I wrote a experience report on setting up a production-ready, high-availability k3s cluster on OVHcloud bare metal servers. My goal was to significantly reduce infrastructure costs compared to managed services like AWS EKS, and this setup costs just $178/month compared to $550+/month for a comparable cloud setup.

The post is a practical walk-through covering:

Provisioning servers and a private network with Terraform.
Building a resilient 3-node k3s control plane with HAProxy and Keepalived.
Using Cloudflare for cheap load balancing.
Securing the cluster with mTLS and Kubernetes Network Policies.

Here is the link: https://academy.fpblock.com/blog/ovhcloud-k8s/

https://redd.it/1pqby1b
@r_devops
What are some examples of devops/SRE/cloud projects to pin on GitHub?

Is having stuff on GitHub even necessary for us? I mean, what kind of stuff would be there? I just noticed that I had mostly front-end code (React), which probably made me look like a React developer, not the DevOps/SRE/cloud guy that I am. Anyway, I'm open for jobs and just wondering what works these days.

https://redd.it/1pqdsht
@r_devops
Built an open-source CLI to deterministically remove secrets from logs (no ML, no guessing)

Hi r/devops,

I’ve been working on a small open-source CLI called **LogShield**.
The idea was to explore whether **deterministic, rule-based log sanitization** can be safer than probabilistic masking when logs are shared or shipped.

Key characteristics:

* Reads from **stdin**, writes sanitized logs to **stdout**
* Explicit, inspectable rules (no ML, no heuristics)
* Same input → same output (deterministic)
* Designed to minimize false positives that break debugging
* Works as a drop-in filter in pipelines

Typical use cases I had in mind:

* Sanitizing logs before uploading CI/CD artifacts
* Preventing accidental secret leaks when logs are shared in tickets or Slack
* Pre-filtering logs before shipping to third-party services

Example:

cat app.log | logshield scan --strict > safe.log


The ruleset is intentionally conservative and fully inspectable.

I’d really appreciate feedback from a DevOps perspective on:

* Whether deterministic redaction is something you’d trust in pipelines
* Edge cases where this would break real-world workflows
* Cases where you’d prefer masking to fail *closed* vs *fail open*

Repo: [https://github.com/afria85/LogShield](https://github.com/afria85/LogShield)
Landing page: [https://logshield.dev](https://logshield.dev)

Thanks — looking forward to criticism.

https://redd.it/1pqep6a
@r_devops
Finding newbits & netnum in Terraforms cidrsubnet()

Does anyone have a quick way either within TF or externally which can take the base_cidr, your "desired cidr", and then spit out the needed newbits and netnum?

If the subnets are fairly simple I can usually just guess them and verify using the console. Anything more complex I calculate by hand.


So I'm hoping there's something more sophisticated available (short of writing my own tool).


Thanks in advance.

https://redd.it/1pqfn36
@r_devops
Confusion about the “Plan” phase in DevOps, is it official and what is it based on?

Hi everyone,
I’m studying DevOps from an academic perspective, and I’m a bit stuck on the “Plan” phase that is often shown as the first phase of the DevOps lifecycle.

Many blogs and diagrams mention phases like Plan → Code → Build → Test → Release → Deploy → Operate → Monitor.
However, I’m struggling to find clear, authoritative references (papers, books, or standards) that explicitly define:
1. What the Plan phase in DevOps exactly is.
2. What it is based on (Agile planning? business requirements? product management?)
3. Whether it is an official DevOps concept or more of a conceptual/educational abstraction.
4. How it differs from planning in Agile/Scrum.

Most explanations online are high-level blog posts, and they don’t clearly cite academic or industry sources.
If you know book, research paper, or credible industry reference, or have practical experience explaining how planning actually works in real DevOps teams.

I’d really appreciate your insights.

Thanks in advance!

https://redd.it/1pqfj53
@r_devops
How to get into cloud/devops within 2-3 years of experience in Infrastructure Administration (Virtualization)

I'm currently working in service based company and my project is basically about Virtualization using Vsphere and Nutanix, I do find Cloud Computing intersting and I've been trying to self learn, improving my bash noscripting skills by doing projects and acquiring certifications. But the issue I face is how can I transition myself from a Virtualization Engineer role to a Cloud Computing role? Without much hands on experience? Like would working on projects on my own count as one? Since every job opening require 4+ years of experience. What are the best choices I could make? Switching internally to a cloud based project and then trying to switch companies?

What could be a better roadmap to get into cloud? Cause at times i feel like I'm just going around in circles without a defenitive idea, it feels like I need to master bash and move on to auto ating things with python, learn docker, kubernetes, terraform,jenkins etc sometimes I do feel like it's overwhelming but i really wanna crack it down, i just need some advise?

Could you please help me out?

https://redd.it/1pqf0tm
@r_devops
Where can I host an API for free so a friend can pentest it?

Hey guys, I want to ask something.

I have an API built using Golang, and I want to host it so my friend can test it. He’s a pen tester, and I want to give him access to the API endpoint rather than sharing my API folders and source files right away.

The problem is, I’m not sure where to host it for free, just for testing purposes. This is mainly for security testing, not production.

Do you have any recommendations for free platforms or setups to host a Go API temporarily for testing?

Thanks in advance!

https://redd.it/1pqi9aa
@r_devops
Who's responsible for contract testing on your team?

We are just starting off with contract testing in our organization and would love your inputs on which team typically owns the effort.

View Poll

https://redd.it/1pqj775
@r_devops
Resistance against implementing "automation tools"

Hi all,

I'm seeing same pattern in different companies: "it"/"devops" team are mostly doing old-school manual deployment and post configuration.

This seems to be related with few factors like: time pressure, idleness, lack of understanding from management or even many silo's where some are already using those while other are just continue.

Have you seen such?

This is kicking back as ppl are getting out of touch with market. Plus it's on their free time and own determination to learn - what's not helpful as well.

https://redd.it/1pqk6m6
@r_devops
Content Delivery Network (CDN) - what difference does it really make?

It's a system of distributed servers that deliver content to users/clients based on their geographic location - requests are handled by the closest server. This closeness naturally reduce latency and improve the speed/performance by caching content at various locations around the world.

It makes sense in theory but curiosity naturally draws me to ask the question:

>ok, there must be a difference between this approach and serving files from a single server, located in only one area - but what's the difference exactly? Is it worth the trouble?

**What I did**

Deployed a simple frontend application (`static-app`) with a few assets to multiple regions. I've used DigitalOcean as the infrastructure provider, but obviously you can also use something else. I choose the following regions:

* **fra** \- Frankfurt, Germany
* **lon** \- London, England
* **tor** \- Toronto, Canada
* **syd** \- Sydney, Australia

Then, I've created the following droplets (virtual machines):

* static-fra-droplet
* test-fra-droplet
* static-lon-droplet
* static-tor-droplet
* static-syd-droplet

Then, to each *static* droplet the `static-app` was deployed that served a few static assets using Nginx. On *test-fra-droplet* `load-test` was running; used it to make lots of requests to droplets in all regions and compare the results to see what difference CDN makes.

Approximate distances between locations, in a straight line:

* Frankfurt - Frankfurt: \~ as close as it gets on the public Internet, the best possible case for CDN
* Frankfurt - London: \~ 637 km
* Frankfurt - Toronto: \~ 6 333 km
* Frankfurt - Sydney: \~ 16 500 km

Of course, distance is not all - networking connectivity between different regions varies, but we do not control that; distance is all we might objectively compare.

**Results**

**Frankfurt - Frankfurt**

* Distance: as good as it gets, same location basically
* Min: 0.001 s, Max: 1.168 s, Mean: 0.049 s
* **Percentile 50 (Median): 0.005 s**, Percentile 75: 0.009 s
* **Percentile 90: 0.032 s**, Percentile 95: 0.401 s
* Percentile 99: 0.834 s

**Frankfurt - London**

* Distance: \~ 637 km
* Min: 0.015 s, Max: 1.478 s, Mean: 0.068 s
* **Percentile 50 (Median): 0.020 s**, Percentile 75: 0.023 s
* **Percentile 90: 0.042 s**, Percentile 95: 0.410 s
* Percentile 99: 1.078 s

**Frankfurt - Toronto**

* Distance: \~ 6 333 km
* Min: 0.094 s, Max: 2.306 s, Mean: 0.207 s
* **Percentile 50 (Median): 0.098 s**, Percentile 75: 0.102 s
* **Percentile 90: 0.220 s**, Percentile 95: 1.112 s
* Percentile 99: 1.716 s

**Frankfurt - Sydney**

* Distance: \~ 16 500 km
* Min: 0.274 s, Max: 2.723 s, Mean: 0.406 s
* **Percentile 50 (Median): 0.277 s**, Percentile 75: 0.283 s
* **Percentile 90: 0.777 s**, Percentile 95: 1.403 s
* Percentile 99: 2.293 s

*for all cases, 1000 requests were made with 50 r/s rate*

If you want to reproduce the results and play with it, I have prepared all relevant noscripts on my GitHub: [https://github.com/BinaryIgor/code-examples/tree/master/cdn-difference](https://github.com/BinaryIgor/code-examples/tree/master/cdn-difference)

https://redd.it/1pql6h1
@r_devops
Data analytics or full stack ?

I come from a very lower middle class family, so which field should I go into where I can get a high package and most importantly, where will freshers get a job quickly without experience, I will later Become sde agar me full stack karunga tho or data analytics karunga tho data scientist ya aiml engineer , kaha freshers ko job milegi I can wait for 10 months job dhundh ne ke liye .

Kaha high package or high package milega
Tell me guys

https://redd.it/1pqnsh3
@r_devops
Liftbridge is back: Lightweight message streaming for distributed systems

Tyler Treat's Liftbridge project has been transferred to Basekick Labs for continued maintenance. It's been dormant since 2022, and we're reviving it.

TL;DR: Durable message streaming built on NATS. Think

Kafka's log semantics in a Go binary.

Technical Overview:

Liftbridge sits alongside NATS and persists messages to a replicated commit log. Key design decisions:

\- Dual consensus model: Raft for cluster metadata, ISR (Kafka-style) for data replication. Avoids writing messages to both a Raft log and message log (like NATS Streaming did).

\- Commit log structure: Append-only segments with offset and timestamp indexes. Memory-mapped for fast lookups.

\- NATS integration: Can subscribe to NATS subjects and persist transparently (zero client changes), or use gRPC API for explicit control.

Why this matters:

IBM's $11B Confluent acquisition has teams looking at alternatives. Liftbridge fills a gap: lighter than Kafka, more durable than plain NATS.

Useful for:

\- Edge computing (IoT, retail, industrial)

\- Go ecosystems wanting native tooling

\- Teams needing replay/offset semantics without JVM ops

What's next:

Modernizing the codebase (Go 1.25+, updated deps), security audit, and first release in January.

GitHub: https://github.com/liftbridge-io/liftbridge

Technical details: https://basekick.net/blog/liftbridge-joins-basekick-labs

Happy to answer questions about the architecture.

https://redd.it/1pqpe3z
@r_devops
Help with EKS migration from cloudformation to terraform

Hi all,

I am currently working on a project where I want to set up a new environment on a new account. Before that we used cloudformation templates, but I always liked IaC, so I wanted to do some learning and decided to use Terraform for it. My devops and cloud engineering knowledge is rather limited as I am mostly a fullstack dev. Regardless I decided that I will first import everything from Env A and then just apply it on ENV B. Which worked quite well, except for the EKS Loadbalancer.

So for eks we used eksctl in the cloudshell and just configured it that way. later we connected via a bastion host to the cluster and added helm, eks-chart and then AWS Loadbalancer Controller. First I just imported the cluster, nodes and loadbalancer. But a target group was not created, then I imported the target group, but it's not connecting to the load balancer and the nodes.

I also tried the eks module from AWS, but that one can't find the subnets of the vpc eventhough I add them directly as an array (everywhere else it works)

Tl;dr: What I know need help with is getting resources. It's holiday season and while I do not have to work, I want to read some stuff and finally understand how to set up an eks cluster in a vpc with a correctly working loadbalancer and target group with the nodes are linked via ip adress. THANK YOU VERY MUCH (and happy holidays)

EDIT: you can also recommend some books for me

https://redd.it/1pqq6jq
@r_devops
What is one piece of complexity in your stack that you would happily remove if you could?

More teams are quietly stepping back from complexity. Not because they cannot handle it, but because they are tired of it. 

Distributed systems are powerful. They are also exhausting. I hear more engineers saying they want systems they can reason about at 2am. 

This shows up in small ways. Fewer services. Clearer boundaries. More boring tech. And that is meant as a compliment. 

It also shows up in how teams think about reliability. Not chasing five nines, but aiming for fast recovery and clear failure modes. 

Observability has helped here. When systems tell you what they are doing, you do not need as many layers of abstraction. 

https://redd.it/1pqrijg
@r_devops
when high eCPMs trick you into thinking a network performs well

i used to chase the “top” network by looking at ecpm alone. big mistake. one partner showed some crazy ecpm on paper, but the fill was so low that real revenue flatlined.

the wake up was a week in india where a “lower” network filled most of the requests and beat the fancy one on arpu. i removed the high ecpm one for two days and arpu jumped. felt kinda stupid ngl.

now i test for at least a week unless stuff breaks. i watch retention, session drops, and uninstall spikes, not only ecpm. i also added extra placements ahead of time and toggle them remote, which saves time and helps me test quick ideas without rebuilding.

if you’re stuck with unstable revenue, i’d look at arpu, fill, and session length together, not only ecpm.



https://redd.it/1pqpf3q
@r_devops