Reddit DevOps – Telegram
Cloud vs. On-Prem Cost Calculator

Every "cloud pricing calculator" I’ve used is either from a cloud provider or a storage vendor. Surprise: their option always comes out cheapest

So I built my own tool that actually compares **cloud vs on-prem** costs on equal footing:

* Includes hardware, software, power, bandwidth, and storage
* Shows breakeven points (when cloud stops being cheaper, or vice versa)
* Interactive charts + detailed tables
* Export as CSV for reporting
* Works nicely on desktop & mobile, dark mode included

It gives a full yearly breakdown without hidden assumptions.

I’m curious about your workloads. Have you actually found cloud cheaper in the long run, or does on-prem still win?

[https://infrawise.sagyamthapa.com.np/](https://infrawise.sagyamthapa.com.np/)

https://preview.redd.it/r5px17b6mzrf1.png?width=1080&format=png&auto=webp&s=c50f1bb0a86a023482d3807cf0f3365c7a8e33ea

https://redd.it/1nt2ib6
@r_devops
Offer Discounted Azure Certification Voucher (AZ-104 or Advanced Certs)

Hey everyone,

I’ve got an extra Azure certification voucher that’s valid for AZ-104 or any other advanced Azure certification exam.

👉 I’m willing to give it away for half the official price.
👉 If you’re interested, just DM me and we can work it out.

Cheers!

https://redd.it/1nt1vyo
@r_devops
Why does my Go Docker build take 15 minutes on GitHub Actions while Turborepo builds in 3-4 minutes?

I'm building a Go application in a Docker container on GitHub Actions and pushing it to Docker Hub. The entire process takes 12-15 minutes, which seems excessive for a compiled language that's supposed to be fast.

For context, I have a Turborepo project with a similar workflow that completes in 3-4 minutes. I'm using standard GitHub-hosted runners for both.

Is this normal for Go builds on GitHub Actions, or am I missing something obvious in my setup? What are the typical bottlenecks people run into with Go Docker builds in CI/CD?

https://redd.it/1nt873y
@r_devops
DR/FO

I am implementing DR in case of region failure. I have created a managed identity and a bunch of resources in a rg in EastUS. If disaster occurs, will this managed identity also go down? Will I have to create a new managed identity in a new region?

https://redd.it/1nt96cd
@r_devops
If you're running AI agents in production, they probably have way more access than they should. Podcast where we talk about how to secure MCP servers.

MPC servers are becoming some of the highest-privilege components in infrastructure. They sit between AI agents and APIs/data with broad service account permissions. When things go wrong, for example prompt injection, session bugs, etc., the blast radius is huge.

So I wanted to share this podcast epsiode with you all, which covers what MCP is, why it’s needed and used, and how it changes the game for all of us with regards to securing our applications.

The episode also covers how to actually secure MCP servers = it's done with dynamic, contextual authorization policies beings used as guardrails.

Ps. If you want - you can watch the entire episode. Or just read the write-up.

45 min: [https://www.cerbos.dev/news/securing-ai-agents-model-context-protocol](https://www.cerbos.dev/news/securing-ai-agents-model-context-protocol)

I'm interested if anyone here is dealing with this. How are you handling permissions for AI tooling without just giving it admin access to everything?

**Here's an extract on the part about securing MCP servers:**

Bringing together the above points, what might a secure architecture for AI agents using MCP look like? A likely pattern is emerging:

* Establish identity for the agent’s session. When a user initiates an AI agent session, for example, connecting an AI assistant to their Slack or database via MCP, the system should go through an OAuth authorization flow. The result is the agent obtains a token that represents *“User X, via Agent Y”* with appropriate scopes. This token might even be a special *transaction token* limited to just this session. Standards and tools are still catching up here, but the idea is to avoid blind trust in the agent. All actions carry an identifier that ties back to the real user and the specific delegated rights.
* Use an external Policy Decision Point (PDP). The MCP server - which actually executes the tool actions - should not hardcode the permission logic for each action. That would get very messy and hard to update (imagine littering `if (role == admin)` checks all over your code). Instead, the MCP server can ask an external PDP service whether the current identity is allowed to invoke a given tool. This is exactly the model of Cerbos and similar policy engines. The MCP server defines all the possible tools it *could* perform, but right before execution it checks “Can user X (through agent) do action Y on resource Z now?”. The PDP evaluates the policies and says “allow” or “deny” (or even “require elevate” if we implement step-up prompts). In the Cerbos integration demo, this pattern is used to dynamically enable or disable each tool for the AI session - so the agent literally only sees the tools it’s permitted to use. If the user’s permissions don’t allow deletes, the delete command might not even be advertised to the AI model, preventing it from even attempting a forbidden operation.
* Maintain audit logs and visibility. Every action attempted and its outcome (allowed, denied, etc.) should be logged. This is critical not just for compliance, but for building trust with these AI systems. If something goes wrong, you need to trace back and see, *“What did the AI try to do? Why was it allowed? Who approved it?”* In a way, AI agents will force the issue of robust auditing - something that is good security hygiene regardless.

https://redd.it/1ntebp9
@r_devops
I built GoCraft – an open-source generator for Go projects (Auth, DB, Docker, Swagger, gRPC)

Hey folks

I’ve been working on a project called [**GoCraft**](https://gocraft.online/) – an **open-source backend generator for Go** that helps developers skip boilerplate and jump straight into coding.

Instead of spending hours wiring up the same configs (Auth, DB, Docker, Swagger, etc.), GoCraft lets you:

* Add JWT Auth or OAuth2
* Choose DBs (PostgreSQL, MySQL, MongoDB, SQLite, Redis)
* Auto-generate Dockerfile + Docker Compose
* Get Swagger docs + Postman collection
* Add gRPC or WebSocket support
* Even plug in AI APIs like OpenAI

The idea is simple → **pick your stack, generate, and start coding**.
No more copy-pasting boilerplate.

Repo: [github.com/telman03/gocraft-backend](https://github.com/telman03/gocraft-backend)
Website: [gocraft.online](https://gocraft.online/)

I’d love feedback from the community

* Is this something you’d use?
* What features would you want added?
* Any ideas on making it more useful for real-world projects?

Thanks for reading! Excited to hear what you think

https://redd.it/1ntff4s
@r_devops
How do you manage your Vault/OpenBao policies as-code?

We're starting to use OpenBao which gets deployed by ArgoCD using the official Helm chart.
I would like to manage the policies etc. as-code via GitOps too, but I'm getting lost in all the options.

How are you guys solving this?

https://redd.it/1ntfesd
@r_devops
Terragrunt with GitLab Pipeline

I am in a situation where I am using terragrunt to deploy my infra. I have similar folder structure

infrastructure-aws/ ← AWS-specific pipeline
├── vpc/
│ ├── terragrunt.hcl
│ └── tfvars.hcl
└── ec2/
│ ├── terragrunt.hcl
│ └── tfvars.hcl
└ loadbalancer/
│ ├── terragrunt.hcl
│ └── tfvars.hcl


Now if my tfvars.hcl there are some variables e.g. image, ami, etc
These variable are being used in terragrunt.hcl file, so it read the values from tfvars.hcl file and used those values further in input section

I have a ask to take user input from pipeline and pass it to my tfvars. I am unsure how to do that?
I didn't find any examples yet.

So basically in gitlab i will ask user to pass the value of let's say image and then run the pipeline and then terragrunt takes that values from the pipeline directly and use it.

https://redd.it/1nthq48
@r_devops
AI for DevOps. Related courses.

I’ve been searching AI relates to up my skills. Maybe someone can suggest something they’ve done?

I don’t mind a good online uni course. Doesn’t have to be Udemy and such.

It can be a broad spectrum suggestions as long as it’s related to automation and every day DevOps routines.

Appreciate in advance

https://redd.it/1nte1yf
@r_devops
Our security team wants zero CVEs in production. Our containers have 200+. What's realistic here?

Our security team is on a mission for zero CVEs in production. Sounds great, to be honest. But in reality, its proving almost impossible. Our container images are showing upwards of 200 vulnerabilities each.

We scan constantly, patch aggressively, but new CVEs pop up almost daily. It's basically overwhelming. The developers are frustrated, productivity grinds to a halt with all the remediations, and prioritizing which vulnerabilities really matter feels impossible. Not to mention the false alarms that eat up tons of our time.

So I’m wondering, what’s a realistic target here? Is zero CVEs in production a pipe dream for container-heavy environments? Or are there smarter approaches?

I’m trying to figure out how to keep the dream alive without burning out the team in the process.

https://redd.it/1ntlgek
@r_devops
Brief Overview of Release Orchestration 2025

I just finished writing a brief series of articles exploring how teams manage release orchestration. I'm posting this in case anyone else is facing comparable difficulties.

The articles go over the various strategies and patterns that contemporary development teams employ to plan their deployment procedures.

I'm always interested in hearing about the experiences of the community, so it would be wonderful to hear how others are handling their releases!

https://redd.it/1ntmpem
@r_devops
where is the moderation on this sub

this sub has turned into a bunch of advertisements, low effort "how 2 fix, halp lol?!111", and "Hi! I just graduated with a degree in MIS, how do I get a devops job?"

do we even have mods?

https://redd.it/1nto87v
@r_devops
Four Months Into DevOps: Humbling and Challenging

My background has mostly been in supporting internal IT, and recently I got put on a plan to transition into DevOps. I was really excited about it at first. Four months in, it’s been a ride, humbling, for sure.

I’ve been struggling to get my head around Kubernetes, AWS, and Terraform. It’s been frustrating because I haven’t felt this stuck in a long time. In IT, I could usually figure out a solution with enough digging. DevOps feels different, there are so many possible solutions to any problem that it’s hard to know if I’m on the right track.

Even though it’s discouraging at times, I’m determined to keep learning. I know it’s part of the process, and hopefully, with time and practice, these concepts will start clicking. I think I just needed to vent.

https://redd.it/1ntndu3
@r_devops
Quick question: Is envoy not supported on ubuntu 24.04?

Hi

I'm new to reverse proxy.

I wanted to look into using envoy proxy for a project, and went to install it. I'm running ubuntu 24.04 both on my laptop and on the server I'm going to deploy to.

Much to my surprise the latest ubuntu version in the official installation documentation is ubuntu 22.04.

https://www.envoyproxy.io/docs/envoy/latest/start/install#install-binaries

Is Envoy nearing EOL or moved to another project (maybe name change?) or is there another explanation.

There seems to not be a single hit when searching for "24.04" and "envoy".

What other proxy servers would be a good choice to use on Ubuntu 24.04?

Thanks.

https://redd.it/1ntsqfp
@r_devops
Built a Datadog pricing estimator — what service should I add next?

Hey folks, I’ve been working on interactive pricing calculators similar to what AWS/Azure offer today.

I started with Datadog (probably not the easiest first choice 😅). You can check it out here: uniqalc.com/datadog.

I’m considering doing OpenAI next, but curious — are there other tools/services you’d want to see supported?

https://redd.it/1ntpzuj
@r_devops
Flaky login tests due to 2FA — how to handle it?

We’ve got 2FA enabled in staging. Our Selenium tests fail half the time because the OTP flow blocks automation. I don’t want to disable 2FA entirely. Has anyone else run into this?

https://redd.it/1ntvona
@r_devops
Shall I make the move to DevOp?

Working as a senior Infrastructure engineer currently looking after network, VMware and Azure/M365 platform in a hybrid cloud environment. Working as a lead overseeing architecture, design and implementation. Worked heavily in Azure, IAC, pipelines, observability and other DevOps tools in the past 2 years. Shall I make the move to DevOps or aim for Architect type path? I want to stay hands on technically. Any advise is much appreciated.

https://redd.it/1ntx3vu
@r_devops
Easiest way to keep internal documentation up to date other than doing it manually every time?

I understand that engineers need to state the reasoning behind code in docs, but what about the facts like retry mechanisms, constant, API specs, etc... these little mundane things that could change at any time...


https://redd.it/1nty6ia
@r_devops
Easy Cron Job in JSON?

I could get some feedback on my project…

It's a cron job for Linux systems. It differs from the system cron job in that you write jobs in JSON, a more user-friendly format, and you can specify system conditions for the job.

```json
"jobs": [
{
"denoscription": "Nightly backup",
"command": "/usr/local/bin/backup.sh",
"schedule": {
"minute": "0",
"hour": "2",
"day_of_month": "*",
"month": "*",
"day_of_week": "*"
},
"conditions": {
"cpu": "<80%",
"ram": "<90%",
"disk": {
"/": "<95%"
}
}
}
]
}
```

GitHub: https://github.com/GiuseppePuleri/NanoCron

Video demo: https://nanocron.puleri.it/nanocron_video.mp4

Could this be useful in Docker?

https://redd.it/1ntv1gl
@r_devops
Career choices


I've been in CS for about a year now and I've discovered that i can't stand frontend, I respect everyone who takes this side of SE as their life commitment but its not for me , however can a software engineer take on Cloud and Devops roles too alongside the backend tasks if that what interest him a do not touch frontend end at all ? Meaning can he combine these two areas and be highly paid without needing to know frontend ?

https://redd.it/1nu0m8o
@r_devops
To all the devs out there, how do u guys like to be sold?

Do not say test and see myself i know you do, but what else what kind of messaging and marketing is you like. I know you guys won't get on a sales call. you need to try first or build yourself. But if i have to sell you. How are you buying people??

https://redd.it/1nu23j0
@r_devops