Reddit DevOps – Telegram
How do you track IaC drifts by ClickOps?

I'm learning IaC right now. I learned that IaC often face drift problems caused by ClickOps. How do you detect the drifts? Or you just can't...?

https://redd.it/1py07yg
@r_devops
How do you decide whether to touch a risky but expensive prod service?

I’m curious how this works on other teams.

Say you have a production service that you know is overprovisioned or costing more than it should. It works today, but it’s brittle or customer facing, so nobody is eager to touch it.

When this comes up, how do you usually decide whether to leave it alone or try to change it?

Is there a real process behind that decision, or does it mostly come down to experience and risk tolerance?

Would appreciate hearing how people handle this in practice.

https://redd.it/1py0ptl
@r_devops
My experiences on the best kinds of documentation, what are yours?

Like many of you, I've often felt that writing documentation didn't serve its stated purpose.

I've heard people say "what if you get hit by a bus?", but then whatever I write becomes irrelevant before anyone reads it. Tribal knowledge and personal experience seem to matter more.

That said, I've found a few cases where documentation actually works:

Architecture diagrams \- Even when they're not 100% accurate, they help people understand the system faster than digging through config panels or code.

Quick reference for facts \- URLs for different environments, account numbers, repo names. Things you need to recall but don't use every day.

Vision/roadmap documents \- Writing down multi-year plans helps the team align on direction. Everyone reads the same thing instead of having different interpretations from meetings.

But detailed how-to guides or step-by-step procedures? Those seem to go stale immediately and nobody maintains them.

What's the most useful documentation you've seen, and what made it actually work?

https://redd.it/1py3xes
@r_devops
What is the most difficult thing you had to implement as a DevOps engineer?

I had to complete some DevOps ticket in the past, but I didn't do anything particularly difficult as I am primarily a software developer, so I was interested to know what actual DevOps engineer might do on a day-to-day basis, beyond just clearing basic infrastructure tickets.

https://redd.it/1py4eb0
@r_devops
Looking for a cheap Linux server for Spring Boot app + domain

Hi everyone,

I’m a beginner when it comes to deploying applications and servers, and I’m planning to deploy my first Spring Boot Application.

Right now I’m searching for a cheap Linux server / VPS to host a small project (nothing high-load yet). I’d appreciate recommendations for reliable low-cost providers.

I also have a few related questions:

\- Where is the best place to buy a domain name?

\- Is it reasonable to run the database on the same server as the API for a small project, or is it better to separate them from the start?

If you have any tips, warnings, or best practices to share - I’d be happy to hear them.
Thanks in advance!

https://redd.it/1py2546
@r_devops
Did DevOps Get Harder or Did We Overdo the Tools

Sometimes it feels like DevOps didn’t get harder, we just kept adding tools over time. One team on ArgoCD, another on Jenkins or GitHub Actions, workflows in Prefect, infra split between Terraform and Pulumi, monitoring across Datadog and Prometheus, plus Cosine for code navigation into daily work.

Each tool is fine on its own. Together, every deploy feels like walking through old decisions and duct tape. When something breaks, we end up debugging the toolchain more than the product.

How do you deal with this. Standardize, let teams choose, or accept the chaos.

https://redd.it/1pyek0i
@r_devops
Macbook air or pro? Urgent!!

Hello,



I currently work in AWS with networking services and I want to learn devops in upcoming days to switch to a complete devops role where learning involves setting up and running kubernetis and docker.

For this, I am buying a personal laptop where I need sufficient space to set up and run all these. Performance wise, there’s no such requirement as this is completely for learning purpose. Also, I am not sure what else I am going to need / set up during learning phase as I am unsure about devops things as of now.

Considering all these, Would Macbook air 256 GB suffice this learning requirement?

Or should I buy pro?

The thing is I am buying this from US and if I am going for air 512 gb, it’s better that I get a pro by paying a lik extra. So please help me choose between macbook air 256gb or macbook pro?

Thanks in advance!

https://redd.it/1pyftgb
@r_devops
How does the Podman team expect people to learn it?

I've been instructed by our infra team that my proposed project should be deployed with Podman (and not Docker) cause they are afraid of giving root access.

I said "no biggie" just another tool in my belt but I am quite clueless on where to start. The docs are frightingly sparse. It's even worse with Quadlets. Top 3 results on google are a reddit thread, Podman Desktop, and the podman-quadlet docs that have even less info than the podman ones.

It feels like im not in on some joke. Sure I can google tutorials (I prefer official documentation as I find tutorials too ad-hoc) but is that really everything that there is? I almost don't believe it. Does the podman team expect tech influencers to write tutorials/books based on trial and error?

https://redd.it/1pyhwak
@r_devops
😁1
Is the "DevOps" noscript just becoming a fancy name for a 24/7 Support Engineer?

I’ve been in the industry for some time, and I’m starting to worry about the direction the "DevOps" role is taking in a lot of companies. Originally, it was supposed to be about breaking down silos and shared responsibility, but in many places, it has just turned into a dumping ground for everything the dev team doesn't want to deal with.

If a deployment fails, it’s a DevOps problem. If the cloud bill is too high, it’s a DevOps problem. If a database is slow, call DevOps. We’ve gone from "building platforms" to just being the people who get paged at 3 AM because a noscript we didn't write failed in a way we couldn't predict. We are spending so much time putting out fires that we don't have the bandwidth to actually automate the systems that prevent them.

I’ve been trying to document some better boundaries and automation patterns on OrbonCloud lately.
Are we just stuck as the "everything" engineers now?

https://redd.it/1pyi2tp
@r_devops
Chainguard vs Docker HDI

Docker releasing their hardened images for free - does that affect Chainguard at all or are people fully locked in?

https://redd.it/1pyjhc7
@r_devops
What's the best way to deploy?

Hi everyone,
I need to deploy a web app ( redmine: an open source project management app). It is an internal Web app.
The app is currently running on a VM with RHEL7 on-prem.
We have over 1000 active users.
We want to use Azure but I really don't know whether I go with Azure App service (container) or Azure Container Apps?
I'll also deploy Azure Files and Azure Database MyDSQL.
I'd appreciate any help or advice.



https://redd.it/1pyjdus
@r_devops
I’m building a DevOps simulation, what real-world pain points should I add to make it feel authentic

I wanna build something that for sure nobody is ever going to use but i just hate my free time and i find it intresting enough to build it.

The idea is a game with a similar vibe to Among Us, but aimed at devs / DevOps.

You’re all on the same team, responsible for keeping a company’s software running. One of the players is a saboteur whose goal is to take things down. The rest of the team has to keep production alive and figure out who’s causing the incidents.

The problem: I’m not a real DevOps engineer. I’m a developer who ends up doing DevOps because the companies I work for are too cheap to hire one. So while I know some pain, I’m very aware I probably don’t know half of it.

For now, each round spawns a fresh Ubuntu container that represents the company’s main machine. Every player gets a Linux user on that machine. One player is the “manager” with sudo access and decides who gets elevated privileges and when. The system starts in a working state: applications are already running under a process manager (currently PM2), nginx or Apache is preconfigured (based on player choice), DNS is set up, and there’s a mocked certbot-like setup handling SSL.

For now there are three possible initial system states:

• “Setup by DevOps” – everything is where it’s supposed to be (assuming I didn’t mess anything up).
• “Setup by children” – things mostly work, but there are some mistakes.
• “Setup by a frontend dev” – everything runs as sudo and nothing is where it’s supposed to be.

The game features a in game terminal, browser and some unimportant other apps. The player can interact wiht the pages via the ingame browser and with the machine via the ingame terminal or any terminal and ssh to the container.

Now i am at the stage where i need to make tasks, like "the company changed its name, the website should no longer be www.company.com but www.newcompany.com" and the playes should buy the domain (mocked providers), setup the nameservers and dns records and then nginx. Or change the port of the xBackendService to whatever.

And this is where I’d really appreciate some help: without making it too daunting or frustrating, and while keeping things balanced for both teams, what other DevOps pain points should I add to keep the authenticity, while still making it somewhat fun? (it's a simulation after all and making it really fun would break the immersion i guess)?

PS: i am not trying to advertise this as i am pretty sure it will never go to market. I'm a nerd and just enjoy building interesting things for myself, and this turned out to be surprisingly fun to work on.

https://redd.it/1pym5hv
@r_devops
Simple PoW challenge system against AI bots/scrapers of all sorts.

Remember when bots were just annoying? Googlebot, Bingbot, maybe some sketchy SEO crawlers. You'd throw a robots.txt at them and call it a day.

Those days are gone.

Now it's OpenAI, Anthropic, Perplexity, ByteDance, and god knows how many "AI agents" that everyone's suddenly obsessed with. They don't care about robots.txt. They don't care about your bandwidth. They don't care that your home $2/month VPS is getting hammered 24/7 by scrapers training models worth billions.

These companies are scraping content to build AI that will eventually replace the people who created that content. We're literally feeding the machine that's coming for us.

So I built a SHA256 proof-of-work challenge system for Nginx/Openresty. Nothing like Anubis, yet still effective.

https://github.com/terem42/pow-ddos-challenge/

Here's the idea:

Every new visitor solves a small computational puzzle before accessing content

Real browsers with real humans? Barely noticeable — takes <1 second

Scrapers hitting you at scale? Now they need to burn CPU for every single request

At difficulty 5, each request costs \~2 seconds of compute time

Want to scrape 1 million pages? That'll be \~$2,000 in compute costs. Have fun.

The beauty is the economics flip. Instead of YOU paying for their requests, THEY pay for their requests. With their own electricity. Their own CPU cycles.

Yes, if a scraper solves one challenge and saves the cookie, they get a free pass for the session duration. That's why I recommend shorter sessions (POW_EXPIRE=3600) for sensitive APIs.

The economics still work: they need to solve PoW once per IP per session. A botnet with 10,000 IPs still needs 10,000 PoW solutions. It's not perfect, but it's about making scale expensive, not impossible.

It won't stop a determined attacker with deep pockets. Nothing will. But it makes mass scraping economically stupid. And that's really all we can ask for.

https://redd.it/1pylqty
@r_devops
We had a credential leak scare and now I do not trust how we share access

"We had a close call last week where an old API key showed up in a place it absolutely should not have been. Nothing bad happened, but it was enough to make me realize how messy our access setup actually is. Between Slack, docs, and password managers, credentials have been shared far more casually than I am comfortable with.
The problem is that people genuinely need access. Contractors, accountants, devs jumping in to help, sometimes even temporary automation. Rotating everything constantly is not realistic, but keeping things as they are feels irresponsible.
I am looking for recommendations on better ways to handle this. Ideally something where access can be granted without exposing credentials and can be revoked instantly without breaking everything else. How are others solving this after a scare like this?"

https://redd.it/1pyo1hh
@r_devops
are you guys using sop's and runbooks?

i’m about to start writing sops and runbooks for my infra and wanted to see how others are doing it.

are you actually using sops/runbooks in prod or do they just rot over time?
what tools do you use to draft and maintain them?(notion, confluence..)
how are you handling alerts?

would love to hear what setups are actually working (or not) in real companies.

https://redd.it/1pynmxg
@r_devops
How do you enforce escalation processes across teams?

In environments with multiple teams and external dependencies, how do you enforce that escalation processes are actually respected?

Specifically:

* required inputs are always provided
* ownership is clear
* escalations don’t rely on calls or tribal knowledge

Or does it still mostly depend on people chasing others on Slack?

Looking for real experiences, not theoretical frameworks.

https://redd.it/1pynic2
@r_devops
how to combine 2 different framework in devops temple

ok guys I know it's not make sense.

1) english is not my first language

2)I am not a devops professional. just practicing

so I want to set up a wordpress app to write blog posts (I already host one wordpress on my ec2 so I am familiar with wordpress little bit ) and I have another app as side project and want to set a cd/ci pipeline for my side project and I want to post progress of my side project in the blog but where I am struggle is:

1) wordpress written in php and different framework, my side project written in java with springboot. is it common to interact 2 different framework ?

2) I want to keep my wordpress container up always, would it cost too much ?

3) is it make sense to host my wordpress as container?



https://redd.it/1pytdp5
@r_devops
How much code are you writing daily

what's is the dev ops workflow like. are you always writing automation noscripts or is a large chunk reviewing others noscripts. how much of the job are you actually writing noscripts. And what is the best advice you can give me with becoming a dev ops engineer. what do you feel you really need to understand to make it in the field.

https://redd.it/1pyx2x1
@r_devops