Reddit DevOps – Telegram
Brief Overview of Release Orchestration 2025

I just finished writing a brief series of articles exploring how teams manage release orchestration. I'm posting this in case anyone else is facing comparable difficulties.

The articles go over the various strategies and patterns that contemporary development teams employ to plan their deployment procedures.

I'm always interested in hearing about the experiences of the community, so it would be wonderful to hear how others are handling their releases!

https://redd.it/1ntmpem
@r_devops
where is the moderation on this sub

this sub has turned into a bunch of advertisements, low effort "how 2 fix, halp lol?!111", and "Hi! I just graduated with a degree in MIS, how do I get a devops job?"

do we even have mods?

https://redd.it/1nto87v
@r_devops
Four Months Into DevOps: Humbling and Challenging

My background has mostly been in supporting internal IT, and recently I got put on a plan to transition into DevOps. I was really excited about it at first. Four months in, it’s been a ride, humbling, for sure.

I’ve been struggling to get my head around Kubernetes, AWS, and Terraform. It’s been frustrating because I haven’t felt this stuck in a long time. In IT, I could usually figure out a solution with enough digging. DevOps feels different, there are so many possible solutions to any problem that it’s hard to know if I’m on the right track.

Even though it’s discouraging at times, I’m determined to keep learning. I know it’s part of the process, and hopefully, with time and practice, these concepts will start clicking. I think I just needed to vent.

https://redd.it/1ntndu3
@r_devops
Quick question: Is envoy not supported on ubuntu 24.04?

Hi

I'm new to reverse proxy.

I wanted to look into using envoy proxy for a project, and went to install it. I'm running ubuntu 24.04 both on my laptop and on the server I'm going to deploy to.

Much to my surprise the latest ubuntu version in the official installation documentation is ubuntu 22.04.

https://www.envoyproxy.io/docs/envoy/latest/start/install#install-binaries

Is Envoy nearing EOL or moved to another project (maybe name change?) or is there another explanation.

There seems to not be a single hit when searching for "24.04" and "envoy".

What other proxy servers would be a good choice to use on Ubuntu 24.04?

Thanks.

https://redd.it/1ntsqfp
@r_devops
Built a Datadog pricing estimator — what service should I add next?

Hey folks, I’ve been working on interactive pricing calculators similar to what AWS/Azure offer today.

I started with Datadog (probably not the easiest first choice 😅). You can check it out here: uniqalc.com/datadog.

I’m considering doing OpenAI next, but curious — are there other tools/services you’d want to see supported?

https://redd.it/1ntpzuj
@r_devops
Flaky login tests due to 2FA — how to handle it?

We’ve got 2FA enabled in staging. Our Selenium tests fail half the time because the OTP flow blocks automation. I don’t want to disable 2FA entirely. Has anyone else run into this?

https://redd.it/1ntvona
@r_devops
Shall I make the move to DevOp?

Working as a senior Infrastructure engineer currently looking after network, VMware and Azure/M365 platform in a hybrid cloud environment. Working as a lead overseeing architecture, design and implementation. Worked heavily in Azure, IAC, pipelines, observability and other DevOps tools in the past 2 years. Shall I make the move to DevOps or aim for Architect type path? I want to stay hands on technically. Any advise is much appreciated.

https://redd.it/1ntx3vu
@r_devops
Easiest way to keep internal documentation up to date other than doing it manually every time?

I understand that engineers need to state the reasoning behind code in docs, but what about the facts like retry mechanisms, constant, API specs, etc... these little mundane things that could change at any time...


https://redd.it/1nty6ia
@r_devops
Easy Cron Job in JSON?

I could get some feedback on my project…

It's a cron job for Linux systems. It differs from the system cron job in that you write jobs in JSON, a more user-friendly format, and you can specify system conditions for the job.

```json
"jobs": [
{
"denoscription": "Nightly backup",
"command": "/usr/local/bin/backup.sh",
"schedule": {
"minute": "0",
"hour": "2",
"day_of_month": "*",
"month": "*",
"day_of_week": "*"
},
"conditions": {
"cpu": "<80%",
"ram": "<90%",
"disk": {
"/": "<95%"
}
}
}
]
}
```

GitHub: https://github.com/GiuseppePuleri/NanoCron

Video demo: https://nanocron.puleri.it/nanocron_video.mp4

Could this be useful in Docker?

https://redd.it/1ntv1gl
@r_devops
Career choices


I've been in CS for about a year now and I've discovered that i can't stand frontend, I respect everyone who takes this side of SE as their life commitment but its not for me , however can a software engineer take on Cloud and Devops roles too alongside the backend tasks if that what interest him a do not touch frontend end at all ? Meaning can he combine these two areas and be highly paid without needing to know frontend ?

https://redd.it/1nu0m8o
@r_devops
To all the devs out there, how do u guys like to be sold?

Do not say test and see myself i know you do, but what else what kind of messaging and marketing is you like. I know you guys won't get on a sales call. you need to try first or build yourself. But if i have to sell you. How are you buying people??

https://redd.it/1nu23j0
@r_devops
Introducing Upyng – A Powerful Offline Utility App for DevOps & Techies!

Hey everyone,

I’ve been working on something I’m really excited to share – my app Upyng. It’s currently available for macOS, and I’m actively working on bringing it to Windows and Linux by October 15.

Originally, I planned to launch Upyng as an online website, but I ran into issues integrating Google Ads. Since the entire project is built using Flutter, I decided to pivot and build proper desktop apps instead. This turned out to be a great decision — now everything works completely offline, with no dependency on third-party websites.

Upyng brings together several commonly used developer and debugging tools into one clean, fast, and modern app, so you don’t have to juggle multiple sites or separate utilities.

Current features include:
• Regex tester
• JSON / YAML / XML / CSV formatter & viewer
• Grok tester
• Text compare
• Cron helper
• QR code generator

For this launch month, Upyng is available at a reduced price until October 31. After that, the price will increase, so it’s a good time to grab it early and support the project.

Current status:
• Available now: macOS
• Coming October 15: Windows & Linux

Mac App Store link—> https://apps.apple.com/in/app/upyng-devtools-more/id6752918289?mt=12

I’d love to get your feedback, suggestions, and support to help shape Upyng’s future development.

Thanks so much,
— Suraj


https://redd.it/1nu2nh9
@r_devops
Switching from Data Science to Cloud Engineering? Need opinions from people in the industry

Hi everyone,
I’ve been learning and practicing data science and ML for the last 6 months. I also hold Cisco and IBM certifications in this field, and I feel somewhat comfortable with the basics now.

But recently, I’ve noticed that almost everyone is getting into data science/ML, and the competition seems extremely high. That’s why I’m considering shifting my focus toward cloud computing and cloud engineering roles — something that feels more engineering-focused and potentially in higher demand.

For those of you already working in tech (especially in cloud or data-related roles):

Do you think it’s worth pivoting from data science to cloud engineering at this stage?

What’s the job market like for cloud engineers compared to data science right now?

Are there clear entry-level paths/resources you’d suggest?


Any honest suggestions or experiences would be really helpful. Thanks a lot!

https://redd.it/1nu4n3h
@r_devops
Bare metal OpenStack vs K8s-first for a self-service regional cloud?

Hi folks, I currently run a private DC with paying customers from direct b2b sales lines. I’d want to flip to self-service (sign up, provision, pay). I’m torn between:

A) Bare metal (Ubuntu 24.04) → OpenStack control plane (Ansible, Galera) → tenants via Terraform
B) Bare metal (Ubuntu 24.04) → Kubernetes mgmt layer → OpenStack on top → Terraform for tenants

3 questions:
1. From an operations POV, is OpenStack directly on metal simpler to run/upgrade, or is K8s-first more maintainable long term?
2. What’s your favorite portal + IAM + billing combo for dev-friendly self-service (API keys, projects/quotas, usage graphs)?
3. What guardrails are non-negotiable for open signups (quotas, egress controls, WAF/DDoS, rate limits, abuse detection)?

Bonus: Opinions on OVN vs OVS, Ceph design, Cells v2/regions, SSO/OIDC, blue/green upgrades, and GPU/MIG quotas welcome.

🙏

https://redd.it/1nu744r
@r_devops
Managing test analytics & flaky test detection - tools?

We have a growing suite, and flakiness is a nightmare. CI logs aren’t enough to see patterns. Are there analytics dashboards that track flaky tests over time?


https://redd.it/1nu7zkk
@r_devops
im a backend wants to extend my knowledge to devops and infrastructure

i made a book list , but think this list is overkill , im here to ask for recommendations how to approach that ?

my list is

The Linux Command Line" by William Shotts 2019

Deoplyment From scratch

fundamentals devops software delivery 

Learn docker in month of launch 

Learn kubernetes in month of launch 

Release it .
system performance 


\- i have some experience with docker

https://redd.it/1nu97i8
@r_devops
DevOps Audit/Auditor

Hi all,

I need to find a DevOps /SaaS auditor, any clue how I would find one?

Thanks

Ssushi

https://redd.it/1nua0jz
@r_devops
FluxCD webhook receivers setup in large orgs

Hi there,

As I was implementing fluxcd at a large org I wondered how many of you using flux proactively used the webhook component to send event and trigger reconciliations for git repositories, image automation, kustomizations, etc.

In a development environment, one would want quick updates when building a new image or editing manifests, needing the ImageUpdateAutomation to commit quickly and then trigger a GitRepository and Kustomization reconciliation hence the use case of Receivers. It would also allow for greater update intervals wich could help reducing resource usage (in the forge and the controllers) in a setup with tens of GitRepositories, Kustomizations and lots of clusters... but then again, how do you use that efficiently in a multi cluster setup since the application being built knows neither the namespace(s) it should be deployed in nor the destination flux instances.

I went quite far in this rabbit hole, even wondering if I should somehow build some kind of Receiver router that would then dispatch received events to the correct flux instances using some CRDs, etc. but then I thought I might not be the only one with this use case (it seemed pretty standard) so I should ask the community how they're doing it.

Please advise!

https://redd.it/1nub7ax
@r_devops
I see enterprises make these 3 cloud mistakes constantly. What's the biggest 'oops' you've ever seen?

**Your Monolith is Groaning, and Your CFO is Asking Questions.**

Let's be honest. Your on-premise servers are running hot, scaling for the holiday rush is a year-long panic attack, and every new feature deployment feels like open-heart surgery. You know the cloud is the answer, but the path from your current state to a nimble, cloud-native enterprise application seems foggy and filled with buzzwords.

This isn't another high-level whitepaper. This is a practical, no-BS guide to getting it done right. I'll cover the critical decisions, the tools that actually work, and the traps that'll burn your budget.

# Part 1: The "Why" - The No-Fluff Benefits of the Cloud

Forget "digital transformation." Here's what you actually get.

* Stop Guessing Your Capacity: Remember ordering servers 6 months in advance? Now you can scale your resources up or down in minutes. Pay for what you use, not what you might use.
* Go Faster (Seriously): With the right setup, your developers can go from writing code to deploying it in a single afternoon. This isn't a fantasy; it's what a well-oiled CI/CD pipeline in the cloud provides.

Global Reach, Local Speed: With a few clicks, you can deploy your application in data centers from Virginia to Frankfurt to Tokyo, giving users a low-latency experience anywhere in the world.

# Part 2: Your Enterprise Cloud Roadmap: A 5-Step Practical Guide

**Step 1: Choose Your Playground (AWS vs. Azure vs. GCP)**

This is the first holy war you'll encounter. All three are excellent, but they have different personalities.

|Factor|AWS (Amazon Web Services)|Azure (Microsoft)|GCP (Google Cloud Platform)|
|:-|:-|:-|:-|
|**The Vibe**|The undisputed market leader. Has a service for everything. The "default choice."|The enterprise champion. Deep integration with Microsoft products (Windows Server, Office 365, Active Directory).|The data & container expert. King of Kubernetes, Big Data, and AI/ML services.|
|**Best For...**|Companies wanting the widest array of services and the largest community support.|Enterprises heavily invested in the Microsoft ecosystem.|Companies focused on data analytics, machine learning, and container orchestration.|
|**Watch Out For**|The sheer number of services can be overwhelming. The billing can get complex fast.|The user interface can sometimes feel less intuitive than the others.|Smaller market share means a slightly smaller talent pool in some areas.|

*Pro-Tip: Don't get paralyzed by choice. For most general-purpose enterprise apps, any of the three will work. Make the decision based on your team's existing expertise and your company's strategic alliances (e.g., if you're a Microsoft shop, Azure is a natural fit).*

**Step 2: Pick Your Architecture (Don't Just Default to Microservices)**

How you structure your app is the most critical decision you'll make.

Monolith: Your entire application is a single, unified unit.

* Pro: Simple to develop, test, and deploy initially.
* Con: Becomes a nightmare to update and scale as it grows. A bug in one small part can bring down the entire app. This is likely what you're moving away from.

Microservices: Your application is broken down into small, independent services that communicate with each other via APIs.

* Pro: Highly scalable and resilient. Teams can work on different services independently. You can use different tech stacks for different services.
* Con: Way more complex. You have to manage a distributed system, which adds challenges in networking, monitoring, and data consistency. **~~Don't adopt microservices just because it's trendy.~~**

Serverless (Functions as a Service): You don't manage any servers. You just write code (functions) that runs in response to events (like an API call or a file upload).

* Pro: Ultimate scalability and cost-efficiency (you truly pay for what you use, down to the millisecond).
* Con: Can lead to vendor lock-in. Not suitable for long-running, computationally intensive tasks.

*Pro-Tip: Start with a
"well-structured monolith" or a few key microservices. Avoid breaking everything down into 100 tiny services from day one. Evolve your architecture; don't try to perfect it on the first attempt.*

**Step 3: Embrace Automation (Your DevOps Playbook)**

The cloud's power is wasted if your deployment process is still manual.

CI/CD is Non-Negotiable: Set up a Continuous Integration/Continuous Deployment pipeline from day one. Every code change should automatically be built, tested, and deployed.

* Tools: GitHub Actions (great if you're on GitHub), GitLab CI (excellent all-in-one solution), Jenkins (the old, powerful workhorse).

Infrastructure as Code (IaC): Define your servers, databases, and networks in code. This makes your infrastructure repeatable, version-controlled, and easy to manage.

* Tools: Terraform (the cloud-agnostic standard), AWS CloudFormation (AWS-specific).

**Step 4: Lock It Down (Security is NOT an Afterthought)**

The cloud provider secures the cloud, but you are responsible for security in the cloud. This is the "Shared Responsibility Model." Don't get caught out.

* Identity & Access Management (IAM): Grant the least privilege necessary. Don't give a junior developer admin access to your production database.
* Network Security: Use Virtual Private Clouds (VPCs) and subnets to isolate your resources from the public internet.
* Encrypt Everything: Encrypt your data both at rest (in the database) and in transit (over the network).

**Step 5: Tame the Beast (Cloud Cost Management)**

Your biggest post-launch surprise will be the bill. Get ahead of it.

Tag Everything: Tag every resource (server, database, etc.) with its owner, project, and environment (dev, staging, prod). This is the only way to know where your money is going.

Set Billing Alerts: Create alerts that notify you when your spending exceeds a certain threshold.

Shut Down Dev/Test Environments: Don't run development and testing servers 24/7. Automate noscripts to shut them down on nights and weekends. This alone **can save you 60-70% on non-production costs.**

# Part 3: The "Oops" File - 3 Common Cloud Pitfalls to Avoid

**The Blind "Lift and Shift":** Just moving your old, inefficient monolith from your on-premise server to a cloud server (like an EC2 instance) is the fastest way to get a massive bill with zero benefits. You're just renting a more expensive data center.

1. **Ignoring Cost Governance:** Teams will spin up resources and forget about them. Without a clear governance and tagging strategy, your cloud bill will spiral out of control.
2. **The "It's the Cloud's Problem" Security Myth:** Assuming AWS/Azure/GCP handles all security is a recipe for disaster. You are still responsible for configuring firewalls, managing user access, and securing your application code.

# TL;DR & Conclusion

Moving your enterprise application to the cloud isn't just a technical shift; it's a cultural one.

* **Start Small:** Don't try to boil the ocean. Begin with a single application.
* **Choose Wisely:** Pick your cloud and architecture based on your team and needs, not just trends.
* **Automate Everything:** Your CI/CD pipeline and IaC are your best friends.
* **Govern Costs & Security:** From day one, treat cost and security as primary features.

*The journey is complex, but the payoff, in speed, scalability, and resilience, is undeniable.*

https://redd.it/1nud3km
@r_devops
How the hell are you all handling AI jailbreak attempts?

We have public facing customer support AI assistant, and lately it feels like every day someone’s trying to break it. Am talking multi layer prompts, hidden instructions in code blocks, base64 payloads, images with steganographically hidden text and QR codes.

While we’ve patched a lot, I’m worried about the ones we’re not catching. We’ve looked at adding external guardrails and red teaming tools, but I’d love to hear from anyone who’s been through this at scale.

How do you detect and block these attacks without rendering the platform unusable for normal users? And how do you keep up when the attack patterns evolve so fast?

https://redd.it/1nudj4x
@r_devops