Reddit DevOps – Telegram
rolling back to bare metal kubernetes on top of Linux?

Since Broadcom is raising our license cost 300% (after negotiation and discount) we're looking for options to reduce our license footprint.

Our existing k8s is just running on Linux vms in our vsphere with rancher. we have some workloads in Tanzu but nothing critical.

Have I just been out of the game in running os' on bare metal servers or is there a good reason why we don't just convert a chunk to of our esx servers to Debian and run kubernetes on there? it'll make a couple hundred thousand dollars difference annually...

https://redd.it/1oit22p
@r_devops
Suggestion

honesty, Linode’s fine but it feels kinda outdated the support’s okay, but the UI and performance can be inconsistent. I know there’s gcp, azure, and aws out there which one’s the best to learn that’s modern, flexible, and still affordable?

https://redd.it/1oiv2d5
@r_devops
Suggestion about learning active directory

Hello All ,
I am learning devops from scratch from youtube. I have started with AWS - recently i learned IAM after that there is a topic called active directory setup. The use case : youtuber told was if there is many users ( ex count users count : 2000) it will be difficult to setup user and setup iam role and do role switch and all those things . While learning this topic i can understand what he is doing and how he is doing but it is difficult to co relate as i do not have a networking background . Should i learn this topic is it important for devops learning . Please share your inputs.

https://redd.it/1oiw1rp
@r_devops
Self-hosted alternatives to Jira that don't require a PhD to set up?

We want to move away from Atlassian but every self-hosted alternative seems to require days of configuration or is missing critical features. What are people actually using that works out of the box?

https://redd.it/1oijtow
@r_devops
kafka complexity was killing our team's productivity so we switched to something with zero dependencies


Look, I love Kafka. I really do. But running it for the past 18 months has been exhausting. We're a team of 4 backend engineers. That's it. And somehow we were spending like 30% of our time just keeping Kafka alive. Not building features. Not fixing bugs. Just babysitting infrastructure. Rolling updates felt like defusing a bomb. Debugging cluster issues meant we'd all gather around someone's monitor staring at JMX metrics and heap dumps like we were reading tea leaves. When something broke (and it would break), you could kiss the next 6 hours goodbye.

I remember this one time we had a broker go down at 2am. Spent until 8am trying to get the cluster stable again. My wife was not thrilled, and the kicker? The actual issue was some obscure zookeeper connection timeout that only showed up under specific load conditions. Cool, great, love that for us.

An option was hiring a dedicated Kafka admin but we absolutely couldn't afford it, so I started digging into alternatives. Tried rabbitmq first because everyone talks about it, solid tool but it didn't really solve the distributed streaming thing we needed. Then I looked at pulsar and yeah no, that's even more complex with bookkeeper on top of zookeeper. Hard pass.

My thinking shifted. I stopped asking "what's exactly like Kafka but simpler" and started asking "what could solve our problems?" We needed reliable messaging, streaming, replay capability. Did we need Kafka's specific partition model? Honestly... probably not. We were using it because it's what you're "supposed" to use, you know?

I found nats with jetstream and I'll be honest, I was unsure about it. It looked too simple. Single binary, no zookeeper, no massive config files. My brain was like "this won’t work." But I spun it up anyway. Clustering just worked out of the box with no drama. We've been running it for 3 months now in production so we don't even manage the infrastructure, which is nice because remember, we're 4 people, processing about 500k messages per day. Haven't had a single issue that took more than 10 minutes to resolve.

My team is shipping features again, actual features. Not spending our days in kafka land. It's wild how much mental energy we got back. I'm not saying kafka is bad. If you're at massive scale and you've got the team to manage it, is probably perfect for you. But for smaller teams? Sometimes you're using enterprise tools when you don't have enterprise problems, and that's okay.

https://redd.it/1oixphz
@r_devops
AI was implemented as a trial in my company, and it’s scary.

I know that almost everyday someone comes up and says AI will take my job and I’m scared but I promise to keep this short and maybe different.

I am currently a junior devops, so not huge experience or knowledge, but I was told that the team are trying to implement Claude code into vs code for the dev team and MCPs for provisioning and then later for monitoring generally and taking action when something fails.

The trial was that Claude code was so good in the testing, it scared me alittle, because it planned and worked with hundreds of files, found what it needs to do, and did it first try (now fully implemented)

With the MCP, it was like a junior devops/SRE, and after that trial, the company stopped the hiring cycle and the team is kept at only 4 instead of expanding to 6 as planned, and honestly from what I saw, I even think they might view it as “4 too many”.

This is all happening 3 years after ChatGPT released, 3 years and people are already getting scared shitless. I thought AI was a good boost, but I don’t think management would see it as a boost, but a junior replacement and maybe later a full replacement.

https://redd.it/1oiytfa
@r_devops
[Advice] Best way to build and distribute an internal CLI tool (PHP vs Node)?

Hey !

I’m currently working on an internal CLI tool for my team — mainly to automate recurring tasks like syncing databases, uploading assets, or triggering deployments.

I’m hesitating between several approaches, and I’d love to get some feedback, especially about distribution and updates.

**Context:**

* Team setup: mixed macOS (only me) / Windows
* Goal: make it easy for anyone to just run commands like "*toto sync*" or "*toto deploy*" through SSH, without a complex setup.

**Options I’m considering:**

1. Node.js CLI (using clack js) → Publish on Private GitLab’s npm Registry, install globally
2. PHP CLI (using Laravel Prompts or Laravel Zero) → Distributed either as a Composer global package or a single PHAR binary via GitLab Releases.

**My questions:**

* Which approach feels the most maintainable and easiest to distribute internally?
* Is the PHAR format still considered a good modern option?
* How do you handle auto-updates for your internal CLI tools?
* Do you prefer Composer global, npm -g, or shipping a standalone binary?

Looking for the cleanest, cross-platform way to build a team CLI that’s:

* easy to install,
* modern (interactive prompts, spinners, multiselect, etc.),
* and able to run SSH/rsync commands to deploy WordPress projects.

Would love to hear how you do it internally or see examples if you’ve built something similar!

https://redd.it/1oiz484
@r_devops
How does your team promote your products? Which channel?

Hi all, I’m curious about how web developers and their teams promote their own products or tools.

Do you mainly use email marketing to reach your audience or do you rely more on social media, blogs, or other channels?



https://redd.it/1oj0lfc
@r_devops
Help! My side project is burning cash on Google Cloud SQL 😅need a free database host

I’ve deployed my machine learning web app on Google Cloud, but I’ve started incurring charges. I’m now looking for a free alternative for hosting.

The app consists of:

* A frontend hosted on Vercel
* Two APIs (one for data processing and another for connecting to the ML .pkl model)
* A MySQL database that stores all the data used by the APIs

From what I understand, the costs are coming from the MySQL database hosted on Cloud SQL. It’s already cost me around $3 in just a week, which is not sustainable since the app doesn’t generate any income.

I’m looking for a free MySQL hosting option (or something similar) that can work with my current setup. I’ve tried alternatives like CockroachDB and Firebase, but I found them a bit confusing. Before committing to another platform, I wanted to ask for recommendations.

Thanks in advance!

https://redd.it/1oj1lyp
@r_devops
We’re building a small fintech app – AWS vs Azure? Need advice on structure, security, and cost

Hey everyone,

I’m part of a small team building a mobile app (iOS & Android) for home financing. The app’s purpose is to let users create a profile, go through a credit evaluation via a third-party integration, and eventually manage parts of their financing process in a secure and compliant way.

We’re at the stage where we need to decide on the overall backend and authentication setup, and I’d really appreciate some insight from people who’ve been there before.

Here’s what we care about:

- Keeping costs low, especially early on (MVP phase).

- Minimizing our data responsibility – ideally, we don’t want to directly handle sensitive personal data due to GDPR.

- Maintaining a secure and scalable architecture.

- Using something our team (mostly .NET/C# devs) can work with comfortably.


We’ve been comparing three main approaches:

1. AWS (Cognito + API Gateway + Lambda + DynamoDB)

- Super low cost for early usage (Cognito free up to ~10k MAU, Lambda pay-per-use).

- Easy to scale, and no server maintenance.

- .NET 8 works great with Lambda now.

- Slightly less integrated if we ever need to connect with Microsoft services later.

2. Azure (Entra ID B2C + Azure Functions + CosmosDB)

- Strong enterprise-level security and compliance.

- Better if we end up needing Office 365 / Power BI / MS ecosystem integration.

- B2C is free up to 50k users, but setup and maintenance seem more complex.

- Costs and admin overhead might ramp up faster.

At this point, I’m leaning toward AWS because it seems cheaper, easier to maintain, and gives us a clean, serverless architecture with minimal ops.

But I’d love to hear your experiences:

- Have you built similar apps (fintech, identity-heavy, serverless)?

- How have you handled user authentication and third-party integrations securely?

- Any surprises or gotchas you’ve faced with Cognito, Entra B2C, or Auth0?

- Would you choose differently if you had to start over?


Any advice, lessons learned, or real-world insights would be massively appreciated 🙏

Thanks!

https://redd.it/1oj2vky
@r_devops
Any tool for debugging mobile viewport breakpoints remotely?

Our responsive app works fine on desktop but certain breakpoints on Android Chrome look broken. I can’t tether every phone to inspect it. Is there any way to live-debug mobile browsers remotely?

https://redd.it/1oiws4y
@r_devops
No Kubernetes experience, Am I cooked?

Currently in a role which everything is deployed via AWS ECS Fargate containers. I have been supporting these applications for a little bit now. There is not a TON of net new things to work on and learn. Just browsing roles or Job Denoscriptions I am seeing a ton of companies asking for Kubernetes experience. It seems like 80-90% of the roles want this for a mid level engineer. Are this many companies actually using Kubernetes, whether it be AWS EKS or Azure AKS, or googles Kubernetes offering.

having no experience and frankly, Kubernetes for my current work application is overkill. So I wouldn't be able to gain on the job experience. That said, am I cooked in this Job market(outside of the Market already being doo-doo in general). I have come across posts of folks who study for the cert but seem to not have hands on experience - which I DONT want to go down this route, not sure what the though process is on that lol.

Thought about doing it on my spare time but kids and wife take a good majority of my weekend, and not sure what the best method is to learn about Kubernetes and which learning method would be the most effective which the community recommends.

https://redd.it/1oj5dlq
@r_devops
How N26 builds reliability at scale — with Bruno Paulino (Tech Lead at N26)

What does reliability actually look like when every deploy touches millions of bank customers?

In this episode of Señors @ Scale, Bruno Paulino (Tech Lead at N26) shares how his teams build resilient FinTech systems — from CI/CD pipelines and server-driven UIs to AI-powered customer support.

We cover:

Cutting deploy times from 1 hour to 5 minutes
Rolling out server-driven UI across mobile and web
Using LLMs and RAG to scale customer support
Statsig and safe experimentation in production
Balancing speed, compliance, and reliability in FinTech
Lessons from outages, testing, and developer culture

🎧 Watch or listen:
▶️ YouTube: https://youtu.be/XA42xUQlxRY
🎧 Spotify: https://open.spotify.com/episode/1cVpylsiGZphf8Pr6ocFgv
🍎 Apple Podcasts: https://podcasts.apple.com/us/podcast/reliability-at-scale-with-bruno-paulino-n26/id1827500070?i=1000733534640

If you’re into DevOps, platform engineering, or CI/CD at scale — this one’s for you.

https://redd.it/1oj50af
@r_devops
Octopus Deploy vs speed/safety tradeoffs

One of the biggest tensions in DevOps is shipping faster vs shipping safer. Octo⁤pus Deploy gives us approvals, audit logs, and runbooks, but those can also slow things down if overused.

How do you balance speed and safety in Octo⁤pus Deploy? Feature flags? Progressive deployments? Manual approvals only in certain environments? Would love to hear how other teams approach this.

https://redd.it/1oj5szo
@r_devops
Apple's new container runtime vs Docker Desktop

Hi everyone

I was curious how Apple’s new container system compares to Docker Desktop, so I ran some benchmarks.
I tested CPU, memory, disk I/O, and startup time.

|Category|Docker|Apple|Units|
|:-|:-|:-|:-|
|CPU 1 thread|10939.81|11080.05|events/s|
|CPU all threads|53881.70|55415.57|events/s|
|Memory|81634.45|108588.00|MiB/s|
|Startup time|0.21|0.92|seconds|

Full charts and results, are available here: Full Benchmark

Let me know if you’d like me to run additional tests

https://redd.it/1oj9wxs
@r_devops
DevOps engineers: What Bash skills do you actually use in production that aren't taught in most courses?

I'm a DevOps Team Lead managing Kubernetes/AWS infrastructure at an FDA-compliant medical device company. My colleague works at Proofpoint doing security automation.

We've both noticed that most Bash courses teach toy examples, but production Bash is different. We're curious what real-world skills you wish you'd learned earlier:

* Are you parsing CloudWatch/Splunk logs?
* Automating CI/CD pipelines?
* Handling secrets management in noscripts?
* Debugging production incidents with Bash one-liners?
* Something else entirely?

What Bash skills have been most valuable in your DevOps career that you had to learn the hard way?

https://redd.it/1ojcrdo
@r_devops
The Vi editor Survival Guide for devs like me

I have put together a simple guide to vi commands that actually helped me all these years when editing configs or noscripts on Linux.
Short, practical, and focused on real examples.

Let me know if I have missed some..would love to take feedbacks and make it an exhaustive list!

Read it here

https://redd.it/1ojac48
@r_devops
Do I build "api-core" layer as an always-on container (App Runner / Fargate) — or as event-driven Lambda functions?

Such as user auth, billing, usage. Think core business logic that my webapps will call about my customers (B2C/B2B)

Where the api-core is like an internal service, with its own ci/cd pipeline

https://redd.it/1ojgtza
@r_devops
Taking the CKAD exam this week after CKS and CKA. Any advice?

Hi All!

I am taking the CKAD exam next week. I was urged to be a KUBERSTRONAUT by my co-workers. Any advice for me? I am yet to do the Killrsh practice tests (I want to do it just before the exams).

My past experiences with the exam have been that the questions are really not what you expect. Is it going to be the same with CKAD? I am going in with just a week's prep so I am feeling a bit unprepared. Should I work for another week?

Any particular topics that I should focus on?

Thanks in advance for all your help!

https://redd.it/1ojargr
@r_devops
Does every DevOps role really need Kubernetes skills?

I’ve noticed that most DevOps job postings these days mention Kubernetes as a required skill. My question is, are all DevOps roles really expected to involve Kubernetes?

Is it not possible to have DevOps engineers who don’t work with Kubernetes at all? For example, a small startup that is just trying to scale up might find Kubernetes to be an overkill and quite expensive to maintain.

Does that mean such a company can’t have a DevOps engineer on their team? I’d like to hear what others think about this.


https://redd.it/1ojj08t
@r_devops