SSL/TLS explained (newbie-friendly): certificates, CA chain of trust, and making HTTPS work locally with OpenSSL
I kept hearing “just add SSL” and realized I didn’t actually understand what a certificate proves, how browsers trust it, or what’s happening during verification—so I wrote a short “newbie’s log” while learning.
In this post I cover:
What an “SSL certificate” (TLS, really) is: issuer info + public key + signature
Why the signature matters and how verification works
The chain of trust (Root CA → Intermediate CA → your cert) and why your OS/browser already trusts certain roots
A practical walkthrough: generate a local root CA + sign a localhost cert (SAN included), then serve a local site over HTTPS with a tiny Python server + import the root cert into Firefox
Blog Link: https://journal.farhaan.me/ssl-how-it-works-and-why-it-matters
https://redd.it/1r07ejx
@r_devops
I kept hearing “just add SSL” and realized I didn’t actually understand what a certificate proves, how browsers trust it, or what’s happening during verification—so I wrote a short “newbie’s log” while learning.
In this post I cover:
What an “SSL certificate” (TLS, really) is: issuer info + public key + signature
Why the signature matters and how verification works
The chain of trust (Root CA → Intermediate CA → your cert) and why your OS/browser already trusts certain roots
A practical walkthrough: generate a local root CA + sign a localhost cert (SAN included), then serve a local site over HTTPS with a tiny Python server + import the root cert into Firefox
Blog Link: https://journal.farhaan.me/ssl-how-it-works-and-why-it-matters
https://redd.it/1r07ejx
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Monitoring performance and security together feels harder than it should be
One thing I have noticed is how disconnected performance monitoring and cloud security often are. You might notice latency or error spikes, but the security signals live somewhere else entirely. Or a security alert fires with no context about what the system was doing at that moment.
Trying to manage both sides separately feels inefficient, especially when incidents usually involve some mix of performance, configuration, and access issues. Having to cross check everything manually slows down response time and makes postmortems messy.
I am curious if others have found ways to bring performance data and security signals closer together so incidents are easier to understand and respond to.
https://redd.it/1r0dbxa
@r_devops
One thing I have noticed is how disconnected performance monitoring and cloud security often are. You might notice latency or error spikes, but the security signals live somewhere else entirely. Or a security alert fires with no context about what the system was doing at that moment.
Trying to manage both sides separately feels inefficient, especially when incidents usually involve some mix of performance, configuration, and access issues. Having to cross check everything manually slows down response time and makes postmortems messy.
I am curious if others have found ways to bring performance data and security signals closer together so incidents are easier to understand and respond to.
https://redd.it/1r0dbxa
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
When is it time to quit?
I wrapped up a tech panel for a Principal Azure Engineer role at an investment bank a couple of hours ago. This followed an interview with the hiring manager last Wednesday. We know each other from the past, i.e., I’ve interviewed for multiple roles at this firm over the last 5-6 years.
This role landed on my LinkedIn feed randomly. I commented on the post and emailed the hiring manager directly, we had a short back-and-forth, and his recruiter called me almost immediately. The process has been unusually smooth by modern standards.
Today’s panel felt strong. I’m confident I cleared the bar with both the Azure SME and the hiring manager. I saw visible agreement on several answers, got verbal acknowledgment more than once and handled questions from a junior panelist with ease. I was told that I’m “first in line” (not sure if that means FIFO or first on the shortlist), however, it seemed to be directionally positive.
Here’s the problem: I was laid off a little over six months ago and I am EXHAUSTED. It's like I've been on the hamster wheels of interviews since 8/4/2025. I’ve done the prep, the loops, the panels, the follow-ups. I know I’m good enough to be gainfully employed as a DevOps engineer.
If this role doesn’t turn into an offer, I’m seriously questioning whether I want to continue in tech at all. I don’t know if I have it in me to keep doing 5–7 round interview gauntlets, only to be rejected for vague reasons like “culture fit” or not smiling enough. I’ve given my adult life to STEM / engineering / corporate IT / tech and I am exhausted from having to engage with recruiters who want someone to take managerial roles for IC level pay.
I’m not bitter about rejection. I’m tired of dysfunction...hiring managers who don’t know the difference between EC2 and AWS Lambda, recruiters who can’t distinguish an AWS account from an Azure subnoscription and BS interview processes that ding candidates for being "too intense".
So I’m asking honestly: when is it time to walk away?
For those who’ve been at a similar crossroads...did you step back temporarily, change strategy or leave tech altogether?
TL;DR: Six months, countless interviews, strong signals in today's tech panel. If today's tech panel doesn’t result in an offer, I’m seriously considering being done with the tech interview industrial complex.
https://redd.it/1r0jghq
@r_devops
I wrapped up a tech panel for a Principal Azure Engineer role at an investment bank a couple of hours ago. This followed an interview with the hiring manager last Wednesday. We know each other from the past, i.e., I’ve interviewed for multiple roles at this firm over the last 5-6 years.
This role landed on my LinkedIn feed randomly. I commented on the post and emailed the hiring manager directly, we had a short back-and-forth, and his recruiter called me almost immediately. The process has been unusually smooth by modern standards.
Today’s panel felt strong. I’m confident I cleared the bar with both the Azure SME and the hiring manager. I saw visible agreement on several answers, got verbal acknowledgment more than once and handled questions from a junior panelist with ease. I was told that I’m “first in line” (not sure if that means FIFO or first on the shortlist), however, it seemed to be directionally positive.
Here’s the problem: I was laid off a little over six months ago and I am EXHAUSTED. It's like I've been on the hamster wheels of interviews since 8/4/2025. I’ve done the prep, the loops, the panels, the follow-ups. I know I’m good enough to be gainfully employed as a DevOps engineer.
If this role doesn’t turn into an offer, I’m seriously questioning whether I want to continue in tech at all. I don’t know if I have it in me to keep doing 5–7 round interview gauntlets, only to be rejected for vague reasons like “culture fit” or not smiling enough. I’ve given my adult life to STEM / engineering / corporate IT / tech and I am exhausted from having to engage with recruiters who want someone to take managerial roles for IC level pay.
I’m not bitter about rejection. I’m tired of dysfunction...hiring managers who don’t know the difference between EC2 and AWS Lambda, recruiters who can’t distinguish an AWS account from an Azure subnoscription and BS interview processes that ding candidates for being "too intense".
So I’m asking honestly: when is it time to walk away?
For those who’ve been at a similar crossroads...did you step back temporarily, change strategy or leave tech altogether?
TL;DR: Six months, countless interviews, strong signals in today's tech panel. If today's tech panel doesn’t result in an offer, I’m seriously considering being done with the tech interview industrial complex.
https://redd.it/1r0jghq
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Security findings come in Jira tickets with zero context
Security scanner runs nightly and I wake up to 15 Jira tickets. Each one says fix CVE-2025-XXXX in dependency Y with no explanation of what the dependency does, where it's used, or why it matters.
I'm supposed to drop whatever sprint work I'm on, research the CVE, find where we use that package, assess actual risk, test the upgrade, and hope nothing breaks.
Meanwhile the ticket was auto-generated and the security team has no idea what they're asking me to fix. Just scanner said critical so here's a ticket.
Why can't these tools give actual context? Like this package is used in auth flow, vulnerability allows account takeover, here's how to fix it. Instead of just screaming CVE numbers at me.
https://redd.it/1r4xpz9
@r_devops
Security scanner runs nightly and I wake up to 15 Jira tickets. Each one says fix CVE-2025-XXXX in dependency Y with no explanation of what the dependency does, where it's used, or why it matters.
I'm supposed to drop whatever sprint work I'm on, research the CVE, find where we use that package, assess actual risk, test the upgrade, and hope nothing breaks.
Meanwhile the ticket was auto-generated and the security team has no idea what they're asking me to fix. Just scanner said critical so here's a ticket.
Why can't these tools give actual context? Like this package is used in auth flow, vulnerability allows account takeover, here's how to fix it. Instead of just screaming CVE numbers at me.
https://redd.it/1r4xpz9
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Duplicate writes in multi-step automation: where do you enforce idempotency?
Genuine question.
We run multi-step automation that touches tickets, db writes, api calls and emails.
A step partially failed or timed out. we restarted the run. a downstream write had already happened. result: duplicate tickets, duplicate notifications.
This does not feel like a simple retry problem. it is about where step boundaries live and how side effects stay idempotent across an entire run.
Things we are trying:
Treating write-capable steps differently from read-only steps
Requiring idempotency keys or operation ids for side effects
Making re-runs step-scoped instead of whole-run
Keeping a durable per-step ledger with inputs, outputs and timestamps
Adding manual pause or cancel before certain write steps
It still feels easy to get wrong.
Where do you enforce idempotency in practice?
Application layer
Workflow engine
Middleware or sidecar
Sagas or outbox pattern
Approval gates
If you have shipped long-running automation with real side effects, what worked and what caused incidents?
https://redd.it/1r4u7zr
@r_devops
Genuine question.
We run multi-step automation that touches tickets, db writes, api calls and emails.
A step partially failed or timed out. we restarted the run. a downstream write had already happened. result: duplicate tickets, duplicate notifications.
This does not feel like a simple retry problem. it is about where step boundaries live and how side effects stay idempotent across an entire run.
Things we are trying:
Treating write-capable steps differently from read-only steps
Requiring idempotency keys or operation ids for side effects
Making re-runs step-scoped instead of whole-run
Keeping a durable per-step ledger with inputs, outputs and timestamps
Adding manual pause or cancel before certain write steps
It still feels easy to get wrong.
Where do you enforce idempotency in practice?
Application layer
Workflow engine
Middleware or sidecar
Sagas or outbox pattern
Approval gates
If you have shipped long-running automation with real side effects, what worked and what caused incidents?
https://redd.it/1r4u7zr
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How are you handling AI agent inventory and compliance in your infrastructure?
With the EU AI Act enforcement date coming up (August 2026), we've been dealing with a problem that I think a lot of DevOps teams are going to hit: figuring out what AI agents are actually running in your infrastructure.
Our situation: we had n8n workflows calling OpenAI, LangChain agents deployed by different teams, random Zapier integrations making API calls to Claude — and nobody had a central view of all of it. Classic shadow AI problem.
The compliance angle made it urgent. The EU AI Act requires organizations to classify AI systems by risk level, maintain documentation, and demonstrate oversight. Can't do any of that if you don't even have an inventory.
What we ended up building was a scanner that walks through your infra and maps AI components — models, agents, API calls, data flows. We open-sourced it as AI-BOM (github.com/Trusera/ai-bom) since we figured other teams are hitting the same wall.
But I'm curious how others are approaching this:
- Do you have visibility into what AI/LLM integrations are running across your org?
- Is anyone tracking AI agents as part of their CMDB or asset inventory?
- How are you thinking about EU AI Act compliance from an infrastructure perspective?
- Anyone using SBOM-style approaches for AI components?
Would love to hear what other teams are doing — or if this just isn't on your radar yet.
https://redd.it/1r4y6b7
@r_devops
With the EU AI Act enforcement date coming up (August 2026), we've been dealing with a problem that I think a lot of DevOps teams are going to hit: figuring out what AI agents are actually running in your infrastructure.
Our situation: we had n8n workflows calling OpenAI, LangChain agents deployed by different teams, random Zapier integrations making API calls to Claude — and nobody had a central view of all of it. Classic shadow AI problem.
The compliance angle made it urgent. The EU AI Act requires organizations to classify AI systems by risk level, maintain documentation, and demonstrate oversight. Can't do any of that if you don't even have an inventory.
What we ended up building was a scanner that walks through your infra and maps AI components — models, agents, API calls, data flows. We open-sourced it as AI-BOM (github.com/Trusera/ai-bom) since we figured other teams are hitting the same wall.
But I'm curious how others are approaching this:
- Do you have visibility into what AI/LLM integrations are running across your org?
- Is anyone tracking AI agents as part of their CMDB or asset inventory?
- How are you thinking about EU AI Act compliance from an infrastructure perspective?
- Anyone using SBOM-style approaches for AI components?
Would love to hear what other teams are doing — or if this just isn't on your radar yet.
https://redd.it/1r4y6b7
@r_devops
GitHub
GitHub - Trusera/ai-bom: AI Bill of Materials — discover every AI agent, model, and API in your infrastructure
AI Bill of Materials — discover every AI agent, model, and API in your infrastructure - Trusera/ai-bom
Any resources to help a senior backend engineer moving into a lead data platform engineering role? My DevOps knowledge is elementary at best and I don't know everything AWS but I'm the most qualified to do this.
For context, I'm a strong backend engineer and I've used Terraform to create my own services and whatnot but I've never done anything this in-depth like the SREs and lead platform engineers at my previous companies.
Establishing engineering best practices for the team, platform monitoring, observability, security/governance, failover, design patterns, architecture, and the whole 9 yards are going to be my main responsibility (this absolutely terrifies me). I'm going to be the main engineer that data/analytics engineers, ml engineers, and management can come to for advice.
My vision here is to build a boring but reliable and well-oiled machine. Ideally costs are optimized, we're not being idiots by leaving resources unattended to. Everything's being built from scratch so I have the final say but I'm worried about screwing it up and doing something stupid that'll cost the companies thousands for no reason.
Tooling wise, it's mainly AWS, Snowflake, and I'm thinking of introducing Gitlab instead of Github.
https://redd.it/1r50dcd
@r_devops
For context, I'm a strong backend engineer and I've used Terraform to create my own services and whatnot but I've never done anything this in-depth like the SREs and lead platform engineers at my previous companies.
Establishing engineering best practices for the team, platform monitoring, observability, security/governance, failover, design patterns, architecture, and the whole 9 yards are going to be my main responsibility (this absolutely terrifies me). I'm going to be the main engineer that data/analytics engineers, ml engineers, and management can come to for advice.
My vision here is to build a boring but reliable and well-oiled machine. Ideally costs are optimized, we're not being idiots by leaving resources unattended to. Everything's being built from scratch so I have the final say but I'm worried about screwing it up and doing something stupid that'll cost the companies thousands for no reason.
Tooling wise, it's mainly AWS, Snowflake, and I'm thinking of introducing Gitlab instead of Github.
https://redd.it/1r50dcd
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Need help preparing for internship
Hi, I was lucky enough to get a cloud/devops engineer intern, but I rlly only know the basics of the cloud, I don’t really know much about it.
Are there any resources/books you recommend to learn more abt cloud technologies and be able to do good during the internship?
Thank you so much!
https://redd.it/1r52nkk
@r_devops
Hi, I was lucky enough to get a cloud/devops engineer intern, but I rlly only know the basics of the cloud, I don’t really know much about it.
Are there any resources/books you recommend to learn more abt cloud technologies and be able to do good during the internship?
Thank you so much!
https://redd.it/1r52nkk
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Book recommendation
What is the best book to learn network? I have general idea about dns, firewalls, NAT, switch, hub etc. But I still don’t feel confident regarding network and want to dig deeper.
https://redd.it/1r4wpu8
@r_devops
What is the best book to learn network? I have general idea about dns, firewalls, NAT, switch, hub etc. But I still don’t feel confident regarding network and want to dig deeper.
https://redd.it/1r4wpu8
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Do you feel the Heat of AI in DevOps Roles?
as the noscript suggests, do you feel AI is after your DevOps job?.
have you seen it helping effectively in your role or eliminating your role.
helping --> generating IAC, python code for automation. decesion making when your confused at using anything in DevOps. etc.,
Eliminating --> AI can replace you in every possible way.
I can go first:
Helping --> I have seen juniors using it effectively and writing better code with faster turnaround time.my junior is nothing without AI and so arrogant person that he tells him self and others that he knows everything. true to this my manager supports him as he fixes and provisions infra in no time.but he engages us in calls for hours to make him self understand the requirement.
Eliminating --> i strongly feel our roles will be vanished in years to come.may be max 5 yrs. the reason I see is the bug. the startup bug. everyone wants to do something and they feel as if they are doing favour to the society. but no, they are satisfieng their ego.they are looking very closely at all roles to see what can be automated and targetting them. DevOps is no exception here. thts how Amazon also had to let go many DevOps/cloud engineerings.
https://redd.it/1r56d15
@r_devops
as the noscript suggests, do you feel AI is after your DevOps job?.
have you seen it helping effectively in your role or eliminating your role.
helping --> generating IAC, python code for automation. decesion making when your confused at using anything in DevOps. etc.,
Eliminating --> AI can replace you in every possible way.
I can go first:
Helping --> I have seen juniors using it effectively and writing better code with faster turnaround time.my junior is nothing without AI and so arrogant person that he tells him self and others that he knows everything. true to this my manager supports him as he fixes and provisions infra in no time.but he engages us in calls for hours to make him self understand the requirement.
Eliminating --> i strongly feel our roles will be vanished in years to come.may be max 5 yrs. the reason I see is the bug. the startup bug. everyone wants to do something and they feel as if they are doing favour to the society. but no, they are satisfieng their ego.they are looking very closely at all roles to see what can be automated and targetting them. DevOps is no exception here. thts how Amazon also had to let go many DevOps/cloud engineerings.
https://redd.it/1r56d15
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
ACA autoscaling killing long running jobs — best practice?
Using Azure Container Apps with HTTP autoscaling(with 10 as concurrent users) for report generation. During scale up/down, replicas get terminated and reports fail mid-execution.
Questions:
• Is this the right pattern for long-running jobs on ACA?
• Any Service Bus lock timeout gotchas?
https://redd.it/1r4hkzu
@r_devops
Using Azure Container Apps with HTTP autoscaling(with 10 as concurrent users) for report generation. During scale up/down, replicas get terminated and reports fail mid-execution.
Questions:
• Is this the right pattern for long-running jobs on ACA?
• Any Service Bus lock timeout gotchas?
https://redd.it/1r4hkzu
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Dual boot or VMware
I started learning devops a while ago, I used to practice on VMware but sometimes the machine freezes specially when I am learning k8s so I start thinking about dual boot but I don’t know if it is good enough for learning and practice all the tools or I should give the machine more specs
https://redd.it/1r57rba
@r_devops
I started learning devops a while ago, I used to practice on VMware but sometimes the machine freezes specially when I am learning k8s so I start thinking about dual boot but I don’t know if it is good enough for learning and practice all the tools or I should give the machine more specs
https://redd.it/1r57rba
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Homelab or digital ocean?
i need to do projects to learn and show off on my resume but im a student and i dont have money. I thought that maybe i should do some cloud provider free trial in order to show competency with servers(terraform) but all signs lead me to believe that homelabbing will guarantee a special interview i have in a month and a half from now. Should i take the invesand homelab or try to do projects with a cloud provider?
https://redd.it/1r58xhz
@r_devops
i need to do projects to learn and show off on my resume but im a student and i dont have money. I thought that maybe i should do some cloud provider free trial in order to show competency with servers(terraform) but all signs lead me to believe that homelabbing will guarantee a special interview i have in a month and a half from now. Should i take the invesand homelab or try to do projects with a cloud provider?
https://redd.it/1r58xhz
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
What does “config hell” actually look like in the real world?
I've heard about "Config Hell" and have looked into different things like IAM sprawl and YAML drift but it still feels a little abstract and I'm trying to understand what it looks like in practice.
I'm looking for war stories on when things blew up, why, what systems broke down, who was at fault. Really just looking for some examples to ground me.
Id take anything worth reading on it too.
https://redd.it/1r5ew1g
@r_devops
I've heard about "Config Hell" and have looked into different things like IAM sprawl and YAML drift but it still feels a little abstract and I'm trying to understand what it looks like in practice.
I'm looking for war stories on when things blew up, why, what systems broke down, who was at fault. Really just looking for some examples to ground me.
Id take anything worth reading on it too.
https://redd.it/1r5ew1g
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
where can I find courses
hello all,
I want advice regarding where to find good courses about devops, Kubernetes, dockers, AWS.
if there is a course that tackles most of this in one go would be better.
https://redd.it/1r5kgn9
@r_devops
hello all,
I want advice regarding where to find good courses about devops, Kubernetes, dockers, AWS.
if there is a course that tackles most of this in one go would be better.
https://redd.it/1r5kgn9
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
I made a single binary alternative to Grafana+Prometheus for monitoring Docker on remote servers
I got tired of needing a full grafana + prometheus + loki + alertmanager stack just to monitor a handful of docker containers across a couple VPSs. So I built a simpler alternative.
A single binary agent runs on your server collecting host metrics from /proc, monitoring containers via the docker socket (read-only), tailing logs, and evaluating alert rules. You define alert conditions in a toml config, container down, high cpu, disk filling up, unhealthy health checks, restart loops, and get notified via email or webhooks. You connect from your machine over SSH via a TUI, no exposed ports, no HTTP server, nothing to firewall.
It deploys as a docker compose service or a systemd unit. Sub 50 mb ram usage on my own servers currently, sqlite storage with 7 day retention, config reload via SIGHUP.
There's a gif of how the TUI looks on the repo if you want to see it in action. MIT licensed, I really just built it to solve my own problem so feel free to check it out but expect bugs if you do :)
https://github.com/thobiasn/tori-cli
https://redd.it/1r5mp8g
@r_devops
I got tired of needing a full grafana + prometheus + loki + alertmanager stack just to monitor a handful of docker containers across a couple VPSs. So I built a simpler alternative.
A single binary agent runs on your server collecting host metrics from /proc, monitoring containers via the docker socket (read-only), tailing logs, and evaluating alert rules. You define alert conditions in a toml config, container down, high cpu, disk filling up, unhealthy health checks, restart loops, and get notified via email or webhooks. You connect from your machine over SSH via a TUI, no exposed ports, no HTTP server, nothing to firewall.
It deploys as a docker compose service or a systemd unit. Sub 50 mb ram usage on my own servers currently, sqlite storage with 7 day retention, config reload via SIGHUP.
There's a gif of how the TUI looks on the repo if you want to see it in action. MIT licensed, I really just built it to solve my own problem so feel free to check it out but expect bugs if you do :)
https://github.com/thobiasn/tori-cli
https://redd.it/1r5mp8g
@r_devops
GitHub
GitHub - thobiasn/tori-cli: Lightweight remote Docker monitoring with alerting. Single binary, zero exposed ports, SSH-only access.
Lightweight remote Docker monitoring with alerting. Single binary, zero exposed ports, SSH-only access. - thobiasn/tori-cli
Can the CKA replace real k8s experience in job hunting?
Senior DevOps engineer here, at a biotech company. My specific team supports more on the left side of the SDLC, helping developers create and improve build pipelines, integrating cloud resources into that process like S3, EC2, and creating self-help jobs on Jenkins/GitHub actions.
TLDR, I need to find another job. However, most DevOps jobs ive seen require k8s at scale- focusing on reliability/observability. I have worked with Kubernetes lightly, inspecting pod failures etc, but nothing that would allow me to deploy and maintain a kubernetes cluster. Because of this, I'm in the process of obtaining the CKA to address those gaps.
To hiring managers out there: Would you hire someone or accept the CKA as a replacement for X years of real Kubernetes experience?
For those of you who obtained the CKA for this reason, did it help you in your job search?
https://redd.it/1r5nh8z
@r_devops
Senior DevOps engineer here, at a biotech company. My specific team supports more on the left side of the SDLC, helping developers create and improve build pipelines, integrating cloud resources into that process like S3, EC2, and creating self-help jobs on Jenkins/GitHub actions.
TLDR, I need to find another job. However, most DevOps jobs ive seen require k8s at scale- focusing on reliability/observability. I have worked with Kubernetes lightly, inspecting pod failures etc, but nothing that would allow me to deploy and maintain a kubernetes cluster. Because of this, I'm in the process of obtaining the CKA to address those gaps.
To hiring managers out there: Would you hire someone or accept the CKA as a replacement for X years of real Kubernetes experience?
For those of you who obtained the CKA for this reason, did it help you in your job search?
https://redd.it/1r5nh8z
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How I Built a Production-Grade Kubernetes Homelab on 2 Recycled PCs (Proxmox + Talos Linux, ~€150)
I wrote a detailed walkthrough on building a production-grade Kubernetes homelab using 2 recycled desktop PCs (\~€150 total). The stack covers Proxmox for virtualization, Talos Linux as an immutable K8s OS, ArgoCD for GitOps, and Traefik + Cloudflare Tunnel for external access.
Key topics: Infrastructure as Code with Terraform, GlusterFS for replicated storage, External Secrets Operator with Bitwarden, and a full monitoring stack (Prometheus + Grafana + Loki).
Full article: https://medium.com/@sylvain.fano/how-i-built-a-production-grade-kubernetes-homelab-in-2-weekends-with-claude-code-b92bca5091d3
Happy to discuss architecture decisions or answer any questions!
https://redd.it/1r5m7ir
@r_devops
I wrote a detailed walkthrough on building a production-grade Kubernetes homelab using 2 recycled desktop PCs (\~€150 total). The stack covers Proxmox for virtualization, Talos Linux as an immutable K8s OS, ArgoCD for GitOps, and Traefik + Cloudflare Tunnel for external access.
Key topics: Infrastructure as Code with Terraform, GlusterFS for replicated storage, External Secrets Operator with Bitwarden, and a full monitoring stack (Prometheus + Grafana + Loki).
Full article: https://medium.com/@sylvain.fano/how-i-built-a-production-grade-kubernetes-homelab-in-2-weekends-with-claude-code-b92bca5091d3
Happy to discuss architecture decisions or answer any questions!
https://redd.it/1r5m7ir
@r_devops
Medium
How I Built a Production-Grade Kubernetes Homelab in 2 Weekends with Claude Code
Two old PCs, a dusty Raspberry Pi 2 (sic!), an AI coding assistant, and zero excuses left not to learn Kubernetes for real.
Weekly/temp DevOps ENTRY LEVEL - internship / fresher & changing careers
This is a weekly thread to ask questions about getting into DevOps.
If you are a student, or want to start career in DevOps but do not know how? Ask here.
Changing careers but do not have basic prerequisites? Ask here.
Before asking
try to search if your question was asked and answered
try these resources
[https://roadmap.sh/devops](https://roadmap.sh/devops)
(please suggest more)
_____________
Individual posts of this type may be removed and redirected here.
Please remember to follow the rules and remain civil and professional.
This is a trial weekly thread.
https://redd.it/1r659ga
@r_devops
This is a weekly thread to ask questions about getting into DevOps.
If you are a student, or want to start career in DevOps but do not know how? Ask here.
Changing careers but do not have basic prerequisites? Ask here.
Before asking
try to search if your question was asked and answered
try these resources
[https://roadmap.sh/devops](https://roadmap.sh/devops)
(please suggest more)
_____________
Individual posts of this type may be removed and redirected here.
Please remember to follow the rules and remain civil and professional.
This is a trial weekly thread.
https://redd.it/1r659ga
@r_devops
roadmap.sh
DevOps Roadmap: Learn to become a DevOps Engineer or SRE
Step by step guide for DevOps, SRE or any other Operations Role in 2026
I've run Docker Swarm in production for 10 years. $166/year. 24 containers. Two continents. Zero crashes. Here's why I never migrated to Kubernetes.
Every week on Reddit someone asks about Docker Swarm and the responses are always the same: "Swarm is dead." "Just use K8s." "Nobody runs Swarm in production."
I've run Swarm in production for a decade. Not a toy setup — multi-node clusters, manager redundancy, 4-6 replicas per service, rolling deployments in batches of two with automatic rollback on healthcheck failure. Zero customer downtime. Over the years I optimized the architecture down to 24 containers across two continents on $166/year total infrastructure.
I finally wrote the article I wish existed when I made my choice ten years ago. 7,400 words. Real production numbers. Working code. No affiliate links. No "it depends" cop-out.
**What's in it:**
* Side-by-side YAML comparison: 27 lines (Compose) → 42 lines (Swarm) → 170+ lines (K8s) for the same app
* Healthcheck comparison table testing 6 failure scenarios — K8s wins 2 out of 6
* A working 150-line autoscaler that's actually smarter than K8s HPA (adaptive polling vs fixed 15s intervals)
* Cost breakdown: $166/year vs $1,584-2,304/year minimum for EKS
* CAST AI 2024 data: 87% idle CPU, 68% of pods overprovisioned 3-8x, $50-500K annual waste per cluster
* Why your Node.js containers are 7x bigger than they need to be and how that drives false demand for autoscaling
* Why you should never expose Node.js directly to the internet (and what to do instead)
The only feature K8s genuinely has that Swarm lacks is autoscaling — and Datadog's own 2023 report shows only \~50% of K8s organizations even use HPA. So half the industry is paying the full complexity tax for a feature they don't use.
Not saying K8s is bad. It's an incredible system for the 1% who need it. But the data shows 99% don't — they're paying 10-100x more for capabilities they never touch while 87% of their CPU does nothing.
[Read Full Web Article Here](https://thedecipherist.com/articles/docker_swarm_vs_kubernetes/?utm_source=reddit&utm_medium=post&utm_campaign=docker-swarm-vs-kubernetes&utm_content=launch-post&utm_term=r-devops)
Happy to answer any questions. I've been running this setup since before K8s hit 1.0.
https://redd.it/1r6krmk
@r_devops
Every week on Reddit someone asks about Docker Swarm and the responses are always the same: "Swarm is dead." "Just use K8s." "Nobody runs Swarm in production."
I've run Swarm in production for a decade. Not a toy setup — multi-node clusters, manager redundancy, 4-6 replicas per service, rolling deployments in batches of two with automatic rollback on healthcheck failure. Zero customer downtime. Over the years I optimized the architecture down to 24 containers across two continents on $166/year total infrastructure.
I finally wrote the article I wish existed when I made my choice ten years ago. 7,400 words. Real production numbers. Working code. No affiliate links. No "it depends" cop-out.
**What's in it:**
* Side-by-side YAML comparison: 27 lines (Compose) → 42 lines (Swarm) → 170+ lines (K8s) for the same app
* Healthcheck comparison table testing 6 failure scenarios — K8s wins 2 out of 6
* A working 150-line autoscaler that's actually smarter than K8s HPA (adaptive polling vs fixed 15s intervals)
* Cost breakdown: $166/year vs $1,584-2,304/year minimum for EKS
* CAST AI 2024 data: 87% idle CPU, 68% of pods overprovisioned 3-8x, $50-500K annual waste per cluster
* Why your Node.js containers are 7x bigger than they need to be and how that drives false demand for autoscaling
* Why you should never expose Node.js directly to the internet (and what to do instead)
The only feature K8s genuinely has that Swarm lacks is autoscaling — and Datadog's own 2023 report shows only \~50% of K8s organizations even use HPA. So half the industry is paying the full complexity tax for a feature they don't use.
Not saying K8s is bad. It's an incredible system for the 1% who need it. But the data shows 99% don't — they're paying 10-100x more for capabilities they never touch while 87% of their CPU does nothing.
[Read Full Web Article Here](https://thedecipherist.com/articles/docker_swarm_vs_kubernetes/?utm_source=reddit&utm_medium=post&utm_campaign=docker-swarm-vs-kubernetes&utm_content=launch-post&utm_term=r-devops)
Happy to answer any questions. I've been running this setup since before K8s hit 1.0.
https://redd.it/1r6krmk
@r_devops
The Decipherist
Docker Swarm vs Kubernetes in 2026 — The Decipherist
10 years of Docker Swarm in production — 24 containers, two continents, live SaaS platform, zero crashes, $166/year. Side-by-side YAML comparisons, real production numbers, a working autoscaler that's smarter than K8s HPA, and a cost breakdown that should…
Security Scanning, SSO, and Replication Shouldn't Be Behind a Paywall — So I Built an Open-Source Artifact Registry
Side project I've been working on — but more than anything I'm here to pick your brains.
I felt like there was no truly open-source solution for artifact management. The ones that exist cost a lot of money to unlock all the features. Security scanning? Enterprise tier. SSO? Enterprise tier. Replication? You guessed it. So I built my own.
Artifact Keeper is a self-hosted, MIT-licensed artifact registry. 45+ package formats, built-in security scanning (Trivy + Grype + OpenSCAP), SSO, peer mesh replication, WASM plugins, Artifactory migration tooling — all included. No open-core bait-and-switch.
What I really want from this post:
\- Tell me what drives you crazy about Artifactory, Nexus, Harbor, or whatever you're running
\- Tell me what you wish existed but doesn't
\- If something looks off or missing in Artifact Keeper, open an issue or start a discussion
GitHub Discussions: https://github.com/artifact-keeper/artifact-keeper/discussions
GitHub Issues: https://github.com/artifact-keeper/artifact-keeper/issues
You don't have to submit a PR. You don't even have to try it. Just tell me what sucks about artifact management and I'll go build the fix.
But if you do want to try it:
https://artifactkeeper.com/docs/getting-started/quickstart/
Demo: https://demo.artifactkeeper.com
GitHub: https://github.com/artifact-keeper
https://redd.it/1r6pwxy
@r_devops
Side project I've been working on — but more than anything I'm here to pick your brains.
I felt like there was no truly open-source solution for artifact management. The ones that exist cost a lot of money to unlock all the features. Security scanning? Enterprise tier. SSO? Enterprise tier. Replication? You guessed it. So I built my own.
Artifact Keeper is a self-hosted, MIT-licensed artifact registry. 45+ package formats, built-in security scanning (Trivy + Grype + OpenSCAP), SSO, peer mesh replication, WASM plugins, Artifactory migration tooling — all included. No open-core bait-and-switch.
What I really want from this post:
\- Tell me what drives you crazy about Artifactory, Nexus, Harbor, or whatever you're running
\- Tell me what you wish existed but doesn't
\- If something looks off or missing in Artifact Keeper, open an issue or start a discussion
GitHub Discussions: https://github.com/artifact-keeper/artifact-keeper/discussions
GitHub Issues: https://github.com/artifact-keeper/artifact-keeper/issues
You don't have to submit a PR. You don't even have to try it. Just tell me what sucks about artifact management and I'll go build the fix.
But if you do want to try it:
https://artifactkeeper.com/docs/getting-started/quickstart/
Demo: https://demo.artifactkeeper.com
GitHub: https://github.com/artifact-keeper
https://redd.it/1r6pwxy
@r_devops
GitHub
GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.