Reddit DevOps – Telegram
Finally solved my "which port is this app on" problem with a simple Caddy trick

If you work on multiple projects (or a monorepo), you know the pain:
- "Was the API on 4323 or 4321?"
- "Is the marketing site 5400 or 5450?"
- opens 5 browser tabs with different ports
- checks package.json for the 10th time

I finally got tired of this and set up Caddy with wildcard localhost routing. Now I just type my-app.dev.localhost and I'm there. No ports. Real HTTPS. Green padlock.

The key insight that took me a few failed attempts: \*.localhost doesn't work (browsers reject the wildcard cert), but \*.dev.localhost does. Adding one subdomain level makes it work.



Basic setup:
    *.dev.localhost {
tls internal

host my-api.dev.localhost
handle {
reverse_proxy localhost:4323
}

host my-app.dev.localhost
handle {
reverse_proxy localhost:5400
}

handle {
respond "Not configured" 404
}
}


Run caddy trust once, then sudo caddy start --config Caddyfile. Done.

Bonus: also solves OAuth callback URL issues since your local URLs look like production (https://my-app.dev.localhost/auth/callback).



I wrote up the full thing with gotchas and a dev dashboard if anyone wants: https://thesashka.com/blog/posts/technical/taming-localhost-ports-with-caddy/

https://redd.it/1puid36
@r_devops
Feeling Like an Outsider a Few Months into Job

Hey everyone!

I'm a relatively new to my job, just a few months full time. I did intern with my team before, so I knew what to expect going in.

During my internship, I felt so incredibly confused the entire time. During the time between my internship and starting full time, I did some personal projects and filled in some gaps with containerization and other things.
Now that I am full time, I feel like I somewhat know what I'm doing, but I think what gets me is that my team is able to come up with new things to automate, find gaps in things that I don't see, and come up with better solutions with new technologies. I work for a good company, and my team is really smart, so I know if they are willing to have me, I must be okay.

I think what gets me sometimes is the vast amount of knowledge about tons of different things being in DevOps, and not having much of a background in anything else. There is so much to learn - and only over the past few months have I REALLY worked with RHEL, containerization, CI/CD, AWS, and of course our systems we have created. This, and sometimes I get so invested in the tasks themselves, that I can look over small details in PRs, or forgetting to keep up with putting in progress/closing out my Jira stories.

My team is also extremely organized, and although I find myself to be a very organized person, I feel like I make so many small mistakes during my work. I know I'm only a few months in, but things still take me time and even then, there are so many comments on my PRs. I want to be really good at this, and I really do enjoy it.

If anyone has any tips as far as organization, dealing with imposter syndrome in this field, and/or gaining confidence in my skills and knowledge, I would love to hear it.

Thank you!



https://redd.it/1puc8te
@r_devops
How do you prevent PowerShell noscripts from turning into a maintenance nightmare?

In many DevOps teams, PowerShell noscripts start as quick fixes for specific issues, but over time more noscripts get added, patched, or duplicated until they become hard to maintain and reason about. I’m curious how teams handle this at scale: how do you keep PowerShell noscripts organized, maintainable, and clean as they pile up? Do you eventually turn them into proper modules or tools, enforce standards through CI/automation, or replace them with something else altogether? Interested in hearing what’s actually worked in real-world environments.


https://redd.it/1puabzv
@r_devops
Help resolving connection refused between two sites cert manager

I have 3 nodes in one site and one on another it has only private ips and 3nodes is under same VIP i have done kubeadm init with vip and connected 3 node as control plane one in other location has worker

Worker to this 3 node has icmp and tcp connection all port open between this two

I deployed cert manager in worker 3
When i try applying an yaml it says https://svc:443 connection refused

I have all port opens i did upto my knowledge

Can you help me resolve this issue
Im stuck with this issue past 3 days

https://redd.it/1pugpce
@r_devops
EnvX-UI: Local, Encrypted & Editable .env

EnvX-UI was built to manage and edit .env files across multiple projects, including encrypted ones. A clean, intuitive interface for developers who need secure and centralized environment variable management.


https://github.com/litepacks/envx-ui

https://redd.it/1pulboq
@r_devops
Got actions/flows you swear by ?

Just wondering what people have defaults when they start a repo ?

We have linters and code stylers on production code repos
Just wondering is there others out there that may be handy ?

https://redd.it/1punbp6
@r_devops
State backend on AWS

How do you deal with the “chicken and egg” situation when creating backend for your infra on AWS? I’ve seen people do a bootstrap directory that deploys s3 and dynamodb table, and I have grown accustomed to it as well. I’m wondering how others approach it especially with dynamodb being depreciated for statelocking.

https://redd.it/1pum2l9
@r_devops
About stack in 2026

i have 4 years of experience job with full stack development in php,node,python,mysql,mongodb,redist and vue and react frontend framework.

i have knowledge in linux, nginx, apache, aws, docker, terraform, ansible, github and gitlab pipelines, a little bit about prometheus and grafana.

I have done some infra deploy in aws and digital ocean, but i feel im not enough yet.

Next month i will have a interview by a devops engineer mid/senior job, but i really want to this do right.

What stack do you guys recommend me to learn or revise to do well in the interview?

i really love do devops engineer much more than do code, and i really want migrate to this job, but feel very insecure because its a mid/senior job, i are have indicate to this job by a friend, that friend which taught me a lot about devops.

https://redd.it/1pult38
@r_devops
Zero-trust inside an early LLM platform: did you implement it from day one?

We’re building an internal LLM platform and compared two access models:

Option A - strict zero-trust between microservices (mTLS/JWT per call, sidecars, IdP).
Option B - a trusted boundary at the Docker network level (no per-request auth inside, strong boundary controls)

Current choice: Option B for the MVP. Context: single operator domain, no external system callers to the LLM service.

Why now
• Lower inference latency, faster delivery, lower integration cost

Main risk
• Lateral movement if a node inside the boundary is compromised

Compensators we use
• Network isolation/firewall, minimal images, read-only secrets with rotation, CI dependency scans, centralized logs/alerts, audit of outbound calls to external LLM APIs, isolated job containers without internal network

What we actually measure
• LLM service latency under load
• Secret rotation cadence
• Vulnerability scan score/drift
• Anomaly rate on outbound calls

Switch criteria to zero-trust later
• External integrations, multi-tenant mode, third-party operators/contractors, regulatory pressure

Questions to the community

1. On small teams: which mTLS/JWT pattern kept ops simple enough (service mesh vs per-service libs)?
2. What was the real latency/complexity tax you observed when going zero-trust inside the boundary?
3. Any “gotchas” with token management between short-lived jobs/containers?

https://redd.it/1puloxu
@r_devops
Where do you start when automating things for a series-A/B startup, low headcount?

Hey all

I’m curious how others approach this:

I’m working with a startup, they’re 2 years in and have some solid customers, and a dev team of about 8.

Software assets

\- spring boot/react typical web app for a UI, a bunch of LLM interactions, and data management

\- admin app where prompt engineers work with poorly/manual git versioned workflow

Testing

\- no unit

\- no integration

\- limited selenium coming online now

\- thousands of manual test cases, regression takes 5 days (!)

Deploy:

\- everything is non-CI, some shell noscripts

\- liquibase rolls into schema JARs

Infra:

\- stale terraform, likely significant config drift

Envs:

\- AWS

\- dev/qa/preprod/prod, but also a handful of “prod v1.x” instances where customers are being migrated from

Git:

\- trunk based, release branches, feature branches

Your reply could be from any experience, I’m just setting a little bit of level here so that we’re on the same page in terms of where they are in dev maturity. I have my thoughts, too, and a plan, and im curious how other folks see it, always something to learn.

Cheers!

https://redd.it/1pus3sb
@r_devops
Google Cloud CDN vs Cloudfront help me decide?

Hey guys
I'm building a video heavy app with long form stuff like 30 mins each and trying to figure out which CDN to use as a backup.
​I use Cloudflare as my main right now but after the recent outages I really need a solid secondary. I'm torn between Google Cloud CDN and AWS Cloudfront.
​GCP seems faster because of their private fiber network but AWS is just everywhere. For anyone who actually used both for video streaming or large files which one was less of a headache to set up? And how is the caching for long videos?
​Not really looking for marketing fluff just want to know from someone who’s been in the trenches which one is more reliable when things go south?
​Cheers

https://redd.it/1puutzv
@r_devops
what does a DevOps engineer actually do day-to-day?

Hi everyone,

I’m currently getting into DevOps and had a few beginner questions that I’ve been thinking about.

From a real-world perspective, what does a DevOps engineer usually do on a daily basis?
Do you mostly write noscripts and automation, or do you also write application code?

Another thing I’m curious about is command usage. As a beginner, it feels overwhelming to remember so many commands and configurations. In real jobs, do engineers memorize most commands, or is it normal to rely on documentation, notes, and previously written noscripts?

Also, how different is interview expectation compared to actual on-the-job work?
I’m asking this genuinely to understand what I should focus on while learning.

https://redd.it/1puwcya
@r_devops
A little cookiecutter noscript to add logging and redirect to circusd

I've recently set up a home server slash IoT hub (router with three wifi access points, zigbee server, file server, a bunch of little web servre apps) and ended up using circusd. Mostly to keep services nicely separate from one another and systemd. It lets me look at the pstree for an entire service, watch for restarts and look at all the logs together.

I have a pattern where each service gets its own user with files for running circus, rsyslog etc. I've done this enough times that I've set up a little cookiecutter noscript to set up the user and I thought I might as well share this here. It's very much tuned for the "home network" setting (e.g. I am publishing services on mdns using avahi etc). Also people probably want autoscaling container magic for things used in anger, but works pretty well for single user stuff.

`https://github.com/talwrii/cookiecutter-circus`

https://redd.it/1puyj6m
@r_devops
Why am i getting rejected from internships?

Hi there DevOps community
This year am looking for an internship, so i apply as a student should do, there r actually a lot of offres, and i know where am applying there aren't that many DevOps students, so i was expecting to get responses quickly, but for some reason i get rejected right away
There is a pattern that i noticed tho, big companies take a looong time and than they rejected me, but smaller companies take medium time and they accept Moving to the next stage
My resume is strong, and the few other DevOps students are also facing the same issues
Does anyone have an idea on what's happening?

https://redd.it/1puyjs2
@r_devops
How are you handling CI/CD for AI Agents?

I’m a dev working on a tool to help audit and deploy AI agents. I realized that standard CI/CD breaks down with agents because a code rollback doesn't necessarily fix a "behavior" regression caused by a prompt drift or model update. If you are deploying LLMs in production: Do you treat prompts as config files (Helm charts/Env vars) or code? If an agent starts hallucinating in prod, does your current pipeline allow you to "hot swap" the prompt version without a full redeploy?

https://redd.it/1pv11x0
@r_devops
Which AWS consulting partners in Europe are actually worth it? Top 10

Let’s be honest, browsing the AWS Partner Network directory feels like trying to find a needle in a haystack where every needle claims to be Premier. Everyone has badges, everyone promises seamless digital transformation, but how many actually deliver when production is on fire? Finding top AWS consultants who don't just bill you for hours but actually fix your cloud infrastructure is harder than it looks.

I’ve dealt with enough agencies to know that a shiny sales deck doesn't equal clean code. So this isn't a ranked leaderboard, but rather a curated list of companies that actually bring value to the table, depending on whether you need AWS managed services or deep engineering muscle:

1. Nordcloud: They are essentially the IBM of the cloud world in Europe now. If you are a massive enterprise needing standardized compliance and have the budget to match, they are a solid bet.
2. Beetroot: A strong choice if you need AWS certified developers but want them embedded in your team rather than just consulting from the outside. They specialize in building dedicated teams and handling complex DevOps pipelines. Their focus is big on the "human" side of tech, which helps when retention matters.
3. DoiT International: Go to them if your bill is bleeding you dry. They are absolute wizards at cost optimization and reselling, though less focused on building custom apps from scratch.
4. The Scale Factory: Great for SaaS businesses. They understand scalability and don't just throw hardware at problems.
5. Storm Reply: Very strong on the technical execution side, particularly in Germany and Italy. They handle heavy IoT and industrial cloud projects well.
6. AllCloud: If you are stuck between Salesforce and AWS, these guys bridge that gap better than most.
7. tecRacer: Another heavy hitter in the DACH region. Their training is top-tier, which usually translates to competent consultants.
8. SoftwareOne: Good for licensing and general management, though sometimes feels a bit corporate for agile startups.
9. Contino: Excellent for the transformation culture. They focus heavily on cloud-native adoption rather than just "lift and shift."
10. Caylent: While they have a heavy US presence, their European operations are growing and they are deep into AWS Lambda and serverless architectures.

When you interview these firms, ask about their DevOps culture. Do they automate security checks? Do they use Terraform or CloudFormation? If they stare blankly, run. You want partners who push for serverless where it saves money and containers where it makes sense, not just whatever is easiest for them to bill. If you just need hands, standard outsourcing works. But for architecture, you need top AWS consultants who will challenge your bad ideas. The best cloud migration services often involve telling the client that their legacy app shouldn't be migrated as-is. It makes a massive difference in the long run.

https://redd.it/1puy3t4
@r_devops
Vagrant SSH CTRL C Bug Workaround - Decoding DevOps

Hi everyone!

I'm new in my DevOps journey, following a Udemy course named Decoding DevOps, and for now I'm liking it a lot, the only thing that was quite annoying is that the vagrant ssh command would exit the ssh client whenever you sent a CTRL+C, I couldn't find a way around it apart from using the normal SSH client through your Git BASH, so I just made a simple tidy noscript that automatically gets all the info needed from the VM and creates an alias for simple ssh connecting. Here is my repo, it's the first time I'm doing something like this, I know its really simple but tbh having it work on my end made me very happy and I want to just share this somewhere.

https://github.com/jovanjungic/vssh-sync

https://redd.it/1puxrzo
@r_devops
Building a deterministic policy firewall for AI execution — would love infra feedback

I’m experimenting with a control-plane style approach for AI systems and looking for infra/architecture feedback.



The system sits between AI (or automation) and execution and enforces hard policy constraints before anything runs.



Key points:

\- It does NOT try to reason like an LLM

\- Intent normalization is best-effort and replaceable

\- Policy enforcement is deterministic and fails closed

\- Every decision generates an audit trail



I’ve been testing it in fintech, health, legal, insurance, and gov-style scenarios, including unstructured inputs.



This isn’t monitoring or reporting — it blocks execution upfront.



Repo here: https://github.com/LOLA0786/Intent-Engine-Api



Genuinely curious:

\- What assumptions would you attack?

\- Where would this be hard to operate?

\- What would scare you in prod?



https://redd.it/1pv6ox2
@r_devops
Mist: self-hostable PaaS for deploying apps on your own infrastructure

Over the past few months, me and a friend have been building Mist, a self-hostable PaaS aimed at people running their own VPS or homelab setups.
Mist helps you deploy and manage applications on infrastructure you control using a Docker-based workflow, while keeping things lightweight and predictable.

Current features:
- auto-deployments on git push
- Docker-based application deployments
- multi-user architecture
- domain and TLS management

The project is fully open source. There’s a fairly large roadmap ahead, and we’re actively looking for contributors and early feedback from people who self-host or build infra tools.

Docs / project site: https://trymist.cloud
Source code: https://github.com/corecollectives/mist

Happy to answer questions or hear suggestions.

We’re still relatively new to software development and are building this in the open while learning and iterating.

https://redd.it/1pv7pk9
@r_devops
Devops or Devlopment as a fresher

I don’t have much in-depth knowledge about web dev like I know only basic html, css, did some vibe coded projects from scratch and deployed it on vercel. By this I got to know about how backend and frontent works. How different tech stack works like surface knowledge, react, angular, different backend frameworks like django fastapi, as well as middlerware and where they are used, as well as built tools like vue, runtime environment, crud databases, supabase, sql, hiding .env before pushing to git, different package managers, microservices, RESTapi integration as well as different api options, tier 2 and tier 3 web architecture difference, all because of curiosity and AI. Now If u tell me to code without AI I will know which tech stack to use, what to build but not how to build it as I don’t know the syntax of each lang but understand the logic behind the structure of the project.

I am confused as a 4th sem btech student tier 3, I m not much inclined towards web dev learning it from scratch as well as long codes but I like top down or big picture approach how different systems work and manages lot of interactions without breaking, how it scales and most importantly I like to automate task rather than writing long codes, so I got to know about devops which fits my interest as I know Linux, noscripting, networking, yaml and also interest in learning cloud computing.

So I wanted to ask if I should go for pure devops instead of development will I get entry level jobs and internships.

Your guidance will be much appreciated 🙏

https://redd.it/1pv6csu
@r_devops
I’m building runtime “IAM for AI agents” policies, mandates, hard enforcement. Does this problem resonate?

I’m working on an MVP that treats AI agents as **economic actors**, not just noscripts or prompts and I want honest validation from people actually running agents in production.



The problem I keep seeing

Agents today can:

* spend money (LLM calls, APIs)
* call tools (email, DB, infra, MCP servers)
* act repeatedly and autonomously

But we mostly “control” them with:

* prompts
* conventions
* code



There’s no real concept of:

* agent identity
* hard authority
* budgets that can’t be bypassed
* deterministic enforcement



If an agent goes rogue, you usually find out **after** money is spent or damage is done.



What I’m building

A small infra layer that sits **outside** the LLM and enforces authority mechanically.



Core ideas:

* **Agent** = stable identity (not a process)
* **Policy** = static, versioned authority template (what **could** be allowed)
* **Rule** = context-based selection (user tier, env, tenant, etc.)
* **Mandate** = short-lived authority issued per invocation
* Enforcement = allow/block tool/MCP + LLM calls at runtime



No prompt tricks. No AI judgment. Just deterministic allow / block.



Examples:

* Free users → agent can only read data, $1 budget
* Paid users → same agent code, higher budget + more tools
* Kill switch → instantly block all future actions
* All actions audited with reason codes



What this is NOT

* Not an agent framework
* Not AI safety / content moderation
* Not prompt guardrails
* Not model alignment



It’s closer to IAM / firewall thinking, but for agents.



Why I’m unsure

This feels **obvious** once you see it, but also very infra-heavy.

I don’t know if enough teams feel the pain **yet**, or if this is too early.



I’d love feedback on:

1. If you run agents in prod: what failures scare you most?
2. Do you rely on prompts for control today? Has that burned you?
3. Would you adopt a hard enforcement layer like this?
4. What would make this a “no-brainer” vs “too much overhead”?



I’m not selling anything, just trying to validate whether this is a real problem worth going deeper on.

github repo for mvp (local only): [https://github.com/kashaf12/mandate](https://github.com/kashaf12/mandate)

https://redd.it/1pvat3j
@r_devops