Question on the stack for blog/mobile app
I'm setting up the infrastructure for a news and contest blog (and a future React Native app). The focus is on maximum optimization and low operating costs at scale (aiming for 200k+ users).
I'd like a reality check on my stack:
• Frontend Web: Next.js (Vercel Hosting + Cloudflare CDN).
• Mobile: React Native.
• CMS/Backend API: Strapi, hosted on Fly.io.
• Database: PostgreSQL via Neon (Serverless DB).
• Authentication/Users: Firebase.
Is this combination the best possible to ensure efficiency and low infrastructure costs in the long run, or is there any bottleneck (mainly in the Strapi/Fly.io/Neon trio) that I should correct before launching the app?
https://redd.it/1phr5jk
@r_devops
I'm setting up the infrastructure for a news and contest blog (and a future React Native app). The focus is on maximum optimization and low operating costs at scale (aiming for 200k+ users).
I'd like a reality check on my stack:
• Frontend Web: Next.js (Vercel Hosting + Cloudflare CDN).
• Mobile: React Native.
• CMS/Backend API: Strapi, hosted on Fly.io.
• Database: PostgreSQL via Neon (Serverless DB).
• Authentication/Users: Firebase.
Is this combination the best possible to ensure efficiency and low infrastructure costs in the long run, or is there any bottleneck (mainly in the Strapi/Fly.io/Neon trio) that I should correct before launching the app?
https://redd.it/1phr5jk
@r_devops
Developers, pls stop treating Datadog like ur personal diary
I’m slowly losing my mind here because some of our devs refuse to filter their logs.
Our Datadog bill is skyrocketing, and for what? to store masterpieces like:
Process starting...
Process started...
Process REALLY started...
Plus 300 lines of “not-an-error” stack traces.
Every time I ask them to log less, or "let me create a filter for that", I get
“we might need it later” or “it’s only a few lines” - sure, times 3 million.
Anyone else fighting the “log everything forever” cult? How do you handle this type of battle? as I need them to agree to drop much of that spend, but also respect that they may need some of it...
https://redd.it/1phs661
@r_devops
I’m slowly losing my mind here because some of our devs refuse to filter their logs.
Our Datadog bill is skyrocketing, and for what? to store masterpieces like:
Process starting...
Process started...
Process REALLY started...
Plus 300 lines of “not-an-error” stack traces.
Every time I ask them to log less, or "let me create a filter for that", I get
“we might need it later” or “it’s only a few lines” - sure, times 3 million.
Anyone else fighting the “log everything forever” cult? How do you handle this type of battle? as I need them to agree to drop much of that spend, but also respect that they may need some of it...
https://redd.it/1phs661
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Amateur Docker mistake
Hello all,
VERY much an amateur here, just now learning Docker and things. I have been working on a small project to learn using Nexus and Docker.
Since I have a new Mac, I was informed running Nexus via Docker was best due to some OS limitations. Well, everything worked fine until I made one dumb rookie mistake.
I created a repo named “docker hosted” on Nexus and needed to add port 8083. So I stopped my container. Removed it and added the additional port. What my uneducated amateur brain didn’t realize was doing this would cause me to generate a new admin password and lose all the previous user, role, blob store and rules I had created.
If you ask about backups, the project I’ve been following along with didn’t do that or hadn’t talked about that yet. So no backups. I looked for the volumes on my machine and unfortunately the previous one wasn’t there.
All this to say.. when you were first learning.. did you make any silly mistakes like this?
I feel real dumb. lol thankfully this is just for learning experience and not for work.
https://redd.it/1phs7xc
@r_devops
Hello all,
VERY much an amateur here, just now learning Docker and things. I have been working on a small project to learn using Nexus and Docker.
Since I have a new Mac, I was informed running Nexus via Docker was best due to some OS limitations. Well, everything worked fine until I made one dumb rookie mistake.
I created a repo named “docker hosted” on Nexus and needed to add port 8083. So I stopped my container. Removed it and added the additional port. What my uneducated amateur brain didn’t realize was doing this would cause me to generate a new admin password and lose all the previous user, role, blob store and rules I had created.
If you ask about backups, the project I’ve been following along with didn’t do that or hadn’t talked about that yet. So no backups. I looked for the volumes on my machine and unfortunately the previous one wasn’t there.
All this to say.. when you were first learning.. did you make any silly mistakes like this?
I feel real dumb. lol thankfully this is just for learning experience and not for work.
https://redd.it/1phs7xc
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
6 years in devops — do i need to study dsa now?
hey folks,
i’ve been a devops engineer for about 6 years, mostly working with kubernetes and cloud infra. my role hasn’t really involved much coding.
now i’m aiming for bigger companies in India, and i keep hearing that they ask dsa in the first round even for devops roles. i don’t mind learning dsa if it’s actually needed, but i’m wondering if it’s worth the time.
for those who’ve interviewed recently, is dsa really required for devops/sre roles at big companies, or should i focus more on system design, cloud, and infra instead?
thanks in advance!
https://redd.it/1phfe4o
@r_devops
hey folks,
i’ve been a devops engineer for about 6 years, mostly working with kubernetes and cloud infra. my role hasn’t really involved much coding.
now i’m aiming for bigger companies in India, and i keep hearing that they ask dsa in the first round even for devops roles. i don’t mind learning dsa if it’s actually needed, but i’m wondering if it’s worth the time.
for those who’ve interviewed recently, is dsa really required for devops/sre roles at big companies, or should i focus more on system design, cloud, and infra instead?
thanks in advance!
https://redd.it/1phfe4o
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Need Suggestions
Actually, i completed my Devops learning journey as much needed for fresher to get job.
I started applying and I know it's takes time to get job now. Because I am fresher and also from non it background with not it degree.
Therefore I need to keep patience.
Along with applying, i need to practice my things regularly so that I won't forget anything.
So my question is hos should I divide my timing for both- i have total 3.5 hours daily.
Consider these points as well before answering:
I need job it's very important for me
But patient i need to consider
Also just for revision and keep practicing is also important
Note: just divide timing between applying and practical
https://redd.it/1phweuz
@r_devops
Actually, i completed my Devops learning journey as much needed for fresher to get job.
I started applying and I know it's takes time to get job now. Because I am fresher and also from non it background with not it degree.
Therefore I need to keep patience.
Along with applying, i need to practice my things regularly so that I won't forget anything.
So my question is hos should I divide my timing for both- i have total 3.5 hours daily.
Consider these points as well before answering:
I need job it's very important for me
But patient i need to consider
Also just for revision and keep practicing is also important
Note: just divide timing between applying and practical
https://redd.it/1phweuz
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
What’s an AI tool you tried recently that actually earned a permanent spot in your workflow?
Lately it feels like there’s a new “game-changing” AI tool dropping every 10 minutes, slick websites, big claims, and then… I use it once and never open it again.
I keep finding myself going back to the same few tools, so I’m genuinely curious:
Has anything you’ve tried recently stuck enough to become part of your daily or weekly routine?
Not talking about hype or one-off demos, I mean a tool that genuinely surprised you and proved useful long-term.
Always looking for real recommendations from people who actually use this stuff, not marketing pages.
https://redd.it/1phyeoo
@r_devops
Lately it feels like there’s a new “game-changing” AI tool dropping every 10 minutes, slick websites, big claims, and then… I use it once and never open it again.
I keep finding myself going back to the same few tools, so I’m genuinely curious:
Has anything you’ve tried recently stuck enough to become part of your daily or weekly routine?
Not talking about hype or one-off demos, I mean a tool that genuinely surprised you and proved useful long-term.
Always looking for real recommendations from people who actually use this stuff, not marketing pages.
https://redd.it/1phyeoo
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Need your opinions
Hey devops folk i got a question which is a common thing every dev thinks. What you think about DSA? I mean i have seen so many resumes in this sub but I didn't saw DSA thing in those resumes. Isn't DSA important in devops? I shared this question with my aunt's son who works in Japan. He was first in SRE and later shifted towards DevSecOps. He used to work in rakuten and then now working in different company. He told me that without knowing coding no company will even ask you. He said coding is also important and having knowledge of DSA is also important. He said that tool like docker.k8s,aws, linux,etc are common and you can see these tools on every resume. I mean he is not wrong and he has 7+ Y.O.E. What you think about this? I have seen devops courses from Scalar,geeksforgeeks and they also had DSA in there curriculum. So please share your opinions because i think that 90%+ students are going in devops just to avoid dsa and coding or they are going for better package. I haven't seen any youtuber discussing about DSA in devops. So is DSA also important just like any other tools which we are learning in devops?
https://redd.it/1phziiz
@r_devops
Hey devops folk i got a question which is a common thing every dev thinks. What you think about DSA? I mean i have seen so many resumes in this sub but I didn't saw DSA thing in those resumes. Isn't DSA important in devops? I shared this question with my aunt's son who works in Japan. He was first in SRE and later shifted towards DevSecOps. He used to work in rakuten and then now working in different company. He told me that without knowing coding no company will even ask you. He said coding is also important and having knowledge of DSA is also important. He said that tool like docker.k8s,aws, linux,etc are common and you can see these tools on every resume. I mean he is not wrong and he has 7+ Y.O.E. What you think about this? I have seen devops courses from Scalar,geeksforgeeks and they also had DSA in there curriculum. So please share your opinions because i think that 90%+ students are going in devops just to avoid dsa and coding or they are going for better package. I haven't seen any youtuber discussing about DSA in devops. So is DSA also important just like any other tools which we are learning in devops?
https://redd.it/1phziiz
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
GitLab CI trigger merge request pipeline on push to target branch
Is there any way to trigger merge request pipeline on push/merge to TARGET (aka main) branch? Default behavior of
Maybe there is any other way to handle it? It's important to retrigger tests on MR-s after any change in main branch as they may not be valid
Now I'm looking into server hooks or just restart MR test jobs by API on merge/push to main in additional job
https://redd.it/1pi16io
@r_devops
Is there any way to trigger merge request pipeline on push/merge to TARGET (aka main) branch? Default behavior of
if: $CI_PIPELINE_SOURCE == 'merge_request_event' does not provide such behaviorMaybe there is any other way to handle it? It's important to retrigger tests on MR-s after any change in main branch as they may not be valid
Now I'm looking into server hooks or just restart MR test jobs by API on merge/push to main in additional job
https://redd.it/1pi16io
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Is it possible to run iOS CI/CD from a Jenkins Linux build node? (Mac agents isn't an option) - Anyone used xtool?
I'm trying to set up CI/CD for an iOS app, but we cannot use Jenkins macOS agents (no EC2 Mac, no on-prem Mac minis - Mac based EC2 instance are crazy-ass costly ).
Our entire pipeline runs on Linux-based Jenkins nodes, and we’d prefer to keep it that way.
I came across xtool, which claims to let you run iOS builds from Linux by offloading the actual Xcode build to their cloud macOS environment: https://github.com/xtool-org/xtool
Has anyone here:
1. Run iOS CI/CD entirely from Linux Jenkins using something like xtool?
2. Used xtool in production? How reliable is it?
3. Faced any limitations (signing, keychain handling, test runners, caching, build times)?
Basically:
Is xtool a viable alternative to running a Jenkins Mac node?
Or am I missing something fundamental in the iOS build pipeline that still requires macOS locally?
Any guidance or real-world experience would be super helpful :)
https://redd.it/1pi1gwj
@r_devops
I'm trying to set up CI/CD for an iOS app, but we cannot use Jenkins macOS agents (no EC2 Mac, no on-prem Mac minis - Mac based EC2 instance are crazy-ass costly ).
Our entire pipeline runs on Linux-based Jenkins nodes, and we’d prefer to keep it that way.
I came across xtool, which claims to let you run iOS builds from Linux by offloading the actual Xcode build to their cloud macOS environment: https://github.com/xtool-org/xtool
Has anyone here:
1. Run iOS CI/CD entirely from Linux Jenkins using something like xtool?
2. Used xtool in production? How reliable is it?
3. Faced any limitations (signing, keychain handling, test runners, caching, build times)?
Basically:
Is xtool a viable alternative to running a Jenkins Mac node?
Or am I missing something fundamental in the iOS build pipeline that still requires macOS locally?
Any guidance or real-world experience would be super helpful :)
https://redd.it/1pi1gwj
@r_devops
GitHub
GitHub - xtool-org/xtool: Cross-platform Xcode replacement. Build and deploy iOS apps with SwiftPM on Linux, Windows, macOS.
Cross-platform Xcode replacement. Build and deploy iOS apps with SwiftPM on Linux, Windows, macOS. - xtool-org/xtool
Is anyone using feature flags to implement chaos engineering techniques?
I'm thinking of failure injections like additional latency, API timeouts, dependency errors, etc.
It sounds useful to have a deploy-free way to inject chaos using a flag. But you also have automatic circuit breakers and other mechanisms in place to remediate issues. Is there an overlapping?
How do you integrate feature flags and kill switches with chaos experiments, circuit breakers, and so on?
https://redd.it/1pi4x79
@r_devops
I'm thinking of failure injections like additional latency, API timeouts, dependency errors, etc.
It sounds useful to have a deploy-free way to inject chaos using a flag. But you also have automatic circuit breakers and other mechanisms in place to remediate issues. Is there an overlapping?
How do you integrate feature flags and kill switches with chaos experiments, circuit breakers, and so on?
https://redd.it/1pi4x79
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How do you manage an application on a single server (eg hetzner)
I've been having a play recently with a hetzner server and, though I wouldn't be surprised to hear it's a "skill issue", I can't seem to see how people manage applications on them.
That isn't anything against hetzner, I enjoyed using it. But I found I ended up gravitating to multi-cloud (GCP and Hetzner) in order to have access to secrets, artifact registry (for docker images), service accounts and so on.
So I'm just curious whether using things like this obviously requires something like GCP (or whatever other services other than Hetzner), or if there are approaches / workflows I'm unaware of.
Cheers!
https://redd.it/1pi4eqg
@r_devops
I've been having a play recently with a hetzner server and, though I wouldn't be surprised to hear it's a "skill issue", I can't seem to see how people manage applications on them.
That isn't anything against hetzner, I enjoyed using it. But I found I ended up gravitating to multi-cloud (GCP and Hetzner) in order to have access to secrets, artifact registry (for docker images), service accounts and so on.
So I'm just curious whether using things like this obviously requires something like GCP (or whatever other services other than Hetzner), or if there are approaches / workflows I'm unaware of.
Cheers!
https://redd.it/1pi4eqg
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Security Misconfiguration: The 90% Problem That Never Goes Away ⚙️
https://instatunnel.my/blog/security-misconfiguration-the-90-problem-that-never-goes-away
https://redd.it/1pi7y7v
@r_devops
https://instatunnel.my/blog/security-misconfiguration-the-90-problem-that-never-goes-away
https://redd.it/1pi7y7v
@r_devops
InstaTunnel
Security Misconfiguration: The #1 Cause of Breaches in 90%
Discover why 90% of applications suffer from security misconfiguration. From default passwords to exposed admin panels,learn how attackers exploit configuration
I built a CLI tool to deploy to Docker Swarm like it's Vercel (Secrets rotation, Multi-env)
Hi everyone,
I love Docker Swarm for its simplicity, but I hated managing deployments manually. Kubernetes felt like overkill for my use case, but writing bash noscripts to handle
So I wrote Rollwave.
It's an open-source CLI tool written in Go that acts as a wrapper around Docker Swarm to give you a modern deployment experience.
Key Features:
🔒 Zero-Downtime Secret Rotation: It automatically versions your secrets (e.g., `db_pass_v1`, `db_pass_v2`) and updates your services without downtime.
🌍 Multi-Environment Support: You can define
🧹 Auto-Cleanup: It automatically removes old, unused secrets after a successful deploy.
🏗️ Build & Push: It handles the entire build pipeline (including private registry auth) based on your standard
It's currently in Alpha/MVP, but I'm using it for my own projects. I'd love to know what you think!
GitHub: https://github.com/rollwave-dev/rollwave
https://redd.it/1pi9hz0
@r_devops
Hi everyone,
I love Docker Swarm for its simplicity, but I hated managing deployments manually. Kubernetes felt like overkill for my use case, but writing bash noscripts to handle
docker build, docker tag, docker secret create, and docker stack deploy was becoming a nightmare.So I wrote Rollwave.
It's an open-source CLI tool written in Go that acts as a wrapper around Docker Swarm to give you a modern deployment experience.
Key Features:
🔒 Zero-Downtime Secret Rotation: It automatically versions your secrets (e.g., `db_pass_v1`, `db_pass_v2`) and updates your services without downtime.
🌍 Multi-Environment Support: You can define
staging and production environments in one rollwave.yml and deploy with rollwave deploy --env staging.🧹 Auto-Cleanup: It automatically removes old, unused secrets after a successful deploy.
🏗️ Build & Push: It handles the entire build pipeline (including private registry auth) based on your standard
docker-compose.yml.It's currently in Alpha/MVP, but I'm using it for my own projects. I'd love to know what you think!
GitHub: https://github.com/rollwave-dev/rollwave
https://redd.it/1pi9hz0
@r_devops
GitHub
GitHub - rollwave-dev/rollwave: Rollwave — frictionless production deployments using Docker Swarm + Compose. A modern developer…
Rollwave — frictionless production deployments using Docker Swarm + Compose. A modern developer-friendly tool for building, pushing and deploying containerized applications—without Kubernetes compl...
is 40% infrastructure waste just the industry standard?
Posted yesterday in r/kubernetes about how every cluster I audit seems to have 40-50% memory waste, and the thread turned into a massive debate about fear-based provisioning.
The pattern i'm seeing everywhere is developers requesting huge limits (e.g., 8Gi) for apps that sit at 500Mi usage. When asked why, the answer is always "we're terrified of OOMKills."
We are basically paying a fear tax to AWS just to soothe anxiety.
Wanted to get the r/devops perspective on this since you guys deal with the process side more: is this a tooling failure (we need better VPA/autoscaling) or a culture failure (devs have zero incentive to care about costs)?
I wrote a bash noscript to quantify this gap and found \~$40k/yr of fear waste on a single medium cluster.
Curious if you guys fight this battle or just accept the 40% waste as the cost of doing business?
noscript i used to find the waste is here if you want to check your own ratios:https://github.com/WozzHQ/wozz
https://redd.it/1pib0u7
@r_devops
Posted yesterday in r/kubernetes about how every cluster I audit seems to have 40-50% memory waste, and the thread turned into a massive debate about fear-based provisioning.
The pattern i'm seeing everywhere is developers requesting huge limits (e.g., 8Gi) for apps that sit at 500Mi usage. When asked why, the answer is always "we're terrified of OOMKills."
We are basically paying a fear tax to AWS just to soothe anxiety.
Wanted to get the r/devops perspective on this since you guys deal with the process side more: is this a tooling failure (we need better VPA/autoscaling) or a culture failure (devs have zero incentive to care about costs)?
I wrote a bash noscript to quantify this gap and found \~$40k/yr of fear waste on a single medium cluster.
Curious if you guys fight this battle or just accept the 40% waste as the cost of doing business?
noscript i used to find the waste is here if you want to check your own ratios:https://github.com/WozzHQ/wozz
https://redd.it/1pib0u7
@r_devops
GitHub
GitHub - WozzHQ/wozz: Multi-layered defense against Kubernetes waste. Layer 1: Audit CLI. Layer 2: GitHub Action PR Bot.
Multi-layered defense against Kubernetes waste. Layer 1: Audit CLI. Layer 2: GitHub Action PR Bot. - WozzHQ/wozz
What's a "don't do this" lesson that took you years to learn?
After years of writing code, I've got a mental list of things I wish I'd known earlier. Not architecture patterns or frameworks — just practical stuff like:
* Don't refactor and add features in the same PR
* Don't skip writing tests "just this once"
* Don't review code when you're tired
Simple things. But I learned most of them by screwing up first.
What's on your list? What's something that seems obvious now but took you years (or a painful incident) to actually follow?
https://redd.it/1pic0a4
@r_devops
After years of writing code, I've got a mental list of things I wish I'd known earlier. Not architecture patterns or frameworks — just practical stuff like:
* Don't refactor and add features in the same PR
* Don't skip writing tests "just this once"
* Don't review code when you're tired
Simple things. But I learned most of them by screwing up first.
What's on your list? What's something that seems obvious now but took you years (or a painful incident) to actually follow?
https://redd.it/1pic0a4
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Non-UNIX administration?
Hey! I have interest in some less popular OS. For example, right now I have interest in FreeBSD to try to learn jails, play around with ZFS and stuff like that.
My question: is it actually a useful skill? As I understand the field, the non-UNIX administration is really not something that companies look for when hiring DevOps Engineers. Maybe I am wrong and there is an area where (for example) FreeBSD is thriving and cannot be replaced?
https://redd.it/1piakms
@r_devops
Hey! I have interest in some less popular OS. For example, right now I have interest in FreeBSD to try to learn jails, play around with ZFS and stuff like that.
My question: is it actually a useful skill? As I understand the field, the non-UNIX administration is really not something that companies look for when hiring DevOps Engineers. Maybe I am wrong and there is an area where (for example) FreeBSD is thriving and cannot be replaced?
https://redd.it/1piakms
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
I built envsgen: generate docker-compose files, dotenvs, JSON, and YAML from a single TOML config (with imports, variables, shell commands expansion)
Managing multiple services for my self-hosted projects meant rewriting the same env vars in a dozen places. Eventually I snapped and wrote **envsgen**, a small Go CLI that makes one TOML file the “master config” for everything.
Keeps in mind it can has bug as it is my first release, but it works.
Repo: [https://github.com/mcisback/envsgen](https://github.com/mcisback/envsgen)
Medium: [https://marcocaggiano.medium.com/awesome-devops-share-data-between-docker-dotenvs-secrets-and-apps-b909ff346cd3](https://marcocaggiano.medium.com/awesome-devops-share-data-between-docker-dotenvs-secrets-and-apps-b909ff346cd3)
Features:
* Imports (`#!import`)
* `${path.to.value}` references
* `${envs.MY_VAR}` for environment lookups
* `${\\`shell command`\\\`}`if you enable`\--allow-shell\`
* Inheritance (e.g. `backend.local` inherits `backend`)
* Output to dotenv, JSON, YAML, or **docker-compose.yaml**
* `--expand` flattens nested sections for .env formats
Now I can generate docker-compose + backend.env + production.env from the same file, no more duplication.
Happy to hear ideas or improvements!
https://redd.it/1pib7qz
@r_devops
Managing multiple services for my self-hosted projects meant rewriting the same env vars in a dozen places. Eventually I snapped and wrote **envsgen**, a small Go CLI that makes one TOML file the “master config” for everything.
Keeps in mind it can has bug as it is my first release, but it works.
Repo: [https://github.com/mcisback/envsgen](https://github.com/mcisback/envsgen)
Medium: [https://marcocaggiano.medium.com/awesome-devops-share-data-between-docker-dotenvs-secrets-and-apps-b909ff346cd3](https://marcocaggiano.medium.com/awesome-devops-share-data-between-docker-dotenvs-secrets-and-apps-b909ff346cd3)
Features:
* Imports (`#!import`)
* `${path.to.value}` references
* `${envs.MY_VAR}` for environment lookups
* `${\\`shell command`\\\`}`if you enable`\--allow-shell\`
* Inheritance (e.g. `backend.local` inherits `backend`)
* Output to dotenv, JSON, YAML, or **docker-compose.yaml**
* `--expand` flattens nested sections for .env formats
Now I can generate docker-compose + backend.env + production.env from the same file, no more duplication.
Happy to hear ideas or improvements!
https://redd.it/1pib7qz
@r_devops
GitHub
GitHub - mcisback/envsgen
Contribute to mcisback/envsgen development by creating an account on GitHub.
Is there a good way to route requests to a specific instance of an API?
I am setting up a service that will be consumed exclusively through a client library. We will have multiple instances of the service with some instances being shared by multiple customers and some being dedicated to a specific customer. In our database, we have a table that maps the customer id to the specific instance ip their requests are supposed to go to. I am now trying to figure out how to route requests to the correct instance. Note, we already have an authentication mechanism set up that will reject requests if they are sent to the wrong instance, so here I am just figuring out how to route requests assuming the service is being used as intended.
My first thought was to send all requests to one load balancer or api gateway, include a header with the customer id, and have the load balancer route the request to the correct instance based on the customer id. We would want to use one of GCP or AWS's managed load balancers for this though, and I was not able to find a good way to manually specify fine grained routing rules like this for those services. They allow you to specify url maps with routing conditions, but this seems intended for routing requests to different apis rather than routing to specific instances of the same api.
My next thought was to have our client library make an initial request to a shared service that holds the customer id/instance ip map, get the ip of the customer's service and then make requests directly to that service (which will have its own load balancer in front of it) from there. This would work, but it feels a little hacky and has a fair number of edge cases that would need to be handled in the client library.
Anyone have ideas on how you would handle this kind of routing?
Edit: Here by "instance" I really mean a stand alone scalable deployment. Due to some stateful dependencies we need all of the requests from a single customer to go to one deployment.
https://redd.it/1pigr8t
@r_devops
I am setting up a service that will be consumed exclusively through a client library. We will have multiple instances of the service with some instances being shared by multiple customers and some being dedicated to a specific customer. In our database, we have a table that maps the customer id to the specific instance ip their requests are supposed to go to. I am now trying to figure out how to route requests to the correct instance. Note, we already have an authentication mechanism set up that will reject requests if they are sent to the wrong instance, so here I am just figuring out how to route requests assuming the service is being used as intended.
My first thought was to send all requests to one load balancer or api gateway, include a header with the customer id, and have the load balancer route the request to the correct instance based on the customer id. We would want to use one of GCP or AWS's managed load balancers for this though, and I was not able to find a good way to manually specify fine grained routing rules like this for those services. They allow you to specify url maps with routing conditions, but this seems intended for routing requests to different apis rather than routing to specific instances of the same api.
My next thought was to have our client library make an initial request to a shared service that holds the customer id/instance ip map, get the ip of the customer's service and then make requests directly to that service (which will have its own load balancer in front of it) from there. This would work, but it feels a little hacky and has a fair number of edge cases that would need to be handled in the client library.
Anyone have ideas on how you would handle this kind of routing?
Edit: Here by "instance" I really mean a stand alone scalable deployment. Due to some stateful dependencies we need all of the requests from a single customer to go to one deployment.
https://redd.it/1pigr8t
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Join the on-call roster, it’ll change your life
Joining an on-call rotation might change the future of your career and maybe even you as a person. Joining the roster 9 years ago has definitely changed me. In this article, I shared my experience being on-call.
Link: https://serce.me/posts/2025-12-09-join-oncall-it-will-change-your-life
https://redd.it/1pijp5t
@r_devops
Joining an on-call rotation might change the future of your career and maybe even you as a person. Joining the roster 9 years ago has definitely changed me. In this article, I shared my experience being on-call.
Link: https://serce.me/posts/2025-12-09-join-oncall-it-will-change-your-life
https://redd.it/1pijp5t
@r_devops
Join the on-call roster, it’ll change your life
Joining an on-call rotation might change the future of your career – and maybe you as a person. This article shares my experience being on-call.
Need brutally honest feedback: Am I employable as an internal tools/automation engineer with my background?
I'd really appreciate candid, unbiased feedback.
I’m based in Toronto and trying to understand where I realistically fit into the tech job market. My background is non-traditional, and I’ve developed a fear that I’m underqualified for most software roles despite being able to build a lot of things.
My background:
I was the main tech person at a small hedge fund that launched in 2021.
I built all the internal trading and operations tools from scratch:
PnL/exposure dashboards
Efficient trade executors
Signal engines built with insights from PM, deployed on EC2 communicated to client (traders') side noscripts through sockets.
automated margin checks
reconciliation pipelines
Excel/Python hybrid tools for ops
Basically: if the team needed something automated or streamlined, I designed and built it.
Where I feel confident:
I’m very comfortable:
understanding messy business processes
abstracting them into clean systems
building reliable automations
shipping internal tools quickly
integrating APIs
automating workflows for non-technical users
designing guardrails so people don’t make mistakes
Across domains, I feel I could pick up any internal bottleneck and automate it.
Where I feel unprepared / insecure:
Because I was the only technical person:
I never learned Agile/Scrum
never used Jira or any formal ticketing
barely used SQL (everything was Python + Excel)
never worked with other engineers
didn’t learn proper software development patterns
no pull requests, no code reviews
no experience building public products or services
I worry that I’m mostly a “noscript kiddie” who built robust systems by intuition, but not a “proper software engineer.”
The fund manager was a trained software engineer but gave me full freedom as long as the tools worked — which I loved, but now I’m worried I skipped important foundational learning.
My questions for people working in tech today:
1. Is someone with my background employable for internal tools or automation engineering roles in Canada?
2. If not, what specific skills should I prioritize learning to become employable?
SQL?
TypeScript/React?
DevOps?
Software architecture?
3. What kinds of roles would someone like me realistically be competitive for?
Internal tools engineer?
Automation engineer?
Operations engineer?
AI automation roles?
4. Is it realistic for someone with mostly Python + automation experience (but little formal SWE experience) to land roles in the ~80–110k range in Canada?
5. If you were in my position, what would you do next to fix the gaps and move forward?
I’m not looking for comfort — I genuinely want realistic, even harsh feedback from people who understand the current job market.
Thanks in advance to anyone who takes the time to answer.
https://redd.it/1piihs4
@r_devops
I'd really appreciate candid, unbiased feedback.
I’m based in Toronto and trying to understand where I realistically fit into the tech job market. My background is non-traditional, and I’ve developed a fear that I’m underqualified for most software roles despite being able to build a lot of things.
My background:
I was the main tech person at a small hedge fund that launched in 2021.
I built all the internal trading and operations tools from scratch:
PnL/exposure dashboards
Efficient trade executors
Signal engines built with insights from PM, deployed on EC2 communicated to client (traders') side noscripts through sockets.
automated margin checks
reconciliation pipelines
Excel/Python hybrid tools for ops
Basically: if the team needed something automated or streamlined, I designed and built it.
Where I feel confident:
I’m very comfortable:
understanding messy business processes
abstracting them into clean systems
building reliable automations
shipping internal tools quickly
integrating APIs
automating workflows for non-technical users
designing guardrails so people don’t make mistakes
Across domains, I feel I could pick up any internal bottleneck and automate it.
Where I feel unprepared / insecure:
Because I was the only technical person:
I never learned Agile/Scrum
never used Jira or any formal ticketing
barely used SQL (everything was Python + Excel)
never worked with other engineers
didn’t learn proper software development patterns
no pull requests, no code reviews
no experience building public products or services
I worry that I’m mostly a “noscript kiddie” who built robust systems by intuition, but not a “proper software engineer.”
The fund manager was a trained software engineer but gave me full freedom as long as the tools worked — which I loved, but now I’m worried I skipped important foundational learning.
My questions for people working in tech today:
1. Is someone with my background employable for internal tools or automation engineering roles in Canada?
2. If not, what specific skills should I prioritize learning to become employable?
SQL?
TypeScript/React?
DevOps?
Software architecture?
3. What kinds of roles would someone like me realistically be competitive for?
Internal tools engineer?
Automation engineer?
Operations engineer?
AI automation roles?
4. Is it realistic for someone with mostly Python + automation experience (but little formal SWE experience) to land roles in the ~80–110k range in Canada?
5. If you were in my position, what would you do next to fix the gaps and move forward?
I’m not looking for comfort — I genuinely want realistic, even harsh feedback from people who understand the current job market.
Thanks in advance to anyone who takes the time to answer.
https://redd.it/1piihs4
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Malware on application server
I’m a 3rd year IT student on the ops team for a DevOps class where the devs are building a .NET application.
Earlier today I noticed a suspicious process called b0s running from `/var/tmp/.b0s` and eating a ridiculous amount of CPU. After digging into it I realized the application server was actually compromised. There were:
* strange binaries dropped in `/var/tmp` and `/tmp`
* a fake sshd running from `/var/tmp/sshd`
* cronjobs that kept recreating themselves every minute
With some AI help I cleaned everything up. I killed the active malware processes, and removed all the persistence so the server is stable again.
I built the application server with Ansible so rebuilding it tomorrow will be easy… still mad embarrassing though ngl.
https://redd.it/1piol32
@r_devops
I’m a 3rd year IT student on the ops team for a DevOps class where the devs are building a .NET application.
Earlier today I noticed a suspicious process called b0s running from `/var/tmp/.b0s` and eating a ridiculous amount of CPU. After digging into it I realized the application server was actually compromised. There were:
* strange binaries dropped in `/var/tmp` and `/tmp`
* a fake sshd running from `/var/tmp/sshd`
* cronjobs that kept recreating themselves every minute
With some AI help I cleaned everything up. I killed the active malware processes, and removed all the persistence so the server is stable again.
I built the application server with Ansible so rebuilding it tomorrow will be easy… still mad embarrassing though ngl.
https://redd.it/1piol32
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community