Fast API with celery worker
Deployment strategy GitHub actions - ECS - EC2
EC2 2cpu - 4GB
Nginx serving front end less than 500mb
Fast API 1GB
Celery worker (fast api image )
API have a upload requirement but any time there’s an upload the fast API service restarts with 137 OOM out of memory…
File size 2kb
https://redd.it/1psjbug
@r_devops
Deployment strategy GitHub actions - ECS - EC2
EC2 2cpu - 4GB
Nginx serving front end less than 500mb
Fast API 1GB
Celery worker (fast api image )
API have a upload requirement but any time there’s an upload the fast API service restarts with 137 OOM out of memory…
File size 2kb
https://redd.it/1psjbug
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Why the hell do container images come with a full freaking OS I don't need?
Seriously, who decided my Go binary needs bash, curl, and 47 other utilities it'll never touch? I'm drowning in CVE alerts for stuff that has zero business being in production containers. Half my vulnerability backlog is noise from base image bloat.
Anyone actually using distroless or minimal images in prod? How'd you sell the team on it? Devs are whining they can't shell into containers to debug anymore but honestly that sounds like a feature not a bug.
Need practical advice on making the switch without breaking everything.
https://redd.it/1pskpsd
@r_devops
Seriously, who decided my Go binary needs bash, curl, and 47 other utilities it'll never touch? I'm drowning in CVE alerts for stuff that has zero business being in production containers. Half my vulnerability backlog is noise from base image bloat.
Anyone actually using distroless or minimal images in prod? How'd you sell the team on it? Devs are whining they can't shell into containers to debug anymore but honestly that sounds like a feature not a bug.
Need practical advice on making the switch without breaking everything.
https://redd.it/1pskpsd
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
6 years in DevOps (~14 projects) and I’m burning out — considering management or cybersecurity
I’m looking for some perspective from people who’ve been in this field longer than me.
I’ve been working in DevOps for \~6 years. I wouldn’t call myself a “rockstar” or a deep specialist in one niche, but I’ve had decent breadth: I’ve worked across \~14 projects(many different technologies). I’ve touched a lot of the standard DevOps stack: AWS/Azure, Terraform, Kubernetes, differents CI/CD, Helm charts, the usual stuff.
And lately I’ve been asking myself: do I actually want to keep doing this long-term?
I’m not quitting tomorrow, but I’m noticing something that looks a lot like burnout (or at least the early version of it). The biggest issue isn’t that I can’t do the work — it’s that I’m losing interest in the idea of being a “Senior DevOps” whose life is just… shipping more Helm charts and deployment pipelines forever. I’m starting to worry that if I keep pushing the same path, I’ll end up stuck and miserable.
On top of that, I’ve been thinking a lot about doing something that feels genuinely useful / meaningful. For me, “useful” looks like working on problems that matter beyond shipping features — and honestly, I’ve always seen military work as something with a stronger sense of purpose. That made me consider a longer-term plan: move into cybersecurity and potentially transition from civilian work into a military role (or defense-related work). My hope is that it would give me a stronger feeling that I’m doing something important.
So I’ve started thinking about alternative directions that still use my background, but feel like forward movement rather than “more of the same.” A few paths I’m considering:
Engineering manager / technical project manager / delivery-type role I have a 3-year degree in IT Project Management.
Cybersecurity (especially cloud/Kubernetes security, incident response, defensive security) Potentially aiming for a role that could translate into defense/military work later.
What I’m hoping to get from this post:
1. If you hit this “I don’t want to do the same DevOps work forever” phase — what did you do?
2. For people who moved from DevOps into management (EM/PM/TPM) — what skills mattered most and what surprised you?
3. For people who moved from DevOps into cybersecurity — what was the best entry point (cloud security, detection/response, security engineering, GRC, etc.) and what would you do differently?
4. Any advice for figuring out whether this is real burnout vs just needing a change of project/company?
5. If anyone has experience moving from civilian tech into defense/military-related work (even indirectly) — what should I know upfront?
I’d really appreciate any stories, recommendations, or even “here’s what I wish I knew earlier.”
Thanks.
https://redd.it/1psn7om
@r_devops
I’m looking for some perspective from people who’ve been in this field longer than me.
I’ve been working in DevOps for \~6 years. I wouldn’t call myself a “rockstar” or a deep specialist in one niche, but I’ve had decent breadth: I’ve worked across \~14 projects(many different technologies). I’ve touched a lot of the standard DevOps stack: AWS/Azure, Terraform, Kubernetes, differents CI/CD, Helm charts, the usual stuff.
And lately I’ve been asking myself: do I actually want to keep doing this long-term?
I’m not quitting tomorrow, but I’m noticing something that looks a lot like burnout (or at least the early version of it). The biggest issue isn’t that I can’t do the work — it’s that I’m losing interest in the idea of being a “Senior DevOps” whose life is just… shipping more Helm charts and deployment pipelines forever. I’m starting to worry that if I keep pushing the same path, I’ll end up stuck and miserable.
On top of that, I’ve been thinking a lot about doing something that feels genuinely useful / meaningful. For me, “useful” looks like working on problems that matter beyond shipping features — and honestly, I’ve always seen military work as something with a stronger sense of purpose. That made me consider a longer-term plan: move into cybersecurity and potentially transition from civilian work into a military role (or defense-related work). My hope is that it would give me a stronger feeling that I’m doing something important.
So I’ve started thinking about alternative directions that still use my background, but feel like forward movement rather than “more of the same.” A few paths I’m considering:
Engineering manager / technical project manager / delivery-type role I have a 3-year degree in IT Project Management.
Cybersecurity (especially cloud/Kubernetes security, incident response, defensive security) Potentially aiming for a role that could translate into defense/military work later.
What I’m hoping to get from this post:
1. If you hit this “I don’t want to do the same DevOps work forever” phase — what did you do?
2. For people who moved from DevOps into management (EM/PM/TPM) — what skills mattered most and what surprised you?
3. For people who moved from DevOps into cybersecurity — what was the best entry point (cloud security, detection/response, security engineering, GRC, etc.) and what would you do differently?
4. Any advice for figuring out whether this is real burnout vs just needing a change of project/company?
5. If anyone has experience moving from civilian tech into defense/military-related work (even indirectly) — what should I know upfront?
I’d really appreciate any stories, recommendations, or even “here’s what I wish I knew earlier.”
Thanks.
https://redd.it/1psn7om
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
What are the biggest observability challenges with AI agents, ML, and multi‑cloud?
As more teams adopt AI agents, ML‑driven automation, and multi‑cloud setups, observability feels a lot more complicated than “collect logs and add dashboards.”
My biggest problem right now: I often wait hours before I even know what failed or where in the flow it failed. I see symptoms (alerts, errors), but not a clear view of which stage in a complex workflow actually broke.
I’d love to hear from people running real systems:
1. What’s the single biggest challenge you face today in observability with AI/agent‑driven changes or ML‑based systems?
2. How do you currently debug or audit actions taken by AI agents (auto‑remediation, config changes, PR updates, etc.)?
3. In a multi‑cloud setup (AWS/GCP/Azure/on‑prem), what’s hardest for you: data collection, correlation, cost/latency, IAM/permissions, or something else?
4. If you could snap your fingers and get one “observability superpower” for this new world (agents + ML + multi‑cloud), what would it be?
Extra helpful if you can share concrete incidents or war stories where:
Something broke and it was hard to tell whether an agent/ML system or a human caused it.
Traditional logs/metrics/traces weren’t enough to explain the sequence of stages or who/what did what when.
Looking forward to learning from what you’re seeing on the ground.
https://redd.it/1psn5qc
@r_devops
As more teams adopt AI agents, ML‑driven automation, and multi‑cloud setups, observability feels a lot more complicated than “collect logs and add dashboards.”
My biggest problem right now: I often wait hours before I even know what failed or where in the flow it failed. I see symptoms (alerts, errors), but not a clear view of which stage in a complex workflow actually broke.
I’d love to hear from people running real systems:
1. What’s the single biggest challenge you face today in observability with AI/agent‑driven changes or ML‑based systems?
2. How do you currently debug or audit actions taken by AI agents (auto‑remediation, config changes, PR updates, etc.)?
3. In a multi‑cloud setup (AWS/GCP/Azure/on‑prem), what’s hardest for you: data collection, correlation, cost/latency, IAM/permissions, or something else?
4. If you could snap your fingers and get one “observability superpower” for this new world (agents + ML + multi‑cloud), what would it be?
Extra helpful if you can share concrete incidents or war stories where:
Something broke and it was hard to tell whether an agent/ML system or a human caused it.
Traditional logs/metrics/traces weren’t enough to explain the sequence of stages or who/what did what when.
Looking forward to learning from what you’re seeing on the ground.
https://redd.it/1psn5qc
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Lewin and modern DevOps
I recently read an amazing piece by Dr. Richard Claydon called “Lewin, Rewritten: Rethinking “How Change Works” for a Run / Serve / Change World”,
it explores Kurt Lewin’s change models in a modern context, and my thoughts immediately wandered into the world of DevOps.
We spend so much time talking about the "DevOps" toolchain: Kubernetes, Cloud platforms, DORA metrics. But anyone who has led a transformation knows the tools are rarely (if ever) the hard part.
The hard part is the human system.
I realized that Lewin’s 3-stage model (Unfreeze, Change, Refreeze) maps very well to the engineering challenges we face today. It explains why we hit the "J-curve" of poor performance, why "Unfreezing" habits is so hard, and why we need to rethink what "Refreezing" means in an agile world.
I’ve written up my reflections on how Lewin’s thinking applies to modern DevOps and engineering leadership here,
https://cladam.github.io/2025/12/22/lewin-and-devops/
https://redd.it/1psuvjv
@r_devops
I recently read an amazing piece by Dr. Richard Claydon called “Lewin, Rewritten: Rethinking “How Change Works” for a Run / Serve / Change World”,
it explores Kurt Lewin’s change models in a modern context, and my thoughts immediately wandered into the world of DevOps.
We spend so much time talking about the "DevOps" toolchain: Kubernetes, Cloud platforms, DORA metrics. But anyone who has led a transformation knows the tools are rarely (if ever) the hard part.
The hard part is the human system.
I realized that Lewin’s 3-stage model (Unfreeze, Change, Refreeze) maps very well to the engineering challenges we face today. It explains why we hit the "J-curve" of poor performance, why "Unfreezing" habits is so hard, and why we need to rethink what "Refreezing" means in an agile world.
I’ve written up my reflections on how Lewin’s thinking applies to modern DevOps and engineering leadership here,
https://cladam.github.io/2025/12/22/lewin-and-devops/
https://redd.it/1psuvjv
@r_devops
cladam.github.io
Lewin's teachings and modern DevOps | Claes Adamsson
A personal site for projects, documentation, and thoughts on software development and delivery.
New Article: Startup CPU Boost in Kubernetes with In-Place Pod Resize
https://piotrminkowski.com/2025/12/22/startup-cpu-boost-in-kubernetes-with-in-place-pod-resize/
https://redd.it/1psuh6i
@r_devops
https://piotrminkowski.com/2025/12/22/startup-cpu-boost-in-kubernetes-with-in-place-pod-resize/
https://redd.it/1psuh6i
@r_devops
Piotr's TechBlog
Startup CPU Boost in Kubernetes with In-Place Pod Resize - Piotr's TechBlog
This article explains how to use the In-Place Pod Resize in Kubernetes with Kube Startup CPU Boost to speed up Java application startup.
Application-layer attacks bypassing traditional defenses
Hey all, Even strong posture programs sometimes miss runtime risks like application-layer exploits, which trigger alerts only after significant damage.
This ArmoSec blog on cloud runtime threats highlights the most common runtime vectors and practical detection strategies. Have you seen runtime attacks in production? How did you detect them early?
https://redd.it/1pswoea
@r_devops
Hey all, Even strong posture programs sometimes miss runtime risks like application-layer exploits, which trigger alerts only after significant damage.
This ArmoSec blog on cloud runtime threats highlights the most common runtime vectors and practical detection strategies. Have you seen runtime attacks in production? How did you detect them early?
https://redd.it/1pswoea
@r_devops
ARMO
The Real Cloud Attack Vectors to Watch in 2026- ARMO
Learn the 3 most prevalent runtime threat vectors behind modern cloud breaches: application-layer attacks, supply chain compromises, and stolen cloud identities.
Found a really clean kubectl cheat sheet with 100+ essential commands
Was looking for a simple kubectl reference that doesn’t require jumping through the docs every time.
Came across this cheat sheet that groups 100+ commonly used kubectl commands by use case — getting resources, debugging, logs, exec, contexts, namespaces, rollouts, etc.
What I liked:
\- It’s task-based, not just a random command dump
\- Easy to scan when you’re in the middle of debugging
\- Covers the stuff you actually use day-to-day
Link:
https://www.makcloudhance.com/kubectl-cheat-sheet/
Sharing in case it helps someone else. If you know similar resources, drop them here too.
https://redd.it/1psyaqv
@r_devops
Was looking for a simple kubectl reference that doesn’t require jumping through the docs every time.
Came across this cheat sheet that groups 100+ commonly used kubectl commands by use case — getting resources, debugging, logs, exec, contexts, namespaces, rollouts, etc.
What I liked:
\- It’s task-based, not just a random command dump
\- Easy to scan when you’re in the middle of debugging
\- Covers the stuff you actually use day-to-day
Link:
https://www.makcloudhance.com/kubectl-cheat-sheet/
Sharing in case it helps someone else. If you know similar resources, drop them here too.
https://redd.it/1psyaqv
@r_devops
www.makcloudhance.com
Kubectl Cheat Sheet – 41 Unique Kubernetes Commands Every Admin Should Know - Part 1 -
❤1
Experiences with Agentless security (Wiz / Orca), any concerns?
Hi all,
For those of you using Agentless Cloud Security tools like Wiz or Orca, I’m curious about your experience so far.
Are you generally happy with the agentless model?
Do you have any concerns around the fact that disk snapshots are copied to the vendor’s infrastructure and scanned from there?
In particular, I’m wondering:
How comfortable are you with the data exposure / trust model?
Did this raise concerns from security, legal, or compliance teams?
Were there specific mitigations or contractual guarantees that made this acceptable?
Or is the operational simplicity worth the trade-off for you?
Not trying to argue one way or another, just looking to understand how practitioners are thinking about this in real-world environments.
Thanks!
https://redd.it/1psz2ra
@r_devops
Hi all,
For those of you using Agentless Cloud Security tools like Wiz or Orca, I’m curious about your experience so far.
Are you generally happy with the agentless model?
Do you have any concerns around the fact that disk snapshots are copied to the vendor’s infrastructure and scanned from there?
In particular, I’m wondering:
How comfortable are you with the data exposure / trust model?
Did this raise concerns from security, legal, or compliance teams?
Were there specific mitigations or contractual guarantees that made this acceptable?
Or is the operational simplicity worth the trade-off for you?
Not trying to argue one way or another, just looking to understand how practitioners are thinking about this in real-world environments.
Thanks!
https://redd.it/1psz2ra
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
restricting user list to those assigned to project
I'm new so sorry if this is a dumb question, but I'm getting complaints from users editing work items in the web interface -
1. Clicking in the assigned user textbox is confusing people because they expect a dropdown, and when they don't see one they assume they don't have permission to edit. There is no affordance telling them they need to type something first.
2. It searches over the entire organization. I have a project manager that says this is unacceptable, visibility needs to be restricted to those who have been assigned to the project.
There's too much search noise trying to google this so maybe someone can tell me what's going on here, if they plan to fix this or what the rationale is.
https://redd.it/1pt0vyx
@r_devops
I'm new so sorry if this is a dumb question, but I'm getting complaints from users editing work items in the web interface -
1. Clicking in the assigned user textbox is confusing people because they expect a dropdown, and when they don't see one they assume they don't have permission to edit. There is no affordance telling them they need to type something first.
2. It searches over the entire organization. I have a project manager that says this is unacceptable, visibility needs to be restricted to those who have been assigned to the project.
There's too much search noise trying to google this so maybe someone can tell me what's going on here, if they plan to fix this or what the rationale is.
https://redd.it/1pt0vyx
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
GenAI is fun… until you try to keep it running in prod
GenAI is fun… until you try to keep it running in prod 😅
I’ve been seeing tons of GenAI demos lately and yeah, they look great. But every time I end up thinking, okay cool, but how do you operate this thing after the demo?
Recently AWS started talking more seriously about GenAIOps.
GenAI just doesn’t behave like normal apps. Same prompt, different output. “Works” but not always right. Tokens quietly draining money. Stuff breaks in weird ways.
Funny thing is, just recently I found myself using shell noscripts and multi-stage Azure DevOps pipelines to build some guardrails and ops around GenAI workflows. Not fancy, but very real. And that’s when it hit me, yeah, this absolutely needs its own ops mindset.
AWS is basically saying the same: treat prompts, models, agents like deployable artifacts. Monitor quality, not just uptime. Add safety, cost controls, evals. It’s like MLOps… but leveled up for GenAI chaos.
This feels less like hype and more like reality catching up. We’re clearly moving from GenAI experiments to GenAI systems. And systems always need ops.
Good reads if you’re curious: https://aws.amazon.com/blogs/machine-learning/operationalize-generative-ai-workloads-and-scale-to-hundreds-of-use-cases-with-amazon-bedrock-part-1-genaiops/
I hope you are happy now @mods. 😜
#AWS #GenAIOps #GenerativeAI #DevOps #MLOps #CloudEngineering
https://redd.it/1pt3b7w
@r_devops
GenAI is fun… until you try to keep it running in prod 😅
I’ve been seeing tons of GenAI demos lately and yeah, they look great. But every time I end up thinking, okay cool, but how do you operate this thing after the demo?
Recently AWS started talking more seriously about GenAIOps.
GenAI just doesn’t behave like normal apps. Same prompt, different output. “Works” but not always right. Tokens quietly draining money. Stuff breaks in weird ways.
Funny thing is, just recently I found myself using shell noscripts and multi-stage Azure DevOps pipelines to build some guardrails and ops around GenAI workflows. Not fancy, but very real. And that’s when it hit me, yeah, this absolutely needs its own ops mindset.
AWS is basically saying the same: treat prompts, models, agents like deployable artifacts. Monitor quality, not just uptime. Add safety, cost controls, evals. It’s like MLOps… but leveled up for GenAI chaos.
This feels less like hype and more like reality catching up. We’re clearly moving from GenAI experiments to GenAI systems. And systems always need ops.
Good reads if you’re curious: https://aws.amazon.com/blogs/machine-learning/operationalize-generative-ai-workloads-and-scale-to-hundreds-of-use-cases-with-amazon-bedrock-part-1-genaiops/
I hope you are happy now @mods. 😜
#AWS #GenAIOps #GenerativeAI #DevOps #MLOps #CloudEngineering
https://redd.it/1pt3b7w
@r_devops
Amazon
Operationalize generative AI workloads and scale to hundreds of use cases with Amazon Bedrock – Part 1: GenAIOps | Amazon Web Services
In this first part of our two-part series, you'll learn how to evolve your existing DevOps architecture for generative AI workloads and implement GenAIOps practices. We'll showcase practical implementation strategies for different generative AI adoption levels…
👍1
LLMs in prod: are we replacing deterministic automation with trust-based systems?
Hi,
Lately I’m seeing teams automate core workflows by wiring business logic in prompts directly to hosted LLMs like Claude or GPT.
Example I’ve seen in practice:
a developer says in chat that a container image is ready, the LLM decides it’s safe to deploy, generates a pipeline with parameters, and triggers it. No CI guardrails, no policy checks, just “the model followed the procedure”.
This makes me uneasy for a few reasons:
• Vendor lock-in at the reasoning/decision layer, not just APIs
• Leakage of operational knowledge via prompts and context
• Loss of determinism: no clear audit trail, replayability, or hard safety boundaries
I’m not anti-LLM. I see real value in summarization, explanation, anomaly detection, and operator assistance. But delegating state-changing decisions feels like a different class of risk.
Has anyone else run into this tension?
• Are you keeping LLMs assistive-only?
• Do you allow them to mutate state, and if so, how do you enforce guardrails?
• How are you thinking about this from an architecture / ops perspective?
Curious to hear how others are handling this long-term.
https://redd.it/1pt3xw5
@r_devops
Hi,
Lately I’m seeing teams automate core workflows by wiring business logic in prompts directly to hosted LLMs like Claude or GPT.
Example I’ve seen in practice:
a developer says in chat that a container image is ready, the LLM decides it’s safe to deploy, generates a pipeline with parameters, and triggers it. No CI guardrails, no policy checks, just “the model followed the procedure”.
This makes me uneasy for a few reasons:
• Vendor lock-in at the reasoning/decision layer, not just APIs
• Leakage of operational knowledge via prompts and context
• Loss of determinism: no clear audit trail, replayability, or hard safety boundaries
I’m not anti-LLM. I see real value in summarization, explanation, anomaly detection, and operator assistance. But delegating state-changing decisions feels like a different class of risk.
Has anyone else run into this tension?
• Are you keeping LLMs assistive-only?
• Do you allow them to mutate state, and if so, how do you enforce guardrails?
• How are you thinking about this from an architecture / ops perspective?
Curious to hear how others are handling this long-term.
https://redd.it/1pt3xw5
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Teleport!
I recently did a POC on Teleport as an intern, mainly around Kubernetes access, databases, and auditing. It feels like a pretty powerful “all-in-one” access layer, so I’m curious about real-world usage beyond the obvious basics. For folks using Teleport in production—what’s the most interesting or non-obvious use case you’ve implemented , I’d love to hear scenarios that are practical from devops engineer POV
https://redd.it/1pt2enr
@r_devops
I recently did a POC on Teleport as an intern, mainly around Kubernetes access, databases, and auditing. It feels like a pretty powerful “all-in-one” access layer, so I’m curious about real-world usage beyond the obvious basics. For folks using Teleport in production—what’s the most interesting or non-obvious use case you’ve implemented , I’d love to hear scenarios that are practical from devops engineer POV
https://redd.it/1pt2enr
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
AWS IAM for Startup Teams: Autonomy Without Chaos
We had developers blocked on infra emails for basic AWS provisioning because no one trusted IAM permissions.
I wrote about how we moved from “infra as a bottleneck” to developer autonomy using permission boundaries, without handing out admin access.
Would love feedback from folks who’ve solved (or struggled with) this in their orgs.
Link : https://medium.com/aws-in-plain-english/how-i-designed-an-aws-permissions-model-that-gave-developers-autonomy-without-losing-control-d50d03ca2a1d?sk=3d1d0ad4b5e3eb2c8a94cdb41f7f6a65
https://redd.it/1pt7gfk
@r_devops
We had developers blocked on infra emails for basic AWS provisioning because no one trusted IAM permissions.
I wrote about how we moved from “infra as a bottleneck” to developer autonomy using permission boundaries, without handing out admin access.
Would love feedback from folks who’ve solved (or struggled with) this in their orgs.
Link : https://medium.com/aws-in-plain-english/how-i-designed-an-aws-permissions-model-that-gave-developers-autonomy-without-losing-control-d50d03ca2a1d?sk=3d1d0ad4b5e3eb2c8a94cdb41f7f6a65
https://redd.it/1pt7gfk
@r_devops
Medium
How I Designed an AWS Permissions Model That Gave Developers Autonomy Without Losing Control
I joined a startup as an Engineering Manager, inheriting a team of about ten developers who were just beginning their journey with AWS. The…
First experience
Hello :D,
I've been in my first DevOps role for 3 months now, and I wanted to ask: what was your first experience like?
I used to be a developer with 2 years of experience, and I’m curious about how it felt for you when you started.
Right now I honestly feel really bad at it—I make a lot of silly mistakes and I’m starting to get discouraged. How did things go for you in the beginning?
https://redd.it/1pt9ug6
@r_devops
Hello :D,
I've been in my first DevOps role for 3 months now, and I wanted to ask: what was your first experience like?
I used to be a developer with 2 years of experience, and I’m curious about how it felt for you when you started.
Right now I honestly feel really bad at it—I make a lot of silly mistakes and I’m starting to get discouraged. How did things go for you in the beginning?
https://redd.it/1pt9ug6
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Suggestions on training.
Hi,
I've worked as a sysadmin for the past 15 years, always in the Linux world, initially with Red Hat and more recently with the Debian family. I've learned the main parts of AWS, GCP, and Terraform, and I also have recent experience with Git and GitHub (actions - CI/CD). I have an intermediate understanding of Python and networking.
The project I was working on has ended, and I'd like to hear your suggestions on what I should study to stay current.
https://redd.it/1ptau5o
@r_devops
Hi,
I've worked as a sysadmin for the past 15 years, always in the Linux world, initially with Red Hat and more recently with the Debian family. I've learned the main parts of AWS, GCP, and Terraform, and I also have recent experience with Git and GitHub (actions - CI/CD). I have an intermediate understanding of Python and networking.
The project I was working on has ended, and I'd like to hear your suggestions on what I should study to stay current.
https://redd.it/1ptau5o
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Automations inside mid-size DevOps for non technical users
Hey everyone,
I’ve talked to a lot of non technical people working within DevOps teams, especially at smaller companies, and I keep seeing the same pain points come up when it comes to automating workflows:
Tools like zapier or n8n are tough to maintain. If someone builds a workflow and then leaves the team, it turns into a black box, especially for teammates without a technical background.
A lot of automation lives outside the team’s main communication tools like slack or teams, which makes it feel disconnected and awkward to trigger or adjust in context.
There’s usually very little visibility into what an automation is actually doing unless you dig into it, which makes trust and debugging harder.
We’ve been working on something in this area that focuses on natural language driven, context aware automations that live directly inside tools like slack, discord, or google teams so even non technical users can trigger, review, and tweak automations from where they already work.
I’m still trying to gather more feedback and get some opinions:
What’s been your experience with automation tools in small or mid-size DevOps teams?
What’s worked well, and what hasn’t?
https://redd.it/1ptc6gh
@r_devops
Hey everyone,
I’ve talked to a lot of non technical people working within DevOps teams, especially at smaller companies, and I keep seeing the same pain points come up when it comes to automating workflows:
Tools like zapier or n8n are tough to maintain. If someone builds a workflow and then leaves the team, it turns into a black box, especially for teammates without a technical background.
A lot of automation lives outside the team’s main communication tools like slack or teams, which makes it feel disconnected and awkward to trigger or adjust in context.
There’s usually very little visibility into what an automation is actually doing unless you dig into it, which makes trust and debugging harder.
We’ve been working on something in this area that focuses on natural language driven, context aware automations that live directly inside tools like slack, discord, or google teams so even non technical users can trigger, review, and tweak automations from where they already work.
I’m still trying to gather more feedback and get some opinions:
What’s been your experience with automation tools in small or mid-size DevOps teams?
What’s worked well, and what hasn’t?
https://redd.it/1ptc6gh
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
I built khaos - a Kafka traffic simulator for testing, learning, and chaos engineering
Just open-sourced a CLI tool I've been working on. It spins up a local Kafka cluster and generates realistic traffic from YAML configs.
Built it because I was tired of writing throwaway producer/consumer noscripts every time I needed to test something.
It can simulate:
\- Consumer lag buildup
\- Hot partitions (skewed keys)
\- Broker failures and rebalances
\- Backpressure scenarios
Also works against external clusters with SASL/SSL if you need that.
Repo: https://github.com/aleksandarskrbic/khaos
What Kafka testing scenarios do you wish existed?
\---
Install instructions are in the README.
https://redd.it/1pte4o9
@r_devops
Just open-sourced a CLI tool I've been working on. It spins up a local Kafka cluster and generates realistic traffic from YAML configs.
Built it because I was tired of writing throwaway producer/consumer noscripts every time I needed to test something.
It can simulate:
\- Consumer lag buildup
\- Hot partitions (skewed keys)
\- Broker failures and rebalances
\- Backpressure scenarios
Also works against external clusters with SASL/SSL if you need that.
Repo: https://github.com/aleksandarskrbic/khaos
What Kafka testing scenarios do you wish existed?
\---
Install instructions are in the README.
https://redd.it/1pte4o9
@r_devops
GitHub
GitHub - aleksandarskrbic/khaos: Kafka traffic generator - realistic workloads for testing, learning, and chaos engineering
Kafka traffic generator - realistic workloads for testing, learning, and chaos engineering - aleksandarskrbic/khaos
How do I not waste my time in school?
I am a network engineer working in consulting by trade. I was fortunate enough to get into this position but as time is going on I'd like to be on the platform engineering side of things as I want to build other systems besides network infrastructure.
Now I know I can't just snap my fingers and hop so I am pursuing my bachelors at 28 in software engineering (specifically with WGUs BS and MS program - I am specifically going to shoot for their masters in dev ops program once I finish my bachelor's), I am happy to be able to finally be in a spot of life I can finally earn a degree.
What can I do to appropriately spend my time while in school to be in the best position to earn at least a junior platform engineer position. I'm pretty unsure about how to go about building a portfolio, connecting with people already in devops, and any other extra curriculars I can leverage to get me in. I appreciate any insight you folks might have or your guys experience in getting into the field.
https://redd.it/1ptf665
@r_devops
I am a network engineer working in consulting by trade. I was fortunate enough to get into this position but as time is going on I'd like to be on the platform engineering side of things as I want to build other systems besides network infrastructure.
Now I know I can't just snap my fingers and hop so I am pursuing my bachelors at 28 in software engineering (specifically with WGUs BS and MS program - I am specifically going to shoot for their masters in dev ops program once I finish my bachelor's), I am happy to be able to finally be in a spot of life I can finally earn a degree.
What can I do to appropriately spend my time while in school to be in the best position to earn at least a junior platform engineer position. I'm pretty unsure about how to go about building a portfolio, connecting with people already in devops, and any other extra curriculars I can leverage to get me in. I appreciate any insight you folks might have or your guys experience in getting into the field.
https://redd.it/1ptf665
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Traditional devops experience thought
So I don't use cloud as a primary part of my job. I do use it occasionally as a tool. I do an astronomical amount of automation for build and deploy. I am about to spend about 8 months standing up a front end in front of my automation to make a centralized signing and deployment much more user friendly
However I do feel like my career at this current company is on the sunset as I just don't really have much passion for mobile applications and there isn't a lot of space for me to grow into anything else and the depth at which I have to already be an expert is a lot further than I wanted to go
Problem is I don't have a lot of kubernetes experience. So I was thinking about creating a portfolio website that is essentially just a website that monitors its own infrastructure and is a visual representation of the automation
However I don't know if that's a worthwhile practice. I've had a hard time getting interviews lately even though I am a significant contributor at my current company which is in the fortune 200 list
I know that the hiring landscape is kind of bad right now and I honestly don't know if a personal project would even help me get hired as it seems like I'm competing with thousands of people that have the traditional devops experience
But I can do everything from mobile application architecture, I can stand up a web app on a small scale, I've been on the governance board for AI adoption in medical applications, and I have completely reworked a really old mobile application pipeline. When I first came to this company they had 400 bash Scripts and over 10,000 lines of code they handled all of their mobile application signing. The guy who wrote the system intentionally did not document it so that insured his employment
In the last 2 years I have fully documented the process and became a subject matter expert in my own right for mobile application signing and deployment. I've entirely Rewritten his tool to move off of Jenkins and on to git lab and positioned it to be deployed into the cloud if that was ever necessary
I have also trained an entire team of business analysts to handle every aspect of the mobile release process that isn't technical. I feel like I have overcome a lot and I feel like my resume doesn't do me a lot of Justice and because I was so pigeonholed into this shit hole of a team that is now amazing I've kind of stunted my growth
Like I could develop an architect Solutions like this on a whim very easily but at the same time nobody's going to let me touch their hybrid infrastructure because I don't have enough experience in the cloud. I don't know if you guys have any advice
https://redd.it/1ptdb0o
@r_devops
So I don't use cloud as a primary part of my job. I do use it occasionally as a tool. I do an astronomical amount of automation for build and deploy. I am about to spend about 8 months standing up a front end in front of my automation to make a centralized signing and deployment much more user friendly
However I do feel like my career at this current company is on the sunset as I just don't really have much passion for mobile applications and there isn't a lot of space for me to grow into anything else and the depth at which I have to already be an expert is a lot further than I wanted to go
Problem is I don't have a lot of kubernetes experience. So I was thinking about creating a portfolio website that is essentially just a website that monitors its own infrastructure and is a visual representation of the automation
However I don't know if that's a worthwhile practice. I've had a hard time getting interviews lately even though I am a significant contributor at my current company which is in the fortune 200 list
I know that the hiring landscape is kind of bad right now and I honestly don't know if a personal project would even help me get hired as it seems like I'm competing with thousands of people that have the traditional devops experience
But I can do everything from mobile application architecture, I can stand up a web app on a small scale, I've been on the governance board for AI adoption in medical applications, and I have completely reworked a really old mobile application pipeline. When I first came to this company they had 400 bash Scripts and over 10,000 lines of code they handled all of their mobile application signing. The guy who wrote the system intentionally did not document it so that insured his employment
In the last 2 years I have fully documented the process and became a subject matter expert in my own right for mobile application signing and deployment. I've entirely Rewritten his tool to move off of Jenkins and on to git lab and positioned it to be deployed into the cloud if that was ever necessary
I have also trained an entire team of business analysts to handle every aspect of the mobile release process that isn't technical. I feel like I have overcome a lot and I feel like my resume doesn't do me a lot of Justice and because I was so pigeonholed into this shit hole of a team that is now amazing I've kind of stunted my growth
Like I could develop an architect Solutions like this on a whim very easily but at the same time nobody's going to let me touch their hybrid infrastructure because I don't have enough experience in the cloud. I don't know if you guys have any advice
https://redd.it/1ptdb0o
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Anyone using Linear? I've got a couple 1-year coupons lying around.
I ended up with a few unused Linear 1 year credits from a deal I got earlier this month. I don't need all of them anymore, and they'll expire soon, so l figured I'd Give them on to people who want to improve their project + task workflow.
Linear really streamlined my planning + daily workflow. Instead of letting the credits expire, la rather give them to people who will actually use them to stay organized and ship faster.
If you want one, just comment "interested" or DM me and l'il send details.
https://redd.it/1ptekn0
@r_devops
I ended up with a few unused Linear 1 year credits from a deal I got earlier this month. I don't need all of them anymore, and they'll expire soon, so l figured I'd Give them on to people who want to improve their project + task workflow.
Linear really streamlined my planning + daily workflow. Instead of letting the credits expire, la rather give them to people who will actually use them to stay organized and ship faster.
If you want one, just comment "interested" or DM me and l'il send details.
https://redd.it/1ptekn0
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community