Any CI/CP tools in the wind today?
I have been trying to finesse simple too for handling deployment based on git, and are not super happy with GitHub. It does the core tasks fine but want a dedicated tool.
Been testing coolify and that works fine, yet I feel it’s not direct aimed to CI and CD and more to be a portainer clone but I might be wrong
Anyone that can recommend some alternatives that support CI, CD and Test management?
I’m open with self hosted or paid (but not enterprise prices)
Should be GUI tools as I want it team friendly
https://redd.it/1o2q8wo
@r_devops
I have been trying to finesse simple too for handling deployment based on git, and are not super happy with GitHub. It does the core tasks fine but want a dedicated tool.
Been testing coolify and that works fine, yet I feel it’s not direct aimed to CI and CD and more to be a portainer clone but I might be wrong
Anyone that can recommend some alternatives that support CI, CD and Test management?
I’m open with self hosted or paid (but not enterprise prices)
Should be GUI tools as I want it team friendly
https://redd.it/1o2q8wo
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How can we track PRs and merges efficiently with monday dev?
We integrated GitHub with monday dev to automatically update task status when PRs merge. How do other dev teams handle tracking PRs without switching between multiple tools?
https://redd.it/1o2tdzo
@r_devops
We integrated GitHub with monday dev to automatically update task status when PRs merge. How do other dev teams handle tracking PRs without switching between multiple tools?
https://redd.it/1o2tdzo
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Help with task tracking for development teams?
I’ve been using monday dev for about a month. It’s been great for our dev workflow, but I’d love to know if anyone has tips for better task tracking?
https://redd.it/1o2tnxp
@r_devops
I’ve been using monday dev for about a month. It’s been great for our dev workflow, but I’d love to know if anyone has tips for better task tracking?
https://redd.it/1o2tnxp
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
4600 Stars- the story about our open source Agent!
Hey devops 👋
I wanted to share the journey behind a wild couple of days building Droidrun, our open-source agent framework for automating real Android apps.
We started building Droidrun because we were frustrated: everything in automation and agent tech seemed stuck in the browser. But people live on their phones and apps are walled gardens. So we built an agent that could actually tap, scroll, and interact inside real mobile apps, like a human.
A few weeks ago, we posted a short demo no pitch, just an agent running a real Android UI. Within 48 hours:
We hit [4600+ GitHub Stars](https://github.com/droidrun/droidrun)
Got devs joining our Discord
Landed on the radar of investors
And closed a $2M+ funding round shortly after
What worked for us:
We led with a real demo, not a roadmap
Posted in the right communities, not product forums
Asked for feedback, not attention
And open-sourced from day one, which gave us credibility + momentum
We’re still in the early days, and there’s a ton to figure out. But the biggest lesson so far:
Don’t wait to polish. Ship the weird, broken, raw thing if the core is strong, people will get it.
If you’re working on something agentic, mobile, or just bold than I’d love to hear what you’re building too.
AMA if helpful!
https://redd.it/1o2urs8
@r_devops
Hey devops 👋
I wanted to share the journey behind a wild couple of days building Droidrun, our open-source agent framework for automating real Android apps.
We started building Droidrun because we were frustrated: everything in automation and agent tech seemed stuck in the browser. But people live on their phones and apps are walled gardens. So we built an agent that could actually tap, scroll, and interact inside real mobile apps, like a human.
A few weeks ago, we posted a short demo no pitch, just an agent running a real Android UI. Within 48 hours:
We hit [4600+ GitHub Stars](https://github.com/droidrun/droidrun)
Got devs joining our Discord
Landed on the radar of investors
And closed a $2M+ funding round shortly after
What worked for us:
We led with a real demo, not a roadmap
Posted in the right communities, not product forums
Asked for feedback, not attention
And open-sourced from day one, which gave us credibility + momentum
We’re still in the early days, and there’s a ton to figure out. But the biggest lesson so far:
Don’t wait to polish. Ship the weird, broken, raw thing if the core is strong, people will get it.
If you’re working on something agentic, mobile, or just bold than I’d love to hear what you’re building too.
AMA if helpful!
https://redd.it/1o2urs8
@r_devops
www.droidrun.ai
Droidrun - The First Native Mobile Agent
Give AI native control of mobile apps and phones. Automate mobile workflows and unlock data from any app
Best project management tools for developer teams?
We looked at Asana, Trello, and Monday dev’s for now. Monday Dev was more usable for dev teams than Trello, but I’m curious what others think. Any underrated free tools you’d recommend?
https://redd.it/1o2var6
@r_devops
We looked at Asana, Trello, and Monday dev’s for now. Monday Dev was more usable for dev teams than Trello, but I’m curious what others think. Any underrated free tools you’d recommend?
https://redd.it/1o2var6
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Self healing PRs: Bots and AI agents working together to deal with infosec toil
Keeping dependencies updated with bots like Renovate is a great practice but it can lead to lots of PRs to review and fix. What if this was done with AI coding agents?
We answered this question in my team by adding a build step to "fix the code" and the results were as positive and surprising. It led to a more general question: What if any Pull Requests in your repository could fix itself as part of the build pipeline?
This is the full story: https://www.elastic.co/search-labs/blog/ci-pipelines-claude-ai-agent
https://redd.it/1o2ths0
@r_devops
Keeping dependencies updated with bots like Renovate is a great practice but it can lead to lots of PRs to review and fix. What if this was done with AI coding agents?
We answered this question in my team by adding a build step to "fix the code" and the results were as positive and surprising. It led to a more general question: What if any Pull Requests in your repository could fix itself as part of the build pipeline?
This is the full story: https://www.elastic.co/search-labs/blog/ci-pipelines-claude-ai-agent
https://redd.it/1o2ths0
@r_devops
Elasticsearch Labs
CI/CD pipelines with agentic AI: How to create self-correcting monorepos - Elasticsearch Labs
How our team introduced GenAI into CI pipelines to create self-correcting pull requests, automizing the update of hundreds of dependencies in large monorepos
How can small dev teams reduce context switching using monday dev?
We consolidated GitHub, Slack, and email notifications in monday dev boards to reduce distractions. How do other teams keep workflows smooth without hopping between apps?
https://redd.it/1o2xbmm
@r_devops
We consolidated GitHub, Slack, and email notifications in monday dev boards to reduce distractions. How do other teams keep workflows smooth without hopping between apps?
https://redd.it/1o2xbmm
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How are you validating backend performance before every deploy?
We started running custom load tests on our backend with every merge. If no tests exist, we generate them from OpenAPI and recent traffic logs. Our pipeline reports P95 latency and error rate and can hold rollout for approval if thresholds are breached. This helped cut failed production rollouts by 60 percent.
How are you gating backend releases or generating traffic scenarios for new services?
https://redd.it/1o33fyv
@r_devops
We started running custom load tests on our backend with every merge. If no tests exist, we generate them from OpenAPI and recent traffic logs. Our pipeline reports P95 latency and error rate and can hold rollout for approval if thresholds are breached. This helped cut failed production rollouts by 60 percent.
How are you gating backend releases or generating traffic scenarios for new services?
https://redd.it/1o33fyv
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
What should I focus on to switch to devops
Hi everyone,
I'm currently working as an SRE for a few months but it's just ops role in a large organisation when I am being siloed.
I also have a few years of experience as cloud sysadmin with a focus on AWS and other sysadmin and support roles but I feel like I lose my skillset in my current role.
So I'd like to ask for advice regarding tools, areas projects I could focus on to improve chances of having a shot at a devops role.
https://redd.it/1o367ou
@r_devops
Hi everyone,
I'm currently working as an SRE for a few months but it's just ops role in a large organisation when I am being siloed.
I also have a few years of experience as cloud sysadmin with a focus on AWS and other sysadmin and support roles but I feel like I lose my skillset in my current role.
So I'd like to ask for advice regarding tools, areas projects I could focus on to improve chances of having a shot at a devops role.
https://redd.it/1o367ou
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
I inherited a problem and need your advice
The company I work for has 6 custom websites that are hosted by a relatively small hosting company(\~10 employees). This company also serves as our Devops. They control everything after our Github account. This includes managing Cloudflare which is used to help with security and performance, particularly their firewall and cacheing.
A decision was made before I got involved that this vendor would own the Cloudflare account. I'm honestly not sure what the reason was, but our website's Cloudflare licenses are within their company-wide account. We've been told that we cannot have visibility into the account or share access for security reasons, partly because we would see the instances of their other clients, but also because it's a safety precaution to not allow devs to meddle in devops. Our devs have no interest in doing devops, but often need to look at logs to debug issues, which they can't do right now. I'm also concerned about portability if our relationship with this vendor sours.
So, I'm stepping into this situation thinking we should absolutely own and control the Cloudlfare account that contains the licenses that our websites depend on. We don't have control or visibility into this part of our stack. I'm looking for advice on whether I'm looking at this from the right perspective. I'm also interested in hearing what are industry best practices for a client/vendor relationship in terms of ownership, control, and visibility. Thank you
https://redd.it/1o37sv7
@r_devops
The company I work for has 6 custom websites that are hosted by a relatively small hosting company(\~10 employees). This company also serves as our Devops. They control everything after our Github account. This includes managing Cloudflare which is used to help with security and performance, particularly their firewall and cacheing.
A decision was made before I got involved that this vendor would own the Cloudflare account. I'm honestly not sure what the reason was, but our website's Cloudflare licenses are within their company-wide account. We've been told that we cannot have visibility into the account or share access for security reasons, partly because we would see the instances of their other clients, but also because it's a safety precaution to not allow devs to meddle in devops. Our devs have no interest in doing devops, but often need to look at logs to debug issues, which they can't do right now. I'm also concerned about portability if our relationship with this vendor sours.
So, I'm stepping into this situation thinking we should absolutely own and control the Cloudlfare account that contains the licenses that our websites depend on. We don't have control or visibility into this part of our stack. I'm looking for advice on whether I'm looking at this from the right perspective. I'm also interested in hearing what are industry best practices for a client/vendor relationship in terms of ownership, control, and visibility. Thank you
https://redd.it/1o37sv7
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Requesting Recommendations: AI CLI Agent for DevOps/SRE Workflow (Warp/Gemini-CLI alternatives?)
Hey everyone, I'm trying to level up my terminal game with an AI CLI agent and I'm a total noob. I'm a DevOps/SRE guy, so my job is basically a mix of:
* **25% Coding:** Python, Go, shell noscripts.
* **50% CLI Hell:** Heavy `kubectl`, `aws cli`, `terraform`, and diving into logs/configs to troubleshoot.
* **25% Think Tank:** Architecting stuff, writing docs, and runbooks.
I've been playing with `gemini-cli` and `Warp`, and they're clutch for troubleshooting—the ability for the AI to read a giant `kubectl describe` or a tricky log file to diagnose an issue is a lifesaver.
But I know I'm barely scratching the surface. I need the community's brainpower!
**Quick Questions for the Experts:**
1. **What else is out there?** Besides `gemini-cli`, `qwen`, and `Warp`, what other **agentic CLI tools** are you using? Any good opensource or local-first options (`Aider`, `Claude Code CLI`, etc.) that crush it for infrastructure work?
2. **Multi-Model Setup:** I hate vendor lock-in. I assume `gemini-cli` is Google-only. What are the best CLI agents that let you swap models easily (Gemini, X.ai, Claude, OpenAI, or even Ollama for local models)?
3. **VSCode Terminal Flow:** Can I get this same deep, context-aware utility using something like **Cline in VSCode**? Or is a dedicated terminal like Warp still better for the full experience?
4. **Warp Pro:** I saw a thread (link in comments/PM) mentioning a $56/year deal for Warp Pro. Won't that be a scam? What do you think?
Thanks in advance for any insights.
https://redd.it/1o39bur
@r_devops
Hey everyone, I'm trying to level up my terminal game with an AI CLI agent and I'm a total noob. I'm a DevOps/SRE guy, so my job is basically a mix of:
* **25% Coding:** Python, Go, shell noscripts.
* **50% CLI Hell:** Heavy `kubectl`, `aws cli`, `terraform`, and diving into logs/configs to troubleshoot.
* **25% Think Tank:** Architecting stuff, writing docs, and runbooks.
I've been playing with `gemini-cli` and `Warp`, and they're clutch for troubleshooting—the ability for the AI to read a giant `kubectl describe` or a tricky log file to diagnose an issue is a lifesaver.
But I know I'm barely scratching the surface. I need the community's brainpower!
**Quick Questions for the Experts:**
1. **What else is out there?** Besides `gemini-cli`, `qwen`, and `Warp`, what other **agentic CLI tools** are you using? Any good opensource or local-first options (`Aider`, `Claude Code CLI`, etc.) that crush it for infrastructure work?
2. **Multi-Model Setup:** I hate vendor lock-in. I assume `gemini-cli` is Google-only. What are the best CLI agents that let you swap models easily (Gemini, X.ai, Claude, OpenAI, or even Ollama for local models)?
3. **VSCode Terminal Flow:** Can I get this same deep, context-aware utility using something like **Cline in VSCode**? Or is a dedicated terminal like Warp still better for the full experience?
4. **Warp Pro:** I saw a thread (link in comments/PM) mentioning a $56/year deal for Warp Pro. Won't that be a scam? What do you think?
Thanks in advance for any insights.
https://redd.it/1o39bur
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Career Advice for junior platform engineer
Im fresh out of college and landed a platform engineering role
I was completely new to the "ops" side of development cycle
I was trained for 2 months on AWS, K8S, Linux and docker
After 6 months into the job I still find I have lots of learning to do but I cannot find the time to do it
I'm still expected to finish the task which sometimes includes a technology or framework im completey unaware of
And to solve an issue most times u need knowledge of the application and how the infra is set up to support it
While I can understand the infra side i don't know about the application side and I find myself asking silly questions to my seniors which I think is dumb to be doing after 6 months into the job
So I overthink simple tasks and take too much time competing the task since i spend a lot of time trying to learn or understand the tech or the task in itself
FYI the product im under is complex and trying to fully get to know how it works might take me months
Any advice on how I can do my job better from here on?
What should I focus on and what is an realistic goal at this point?
I still want to be useful to my team and wish to get over this HUGE learning curve ASAP
https://redd.it/1o3cx5a
@r_devops
Im fresh out of college and landed a platform engineering role
I was completely new to the "ops" side of development cycle
I was trained for 2 months on AWS, K8S, Linux and docker
After 6 months into the job I still find I have lots of learning to do but I cannot find the time to do it
I'm still expected to finish the task which sometimes includes a technology or framework im completey unaware of
And to solve an issue most times u need knowledge of the application and how the infra is set up to support it
While I can understand the infra side i don't know about the application side and I find myself asking silly questions to my seniors which I think is dumb to be doing after 6 months into the job
So I overthink simple tasks and take too much time competing the task since i spend a lot of time trying to learn or understand the tech or the task in itself
FYI the product im under is complex and trying to fully get to know how it works might take me months
Any advice on how I can do my job better from here on?
What should I focus on and what is an realistic goal at this point?
I still want to be useful to my team and wish to get over this HUGE learning curve ASAP
https://redd.it/1o3cx5a
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Anyone else feels like AI crowd is mostly JS ppl ?
Every conference i watch like OpenAI etc, are ppl showcasing stuff in typenoscript. Any training I participated in were ppl showcasing how fast to bootstrap JS project, either react or angular or vue.
All of them sitting in VSCode pumping out next 4000 stars GH project that does as much as a single command in terminal.
Moving so fast noone of them even asks a question „does it even make sense?”, who cares, ship it, lets make some mani.
In DevOps Im strugling to find a real use-case for non-deterministic agents. We had one for monitoring but one in blue moon it thought its a good idea to restart services while the issue was transient causing more harm than good.
Any time I bootstrap k8s operator, i have to refactor whole project, even when using pretty strict instructions.md.
When refactoring I still get methods calls that dont even exist. Thats with gpt5.
Dunno if Im too old and stupid or hype is too much, by ppl who dont even care Oo
https://redd.it/1o3c2fd
@r_devops
Every conference i watch like OpenAI etc, are ppl showcasing stuff in typenoscript. Any training I participated in were ppl showcasing how fast to bootstrap JS project, either react or angular or vue.
All of them sitting in VSCode pumping out next 4000 stars GH project that does as much as a single command in terminal.
Moving so fast noone of them even asks a question „does it even make sense?”, who cares, ship it, lets make some mani.
In DevOps Im strugling to find a real use-case for non-deterministic agents. We had one for monitoring but one in blue moon it thought its a good idea to restart services while the issue was transient causing more harm than good.
Any time I bootstrap k8s operator, i have to refactor whole project, even when using pretty strict instructions.md.
When refactoring I still get methods calls that dont even exist. Thats with gpt5.
Dunno if Im too old and stupid or hype is too much, by ppl who dont even care Oo
https://redd.it/1o3c2fd
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Need some advice regarding role change
I am a system admin working mostly on linux, citrix suite and a little bit of networking, websphere . I am trying to move to devops or cloud ops. I have some course level knowledge about devops tools. Im getting a few interview calls which require only linux and networking but, sound like they are totally customer facing roles where i would troubleshoot issues that they encounter. Right now, my role involves deployments , app support and on call rotations. Would it be bad for my career to move to a supposedly customer facing support role ? The pay would definitely be 2x or 3x of what im making currently as im still a junior . Thoughts , please.
https://redd.it/1o3c10r
@r_devops
I am a system admin working mostly on linux, citrix suite and a little bit of networking, websphere . I am trying to move to devops or cloud ops. I have some course level knowledge about devops tools. Im getting a few interview calls which require only linux and networking but, sound like they are totally customer facing roles where i would troubleshoot issues that they encounter. Right now, my role involves deployments , app support and on call rotations. Would it be bad for my career to move to a supposedly customer facing support role ? The pay would definitely be 2x or 3x of what im making currently as im still a junior . Thoughts , please.
https://redd.it/1o3c10r
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
5 Years of Development Experience... to Write YAML?
It's surprising how many DevOps/SRE roles require 5+ years of software development experience and include LeetCode style interviews, when in reality you're most likely going to be writing YAML, Terraform or Python noscripts.
Would love to hear others' experiences. Do people actually do professional software development in these roles? At that point, doesn’t the role just become a standard software engineering position?
P.S On a side note, would you count writing custom glue code, Typenoscript/Python noscripts as a software development experience?
P.P.S Title may read sarcastic, but I'm just trying to navigate the job market and frustrated with the job requirements.
https://redd.it/1o3fwe1
@r_devops
It's surprising how many DevOps/SRE roles require 5+ years of software development experience and include LeetCode style interviews, when in reality you're most likely going to be writing YAML, Terraform or Python noscripts.
Would love to hear others' experiences. Do people actually do professional software development in these roles? At that point, doesn’t the role just become a standard software engineering position?
P.S On a side note, would you count writing custom glue code, Typenoscript/Python noscripts as a software development experience?
P.P.S Title may read sarcastic, but I'm just trying to navigate the job market and frustrated with the job requirements.
https://redd.it/1o3fwe1
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Homelabs and DevOps related experience.
Hello everyone. I’ve been navigating into this sub, to see similar questions. Gathered some valuable information but want to dig up a little more.
Basically I just want to know which projects could be great to have in your own home lab so you can practice and even show in your GitHub account.
What can reinforce sysadmin/sre/devops related knowledge. Or… is it even worth it in the professional world?
I have some sysadmin experience but it was so long ago that I do not even feel comfortable on Linux tech interviews.
I’m from Colombia and not sure how similar would be to you countries. Anyway any information will be appreciated.
https://redd.it/1o3id7q
@r_devops
Hello everyone. I’ve been navigating into this sub, to see similar questions. Gathered some valuable information but want to dig up a little more.
Basically I just want to know which projects could be great to have in your own home lab so you can practice and even show in your GitHub account.
What can reinforce sysadmin/sre/devops related knowledge. Or… is it even worth it in the professional world?
I have some sysadmin experience but it was so long ago that I do not even feel comfortable on Linux tech interviews.
I’m from Colombia and not sure how similar would be to you countries. Anyway any information will be appreciated.
https://redd.it/1o3id7q
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Finding git base branch
While coding, from which base branch did I create this feature branch? This bash noscript helps me answer this question instantly, pretty useful in automation as well as my daily dev workflow.
What can be improved further?
Link to the noscript code
Author Credit: Abhishek, SDE II at RudderStack
https://redd.it/1o3j66q
@r_devops
While coding, from which base branch did I create this feature branch? This bash noscript helps me answer this question instantly, pretty useful in automation as well as my daily dev workflow.
What can be improved further?
Link to the noscript code
Author Credit: Abhishek, SDE II at RudderStack
https://redd.it/1o3j66q
@r_devops
Gist
This will find the immediate parent/base branch. You need to make at least one single commit to let it work effficiently.
This will find the immediate parent/base branch. You need to make at least one single commit to let it work effficiently. - findBaseBranch.sh
laptop for Devops
Cloud services cost a lot, and the worst part is, you don’t even own the machine.
Initially, building a desktop PC appeared to be a cost-effective option. However, after accounting for additional expenses such as a UPS (due to frequent power outages), a monitor, and other peripherals, a laptop proves to be a better value in my situation.
Second hand market are a trap in Nepal.
Earlier I had i5 7th generation laptop with 16GB RAM. It would start to cry whenever I put more than three virtual machines. The host OS was windows 10 and guest OS was rocky linux minimal inside Hyper-V/Virtualbox. And I would like to keep it that way.
Thus I will require 32GB RAM.
And a solid processor should be non-negotiable. But I am not sure about which processor would be most value for money? i.e. give me highest ROI for the least amount of leap in budget?
My budget is around 500 US dollars or 65000 INR. It is 100K NPR(nepal price after tax and shit like that, not conversion value). I cannot go beyond that because I do not have further money as savings. (Currently unemployed)
https://redd.it/1o3mwiz
@r_devops
Cloud services cost a lot, and the worst part is, you don’t even own the machine.
Initially, building a desktop PC appeared to be a cost-effective option. However, after accounting for additional expenses such as a UPS (due to frequent power outages), a monitor, and other peripherals, a laptop proves to be a better value in my situation.
Second hand market are a trap in Nepal.
Earlier I had i5 7th generation laptop with 16GB RAM. It would start to cry whenever I put more than three virtual machines. The host OS was windows 10 and guest OS was rocky linux minimal inside Hyper-V/Virtualbox. And I would like to keep it that way.
Thus I will require 32GB RAM.
And a solid processor should be non-negotiable. But I am not sure about which processor would be most value for money? i.e. give me highest ROI for the least amount of leap in budget?
My budget is around 500 US dollars or 65000 INR. It is 100K NPR(nepal price after tax and shit like that, not conversion value). I cannot go beyond that because I do not have further money as savings. (Currently unemployed)
https://redd.it/1o3mwiz
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Every Monday our dev server dies and I have to ping DevOps to restart 😩 — anyone else deal with this?
I’m working at a small SaaS startup.
Our dev & staging environments (on AWS EC2) randomly go down — usually overnight or early morning.
When I try to test something in the morning, I get the lovely “This site can’t be reached”.
Then I Slack our DevOps guy — he restarts the instance, and it magically works again.
It happens like 3–4 times a week, wasting 20–30 mins each time for me + QA.
I was thinking of building a small tool to automatically detect and restart instances (via AWS SDK) when this happens.
Before I overthink —
👉 does anyone else face this kind of recurring downtime in dev/staging?
👉 how do you handle it? (auto noscripts, CloudWatch, or just manual restart?)
Curious if it’s common enough that a small self-healing tool could actually be useful.
https://redd.it/1o3nzcs
@r_devops
I’m working at a small SaaS startup.
Our dev & staging environments (on AWS EC2) randomly go down — usually overnight or early morning.
When I try to test something in the morning, I get the lovely “This site can’t be reached”.
Then I Slack our DevOps guy — he restarts the instance, and it magically works again.
It happens like 3–4 times a week, wasting 20–30 mins each time for me + QA.
I was thinking of building a small tool to automatically detect and restart instances (via AWS SDK) when this happens.
Before I overthink —
👉 does anyone else face this kind of recurring downtime in dev/staging?
👉 how do you handle it? (auto noscripts, CloudWatch, or just manual restart?)
Curious if it’s common enough that a small self-healing tool could actually be useful.
https://redd.it/1o3nzcs
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How can monday dev help run daily standups without meetings?
We set up boards and automations so updates happen asynchronously. What strategies have other dev teams used to make standups faster and more effective?
https://redd.it/1o3psa8
@r_devops
We set up boards and automations so updates happen asynchronously. What strategies have other dev teams used to make standups faster and more effective?
https://redd.it/1o3psa8
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Trixter: A Chaos Proxy for Simulating Network Faults
>
Hey folks 👋
I’ve just published a post about **Trixter** — a high-performance chaos proxy written in Rust for simulating unreliable networks in CI/CD or staging environments.
Unlike Linux
Example use:
$ docker run --network host ghcr.io/brk0v/trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:3000 \
--api 127.0.0.1:8888
--delay-ms 300 \
--slice-size-bytes 128 \
--terminate-probability-rate 0.01
💡 Run tests with random seeds, and if something fails — extract the seed from logs and reproduce the chaos locally.
Full post with architecture, comparison to
https://redd.it/1o3rkri
@r_devops
>
Hey folks 👋
I’ve just published a post about **Trixter** — a high-performance chaos proxy written in Rust for simulating unreliable networks in CI/CD or staging environments.
Unlike Linux
tc netem, it runs entirely in user space (no root, no kernel modules), and you can tweak network faults dynamically via REST JSON API — latency, throttling, loss, terminations, corruption, etc.Example use:
$ docker run --network host ghcr.io/brk0v/trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:3000 \
--api 127.0.0.1:8888
--delay-ms 300 \
--slice-size-bytes 128 \
--terminate-probability-rate 0.01
💡 Run tests with random seeds, and if something fails — extract the seed from logs and reproduce the chaos locally.
Full post with architecture, comparison to
tc netem, and reproducible chaos setup here: https://biriukov.dev/posts/trixter-chaos-proxy/https://redd.it/1o3rkri
@r_devops
Viacheslav Biriukov
Trixter: A Chaos Proxy for Simulating Network Faults
Posted: Oct 2025 Github: https://github.com/brk0v/trixter Contents
Chaos Engineering and Network Fault Injection Introducing Trixter – A Chaos Monkey for TCP Why Trixter vs GNU/Linux tc netem (Kernel Network Emulator) Using Trixter: Examples of Injecting…
Chaos Engineering and Network Fault Injection Introducing Trixter – A Chaos Monkey for TCP Why Trixter vs GNU/Linux tc netem (Kernel Network Emulator) Using Trixter: Examples of Injecting…