Domain monitoring tool - looking for feedback/advice!
Hi guys!
For the past few months now I've been working on a little tool that routinely monitors the WHOIS/RDAP data, DNS records and the SSL status of domains. If any of this changes, you'll get a little email immediately letting you know.
I would really appreciate feedback on any aspect of the project, whether that's the landing page, something inside the app itself and such.
It doesn't have any ghastly AI features (nor does it need it!) and has only been worked on by myself so I'm pretty eager for feedback.
You can find the project here: https://domainwarden.app
Thank you so much for any feedback! I do appreciate it. :)
https://redd.it/1p5q7y3
@r_devops
Hi guys!
For the past few months now I've been working on a little tool that routinely monitors the WHOIS/RDAP data, DNS records and the SSL status of domains. If any of this changes, you'll get a little email immediately letting you know.
I would really appreciate feedback on any aspect of the project, whether that's the landing page, something inside the app itself and such.
It doesn't have any ghastly AI features (nor does it need it!) and has only been worked on by myself so I'm pretty eager for feedback.
You can find the project here: https://domainwarden.app
Thank you so much for any feedback! I do appreciate it. :)
https://redd.it/1p5q7y3
@r_devops
domainwarden.app
Domainwarden - Domain Monitoring
Stay ahead of domain expirations, SSL issues, and DNS changes with Domainwarden's domain monitoring.
Small but useful DevOps project: CPU usage monitor in Bash (alerts + logs)
Exploring small automation ideas. Built a Bash-based CPU monitor with thresholds + logging.
Tutorial: https://youtu.be/nVU1JIWGnmI
source code : https://github.com/Abhilashchauhan1994/bash\_noscripts/blob/main/cpu\_usage.sh
Please review this and provide me any suggestion that will make this better.
https://redd.it/1p5jrkl
@r_devops
Exploring small automation ideas. Built a Bash-based CPU monitor with thresholds + logging.
Tutorial: https://youtu.be/nVU1JIWGnmI
source code : https://github.com/Abhilashchauhan1994/bash\_noscripts/blob/main/cpu\_usage.sh
Please review this and provide me any suggestion that will make this better.
https://redd.it/1p5jrkl
@r_devops
YouTube
YOU WON'T BELIEVE How Easy Monitoring CPU Usage Is with Bash
In this step-by-step guide, you’ll learn how to create a CPU Usage Monitor Script using Bash noscripting — one of the most powerful tools for Linux automation.
Whether you’re just starting out or looking to enhance your DevOps and Linux skills, this Bash Scripting…
Whether you’re just starting out or looking to enhance your DevOps and Linux skills, this Bash Scripting…
Upcoming interview, what to expect?
First ever interview for a DevOps (Associate) role, want to transition from SQA/automation.
What to expect in this weird time we are living?
https://redd.it/1p60srm
@r_devops
First ever interview for a DevOps (Associate) role, want to transition from SQA/automation.
What to expect in this weird time we are living?
https://redd.it/1p60srm
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Which metrics are most reliable?
Recently i noticed there is always a difference between ec2 instance utilization( cpu,memory) metrics and th e one provided by new relic agent.
I want to keep only one of them in new relic and make alerts, decisions based on that only.
Any insights on which are more reliable?
https://redd.it/1p656av
@r_devops
Recently i noticed there is always a difference between ec2 instance utilization( cpu,memory) metrics and th e one provided by new relic agent.
I want to keep only one of them in new relic and make alerts, decisions based on that only.
Any insights on which are more reliable?
https://redd.it/1p656av
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Words of new CEO - „Why hire seniors when single junior with AI can do work of seniors”
Its silly how the wave has turned in IT because of AI.
Beside offshoring to cheaper countries, AI seems to be the new way to push ppl to do more and more with less staff on the board.
CEO said he literally sees zero reasons to hire for senior roles now. GPT seems to be on a level good enough to replace all of them. AI agents replaced all of our less senior testers, support call centre is replaced by AI call center, senior devs fired and replaced with 1/10 of juniors with AI at hand.
Funny thing is company did not slow down, rather got faster releases, # of issues decreased and overall customer satisfaction went up.
Sad days to be someone continuing IT journey without AI :/
On the other hand - amazing news for Senior ppl in less expensive countries.
“This looks like the times when whole floors of switchboard operators were replaced by a few technicians maintaining automated systems.”
https://redd.it/1p669oq
@r_devops
Its silly how the wave has turned in IT because of AI.
Beside offshoring to cheaper countries, AI seems to be the new way to push ppl to do more and more with less staff on the board.
CEO said he literally sees zero reasons to hire for senior roles now. GPT seems to be on a level good enough to replace all of them. AI agents replaced all of our less senior testers, support call centre is replaced by AI call center, senior devs fired and replaced with 1/10 of juniors with AI at hand.
Funny thing is company did not slow down, rather got faster releases, # of issues decreased and overall customer satisfaction went up.
Sad days to be someone continuing IT journey without AI :/
On the other hand - amazing news for Senior ppl in less expensive countries.
“This looks like the times when whole floors of switchboard operators were replaced by a few technicians maintaining automated systems.”
https://redd.it/1p669oq
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Need realtime ci cd issues
Hi, i know ci cd pipelines and how to set it up, but i need to know what kind of realtime issues companies go through in the ci cd implementation. It can be caching issue or long running pipelines or any thing. I need someone to explain it very well so i can replicate the same thing in my homelab and explore it more.
I would request people to throw their insights over this one.
https://redd.it/1p65gf3
@r_devops
Hi, i know ci cd pipelines and how to set it up, but i need to know what kind of realtime issues companies go through in the ci cd implementation. It can be caching issue or long running pipelines or any thing. I need someone to explain it very well so i can replicate the same thing in my homelab and explore it more.
I would request people to throw their insights over this one.
https://redd.it/1p65gf3
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Why do project-management refugees think a weekend AWS course makes them engineers?
Project-management refugees wandering into tech like they can just cosplay engineering for a weekend is beyond insulting. Years grinding through real systems, debugging at 3 a.m., tearing down and rebuilding your own understanding of how machines behave – all of that gets flattened by someone who thinks an AWS bootcamp slapped on top of zero technical substrate makes them your peer. They drain the fun out of the craft, flatten the discipline, and then act confused when they faceplant the moment anything non-clickops appears. The arrogance isn’t just annoying; it’s a contamination of the field by people who never respected it in the first place.
https://redd.it/1p68qzc
@r_devops
Project-management refugees wandering into tech like they can just cosplay engineering for a weekend is beyond insulting. Years grinding through real systems, debugging at 3 a.m., tearing down and rebuilding your own understanding of how machines behave – all of that gets flattened by someone who thinks an AWS bootcamp slapped on top of zero technical substrate makes them your peer. They drain the fun out of the craft, flatten the discipline, and then act confused when they faceplant the moment anything non-clickops appears. The arrogance isn’t just annoying; it’s a contamination of the field by people who never respected it in the first place.
https://redd.it/1p68qzc
@r_devops
anyone else feel like ai tools are either quiet helpers or complete chaos?
​
i’ve been messing around with a bunch of these ai coding tools lately, and honestly some of them feel like they’re trying way too hard. a few of the agent-style ones start touching files i didn’t even bring up. cool demos, scary in real projects.
the ones that actually stick for me are the calmer ones that stay in lane like aider when i need clean multi-file edits, windsurf or cursor when i want a simple plan instead of a magic trick, and cosine whenever i’m lost in a big repo and need to follow the logic across a bunch of files. i’ve tried tabnine and continue dev too, but they’re hit or miss depending on the day.
curious if anyone else is going through this, what tools ended up becoming part of your routine, and which ones did you quietly uninstall because they made more mess than progress?
https://redd.it/1p69elx
@r_devops
​
i’ve been messing around with a bunch of these ai coding tools lately, and honestly some of them feel like they’re trying way too hard. a few of the agent-style ones start touching files i didn’t even bring up. cool demos, scary in real projects.
the ones that actually stick for me are the calmer ones that stay in lane like aider when i need clean multi-file edits, windsurf or cursor when i want a simple plan instead of a magic trick, and cosine whenever i’m lost in a big repo and need to follow the logic across a bunch of files. i’ve tried tabnine and continue dev too, but they’re hit or miss depending on the day.
curious if anyone else is going through this, what tools ended up becoming part of your routine, and which ones did you quietly uninstall because they made more mess than progress?
https://redd.it/1p69elx
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
testing platforms with actual AI (not just marketing fluff) do they exist?
Every vendor pitch i sit through now mentions "AI powered" something but when you dig into it, it's just basic automation with maybe a chatgpt integration slapped on top.
I'm looking for a test automation platform that actually uses AI in meaningful ways, like understanding user intent, adapting to ui changes without breaking, generating test scenarios from app exploration, that kind of stuff. Not just keyword matching or basic ml.
We're running a pretty standard ci/cd pipeline with github actions, about 300 tests across ui and api. Current setup is playwright which works fine but maintenance is brutal. Every release we spend half a day fixing tests that broke due to ui changes.
Has anyone actually used an ai test automation platform that delivered on the promises? Or is this all just next gen marketing speak for the same old stuff?
Genuinely curious because if the tech is there i want to try it, but i'm not interested in another "revolutionary" tool that's just selenium with extra steps.
https://redd.it/1p6a3zp
@r_devops
Every vendor pitch i sit through now mentions "AI powered" something but when you dig into it, it's just basic automation with maybe a chatgpt integration slapped on top.
I'm looking for a test automation platform that actually uses AI in meaningful ways, like understanding user intent, adapting to ui changes without breaking, generating test scenarios from app exploration, that kind of stuff. Not just keyword matching or basic ml.
We're running a pretty standard ci/cd pipeline with github actions, about 300 tests across ui and api. Current setup is playwright which works fine but maintenance is brutal. Every release we spend half a day fixing tests that broke due to ui changes.
Has anyone actually used an ai test automation platform that delivered on the promises? Or is this all just next gen marketing speak for the same old stuff?
Genuinely curious because if the tech is there i want to try it, but i'm not interested in another "revolutionary" tool that's just selenium with extra steps.
https://redd.it/1p6a3zp
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Failing Every Devops Interview need help
Hey everyone, I’m going through a tough phase and could really use some advice from this community.
I was laid off on 10th October 2025, and since then I’ve been actively interviewing for DevOps roles. It’s been a little over 2 months now, but I keep failing interviews. Some rounds feel like they go well, yet I still end up rejected, and I’m honestly not sure where I’m falling short.
I’ve been practicing Jenkins, Git, Linux, AWS basics, Terraform, CI/CD pipelines, and doing hands-on labs, but I feel like something is still missing, either in my preparation or in the way I communicate during interviews.
If anyone here has been through something similar or is currently working in DevOps, I’d really appreciate any guidance. What should I focus on the most?
How do you approach DevOps interviews?
Any good resources/labs/mock interview groups to improve?
What helped you break into your first DevOps job?
Any help or honest feedback would mean a lot. Thanks in advance.
https://redd.it/1p6bwsk
@r_devops
Hey everyone, I’m going through a tough phase and could really use some advice from this community.
I was laid off on 10th October 2025, and since then I’ve been actively interviewing for DevOps roles. It’s been a little over 2 months now, but I keep failing interviews. Some rounds feel like they go well, yet I still end up rejected, and I’m honestly not sure where I’m falling short.
I’ve been practicing Jenkins, Git, Linux, AWS basics, Terraform, CI/CD pipelines, and doing hands-on labs, but I feel like something is still missing, either in my preparation or in the way I communicate during interviews.
If anyone here has been through something similar or is currently working in DevOps, I’d really appreciate any guidance. What should I focus on the most?
How do you approach DevOps interviews?
Any good resources/labs/mock interview groups to improve?
What helped you break into your first DevOps job?
Any help or honest feedback would mean a lot. Thanks in advance.
https://redd.it/1p6bwsk
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
AI-Powered Attack Automation: When Machine Learning Writes the Exploit Code 🤖
https://instatunnel.my/blog/ai-powered-attack-automation-when-machine-learning-writes-the-exploit-code
https://redd.it/1p6b944
@r_devops
https://instatunnel.my/blog/ai-powered-attack-automation-when-machine-learning-writes-the-exploit-code
https://redd.it/1p6b944
@r_devops
InstaTunnel
AI-Powered Attack Automation: Machine Learning is Transformi
Discover how cybercriminals use AI to automate exploit generation, polymorphic malware, and phishing campaigns. Learn why adaptive, learning-based attacks
Are Azure DevOps pipelines hard to use or is it just me?
Hello all. This one is a bit of a discussion/rant but I wanted to get some opinions on the state of Azure DevOps Pipelines versus the competitors. Have been banging my head against it just trying to do simple stuff such as having it work with combinations of static and dynamic inputs and I feel like I'm finding 1,000 ways to do it wrong and zero ways to get it working.
I think I understand the difference between compile-time and runtime parameters, but it seems incredibly difficult to find the right magic incantation to get runtime parameters to evaluate correctly, especially when using lots and lots of templates (I'm currently working at a place with an existing pipeline setup that I'm trying to amend and there are several layers of nested templates to deal with).
I've been working either directly in DevOps teams or adjacent to them for well over a decade now and have worked with TeamCity, Octopus, Jenkins and GitLab pipelines and I have never had so many headaches as I've had with Azure DevOps pipelines. Is this a common experience?
If it's not, and it's actually just down to my own lack of understanding (very possible) then can anyone recommend some good training resources?
https://redd.it/1p6e02r
@r_devops
Hello all. This one is a bit of a discussion/rant but I wanted to get some opinions on the state of Azure DevOps Pipelines versus the competitors. Have been banging my head against it just trying to do simple stuff such as having it work with combinations of static and dynamic inputs and I feel like I'm finding 1,000 ways to do it wrong and zero ways to get it working.
I think I understand the difference between compile-time and runtime parameters, but it seems incredibly difficult to find the right magic incantation to get runtime parameters to evaluate correctly, especially when using lots and lots of templates (I'm currently working at a place with an existing pipeline setup that I'm trying to amend and there are several layers of nested templates to deal with).
I've been working either directly in DevOps teams or adjacent to them for well over a decade now and have worked with TeamCity, Octopus, Jenkins and GitLab pipelines and I have never had so many headaches as I've had with Azure DevOps pipelines. Is this a common experience?
If it's not, and it's actually just down to my own lack of understanding (very possible) then can anyone recommend some good training resources?
https://redd.it/1p6e02r
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Tools like Graphite and Coderabbit any good?
I’ve been seeing people talk about Graphite and CodeRabbit on twitter and in some YT breakdowns, but it’s hard to tell what’s hype and what’s actually useful when you’re still new to the skill.
I’m a junior backend dev and my biggest struggle is keeping PRs readable and making sure I’m not missing stuff when reviewing others’ work.
Looking for tool recommendations pls 🙏
https://redd.it/1p6ebdi
@r_devops
I’ve been seeing people talk about Graphite and CodeRabbit on twitter and in some YT breakdowns, but it’s hard to tell what’s hype and what’s actually useful when you’re still new to the skill.
I’m a junior backend dev and my biggest struggle is keeping PRs readable and making sure I’m not missing stuff when reviewing others’ work.
Looking for tool recommendations pls 🙏
https://redd.it/1p6ebdi
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Aws lambda deployments. Sam vs aws deploy
In production what should be used
Sam or aws deploy noscripts ?
Since Sam is doing lot of management. For startups is it OK to use Sam in the ci cd ?
https://redd.it/1p6fk6z
@r_devops
In production what should be used
Sam or aws deploy noscripts ?
Since Sam is doing lot of management. For startups is it OK to use Sam in the ci cd ?
https://redd.it/1p6fk6z
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Asked a fresher to shut down an EC2 server… he shut down his own laptop instead
So this happened at work and I’m still laughing about it.
I told a fresher on our team to shut down an EC2 instance before he left for the day so we could save on AWS costs.
Next morning, I log in and see the server is still running.
I ask him, “Hey, did you actually shut it down?”
He nods confidently, “Yes sir, I did. I ran the shutdown command in the terminal.”
Now I’m confused, so I ask him to show me what he did.
He opens his laptop, types the shutdown command in his local terminal, hits enter… and his laptop instantly goes black. Just shuts off.
He looks at me like, “See? It works.”
https://redd.it/1p6jlfv
@r_devops
So this happened at work and I’m still laughing about it.
I told a fresher on our team to shut down an EC2 instance before he left for the day so we could save on AWS costs.
Next morning, I log in and see the server is still running.
I ask him, “Hey, did you actually shut it down?”
He nods confidently, “Yes sir, I did. I ran the shutdown command in the terminal.”
Now I’m confused, so I ask him to show me what he did.
He opens his laptop, types the shutdown command in his local terminal, hits enter… and his laptop instantly goes black. Just shuts off.
He looks at me like, “See? It works.”
https://redd.it/1p6jlfv
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
I Need Scaling YOLOv11/OpenCV warehouse analytics to ~1000 sites – edge vs centralized?
I am currently working on a computer vision analytics project. Now its the time for deployment.
This project is used fro operational analytics inside the warehouse.
The stacks i am used are opencv and yolo v11
Each warehouse gonna have minimum of 3 cctv camera.
I want to know:
should i consider the centralised server to process images realtime or edge computing.
what is your opinon and suggestion?
if anybody worked on this similar could you pls help me how you actually did it.
Thanks in advance
https://redd.it/1p6k2b7
@r_devops
I am currently working on a computer vision analytics project. Now its the time for deployment.
This project is used fro operational analytics inside the warehouse.
The stacks i am used are opencv and yolo v11
Each warehouse gonna have minimum of 3 cctv camera.
I want to know:
should i consider the centralised server to process images realtime or edge computing.
what is your opinon and suggestion?
if anybody worked on this similar could you pls help me how you actually did it.
Thanks in advance
https://redd.it/1p6k2b7
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Production and at scale Kubernetes learning advice
Currently, manage 2 non prod clusters and 1 prod cluster that are dedicated to my team only. Have pretty decent setup, push based gitops, helm, cluster autoscalers, HPA, fluentd logging to cloudwatch, prometheus/grafana/thanos stack for observability.
Looking for some jobs that require K8s in production and at scale.
What do I have to learn and do to be in that level?
https://redd.it/1p6izwe
@r_devops
Currently, manage 2 non prod clusters and 1 prod cluster that are dedicated to my team only. Have pretty decent setup, push based gitops, helm, cluster autoscalers, HPA, fluentd logging to cloudwatch, prometheus/grafana/thanos stack for observability.
Looking for some jobs that require K8s in production and at scale.
What do I have to learn and do to be in that level?
https://redd.it/1p6izwe
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Senior Devops contractor in Zurich
Hey everyone,
Apologies if this sub is not the right one to ask, but I was wondering if anyone knows what the current daily rate is for a Senior Devops in Zurich. I am interviewing for a 'long term' contract (B2B) and relocation to Zurich is needed (I don't live in Switzerland). I was offered 700-800 CHF per day.
My suspicion, knowing the costs of living in Zurich, is that this significantly on the lower side.
Thanks for your help !
https://redd.it/1p6nvlj
@r_devops
Hey everyone,
Apologies if this sub is not the right one to ask, but I was wondering if anyone knows what the current daily rate is for a Senior Devops in Zurich. I am interviewing for a 'long term' contract (B2B) and relocation to Zurich is needed (I don't live in Switzerland). I was offered 700-800 CHF per day.
My suspicion, knowing the costs of living in Zurich, is that this significantly on the lower side.
Thanks for your help !
https://redd.it/1p6nvlj
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Job Skills to Gain
This is going to sound like a weird ask, but I am asking for some suggestions on some skills I should learn.
I’m currently a senior cloud engineer and have a lot of the tech stuff down, if it’s something new I am also good enough to put it together and leverage AI to help me learn my missing gap.
I’m looking at things that could help enhance my career to architect or manager level. I was thinking about doing a communication course but the ones I found on Udemy were super dry.
I also was thinking of data analytics but I am missing the idea of where I can use it at since I’m a consultant.
Any suggestions would be appreciated.
https://redd.it/1p6qj05
@r_devops
This is going to sound like a weird ask, but I am asking for some suggestions on some skills I should learn.
I’m currently a senior cloud engineer and have a lot of the tech stuff down, if it’s something new I am also good enough to put it together and leverage AI to help me learn my missing gap.
I’m looking at things that could help enhance my career to architect or manager level. I was thinking about doing a communication course but the ones I found on Udemy were super dry.
I also was thinking of data analytics but I am missing the idea of where I can use it at since I’m a consultant.
Any suggestions would be appreciated.
https://redd.it/1p6qj05
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Stop looking at CPU usage, start looking at PSI
Simple example with two Linux servers:
Server A: CPU \~100%. Latency is low, requests are fast. Doing video encode. Server B: CPU \~40%. API calls are timing out, SSH is lagging.
If you only look at CPU graphs, A looks worse than B. In reality A is just busy. B is the one under pressure because tasks are waiting for CPU. I still see alerts / autoscaling rules like:
>CPU > 80% for 5 minutes
CPU% just says “cores are busy”. It does not say “tasks are stuck”.
Linux (4.20+) has PSI (Pressure Stall Information) in
This tells you how much time tasks are stalled on CPU / memory / IO.
Example from
Here
For a small observability project I hack on (Linnix, eBPF-based), I stopped using load average and switched to
Longer write-up with more details is here:
https://parth21shah.substack.com/p/stop-looking-at-cpu-usage-start-looking
Anyone here actually using PSI in prod alerts?
https://redd.it/1p6rur8
@r_devops
Simple example with two Linux servers:
Server A: CPU \~100%. Latency is low, requests are fast. Doing video encode. Server B: CPU \~40%. API calls are timing out, SSH is lagging.
If you only look at CPU graphs, A looks worse than B. In reality A is just busy. B is the one under pressure because tasks are waiting for CPU. I still see alerts / autoscaling rules like:
>CPU > 80% for 5 minutes
CPU% just says “cores are busy”. It does not say “tasks are stuck”.
Linux (4.20+) has PSI (Pressure Stall Information) in
/proc/pressure/*. This tells you how much time tasks are stalled on CPU / memory / IO.
Example from
/proc/pressure/cpu:some avg10=0.00 avg60=5.23 avg300=2.10 total=1234567Here
avg60=5.23 means: in the last 60 seconds, tasks were stalled 5.23% of the time because there was no CPU.For a small observability project I hack on (Linnix, eBPF-based), I stopped using load average and switched to
/proc/pressure/cpu for the “is this box in trouble?” logic. False alarms dropped a lot.Longer write-up with more details is here:
https://parth21shah.substack.com/p/stop-looking-at-cpu-usage-start-looking
Anyone here actually using PSI in prod alerts?
https://redd.it/1p6rur8
@r_devops
Substack
Stop looking at CPU Usage. Start looking at Pressure.
Why "100% CPU" doesn't mean your server is dead.
I built an open-source tool for debugging Kubernetes with LLMs - Kubently
Hey y'all - been working on a side project and figured this community might find it useful (or tear it apart, or most likely both) and I've learned a lot just building it. I've been part of another agentic platform engineering project (CAIPE) which introduced me to a lot of the concepts so definitely grateful for that but building this from scratch was a bigger undertaking than I think I originally intended, ha! Full disclosure - there's lots of room for improvement and I have lots of ideas on how to make it better but wanted to get some community feedback on what I have so far to understand if this is something people are actually interested in or if it's a total miss. I think it's useful as is but I definitely built with future enhancements in mind (ie black box architecture/easy to swap out core agent logic/LLM/etc) so its not an insane undertaking when I get around to tackling them.
**Kubently** is an open-source tool for troubleshooting Kubernetes agentically - basically lets you debug clusters through natural conversation with any major LLM. The name is a play on "Kubernetes" + "agentically" if that wasn't obvious.
Why I built it: kubectl output is verbose, debugging is manual, managing multiple clusters means constant context-switching, and honestly agents debug faster than I can half the time. So I built something that fixes this.
**What it does:**
* \~50ms command delivery via SSE
* Read-only operations by default (secure by design)
* Native A2A protocol support - works with whatever LLM you're running
* Integrates with existing A2A systems like [CAIPE](https://cnoe-io.github.io/ai-platform-engineering/)
* LangGraph/LangChain
* Runs on any K8s cluster - EKS, GKE, AKS, bare metal, doesn't matter
* Multi-cluster from day one - deploy lightweight executors to each cluster, manage from single API
Docs: [https://kubently.io](https://kubently.io)
GitHub: [https://github.com/kubently/kubently](https://github.com/kubently/kubently)
Would love feedback, bug reports, or feature requests. And if you find it useful, a star on GitHub would be awesome.
https://redd.it/1p6sld9
@r_devops
Hey y'all - been working on a side project and figured this community might find it useful (or tear it apart, or most likely both) and I've learned a lot just building it. I've been part of another agentic platform engineering project (CAIPE) which introduced me to a lot of the concepts so definitely grateful for that but building this from scratch was a bigger undertaking than I think I originally intended, ha! Full disclosure - there's lots of room for improvement and I have lots of ideas on how to make it better but wanted to get some community feedback on what I have so far to understand if this is something people are actually interested in or if it's a total miss. I think it's useful as is but I definitely built with future enhancements in mind (ie black box architecture/easy to swap out core agent logic/LLM/etc) so its not an insane undertaking when I get around to tackling them.
**Kubently** is an open-source tool for troubleshooting Kubernetes agentically - basically lets you debug clusters through natural conversation with any major LLM. The name is a play on "Kubernetes" + "agentically" if that wasn't obvious.
Why I built it: kubectl output is verbose, debugging is manual, managing multiple clusters means constant context-switching, and honestly agents debug faster than I can half the time. So I built something that fixes this.
**What it does:**
* \~50ms command delivery via SSE
* Read-only operations by default (secure by design)
* Native A2A protocol support - works with whatever LLM you're running
* Integrates with existing A2A systems like [CAIPE](https://cnoe-io.github.io/ai-platform-engineering/)
* LangGraph/LangChain
* Runs on any K8s cluster - EKS, GKE, AKS, bare metal, doesn't matter
* Multi-cluster from day one - deploy lightweight executors to each cluster, manage from single API
Docs: [https://kubently.io](https://kubently.io)
GitHub: [https://github.com/kubently/kubently](https://github.com/kubently/kubently)
Would love feedback, bug reports, or feature requests. And if you find it useful, a star on GitHub would be awesome.
https://redd.it/1p6sld9
@r_devops
cnoe-io.github.io
Introduction | CAIPE (Community AI Platform Engineering)
What is CAIPE (Community AI Platform Engineering)