NEW BOT Телеграм, страница

Reddit DevOps

Azure Engineer or SRE more future?

I am a fresh grad with 1 year working experience (including internship) as a backend developer. I am really interested in cloud and DevOps. I recently received 2 interviews: Azure Engineer and SRE. I wonder which path has more future ?

The Azure Engineer position basically focus on IaaS , deployment, write Teraform. They said might have chance to do cicd pipeline in future... I am wondering is it a good path to go for Cloud engineer/ DevOps engineer? Because they also mentioned that it is bery easy to pick up... I am afraid if this is just a simple deployment job. But they do mentioned that I will do the design infrastructure etc. and a lot of things to learn.

Or SRE better? Which path has more future?
Hope to seek opinions from you all ..🙏🏻

https://redd.it/1pcrhhd
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

10 views02:28

Reddit DevOps

List of 50 top companies in 2025 that hire DevOps engineers!

https://devopsprojectshq.com/role/top-devops-companies-2025/

https://redd.it/1pcwt6t
@r_devops

DevOps Projects

Top 50 DevOps Companies Hiring DevOps Engineers in 2025

Data-driven ranking of companies hiring DevOps engineers based on active job openings. Find your next DevOps career opportunity.

8 views06:28

Reddit DevOps

Is Continuous Exposure Management the true SecDevOps endgame?

We talk a lot about "Shift Left," but the reality is security findings often hit the CI/CD pipeline late, or they are generated by a vulnerability scanner that doesn't understand the context of the running application.

I'm looking at this idea of Exposure Management, which seems like the natural evolution of SecDevOps/SRE practices. It forces security to be integrated and continuous, covering the entire lifecycle: code repos, cloud configurations, deployed application, and user identity. The goal is to continuously assess risk, not just find flaws.

If you are running a mature SecDevOps pipeline, how are you ensuring that security findings from different tools (SAST, DAST, CSPM, etc.) are unified and prioritized to show a single, clear measure of risk, rather than just raw vulnerability counts?

https://redd.it/1pcxb2x
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views07:28

Reddit DevOps

Devops tool builder

Hi. I am 7+ year devops experience have been building some SaaS products for a while. I want to contribute for devops community. Is there any tool that would help devops. I thought about incident management, auto resolution but some companies have been doing them. And AWS also announced AWS Devops Agent today. Is there any part of the daily worklife of devops, sre and sys admin thats often overlooked by the devops tool companies with or without AI.

https://redd.it/1pcxab8
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

10 views08:28

Reddit DevOps

Switching to product based company

Question on programming languages and switching to developer role

Just a general question. In the product based companies, does programming language based on oops matters or even go lang should be fine. Consider both interview and regular day to day work.
The thing is i have almost 15 yrs experience, never coded in my life and I recently picked up go language. I know it will take lot of time to develop skillset considering i will not have practical exposure. But still a few questions if anyone can help.
1) I know I can never match or atleast get an entry to maang faang or whatever. But will there a chance for other product companies.
I don't know how tougher will be struggle in their day to day works.
2) If in interviews, if I choose go language with no idea around classes or oops will that be a reject.
3) I know at this age, system design etc..is expected but again i dont think I can understand them unless I have practical exposure. But if I am ready to lower my designation will that be ok.

https://redd.it/1pd033f
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

7 views10:28

Reddit DevOps

Transparently and efficiently forward connection to container/VM via load balancer

**TLDR**: How can my load balancer efficiently and transparently forward an incoming connection to a container/VM in Linux?

**Problem**: For educational purposes, and maybe to write a patch for liburing in case some APIs are missing, I would like to learn how to implement a load balancer capable of scaling a target service from zero to hero. LB and target services are on the same physical node.

I would like for this approach to be:

* **Efficient**: as little memory copying as possible, as little CPU utilization as possible
* **Transparent**: the target service should not understand what's happening

I saw systemd socket activation, but it seems it can scale from 0 to 1, while it does not handle further scaling. Also the socket hands off code felt a bit hard to follow, but maybe I'm just a noob.

**Current status**: After playing a bit I managed to do this either efficiently or transparently, but not both. I would like to do both.

The load balancer process is written in Rust and uses `io_uring`.

**Efficient approach**:

* LB binds to a socket and fires a multishot accept
* On client connection the LB perform some business logic to decide which container should handle the incoming request
* If the service is scaled to zero fires up the first container
* If the service is overloaded fires up more instances
* Pass the socket file denoscriptor to the container via `sendmsg`
* The container receives the FD and fires a multishot receive to handle incoming data

This approach is VERY efficient (no memory copying, very little CPU usage) but the receiving process need to be aware of what's happening to receive and correctly handle the socket FD.

Let's say I want to run an arbitrary node.js container, then this approach won't work.

**Transparent approach**:

* LB binds to a socket and fires a multishot accept
* On client connection the LB perform some business logic to decide which container should handle the incoming request
* If the service is scaled to zero fires up the first container
* If the service is overloaded fires up more instances
* LB connect to the container, fires a multishot receive
* Incoming data get sent to the container via zerocopy send

This approach is less efficient because:

* The incoming container copies the data once (but this happens also in the efficient case)
* We double the number of active connections, for each connection between client and LB we have a connection between LB and service

The advantage of this approach is that the incoming service is not aware of what's happening

**Questions**:

* What can I use to efficiently forward the connection from the LB to the container? Some kind of pipe?
* Is there a way to make the container think there is a new accept event even though the connection was already accepted and without opening a new connection between the LB and the container?
* If the connection is TCP, can I use the fact that both the LB and the container are on the same phyisical node and use some kind of lightweight protocol? For example I could use Unix Domain Sockets but then the target app should be aware of this, breaking transparency

https://redd.it/1pd14dz
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views11:28

Reddit DevOps

Transitioning from Software Engineer to DevOps

Hello everyone.

In recent years I have been working as a software engineer with a specialization in backend and now I want to make a transition to the field of DevOps.

As a developer I use a lot of common tools such as CI/CD, Docker, Python but unfortunately as part of my work day I don't really cover all the tools (I don't have any work with the cloud at all) and therefore I have to learn everything myself through independent projects that I check.

Moreover, there are more jobs in the field of DevOps than in software development and you can be more compensated in them and this is one of the reasons I want to make the transition.

I use AI a lot in terms of topics and terms that I need to know and of course learn how things work

Has anyone made this transition before?

What jobs should I aim for? I was thinking about the MID LEVEL level

Tips that can help?

Thank you.

https://redd.it/1pd1yi2
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views12:28

Reddit DevOps

Salt Typhoon: When State-Sponsored Hackers Infiltrate Telecom Infrastructure 📡

https://instatunnel.my/blog/salt-typhoon-when-state-sponsored-hackers-infiltrate-telecom-infrastructure

https://redd.it/1pd3z0z
@r_devops

InstaTunnel

Salt Typhoon:How State-Sponsored Hackers Breached US Telecom

Explore how the Salt Typhoon cyber-espionage campaign infiltrated major U.S. telecom providers to geolocate users and intercept communications

9 views13:28

Reddit DevOps

eBPF for the Infrastructure Platform:
How Modern Applications Leverage Kernel-Level Programmability

New white paper from the eBPF Foundation

https://ebpf.foundation/new-state-of-ebpf-report-explores-how-modern-infrastructure-teams-are-building-on-kernel-level-programmability/

https://redd.it/1pd516j
@r_devops

7 views14:28

Reddit DevOps

So, what do you guys think of the new AWS DevOps Agents?

According to AWS, the agent can identify, investigate, and even “resolve” incidents based on monitoring alerts, significantly reducing the number of incident responses required by an actual DevOps person.

I personally think it’s still a long shot to fully resolve incidents for larger organizations because they have resources spread across multiple clouds, on‑prem servers, and all the complexity that involves. These kinds of agents might be useful as an additional layer of monitoring by acting as a third eye on all the monitoring and observability tools an organization has.

https://aws.amazon.com/devops-agent/

Full article about the Frontier agents which includer a Developer Agent(Kiro), Security Agent and DevOps Agent : https://www.aboutamazon.com/news/aws/amazon-ai-frontier-agents-autonomous-kiro?utm\_source=ecsocial&utm\_medium=linkedin&utm\_term=36

https://redd.it/1pd481u
@r_devops

Amazon

Frontier agent – AWS DevOps Agent – AWS

AWS DevOps Agent is a frontier agent that resolves and proactively prevents incidents, continuously improving reliability and performance.

9 views15:28

Reddit DevOps

A Technical Look at Why Moving Back to a Monolith Saved $1,200/Month: Real Benchmarks

A client came to us with a backend that appeared modern but operated poorly. The system relied on 12 microservices, 12 deployments, and 12 separate repositories, which collectively created multiple points of failure. Despite the complexity, the platform was slow, expensive, and increasingly difficult to maintain. Their monthly infrastructure cost had reached $1,900, and each CI/CD cycle required 27 minutes for a single release.

After reviewing logs, traffic patterns, and operational behavior, it became clear that the architecture itself was creating the instability. We consolidated the entire system into a single, modular Node.js and Express monolith running on PM2 and Docker.

The results were immediate:

\- The infrastructure cost has been reduced from $1,900 to $700 per month

\- Latency (P95) improved from 240ms to 38ms

\- CI/CD time decreased from 27 minutes to 8 minutes

\- Deployment failures reduced from 6 per month to 0 - 1 per month

\- Debugging time dropped from hours to minutes

The experience highlighted a recurring pattern. Microservices often address organizational scale rather than early product requirements. In this case, a well-structured monolith delivered significantly better performance, lower overhead, and greater operational stability.

This outcome further reinforced a broader operational reality that complexity tends to accumulate faster than value when microservices are introduced before the system or the team genuinely requires them.

Much of the instability observed in this project stemmed not from technical limitations but from architectural decisions made prematurely.

This serves as a reflection on how architectural choices, when made too early, can introduce operational burdens that ultimately hinder system resilience and efficiency.

https://redd.it/1pd70n9
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

14 views16:28

Reddit DevOps

Remote team laptop setup automation - we automate everything except new hire laptops

DevOps team that prides itself on automation. Everything is infrastructure as code:

- Kubernetes clusters: Terraform
- Database migrations: Automated
- CI/CD pipelines: GitHub Actions
- Monitoring: Automated alerting
- Scaling: Auto-scaling groups
- Deployments: Fully automated

New hire laptop setup: "Here's a list of 63 things to install manually, good luck!"

New DevOps engineer started Monday. Friday afternoon and they're still configuring local environment:

- Docker (with all the WSL complications)
- kubectl with multiple cluster configs
- terraform with authentication
- AWS CLI with MFA setup
- Multiple VPN clients for different environments
- IDE with company plugins
- SSH key management across services
- Local databases for development
- Language version managers
- Company security tools

We can provision entire production environments in 12 minutes but can't ship a laptop ready to work immediately?

This feels like the most obvious automation opportunity in our entire tech stack. Why are we treating developer laptop configuration like it's 2010 while everything else is cutting-edge automated infrastructure?

https://redd.it/1pda7cv
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

13 views17:28

Reddit DevOps

What about cosplay for IT conference?
https://www.reddit.com/gallery/1pdaiej

https://redd.it/1pdavyt
@r_devops

From the lotr community on Reddit: What about cosplay for IT conference?

Explore this post and more from the lotr community

12 views18:28

Reddit DevOps

Two weeks ago I posted my weekend project here. Yesterday nixcraft shared it.

Today I'm writing this as a thank you.

Two weeks ago I posted my cloud architecture game to r/devops and r/webdev.

Honestly, I was hoping for maybe a couple of comments. Just enough to understand if anyone actually cared about this idea.

But people cared. A lot.

You tried it. You wrote reviews. You opened GitHub issues. You upvoted. You commented with ideas I hadn't even thought of. And that kept me going.

So I kept building. I closed issues that you guys opened. I implemented my own ideas. I added new services. I built Sandbox Mode. I just kept coding because you showed me this was worth building.

Then yesterday happened.

I saw that nixcraft - THE nixcraft - reposted my game. I was genuinely surprised. In a good way.

250 stars yesterday morning. 1250+ stars right now. All in 24 hours.

Right now I'm writing this post as a thank you. Thank you for believing in this when it was just a rough idea. Thank you for giving me the motivation to keep going.

Because of that belief, my repository exploded. And honestly? It's both inspiring and terrifying. I feel this responsibility now - I don't have the right to abandon this. Too many people believed in it.

It's pretty cool that a simple weekend idea turned into something like this.

Play: https://pshenok.github.io/server-survival
GitHub: https://github.com/pshenok/server-survival

Thank you, r/devops and r/webdev. You made this real.

https://redd.it/1pdcmom
@r_devops

GitHub

GitHub - pshenok/server-survival: Tower defense game that teaches cloud architecture. Build infrastructure, survive traffic, learn…

Tower defense game that teaches cloud architecture. Build infrastructure, survive traffic, learn scaling. - pshenok/server-survival

12 views19:28

Reddit DevOps

How do you guys handle very high traffic?

I have came across a case where there are usually 10-15k requests per min, on normal spikes it goes up to 40k req/min. But today for some reason i encountered hugh spikes 90k req/min multiple times. Now servers that handle requests are in auto scaling and it scaled up to 40 servers to match the traffic but it also resulted in lots of 5XX and 4xx errors while it scaled up.
Architecture is as below

Aws WAF —> ALB—-> AutoScaling EC2

Part of requests are not that much important to us, meaning can be processed later(slowly)

Need senior level architecture suggestions to better handle this.

We considered contanerization but at the moment App is tightly coupled with local redis server. Each server needs to have redis server and php horizon

https://redd.it/1pddcp8
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views20:28

Reddit DevOps

Building A Platform for Provisioning infrastructure on different clouds - Side Project

Hello, I hope everyone is good. Now a days i have free time because my job is very relax. So i decided to build a platform similar to internet developer tool. Its just my side project polishing my skills bcz i want to learn platform engineering. I am DevOps engineer.i have questions from all platform engineers if you would like to build the platform how you make the architecture. My current stack is:
Casbin - for RBAC
Pulumi - for infrastructure Provisioning
Fastapi - backend api
React - frontend
Calery redis - multiple jobs handling
PostgreSQL for Database

For cloud provide authentication i am using identify provide setup to automatically exchange tokens so no need for service accounts to store.

Need suggestions like what are the mistakes people do when building platform and how to avoid them. Are my current stack is good or need to change?
Thanks everyone.

https://redd.it/1pddqo0
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

❤1

10 views21:28

Reddit DevOps

My first website

It's basically what the noscript says, I created my first website (with help from AI) and wanted to receive feedback. If you find the idea interesting, feel free to make a donation at the bottom of the page. Note the site is not completely finished yet, there are still some bugs to fix.

Link: jgsp.me

https://redd.it/1pdi42e
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

5 views22:28

Reddit DevOps

Setup to deploy small one-off internal tools without DevOps input?

So,

Out DevOps guy is flooded and so is the bottle neck on deploying anything new. My team would like to be able to deploy one-ff web apps to AWS without his input as they are not mission critical i.e. prototypes, ideas, internal tools, but it takes weeks to get it to happen atm.

I'm thinking, if we had a EKS cluster for handling these little web apps, is there a setup in which, along with the web-app code, we could include the k8s config YAML for the app and have a CI/CD noscript (we're using Bitbucket) that could pick up this ks config and deploy to EKS?

Hopefully not involving the poor DevOps guy and making my team more independent while remaining secure in our VPC.

We had a third party vibe code a quick app and deployed to Vercel, which breaks company data privacy for our clients not to mention security concerns. But its a use case we've been told we need to cater to...

Has anyone done something like this?

https://redd.it/1pdio9p
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

6 views23:28

Reddit DevOps

Using ClickHouse for Real-Time L7 DDoS & Bot Traffic Analytics with Tempesta FW

Most open-source L7 DDoS mitigation and bot-protection approaches rely on challenges (e.g., CAPTCHA or JavaScript proof-of-work) or static rules based on the User-Agent, Referer, or client geolocation. These techniques are increasingly ineffective, as they are easily bypassed by modern open-source impersonation libraries and paid cloud proxy networks.

We explore a different approach: classifying HTTP client requests in near real time using ClickHouse as the primary analytics backend.

We collect access logs directly from [Tempesta FW](https://github.com/tempesta-tech/tempesta), a high-performance open-source hybrid of an HTTP reverse proxy and a firewall. Tempesta FW implements zero-copy per-CPU log shipping into ClickHouse, so the dataset growth rate is limited only by ClickHouse bulk ingestion performance - which is very high.

[WebShield](https://github.com/tempesta-tech/webshield/), a small open-source Python daemon:

* periodically executes analytic queries to detect spikes in traffic (requests or bytes per second), response delays, surges in HTTP error codes, and other anomalies;

* upon detecting a spike, classifies the clients and validates the current model;

* if the model is validated, automatically blocks malicious clients by IP, TLS fingerprints, or HTTP fingerprints.

To simplify and accelerate classification — whether automatic or manual — we introduced a new TLS fingerprinting method.

WebShield is a small and simple daemon, yet it is effective against multi-thousand-IP botnets.

The [full article](https://tempesta-tech.com/blog/defending-against-l7-ddos-and-web-bots-with-tempesta-fw/) with configuration examples, ClickHouse schemas, and queries.

https://redd.it/1pdd2lm
@r_devops

GitHub

GitHub - tempesta-tech/tempesta: Web application acceleration, advanced DDoS protection and web security

Web application acceleration, advanced DDoS protection and web security - tempesta-tech/tempesta

9 views00:28

Reddit DevOps

Yaml pipeline builder

Is there such a thing as a gui to at least scaffold multi stage pipelines? I'm building some relatively simple ones and it seems to me a gui would have been able to do what I need

The azure devops classic builder does a pretty good job but only works within a single job

https://redd.it/1pdluxy
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

7 views01:28

Reddit DevOps

Deep dive into the top command — useful for performance debugging

Explained how to use top for real-time performance insights, sorting, and debugging.

Full Tutorial can be found at https://youtu.be/vNoRFvAm52s

https://redd.it/1pdln1l
@r_devops

YouTube

How System Monitoring Can Supercharge Your Performance

Welcome to our in-depth guide on the `top` command in Linux! If you're looking to monitor system performance, manage processes, and keep an eye on resource usage in real-time, then mastering the `top` command is essential.

**In this video, you'll learn:**…

6 views02:28

About

Blog

Apps

Platform