Reddit DevOps – Telegram
"well-structured monolith" or a few key microservices. Avoid breaking everything down into 100 tiny services from day one. Evolve your architecture; don't try to perfect it on the first attempt.*

**Step 3: Embrace Automation (Your DevOps Playbook)**

The cloud's power is wasted if your deployment process is still manual.

CI/CD is Non-Negotiable: Set up a Continuous Integration/Continuous Deployment pipeline from day one. Every code change should automatically be built, tested, and deployed.

* Tools: GitHub Actions (great if you're on GitHub), GitLab CI (excellent all-in-one solution), Jenkins (the old, powerful workhorse).

Infrastructure as Code (IaC): Define your servers, databases, and networks in code. This makes your infrastructure repeatable, version-controlled, and easy to manage.

* Tools: Terraform (the cloud-agnostic standard), AWS CloudFormation (AWS-specific).

**Step 4: Lock It Down (Security is NOT an Afterthought)**

The cloud provider secures the cloud, but you are responsible for security in the cloud. This is the "Shared Responsibility Model." Don't get caught out.

* Identity & Access Management (IAM): Grant the least privilege necessary. Don't give a junior developer admin access to your production database.
* Network Security: Use Virtual Private Clouds (VPCs) and subnets to isolate your resources from the public internet.
* Encrypt Everything: Encrypt your data both at rest (in the database) and in transit (over the network).

**Step 5: Tame the Beast (Cloud Cost Management)**

Your biggest post-launch surprise will be the bill. Get ahead of it.

Tag Everything: Tag every resource (server, database, etc.) with its owner, project, and environment (dev, staging, prod). This is the only way to know where your money is going.

Set Billing Alerts: Create alerts that notify you when your spending exceeds a certain threshold.

Shut Down Dev/Test Environments: Don't run development and testing servers 24/7. Automate noscripts to shut them down on nights and weekends. This alone **can save you 60-70% on non-production costs.**

# Part 3: The "Oops" File - 3 Common Cloud Pitfalls to Avoid

**The Blind "Lift and Shift":** Just moving your old, inefficient monolith from your on-premise server to a cloud server (like an EC2 instance) is the fastest way to get a massive bill with zero benefits. You're just renting a more expensive data center.

1. **Ignoring Cost Governance:** Teams will spin up resources and forget about them. Without a clear governance and tagging strategy, your cloud bill will spiral out of control.
2. **The "It's the Cloud's Problem" Security Myth:** Assuming AWS/Azure/GCP handles all security is a recipe for disaster. You are still responsible for configuring firewalls, managing user access, and securing your application code.

# TL;DR & Conclusion

Moving your enterprise application to the cloud isn't just a technical shift; it's a cultural one.

* **Start Small:** Don't try to boil the ocean. Begin with a single application.
* **Choose Wisely:** Pick your cloud and architecture based on your team and needs, not just trends.
* **Automate Everything:** Your CI/CD pipeline and IaC are your best friends.
* **Govern Costs & Security:** From day one, treat cost and security as primary features.

*The journey is complex, but the payoff, in speed, scalability, and resilience, is undeniable.*

https://redd.it/1nud3km
@r_devops
How the hell are you all handling AI jailbreak attempts?

We have public facing customer support AI assistant, and lately it feels like every day someone’s trying to break it. Am talking multi layer prompts, hidden instructions in code blocks, base64 payloads, images with steganographically hidden text and QR codes.

While we’ve patched a lot, I’m worried about the ones we’re not catching. We’ve looked at adding external guardrails and red teaming tools, but I’d love to hear from anyone who’s been through this at scale.

How do you detect and block these attacks without rendering the platform unusable for normal users? And how do you keep up when the attack patterns evolve so fast?

https://redd.it/1nudj4x
@r_devops
The first malicious MCP server just dropped, what does this mean for agentic systems?

The postmark-mcp incident has been on my mind. For weeks it looked like a totally benign npm package, until v1.0.16 quietly added a single line of code: every email processed was BCC’d to an attacker domain. That’s \~3k–15k emails a day leaking from \~300 orgs.

What makes this different from yet another npm hijack is that it lived inside the Model Context Protocol (MCP) ecosystem. MCPs are becoming the glue for AI agents, the way they plug into email, databases, payments, CI/CD, you name it. But they run with broad privileges, they’re introduced dynamically, and the agents themselves have no way to know when a server is lying. They just see “task completed.”

To me, that feels like a fundamental blind spot. The “supply chain” here isn’t just packages anymore, it’s the runtime behavior of autonomous agents and the servers they rely on.

So I’m curious: how do we even begin to think about securing this new layer? Do we treat MCPs like privileged users with their own audit and runtime guardrails? Or is there a deeper rethink needed of how much autonomy we give these systems in the first place?

https://redd.it/1nuechz
@r_devops
When 99.9% SLA sounds good… until you do the math

Had an interesting conversation last week about a potential enterprise deal. The idea was floated to promise 99.9% uptime as part of the SLA. On the surface it sounded fine, everyone in the room nodded along.

Then I did the math: 99.9% translates to about 43 minutes of downtime per month. The awkward part? We'd already used that up during a P1 incident the previous Saturday. I ended up being the one to point it out, and the room went dead silent.

What really made me shake my head was when someone suggested maybe we should aim for 99.99% instead, just to grab the deal. To me, adding another feels absurd when we can barely keep up with the three nines.

In the end, we dropped the idea of including the SLA for this account, but it definitely could have gone the other way.

Curious if anyone else has had to be the "reality check" in one of these conversations?

https://redd.it/1nue6oy
@r_devops
Trunk based or Gitflow?

Hey guys any thoughts about enforcing these into ci/cd? What are your thoughts and for a fast phase environment what’s better?

https://redd.it/1nuhujm
@r_devops
Job at Bottomline as systems engineer

Hi Everyone,

I have ~4.5 years of experience as a DevOps Engineer. Currently, I’m working at SAP as a DevOps Engineer. However, the role isn’t “true DevOps” in the sense of building CI/CD pipelines or creating Kubernetes clusters. It’s more focused on cloud operations like monitoring k8s clusters, upgrading components, and handling on-call. The positives are that I have good freedom, flexibility, an average package, and extra on-call allowances.

Now I have an offer from Bottomline as a Systems Engineer II with a better package (though benefits aren’t as strong as SAP). Bottomline isn’t as big as SAP. it’s a growing company. The role is more like a Kubernetes admin within their central infrastructure team, but it also involves AWS, GitOps, Terraform, etc. The team is spread across the US and UK, so I’d be covering either Shift 1 or Shift 2 without additional allowance, and week-offs might vary.

The team seems good and welcoming, which is a plus.

I’m in a confused state... so, should I stick with SAP (stability, brand, flexibility) or move to Bottomline (hands-on infra/devops work, higher pay, smaller company, shift challenges) or wait for othet opportunities?

Any advice would be really appreciated.

https://redd.it/1nufbsz
@r_devops
Beginner looking for guidance to learn DevOps – Where should I start?

Hi everyone,

I’m a complete beginner and want to get into **DevOps**. I have some basic knowledge of coding/development, but I feel overwhelmed by how broad DevOps is (CI/CD, Docker, Kubernetes, Cloud, Monitoring, etc.).

Could you please guide me on:

* **Where to start as a beginner?** (Linux, Git, Docker, Cloud basics?)
* **Recommended learning path** (what skills/tools should I prioritize first?)
* Any **free/affordable resources** (courses, YouTube channels, documentation, books)
* How much coding knowledge is actually required for DevOps?
* Any **projects or hands-on practice ideas** to build real skills?

My goal is to gradually build a strong foundation and eventually be job-ready for DevOps/SRE roles.

Any advice, roadmap suggestions, or resource links would be super helpful! 🙏

Thanks in advance.

https://redd.it/1nun9c5
@r_devops
Academic Repository Study - Quick 5 Minute Survey

We are master's students at the University of Texas currently working on a research project on how developers and teams choose and adopt their artifact repositories (e.g., Nexus Repository, Artifactory, GitHub Packages, etc.).
We're hoping to better understand:
• What developers consider “must-haves” when choosing a repository manager
• Pain points or frustrations with current tools
• How different environments (work, school, open-source) shape those choices
If you’ve worked with any artifact repository, whether as a student, hobbyist, or in a professional team, we'd be super grateful if you could fill out this quick survey (5 minutes). We will be raffling a $100 gift card at the end of the survey period.

https://forms.gle/3BSCZu51GLFxgUXy5

Your input will help us identify what really matters to devs when they're picking a repository manager and hopefully make your experience better in the future!
(Mods, please let me know if this post isn’t appropriate here and I’ll take it down or if I need to verify the authenticity of the post)



https://redd.it/1nup56c
@r_devops
My company is moving to container only now. But higher ups are deciding we will not containerize any database.

Citing "the access to filesystem and performance are not good enough"

This mean future project will be dockerized... except databases like mariadb, postgres and mongodb that will keep living in a VM (At the moment everything is a VM managed but puppet in our infrastructure)

What are your thoughs ? I have some personnal experience with databases in container (I run a postgres DB in a container for a personnal project) but nothing of the scale a company like us would run

https://redd.it/1nutnci
@r_devops
Mail sending providers with supported Terraform provider

I am looking for a mail sending platform that supports a Terraform provider (not a community provided one). Is this just not a thing? Seems like an absolute no-brainer for mail platforms to provide, yet I haven't been able to find much here.

https://redd.it/1nuuizn
@r_devops
Anyone running production apps on Railway?

Hey everyone

I’ve been looking into Railway and I’m curious about a few things before jumping in:

• How’s the pricing in practice? Is the $5 basic plan actually enough for small production apps?

• What kind of apps/services have you (or your company) successfully run there?

• How do you handle dev/staging/prod environments on Railway?

• How do you manage backups?

I’d love to hear real-world experiences from devs or teams using it for production. Worth it? Or better to look elsewhere?

Thanks!

https://redd.it/1nutqgt
@r_devops
Agentic Solution that generates custom PaaS solutions

Hi Reddit,

I'm excited to share my open-source project that helps teams use AI to generate PaaS configurations.If you have an internal PaaS with custom guidelines, rules and best practices, PaaS-AI can simplify that for your.

PaaS-AI connects to the documentation (web, confluence, etc), to be able to design and generate specs or configs based on your requirements.

The project is super easy to extend, supports CLI (that's what I use personally) and API. You can easily put it behind a UI and share it with even less technical folks ;)

https://github.com/utopiops/paas-ai

It's MIT licensed and will stay like that forever.

P.S. PaaS-AI is not replacing any roles, it's there to help you use existing systems. The engineers build solutions and it's all fun and good stuff, but then have to spend a lot of time, on-board the consumers of their solutions (PaaS in this case). PaaS-AI is built to solve that problem.

https://redd.it/1nuxi4r
@r_devops
How does SASE actually hold up in fast-moving CI/CD environments?

We’ve been told that SASE can simplify networking and security, but I’m wondering how it fits into pipelines where deployments happen constantly. In DevOps-heavy teams, new services spin up and disappear daily, which makes access control tricky.

Does SASE keep pace with that speed, or does it just add another layer of overhead?

https://redd.it/1nusvoq
@r_devops
Need Career guidance

Hello all,

Sorry for a long post. I’m 26 and i have 6 years of work experience in IT as Microsoft Exchange admin ( Messaging, Email Server management) in same company. Lately I’m feeling I have wasted time in one technology rather than learning new ones and changing to different technologies. I feel that it’s too late now to do a jump where freshers are learning hard to crack DSA Problems ,Leetcode scores and experienced like me are currently knows 5-6 technologies , made 3 jumps and be in a good position with almost 2x/3x package than me.

I don’t have coding knowledge. I know few things in cloud related to my work and basic knowledge in Azure. I’m overwhelmed , at the same time when I try to learn something new , it’s not understandable or I lost the sense of grasping things quickly.

I’m ready to revamp myself. As AI is taking over everywhere, I want guidance in which technology i can start from scratch so that it would help in future(atleast for another 10 years)

If you can drop some suggestions on career/learning/overcoming the procrastination/technique to train myself learn harder. Literally any insight would be appreciated.



https://redd.it/1nv1i19
@r_devops
IT career general advice

Hello I'm here to ask if you have any advice for me , I am not very experienced in terms of this field so my apologies. I will try my best to improve.
I am currently doing my bachelors In IT and have been wondering what would be the things I can to in the mean time and in the future.

I am still unsure of what field I want to enter in and so what would you recommended
What are some skills I can learn, and what are some I should. (Programming languages, certs etc.....)
As I am from South east Asia , the salary for most local jobs would be lower than EU,NA... . should I work towards getting a job in these regions? Thank you for your attention

https://redd.it/1nv20v4
@r_devops
Setup for multi location VPN solution

Folks, can you suggest the proper way or solution for my below requirement?
VPN Requirement Brief:

Need a VPN solution for devs to securely connect to multiple office locations (Oman, UAE, KSA).
Devs should be able to select which office VPN server to connect to.
After connecting, they SSH into respective public cloud vps servers — servers should see the office IP as source.
Solution should work on Linux, Windows, macOS with minimal setup and easy switching between servers.

https://redd.it/1nv2r8e
@r_devops
Is Perl still used actively in DevOps or is bash used more?

I'm torn between wanting to refresh my bash noscripting skills vs Perl skills. Which one should it be? Which one is used more in DevOps?

https://redd.it/1nv3mgv
@r_devops
Is my understanding of Kubernetes, OpenTelemetry and incident management correct?

Hi everyone,

I’m learning about observability and incident management in cloud-native setups and want to check if my understanding makes sense (non-engineer here):

Kubernetes manages containers, keeping apps running, scaling them, and handling failures. Kind of like a factory manager keeping it alive and functioning.

OpenTelemetry collects traces, metrics, and logs from apps running in Kubernetes, providing observability. This would be the sensory network so I know what’s happening real-time.

Incident management is about detecting and resolving issues. Kubernetes handles basic self-healing, but OpenTelemetry helps detect incidents and feeds data to monitoring/alerting systems for response. The maintenance team fixing issues and making adjustments to prevent future problems.

Does this sound right? Anything I’ve missed or tiny real-world things I can’t know if I’m not a native engineer?

Trying to use the community here as a bit of mentoring if I’m on the right track. ChatGPT only helps until a certain point.

https://redd.it/1nv3q2u
@r_devops