DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
This write-up from the Oodle.ai team delves into the practical aspects of profiling Go applications in a production environment. It offers valuable insights and techniques for identifying and resolving performance bottlenecks.
https://blog.oodle.ai/go-profiling-in-production/
1
Ably's post explains their "four pillars" engineering principle, which is designed to ensure their systems have no ceiling on scale. This philosophy guides their architecture to handle massive, unpredictable, and complex realtime workloads.
https://ably.com/blog/ablys-four-pillars-no-scale-ceiling
KubeBuddy - A PowerShell tool for monitoring and managing Kubernetes clusters. Perform health checks, resource usage insights, and configuration audits with ease. Supports AKS best practices, snapshot-based monitoring, and security checks tailored for Kubernetes environments. Available on the PowerShell Gallery.

https://github.com/KubeDeckio/KubeBuddy
👍31
This piece from Airbnb Engineering details their journey of building a centralized user signals platform. It explores the motivations, challenges, and architectural decisions behind creating a system to capture user interactions at scale.
https://medium.com/airbnb-engineering/building-a-user-signals-platform-at-airbnb-b236078ec82b
👍1
In this insightful study, the Games24x7 Tech team shares their experience of migrating Node.js services to Kubernetes. They discuss the strategies and tools used to achieve a seamless and efficient transition with minimal downtime.
https://medium.com/@Games24x7Tech/how-we-seamlessly-transitioned-our-node-services-to-k8s-7e2e6067daa0
👍1
🤣9🔥5
Author dotdc presents Terraflow, a CI/CD orchestrator designed to scale Terraform operations effectively. This report outlines the creation of the tool and how it helps manage complex infrastructure deployments.
https://medium.com/@dotdc/creating-terraflow-a-ci-cd-orchestrator-to-scale-terraform-3965b3f8931f
👍1
This analysis provides a deep dive into writing policies for Kubernetes clusters using OPA Gatekeeper. The Permify Tech Blog explains how to enforce custom rules and maintain security and compliance in a cloud-native environment.
https://medium.com/permify-tech-blog/opa-gatekeeper-how-to-write-policies-for-kubernetes-clusters-bb660666eb19
1👍1
AWS just released their postmortem (link in comment) for the October DynamoDB outage. It's thorough, technically detailed, and explains exactly what broke and how they'll "prevent" it from happening again. But this PR-approved, sanitized narrative tells us only what happened to the technology, nothing else.

https://aws.amazon.com/message/101925/
2👍2
Marc Christian P. Gregorio offers a practical commentary on automating centralized NAT Gateways in AWS across multiple VPCs and regions using Terraform. The solution aims to optimize costs and simplify network management for large-scale deployments.
https://medium.com/@marcchristianp.gregorio/automating-centralized-nat-gateways-in-aws-vpcs-and-region-with-terraform-69a6f90d60da
👍31
Elliot Graebert proposes an impact-based leveling system for engineering organizations as an alternative to traditional career ladders. This treatise discusses how focusing on impact can foster a more motivated and effective engineering culture.
https://medium.com/@elliotgraebert/an-impact-based-level-system-for-engineering-organizations-2e0f9bee20e6
👍21
This article from JP Gouin provides a deep dive into implementing GitOps at scale, with a specific focus on the cluster bootstrapping process. It covers the challenges and solutions for managing numerous Kubernetes clusters efficiently and declaratively.
https://medium.com/@jp-gouin/gitops-at-scale-clusters-bootstrapping-f36695d4340d
2