DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
This informative piece by bm54cloud explores the intricacies of deploying and updating Zarf packages in air-gapped environments. The author provides valuable insights into overcoming the unique challenges faced when working with systems disconnected from external networks.

https://medium.com/@bm54cloud/deploy-and-update-zarf-packages-in-an-air-gap-b2e3ec43abf7
👍3
In this captivating tutorial, Noah H explores the powerful capabilities of eBPF technology and Tetragon for enhancing Kubernetes security through runtime monitoring and policy enforcement. The author provides valuable insights into how these tools can detect suspicious activities, prevent container escapes, and enforce security policies directly at the kernel level without significant performance overhead.

https://medium.com/@noah_h/kubernetes-security-ebpf-tetragon-for-runtime-monitoring-policy-enforcement-819b6ed97953
This guide by Marcin Cuber provides a comprehensive walkthrough for implementing AWS ECR pull-through cache for an EKS cluster using Terraform. The tutorial details how to configure cache rules for multiple upstream registries-such as Docker Hub, GitHub, Quay, Kubernetes, and ECR Public-covering both authentication requirements and IAM permissions for seamless integration with your Kubernetes workloads.

https://marcincuber.medium.com/implementing-aws-ecr-pull-through-cache-for-eks-cluster-most-in-depth-implementation-details-e51395568034
👍31
This blogpost by Rodrigo Fior Kuntzer delves into how Miro’s Compute team leverages Kyverno’s mutating webhooks to automate and streamline complex Kubernetes workflows. With practical examples, it demonstrates how Kyverno policies can dynamically modify resources, enforce best practices, and enhance both security and operational efficiency across Kubernetes environments.

https://medium.com/@rodrigofk/automating-kubernetes-workflows-with-kyvernos-mutating-webhooks-ae3f0a81d4d7
1
This post details Amazon’s ambitious migration from Apache Spark to Ray on Amazon EC2 for exabyte-scale data processing, revealing how Ray’s flexibility and efficiency enabled massive cost savings and performance improvements. Readers will discover the technical strategies and real-world results that made this transformation a success for Amazon’s Business Data Technologies team.

https://aws.amazon.com/blogs/opensource/amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-amazon-ec2/
👍41
This article by Ahmet Alp Balkan highlights common pitfalls in generating Kubernetes CustomResourceDefinitions (CRDs) with controller-gen, emphasizing the importance of explicit validation, careful use of required and optional markers, and understanding how Go’s zero values interact with CRD schemas. Through practical examples, it warns developers about issues like unvalidated nested fields, marker typos, and the challenges of defaulting and validation, offering actionable advice to avoid subtle bugs in custom Kubernetes APIs.

https://ahmet.im/blog/crd-generation-pitfalls/index.html
👍1
This retrospective by Marc Olson offers a detailed look at the evolution of AWS Elastic Block Store (EBS), tracing its journey from a simple network-attached block storage service launched in 2008 to a massive, distributed SSD-based system now handling over 140 trillion operations daily. The post highlights key lessons learned in performance engineering, organizational structure, and continuous incremental improvement, illustrating how EBS overcame challenges like noisy neighbors, hardware transitions from HDDs to SSDs, and the need for robust measurement and instrumentation to deliver ever-lower latency and higher reliability for AWS customers.

https://www.allthingsdistributed.com/2024/08/continuous-reinvention-a-brief-history-of-block-storage-at-aws.html
This blogpost by Zach Loeber introduces Atmos, an opinionated infrastructure deployment tool from CloudPosse designed to simplify and scale Terraform state management for multi-state projects. Loeber walks through adopting Atmos, its stack-based structure, YAML-driven configuration, and highlights both the flexibility and initial learning curve that come with integrating Atmos into existing workflows.

https://dev.to/zloeber/atmos-wield-terraform-like-a-boss-3bfc
🔥1
This blog post introduces KWOK (Kubernetes WithOut Kubelet), a lightweight tool designed to simulate large-scale Kubernetes clusters by emulating nodes and pods without running real workloads. ZaradarTR explains how KWOK, with its core components kwok and kwokctl, allows developers to quickly create and manage thousands of simulated nodes and pods on local machines-making it ideal for scalability testing, API interaction, and stress-testing Kubernetes environments with minimal resource consumption.

https://medium.com/@ZaradarTR/hello-kwok-af2cafec35b4
👍2
This piece examines the limitations of AWS native security tooling, particularly focusing on AWS IAM Access Analyzer and its effectiveness in detecting publicly exposed resources across various AWS services. The article highlights critical observability gaps that can leave organizations vulnerable, emphasizing the need for enhanced security measures and proactive monitoring to address blind spots and reduce the risk of cloud security incidents.

https://www.securityrunners.io/post/exposing-security-observability-gaps-in-aws