DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.49K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
Please open Telegram to view this post
VIEW IN TELEGRAM
Tail-based sampling unlocks deeper insights into distributed systems by allowing OpenTelemetry users to prioritize traces that matter most, such as those with errors or slow responses. This guide explains how tail-based sampling works, its differences from head-based sampling, and provides a practical walkthrough for setting up a two-tier OpenTelemetry Collector architecture that intelligently filters traces for more actionable observability.

https://itnext.io/empower-your-observability-tail-based-sampling-for-better-tracing-with-opentelemtry-243ca2cc55d1
👍1
Achieving end-to-end visibility for Python data pipelines is essential for ensuring quality and reliability in modern data architectures. This hands-on walkthrough from Elastic Observability Labs explains how to implement OpenTelemetry (OTEL) in your Python ETL noscripts—covering automatic instrumentation, manual tracing, performance metrics, and anomaly-driven alerting—to proactively monitor, troubleshoot, and optimize your entire pipeline lifecycle using Elastic’s platform.

https://www.elastic.co/observability-labs/blog/monitor-your-python-data-pipelines-with-otel
👍1
While GitOps has brought consistency and innovation to Kubernetes deployments, its reliance on git-based workflows and tools like ArgoCD and Flux still leaves important challenges unsolved. This article explores both the real-world progress and the limitations of GitOps, from deployment strategies and multi-cluster rollouts to issues around permissions, secrets management, and the need for solutions that go beyond git as the sole source of truth.

https://itnext.io/realizing-the-potential-of-gitops-263051baff04
2👍2
Meeting customers’ rising expectations for security, speed, and personalization demands a new approach to computing infrastructure, which is exactly where distributed cloud comes in. This feature explains why developers must look beyond traditional centralized cloud models—adopting distributed cloud computing to optimize performance, comply with data regulations, and deliver truly customized services at scale.

https://thenewstack.io/why-developers-need-to-care-about-distributed-cloud-computing/
👍1
Upgrading from Node.js 18 to 20 brought unexpected performance impacts to a Kubernetes-deployed service, as detailed in this technical recap. The experience-driven story reveals how changing memory reservations on Kubernetes pods can shrink Node.js heap spaces—specifically the "new space"—triggering heavier garbage collection and higher CPU load, and how adjusting the --max-semi-space-size parameter restored both speed and stability.

https://deezer.io/node-js-20-upgrade-a-journey-through-unexpected-heap-issues-with-kubernetes-27ae3d325646
👍3
Understanding how to secure Linux containers requires a deep dive into tools like seccomp, which can restrict the system calls available to containerized processes. In this technical guide, the fourth installment of the Container Internals Series breaks down how seccomp filters work, their real-world impact on container security, and practical steps to implement custom seccomp profiles for hardened deployments.

https://levelup.gitconnected.com/container-internals-series-part-4-seccomp-d88543988709
👍4
This informative piece by bm54cloud explores the intricacies of deploying and updating Zarf packages in air-gapped environments. The author provides valuable insights into overcoming the unique challenges faced when working with systems disconnected from external networks.

https://medium.com/@bm54cloud/deploy-and-update-zarf-packages-in-an-air-gap-b2e3ec43abf7
👍3
In this captivating tutorial, Noah H explores the powerful capabilities of eBPF technology and Tetragon for enhancing Kubernetes security through runtime monitoring and policy enforcement. The author provides valuable insights into how these tools can detect suspicious activities, prevent container escapes, and enforce security policies directly at the kernel level without significant performance overhead.

https://medium.com/@noah_h/kubernetes-security-ebpf-tetragon-for-runtime-monitoring-policy-enforcement-819b6ed97953
This guide by Marcin Cuber provides a comprehensive walkthrough for implementing AWS ECR pull-through cache for an EKS cluster using Terraform. The tutorial details how to configure cache rules for multiple upstream registries-such as Docker Hub, GitHub, Quay, Kubernetes, and ECR Public-covering both authentication requirements and IAM permissions for seamless integration with your Kubernetes workloads.

https://marcincuber.medium.com/implementing-aws-ecr-pull-through-cache-for-eks-cluster-most-in-depth-implementation-details-e51395568034
👍31
This blogpost by Rodrigo Fior Kuntzer delves into how Miro’s Compute team leverages Kyverno’s mutating webhooks to automate and streamline complex Kubernetes workflows. With practical examples, it demonstrates how Kyverno policies can dynamically modify resources, enforce best practices, and enhance both security and operational efficiency across Kubernetes environments.

https://medium.com/@rodrigofk/automating-kubernetes-workflows-with-kyvernos-mutating-webhooks-ae3f0a81d4d7
1