DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
Understanding logical replication in PostgreSQL is crucial for anyone managing data across multiple Postgres instances. This blogpost from EnterpriseDB introduces the basics of logical replication, explaining how it enables selective data replication—such as inserts, updates, and deletes—between databases, even across different Postgres versions, and outlines the practical steps to set up publications and subnoscriptions for real-time data synchronization.

https://www.enterprisedb.com/blog/logical-replication-postgres-basics
1👍1
Figma’s migration onto Kubernetes is a compelling case study in how a high-growth company can modernize its infrastructure for scalability, reliability, and developer productivity. This article recounts Figma’s decision to move from AWS ECS to Kubernetes (EKS), the challenges they faced with ECS—such as lack of support for StatefulSets, Helm charts, and advanced autoscaling—and the benefits they unlocked by embracing the broader CNCF ecosystem and Kubernetes’ popularity within the industry.

https://www.figma.com/blog/migrating-onto-kubernetes/
👍1
This newsletter explains the challenges of the "hot shard" problem—when a disproportionate amount of traffic targets a single shard, causing resource saturation and degraded performance. The blogpost outlines practical strategies to address this, such as vertical scaling, adding read replicas or caches, distributing hot keys across more shards, choosing better sharding keys and algorithms, implementing load balancing and queueing, controlling traffic with backpressure, and monitoring the cluster for early detection of issues.

https://newsletter.scalablethread.com/p/how-to-handle-hot-shard-problem
👍31
Migrating from MetalLB to Cilium streamlines Kubernetes networking by consolidating load balancer, IP address management, and network advertisement features into a single tool. This article details how Cilium—starting with version 1.13—natively supports LoadBalancer IP management, BGP (Layer 3) announcements, and Layer 2 (ARP) announcements, eliminating the need for MetalLB in most self-managed clusters. Through practical YAML examples, it demonstrates configuring Cilium IP pools, service selectors, specific IP assignments, and both IPv4 and IPv6 support, as well as advertising service IPs to the network using BGP or ARP, offering a more integrated and simplified approach to Kubernetes networking.

https://isovalent.com/blog/post/migrating-from-metallb-to-cilium/
👍52
Dropbox has built a flexible messaging system model to support its evolving async platform. This blogpost explores how the new architecture enhances decoupling and scalability across their infrastructure services.

https://dropbox.tech/infrastructure/infrastructure-messaging-system-model-async-platform-evolution
👍2
Sven Eliasson benchmarks Hetzner’s Kubernetes storage classes to evaluate their suitability for database workloads. This report highlights the significant performance differences between instance-attached NVMe storage and cloud volumes, offering practical insights for infrastructure planning.

https://sveneliasson.de/benchmarking-hetzners-storage-classes-for-database-workloads-on-kubernetes
👍2
Instant's engineering team shares their journey of upgrading an Aurora Postgres instance to version 16 with zero downtime. This experience report details the challenges faced, including performance bottlenecks and failed upgrade attempts, ultimately leading to a successful migration strategy.

https://www.instantdb.com/essays/pg_upgrade
👍6
Oilbeater presents k8gb as a standout open-source GSLB solution, seamlessly integrating with Kubernetes to manage cross-cluster domain names and traffic with minimal external dependencies. This blogpost delves into how k8gb leverages DNS protocols to achieve automated, multi-cloud traffic routing and disaster recovery, positioning it as a top choice for cloud-native environments.


https://oilbeater.com/en/2024/04/18/k8gb-best-cloudnative-gslb/
2
Ahmet Alp Balkan offers a candid look into the common pitfalls developers face when building Kubernetes controllers. This essay outlines practical patterns and anti-patterns—from CRD design to reconciliation logic—that can make or break production-grade controllers.


https://ahmet.im/blog/controller-pitfalls/
2
🌍 Terraform Model Context Protocol (MCP) Tool - An experimental CLI tool that enables AI assistants to manage and operate Terraform environments. Supports reading Terraform configurations, analyzing plans, applying configurations, and managing state with Claude Desktop integration.

https://github.com/nwiizo/tfmcp
👍4
Tobias Andersen demonstrates how to architect a multi-cluster Kafka environment using Strimzi on Kubernetes. This article details the setup of two Kafka clusters with MirrorMaker2 for cross-cluster replication, ensuring high availability and scalability for the Heimdall platform.

https://medium.com/@ZaradarTR/multi-cluster-kafka-with-strimzi-io-fafd36c2b413
1
Fernando Borretti critiques SQL's limitations in testing and business logic reuse, proposing composable, statically-typed query fragments—'functors'—as a solution. This article explores how functors can enhance modularity, testability, and maintainability in complex SQL systems.

https://borretti.me/article/composable-sql