DevOps&SRE Library – Telegram
DevOps&SRE Library
18.4K subscribers
466 photos
4 videos
2 files
5K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Comparing Open Source k8s Load Balancers

In this article we discuss three open source load-balancer controllers that can be used with any distribution of Kubernetes.


https://medium.com/thermokline/comparing-k8s-load-balancers-2f5c76ea8f31
1
From four to five 9s of uptime by migrating to Kubernetes

https://workos.com/blog/from-four-to-five-9s-of-uptime-by-migrating-to-kubernetes
1
Let’s Consign CAP to the Cabinet of Curiosities

CAP? Again? Still?


https://brooker.co.za/blog/2024/07/25/cap-again.html
1
tntk-infra

Put your DevOps skills to the test with our hands-on capstone project. Designed for anyone interested in gaining practical experience, this project challenges you to integrate AWS, Terraform, Kubernetes, GitHub Actions, ArgoCD, Datadog, and PagerDuty to build and manage a production-like environment. Showcase your ability to create a complete, real-world solution by building cloud infrastructure, implementing observability, developing CI/CD pipelines, and managing incidents.


https://github.com/tntk-io/tntk-infra
1
kardinal

Kardinal is a framework for creating extremely lightweight ephemeral development environments within a shared Kubernetes cluster. In Kardinal, an environment is called a "flow" because it represents a path that a request takes through the cluster. Versions of services that are under development are deployed on-demand, and then shared across all development work that depends on that version. Read more about Kardinal in our docs.


https://github.com/kurtosis-tech/kardinal
1
grpcmd

grpcmd is a simple, easy-to-use, and developer-friendly CLI tool for gRPC.


https://github.com/grpcmd/grpcmd
1
The Art of System Debugging — Decoding CPU Utilization

This blog post describes the case study of how we diagnosed, root caused and then mitigated a performance issue in one of our applications in Flipkart. As part of this journey, we describe the different tools (eBPF and traditional) that can debug performance issues.


https://blog.flipkart.tech/the-art-of-system-debugging-decoding-cpu-utilization-da75f09ef1ff
1
Observability at the Edge

How Chick-fil-A provides observability for 2,800+ K8s clusters


https://medium.com/chick-fil-atech/observability-at-the-edge-b2385065ab6e
1
The raise of Hosted Control Plane in Kubernetes

In the early days of Kubernetes adoption, single-cluster deployments were the norm, offering a straightforward approach to managing applications and services. As the adoption of Kubernetes expanded, the limitations of single-cluster models surfaced. The increasing demand for Kubernetes clusters requires a shift to multicluster deployments and an innovative Hosted Control Plane architecture.


https://clastix.io/post/the-raise-of-hosted-control-plane-in-kubernetes
1
4
Implementing Scalable GitOps With Argo CD and ApplicationSets: A Case Study

https://aviadhaham.me/posts/implementing-gitops-with-argo-cd-and-applicationsets
1
Managing many Helm Charts with Kluctl

Learn how easy it is to manage multiple Helm Charts from one deployment project using Kluctl.


https://kluctl.io/blog/2023/02/28/managing-many-helm-charts-with-kluctl
3
Kubernetes on Proxmox

In this article we’ll create a two-node cluster with one control-plane node and one worker node as a proof-of-concept. As an extra challenge we’ll also take a look at how to do PCIe passthrough for the worker node.


https://blog.stonegarden.dev/articles/2024/03/proxmox-k8s-with-cilium
1
book6

A collaborative IPv6 book.

The intention is a practical introduction to IPv6 for technical people, kept up to date by active practitioners.


https://github.com/becarpenter/book6
1
Piloting through the Fog: A Tale of Migrating to a New Kubernetes Platform

It’s a tale as old as UNIX_MIN_TIMESTAMP. Your team owns a service that you treat like a black box as long as it’s working. Sure, there’s a small maintenance task here and there that the most tenured member of the team almost exclusively picks up. How they fix it might as well be a wizard’s incantation with a sprinkle of fairy dust. But this time around they’re busy on another task, or worse, gone from the company altogether. Here’s my story of such a maintenance task. In this post I go through my journey of migrating one such service from Klaviyo’s legacy kubernetes platform, to our new spiffy, well-managed platform.


https://klaviyo.tech/piloting-through-the-fog-a-tale-of-migrating-to-a-new-kubernetes-platform-7fe5677310fa
1
How our data team handles incidents

Historically, data teams have not been closely involved in the incident management process (at least, not in the traditional “get woken up at 2AM by a SEV0” sense). But with a growing involvement of data (and therefore data teams) in core business processes, decision making, and user-facing products, data-related incidents are increasingly common, and more important than ever.

At incident.io, the Data team works across multiple areas of the business, enabling go-to-market and product teams alike to make data-driven decisions. Given our broad involvement, we’re no stranger to data incidents and are heavy users of our own product to monitor, triage, and respond to them. Here’s a quick run-through of how we’ve set this up.


https://incident.io/blog/how-our-data-team-handles-incidents
1
What is an SLA?

A Service Level Agreement (SLA) is a formal document that outlines the expectations, responsibilities, and performance metrics between a service provider and a customer.


https://uptimerobot.com/blog/what-is-an-sla
1
Optimizing global message transit latency: a journey through TCP configuration

https://ably.com/blog/optimizing-global-message-transit-latency-a-journey-through-tcp-configuration
1
kubetrim

kubetrim tidies up old and broken cluster and context entries from your kubeconfig file.


https://github.com/alexellis/kubetrim
1
outline

The fastest knowledge base for growing teams. Beautiful, realtime collaborative, feature packed, and markdown compatible.


https://github.com/outline/outline
1