DevOps&SRE Library – Telegram
DevOps&SRE Library
18.4K subscribers
466 photos
4 videos
2 files
5K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Kafka 101

Originally developed in LinkedIn during 2011, Apache Kafka is one of the most popular open-source Apache projects out there. So far it has had a total of 24 notable releases and most intriguingly, its code base has grown at an average rate of 24% throughout each of those releases.


https://highscalability.com/unnoscriptd-2
A Git story: Not so fun this time

Linus Torvalds once wrote in a book that he created Linux just for fun, but it ended up sparking a revolution. Git, his second major creation, also an accidental revolution. It’s now a standard tool for software engineers, but its origin story wasn’t so much fun this time, at least for Linus.


https://blog.brachiosoft.com/en/posts/git
Becoming a Senior Site Reliability Engineer: A Guide to Upskilling

Learn how to upskill yourself to become senior site reliability engineer


https://reliabilityengineering.substack.com/p/becoming-a-senior-site-reliability
Mastering Terraform Workflows: apply-before-merge vs apply-after-merge

Discover the two main Terraform and OpenTofu workflows: apply-before-merge and apply-after-merge, and learn why apply-after-merge is likely the better choice.


https://terramate.io/rethinking-iac/mastering-terraform-workflows-apply-before-merge-vs-apply-after-merge
Terraform Development Pipeline

The purpose of a development pipeline is to deploy with confidence and therefore at high frequencies.


https://mycloudrevolution.com/2024/05/23/terraform-development-pipeline
Terramaid

Terramaid transforms your Terraform resources and plans into visually appealing Mermaid diagrams. By converting complex infrastructure into easy-to-understand diagrams, Terramaid enhances documentation, simplifies review processes, and fosters better collaboration among team members. Whether you're looking to enrich your project's documentation, streamline reviews, or just bring a new level of clarity to your Terraform configurations, Terramaid is the perfect utility to integrate into your development workflow.


https://github.com/RoseSecurity/Terramaid
Percentile

What is it? Why is it used? And why is it important in the context of optimization and reliability engineering? Bonus: a browser app that lets you play with data.


https://blog.alexewerlof.com/p/percentile
Enhancing Netflix Reliability with Service-Level Prioritized Load Shedding

Applying Quality of Service techniques at the application level


https://netflixtechblog.com/enhancing-netflix-reliability-with-service-level-prioritized-load-shedding-e735e6ce8f7d
A write-ahead log is not a universal part of durability

A database does not need a write-ahead log (WAL) to achieve durability. A database can write its long-term data structure durably to disk before returning to a client. Granted, this is a bad idea! And granted, a WAL is critical for durability by design in most databases. But I think it's helpful to understand WALs by understanding what you could do without them.


https://notes.eatonphil.com/2024-07-01-a-write-ahead-log-is-not-a-universal-part-of-durability.html
ConfigMap Conundrum: Subtleties of Dynamic Updates in Kubernetes Configurations

Know the differences between ConfigMaps mounted as Volumes and ConfigMaps defined as environment variables.


https://blog.adityasamant.dev/configmap-conundrum-subtleties-of-dynamic-updates-in-kubernetes-configurations
Argo Events — Event Bus and Webhook

Argo Event is a Kubernetes based event automation engine. It is part of the Argo project. Argo Events can be used with or independent of other projects in Argo.

I will be writing a series of articles on Argo Events; in these articles I will be looking at how we can use Argo Event to automate process within and without a Kubernetes cluster.

For this first article in this series, we will examine Argo Events core concepts, installation and provisioning different event buses which Argo Event uses to forward events to their sink. Finally we will look at setting up a webhook event flow to verify our setup.


https://medium.chuklee.com/argo-events-event-bus-and-webhook-ac34e5714209
Users, Groups, Roles and API Access in Kubernetes

The nuances of how users and groups are configured in Kubernetes and how the role-based access control (RBAC) mechanism applies for them.


https://blog.adityasamant.dev/users-groups-roles-and-api-access-in-kubernetes
Kubernetes Events — News feed of your cluster

Understand Kubernetes Events and learn to use kubectl events to monitor and troubleshoot your cluster’s issues effectively.


https://decisivedevops.com/kubernetes-events-news-feed-of-your-kubernetes-cluster-826e08892d7a
GPU Virtualization in K8s: Challenges and State of the Art

Kubernetes schedules GPU workloads by assigning a whole device to a single job exclusively. This one-to-one relationship leads to massive GPU underutilization, especially for interactive jobs, characterized by significant idle periods and infrequent bursts of heavy GPU usage. Current solutions enable GPU sharing by statically assigning a fixed slice of GPU memory to each co-located job. These solutions are not suitable for interactive scenarios since the number of co-located jobs is limited by the size of physical GPU memory. Consequently, users must know the GPU memory demand of their jobs before submitting them for execution, which is impractical.


https://www.arrikto.com/blog/gpu-virtualization-in-k8s-challenges-and-state-of-the-art
gitops-bridge

The GitOps Bridge is a community project that aims to showcase best practices and patterns for bridging the process of creating a Kubernetes cluster to subsequently managing everything through GitOps. It focuses on using ArgoCD or FluxCD, both of which are CNCF-graduated projects.


https://github.com/gitops-bridge-dev/gitops-bridge
Ephemeral Values in Terraform

Since long before I worked at HashiCorp I've been interested in the problems with how Terraform interacts with security-sensitive information like passwords and private keys.


https://log.martinatkins.me/2024/05/22/terraform-ephemeral-values