DevOps&SRE Library – Telegram
DevOps&SRE Library
18.7K subscribers
451 photos
3 videos
2 files
5.07K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
ch-vmm

ch-vmm is a Kubernetes add-on for running Cloud Hypervisor virtual machines. By using Cloud Hypervisor as the underlying hypervisor, ch-vmm enables a lightweight and secure way to run fully virtualized workloads in a canonical Kubernetes cluster.


https://github.com/nalajala4naresh/ch-vmm
Enroll

Enroll inspects a Debian-like or RedHat-like system, harvests the state that matters, and generates Ansible roles/playbooks so you can bring snowflakes under management fast.


https://enroll.sh
PHP 8.5 benchmarks: The state of PHP performance across major CMSs and frameworks

PHP 8.5 has now been officially released, and developers naturally want to know what kind of performance improvements they can expect across popular CMSs and frameworks.

To find out, we benchmarked 13 widely used CMSs and frameworks, including WordPress, WooCommerce, Drupal, Joomla, Laravel, Symfony and CodeIgniter, on PHP 8.2, 8.3, 8.4, and 8.5 under identical conditions. WordPress was also tested on PHP 7.4, since a notable share of sites still run on that version.

Our intention is to provide a clear, practical look at how performance shifts across recent PHP releases and what you can expect when upgrading.


https://kinsta.com/blog/php-benchmarks
Finding the grain of sand in a heap of Salt

How do you find the root cause of a configuration management failure when you have a peak of hundreds of changes in 15 minutes on thousands of servers?

That was the challenge we faced as we built the infrastructure to reduce release delays due to failures of Salt, a configuration management tool. (We eventually reduced such failures on the edge by over 5%, as we’ll explain below.) We’ll explore the fundamentals of Salt, and how it is used at Cloudflare. We then describe the common failure modes and how they delay our ability to release valuable changes to serve our customers.

By first solving an architectural problem, we provided the foundation for self-service mechanisms to find the root cause of Salt failures on servers, datacenters and groups of datacenters. This system is able to correlate failures with git commits, external service failures and ad hoc releases. The result of this has been a reduction in the duration of software release delays, and an overall reduction in toilsome, repetitive triage for SRE.

To start, we will go into the basics of the Cloudflare network and how Salt operates within it. And then we’ll get to how we solved the challenge akin to finding a grain of sand in a heap of Salt.


https://blog.cloudflare.com/finding-the-grain-of-sand-in-a-heap-of-salt
Rethinking QA: From DevOps to Platform Engineering and SRE

A wake‑up call for QA to upskill for platform engineering and SRE, including cloud‑native practices, automation mastery, and system reliability at scale.


https://dzone.com/articles/rethinking-qa-from-devops-to-platform-engineering
Queue-Based Autoscaling Without Flapping: Rethinking App Scaling with K8s, KEDA, and RabbitMQ

https://blog.stackademic.com/autoscaling-with-message-queues-why-everyone-gets-it-wrong-with-kubernetes-keda-rabbitmq-and-f1a4c38e0df4
helm-controller

A simple way to manage helm charts with Custom Resource Definitions in k8s.


https://github.com/k3s-io/helm-controller
nxs-universal-chart

nxs-universal-chart is a Helm chart you can use to install any of your applications into Kubernetes/OpenShift and other orchestrators compatible with native Kubernetes API.


https://github.com/nixys/nxs-universal-chart
percona-xtradb-cluster-operator

Percona Operator for MySQL based on Percona XtraDB Cluster (PXC) automates the creation and management of highly available, enterprise-ready MySQL database clusters on Kubernetes.


https://github.com/percona/percona-xtradb-cluster-operator
1
k8z

A lightweight, modern mobile and desktop application for manage kubernetes. Easily for use fast, secure.


https://github.com/k8zdev/k8z
1
opentelemetry-host-metrics

When you're monitoring infrastructure with OpenTelemetry, the Host Metrics Receiver (hostmetrics) is one of the most relevant components to reach for.

It fully replaces traditional agents (like Prometheus Node Exporter), and collects essential system metrics such as CPU, memory, disk, and network usage directly from the machine where the Collector is running.

Because this receiver needs direct access to the underlying system, it's intended to be used when the Collector is deployed as an Agent. For example, as a DaemonSet on Kubernetes nodes or as a service on a VM or bare-metal host, not as a centralized gateway.

In this guide, you'll learn how to configure it as a Node Exporter alternative for monitoring your server infrastructure.


https://www.dash0.com/guides/opentelemetry-host-metrics
Migrating Kubernetes out of the Big Cloud Providers

“Move to kubernetes to save costs” they said in the early days of the k8s frenzy. This was trusting that an efficient pod bin (node) packing would save on node costs (there’s also autoscale but regular cloud already offers that).

The reality is that the overhead costs of running the control plane and auxiliary services in each node (DNS, metric and log collectors etc) plus extra easy ways to make costly mistakes turns most Kubernetes installations a more expensive proposition than running the workloads without it.

For the record, the great thing about k8s and the reason for its success (besides resume-driven technology) is in the standardization it provides and its extensibility or modularity (this plug-in advantage is the reason mediocre software like Wordpress is successful for example).

Managed k8s in the “big three” public cloud providers: Amazon Elastic Kubernetes Service (EKS) in AWS, Google Kubernetes Engine (GKE) in GCP and Azure Kubernetes Service (AKS) in Azure for a startup is expensive.

On the other hand I don’t want to manage k8s master nodes and essential services on “baremetal” — funny that nowadays that means a Virtual Machine (VM), so I was looking for an intermediate solution between the expensive fully managed k8s and the cheapest (in dollars, not in time) completely self-managed k8s.


https://medium.com/@duran.fernando/migrating-kubernetes-out-of-the-big-cloud-providers-45a378943d5c
Building AWS S3-Style Storage and Automated Static Site Deployment in Kubernetes

Learn how to use MinIO to create S3-style storage in K8s and host static websites, with a simple CD pipeline that updates the site automatically


https://davidwoglo.hashnode.dev/building-aws-s3-style-storage-and-automated-static-site-deployment-in-kubernetes
Запускаем год с запуска LLM

На вебинаре 15 января эксперты Cloud.ru расскажут, как точно рассчитать конфигурацию для запуска LLM и как настраивать параметры инференса, чтобы сэкономить, но не потерять в качестве.

Что еще интересного в программе:

🟢из чего складывается потребление vRAM

🟢как точно рассчитать нужную конфигурацию GPU

🟢какие параметры LLM сильнее влияют на цену и производительность

🟢как масштабировать модель и переводить ее в serverless-режим


А еще будет практика: запустим LLM в сервисе Evolution ML Inference, покажем оптимальные параметры, сравним разные конфигурации по цене и скорости работы.

Будет интересно всем, кто хочет избежать лишних трат на ML-инфраструктуру.

Зарегистрироваться
Please open Telegram to view this post
VIEW IN TELEGRAM
cluster-api-provider-hosted-control-plane

A Kubernetes Cluster API control plane provider that enables management of hosted control planes as first-class Kubernetes resources. This provider allows you to create and manage highly available Kubernetes control plane components (API Server, Controller Manager, Scheduler, and etcd) as hosted services, decoupling them from the underlying infrastructure.


https://github.com/teutonet/cluster-api-provider-hosted-control-plane
egressgateway

In a Kubernetes (k8s) cluster, when Pods access external services, their Egress IP addresses are not fixed. In the Overlay network, the Egress IP address is determined by the node where the Pod resides. While in the Underlay network, Pods directly use their own IP addresses for external communication. Consequently, when Pods are rescheduled, regardless of the network mode, their IP addresses for external communication change. This instability poses a challenge for system administrators in managing IP addresses, especially as the cluster scales and during network fault diagnostics. Controlling egress traffic based on a Pod's original egress IP outside the cluster becomes difficult.

To solve this problem, EgressGateway has been introduced into the k8s cluster. It is an open-source EgressGateway designed to resolve egress egress IP address issues across various CNI network modes, such as Calico, Flannel, Weave, and Spiderpool. Through flexible configuration and management of egress policies, EgressGateway allows setting egress IP addresses for tenant-level or cluster-level workloads. When Pods need to access the external network, the system consistently uses the configured Egress IP as the egress address, providing a stable solution for egress traffic management.


https://github.com/spidernet-io/egressgateway
Vibe coding tools observability with VictoriaMetrics Stack and OpenTelemetry

https://victoriametrics.com/blog/vibe-coding-observability
1
💥 eBPF: рентгеновское зрение для production — видим сеть, безопасность и узкие места прямо в ядре Linux

🔥 22 января в 19:00 мск — бесплатный открытый вебинар OTUS

Устали искать причину падения сервиса часами? А что если увидеть всё сразу: кто куда коннектится, где тормозит сеть, какой процесс подозрительно себя ведёт — и всё это без агентов, без overhead и без перезапуска?

На вебинаре покажем настоящую магию eBPF в живых демо.

📌 Что будет:
— Живое демо: ловим сетевые проблемы с Cilium Hubble
— Живое демо: отлавливаем угрозы в реальном времени с Tetragon
— Диагностируем производительность без остановки сервисов
— Архитектура eBPF простыми словами — как это вообще работает

🎯 После вебинара вы сможете:
— Моментально находить узкие места в продакшене без рестартов
— Заменить десятки тяжёлых агентов одним лёгким eBPF-решением
— Видеть инциденты безопасности, которые пропускают традиционные инструменты
— Понимать, когда eBPF — это спасение, а когда лучше обойтись классикой

👉 Регистрация уже открыта https://vk.cc/cTlmBY

Вебинар приурочен к старту курса «DevOps-инженер: практики и инструменты», где eBPF и современная наблюдаемость — один из ключевых блоков программы.

Реклама. ООО «Отус онлайн-образование», ОГРН 1177746618576, erid: 2Vtzqw8H2cS
taws

taws provides a terminal UI to interact with your AWS resources. The aim of this project is to make it easier to navigate, observe, and manage your AWS infrastructure in the wild.


https://github.com/huseyinbabal/taws