DevOps&SRE Library – Telegram
DevOps&SRE Library
18.4K subscribers
466 photos
4 videos
2 files
5K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Advanced Network Observability – Supercharging Container Network Observability in Azure Kubernetes Service (AKS)

https://pixelrobots.co.uk/2024/06/advanced-network-observability-supercharging-container-network-observability-in-azure-kubernetes-service-aks
Scaling Kubernetes Pods Based on HTTP Traffic using KEDA HTTP Add-on

https://blog.raulnq.com/scaling-kubernetes-pods-based-on-http-traffic-using-keda-http-add-on
system-upgrade-controller

This project aims to provide a general-purpose, Kubernetes-native upgrade controller (for nodes). It introduces a new CRD, the Plan, for defining any and all of your upgrade policies/requirements. A Plan is an outstanding intent to mutate nodes in your cluster. For up-to-date details on defining a plan please review v1/types.go.


https://github.com/rancher/system-upgrade-controller
kraan

kraan helps you deploy and manage 'layers' on top of kubernetes. By applying layers on top of K8s clusters, you can build focused platforms on top of K8s e.g ML platforms, Data platform etc. Each layer is a collection of addons and can have dependencies established between the layers. i.e a "mgmt-layer" can depend on a "common-layer". Kraan will always ensure that the addons in the "common-layer" are deployed successfully before deploying the "mgmt-layer" addons. A layer is represented as a kubernetes custom resource and kraan is an operator that is deployed into the cluster and works constantly to reconcile the state of the layer custom resource.

kraan is powered by flux2 and builds on top of projects like source-controller and helm-controller.


https://github.com/fidelity/kraan
intel-device-plugins-for-kubernetes

Collection of Intel device plugins for Kubernetes


https://github.com/intel/intel-device-plugins-for-kubernetes
sops-secrets-operator

Operator which manages Kubernetes Secret Resources created from user defined SopsSecrets CRs, inspired by Bitnami SealedSecrets and sops.


https://github.com/isindir/sops-secrets-operator
cubefs

As an open-source distributed storage, CubeFS can serve as your datacenter filesystem, data lake storage infra, and private or hybrid cloud storage. In particular, CubeFS enables the separation of storage/compute architecture for databases and AI/ML applications.


https://github.com/cubefs/cubefs
mani-diffy

This program walks a hierarchy of Argo CD Application templates, renders Kubernetes manifests from the input templates, and posts the rendered files back for the user to review and validate.

It is designed to be called from a CI job within a pull request, enabling the author to update templates and see the resulting manifests directly within the pull request before the changes are applied to the Kubernetes cluster.

The rendered manifests are kept within the repository, making diffs between revisions easy to parse, dramatically improving safety when updating complex application templates.


https://github.com/chime/mani-diffy
bashly

Bashly is a command line application (written in Ruby) that lets you generate feature-rich bash command line tools.

Bashly lets you focus on your specific code, without worrying about command line argument parsing, usage texts, error messages and other functions that are usually handled by a framework in any other programming language.


https://github.com/DannyBen/bashly
Dear friend, you have built a Kubernetes

I am afraid to inform you that you have built a Kubernetes. I know you wanted to "choose boring tech" to just run some containers. You said that "Kubernetes is overkill" and "it's just way too complex for a simple task" and yet, six months later, you have pile of shell noscripts that do not work—breaking every time there's a slight shift in the winds of production.


https://www.macchaffee.com/blog/2024/you-have-built-a-kubernetes
Choosing the right Postgres indexes

Indexes can make a world of difference to performance in Postgres, but it’s not always obvious when you’ve written a query that could do with an index. Here we’ll cover:

- What indexes are
- Some use cases for when they’re helpful
- Rules of thumb for figuring out which sort of index to add
- How to identify when you’re missing an index


https://incident.io/blog/choosing-the-right-postgres-indexes
BemiDB

BemiDB is a Postgres read replica optimized for analytics. It consists of a single binary that seamlessly connects to a Postgres database, replicates the data in a compressed columnar format, and allows you to run complex queries using its Postgres-compatible analytical query engine.


https://github.com/BemiHQ/BemiDB
65,000 nodes and counting: Google Kubernetes Engine is ready for trillion-parameter AI models

As generative AI evolves, we're beginning to see the transformative potential it is having across industries and our lives. And as large language models (LLMs) increase in size — current models are reaching hundreds of billions of parameters, and the most advanced ones are approaching 2 trillion — the need for computational power will only intensify. In fact, training these large models on modern accelerators already requires clusters that exceed 10,000 nodes.

With support for 15,000-node clusters — the world’s largest — Google Kubernetes Engine (GKE) has the capacity to handle these demanding training workloads. Today, in anticipation of even larger models, we are introducing support for 65,000-node clusters.

With support for up to 65,000 nodes, we believe GKE offers more than 10X larger scale than the other two largest public cloud providers.


https://cloud.google.com/blog/products/containers-kubernetes/gke-65k-nodes-and-counting
netavark

Netavark is a rust based network stack for containers.


https://github.com/containers/netavark
mise

mise is a polyglot tool version manager. It replaces tools like asdf, nvm, pyenv, rbenv, etc.

mise allows you to switch sets of env vars in different project directories. It can replace direnv.

mise is a task runner that can replace make, or npm noscripts.


https://github.com/jdx/mise
Migrating billions of records: moving our active DNS database while it’s in use

According to a survey done by W3Techs, as of October 2024, Cloudflare is used as an authoritative DNS provider by 14.5% of all websites. As an authoritative DNS provider, we are responsible for managing and serving all the DNS records for our clients’ domains. This means we have an enormous responsibility to provide the best service possible, starting at the data plane. As such, we are constantly investing in our infrastructure to ensure the reliability and performance of our systems.


https://blog.cloudflare.com/migrating-billions-of-records-moving-our-active-dns-database-while-in-use
Against Incident Severities and in Favor of Incident Types

About a year ago, Honeycomb kicked off an internal experiment to structure how we do incident response. We looked at the usual severity-based approach (usually using a SEV scale), but decided to adopt an approach based on types, aiming to better play the role of quick definitions for multiple departments put together. This post is a short report on our experience doing it.


https://www.honeycomb.io/blog/against-incident-severities-favor-incident-types
How to Build Smaller Container Images: Docker Multi-Stage Builds

https://labs.iximiuz.com/tutorials/docker-multi-stage-builds
slackdump

Save or export your private and public Slack messages, threads, files, and users locally without admin privileges.


https://github.com/rusq/slackdump
automatisch

The open source Zapier alternative. Build workflow automation without spending time and money.


https://github.com/automatisch/automatisch