DevOps&SRE Library – Telegram
DevOps&SRE Library
18.4K subscribers
459 photos
3 videos
2 files
5K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
pepr

Pepr is on a mission to save Kubernetes from the tyranny of YAML, intimidating glue code, bash noscripts, and other makeshift solutions.


https://github.com/defenseunicorns/pepr
ClusterSecret

The clusterSecret operator makes sure all the matching namespaces have the secret available and up to date.


https://github.com/zakkg3/ClusterSecret
ls-lint

An extremely fast directory and filename linter - Bring some structure to your project filesystem


https://github.com/loeffel-io/ls-lint
zxc

Terminal based intercepting proxy written in rust with tmux and vim as user interface.


https://github.com/hail-hydrant/zxc
graft

Transactional page storage engine supporting lazy partial replication to the edge. Optimized for scale and cost over latency. Leverages object storage for durability.


https://github.com/orbitinghail/graft
liam

Automatically generates beautiful and easy-to-read ER diagrams from your database.


https://github.com/liam-hq/liam
How We Run Terraform At Scale

Managing over 165k cloud resources across hundreds of workspaces could seem daunting. But for us, it’s just another day at Benchling. Here’s how we do it.

We currently have:

- 165k cloud resources under management
- 625 Terraform workspaces
- 38 AWS accounts
- 170 engineers (40 of whom are infra specialists)

We perform:

- 225 infrastructure releases daily (terraform apply operations)
- 723 plans daily (terraform plan operations)

We’ve been successfully operating Benchling’s infrastructure release system for the past two years (spoiler, it’s Terraform Cloud), over which time we’ve doubled our infrastructure footprint with minimal additional release overhead.


https://benchling.engineering/how-we-run-terraform-at-scale-da7bb75dc394
openinfraquote

OpenInfraQuote is a lightweight, open-source CLI tool for estimating infrastructure costs using Terraform plan and state files. It runs locally or in CI/CD. No backend, no API keys, no external services.


https://github.com/terrateamio/openinfraquote
Things that go wrong with disk IO

There are a few interesting scenarios to keep in mind when writing applications (not just databases!) that read and write files, particularly in transactional contexts where you actually care about the integrity of the data and when you are editing data in place (versus copy-on-write for example).


https://notes.eatonphil.com/2025-03-27-things-that-go-wrong-with-disk-io.html
Hot Take: I Want Execs Closer to Incidents, Not Farther

https://uptimelabs.io/hot-take-i-want-execs-closer-to-incidents-not-farther
Improving Kubernetes-Mixin API Server Rules Consistency

A journey into troubleshooting an insidious, and subtle, issue that may occur with Prometheus Recording Rules


https://medium.com/codex/improving-kubernetes-mixin-api-server-rules-consistency-1c0d727e8160
cloudflare-operator

A Kubernetes Operator to create and manage Cloudflare Tunnels and DNS records
for (HTTP/TCP/UDP*) Service Resources


https://github.com/adyanth/cloudflare-operator
kubectl-cond

A kubectl plugin to print Kubernetes object resource conditions in a more human-readable format.


https://github.com/ahmetb/kubectl-cond
gravity

Fully-replicated DNS, DHCP and TFTP Server backed by etcd.


https://github.com/BeryJu/gravity
GitOps: How to Manage Dynamic Network Policy Changes at Scale Across 25 Clusters?

https://itnext.io/gitops-how-to-manage-dynamic-network-policy-changes-at-scale-across-25-clusters-0727ce1145e5
Top-3 Helm Plugins: Helm Secrets, Helm Diff and Helm Git

https://dev.to/mkdev/top-3-helm-plugins-helm-secrets-helm-diff-and-helm-git-2ngb