DevOps&SRE Library – Telegram
DevOps&SRE Library
18.4K subscribers
460 photos
3 videos
2 files
5K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
How to mount secrets as files or environment variables in Kubernetes

https://itnext.io/how-to-mount-secrets-as-files-or-environment-variables-in-kubernetes-f03d545dcd89
container-startup-autoscaler

container-startup-autoscaler (CSA) is a Kubernetes controller that modifies the CPU and/or memory resources of containers depending on whether they're starting up, according to the startup/post-startup settings you supply. CSA works at the pod level and is agnostic to how the pod is managed; it works with deployments, statefulsets, daemonsets and other workload management APIs.


https://github.com/ExpediaGroup/container-startup-autoscaler
1
kubectl.nvim

Processes kubectl outputs to enable vim-like navigation in a buffer for your cluster.


https://github.com/Ramilito/kubectl.nvim
falco

Falco is a cloud native runtime security tool for Linux operating systems. It is designed to detect and alert on abnormal behavior and potential security threats in real-time.


https://github.com/falcosecurity/falco
dice

DiceDB is an open-source, fast, reactive, in-memory database optimized for modern hardware. Commonly used as a cache, it offers a familiar interface while enabling real-time data updates through query subnoscriptions. It delivers higher throughput and lower median latencies, making it ideal for modern workloads.


https://github.com/dicedb/dice
stu

STU is the TUI explorer application for Amazon S3 (AWS S3) written in Rust using ratatui.


https://github.com/lusingander/stu
xan

xan is a command line tool that can be used to process CSV files directly from the shell.


https://github.com/medialab/xan
openproject

OpenProject is the leading open source project management software.


https://github.com/opf/openproject
Beyond “5 Whys”: A Better Way to Learn from Incidents

We all can agree that the most important purpose of a post-incident review (or post-mortem) is to learn from incidents. Implied in this learning is improving the system (people, processes, technology, and their interactions). All my reflections on the “5 Whys” technique refer back to how the technique enhances our learning (or not) from incidents.


https://uptimelabs.io/beyond-5-whys-a-better-way-to-learn-from-incidents
Systematically Terraforming a Brownfield of Cloud Infrastructure

Some thinking, trade-offs, theory building, and method-making one might ended up doing, in the course of bringing Infrastructure as Code (IaC) discipline to brownfield (and greenfield) services, at a small regulated fintech company, having a smaller engineering team that serves several business units, including one of India's largest national tax gateways. Only somewhat easier than reading a long compound sentence without pausing for breath. Phew.


https://www.evalapply.org/posts/systems-approach-to-infrastructure-as-code
The Infra to handle 10M Requests in 10 Minutes for $0.0116

In this article, we'll break down the infrastructure required to achieve a target of 10 million requests in 10 minutes, all for around $0.0116. This guide goes beyond basic setup and explores practical considerations for production-ready systems, balancing cost efficiency and high availability.


https://tonywang.io/blog/infra-10m-requests-10-minutes-0.0116
Understanding Kubernetes Multi-Tenancy: Models, Challenges, and Solutions

https://www.loft.sh/blog/understanding-kubernetes-multi-tenancy-models-challenges-and-solutions
We Threw Away 13 Years of Work for EKS

Thirteen years of running in EC2.

Thirteen years of custom AMIs. Thirteen years of deployment pipelines put together with toothpicks and bubblegum. Thirteen years of launch noscripts that really-do-seem-to-be-an-anti-pattern-but-hey-at-least-they-work.

And we threw it all away to run in EKS.

This is the choice we made at GumGum in early 2023, and this blog post covers the problems that led to this insane idea, and why this idea wasn’t so insane after all.


https://medium.com/gumgum-tech/we-threw-away-13-years-of-work-for-eks-b0fd8f53917c
How we avoided an outage caused by running out of IPs in EKS

Solving IP exhaustion in EKS: Avoiding a network outage by implementing custom networking


https://medium.com/adevinta-tech-blog/how-we-avoided-an-outage-caused-by-running-out-of-ips-in-eks-c831ab97d0e4