DevOps&SRE Library – Telegram
DevOps&SRE Library
18.4K subscribers
459 photos
3 videos
2 files
5K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
tailpipe

Tailpipe is the lightweight, developer-friendly way to query logs.

Cloud logs, SQL insights. Collects logs from cloud, container and application sources. Query and analyze your data instantly with the power of SQL, right from your terminal.

Fast, local, and efficient. Runs locally, powered by DuckDB's in-memory analytics and Parquet's optimized storage.

An ecosystem of prebuilt intelligence. MITRE ATT&CK-aligned queries, prebuilt detections, benchmarks, and dashboards, all open source and community-driven.

Built to build with. Define detections as code, extend functionality with plugins and write custom SQL queries.


https://github.com/turbot/tailpipe
mdq

like jq but for Markdown: find specific elements in a md doc


https://github.com/yshavit/mdq
yaak

Yaak is a desktop API client for interacting with REST, GraphQL, Server Sent Events (SSE), WebSocket, and gRPC APIs.


https://github.com/mountain-loop/yaak
It's a log eat log world!

Let's discuss logging - unstructured, structured and canonical log lines - what they are and what value they bring to your production systems.


https://obakeng.substack.com/p/its-a-log-eat-log-world
Redis as a Primary Database for Complex Applications

How Redis can be used as a primary database for complex applications that need to store data in multiple formats?


https://faun.pub/redis-as-a-primary-database-for-complex-applications-501ced31f923
Slicing Up—and Iterating on—SLOs

One of the main pieces of advice about Service Level Objectives (SLOs) is that they should focus on the user experience. Invariably, this leads to people further down the stack asking, “But how do I make my work fit the users?”—to which the answer is to redefine what we mean by “user.” In the end, a user is anyone who uses whatever it is you’re measuring.


https://www.honeycomb.io/blog/slicing-up-and-iterating-on-slos
Unlocking Kubernetes Observability with the OpenTelemetry Operator

https://www.dash0.com/blog/unlocking-kubernetes-observability-with-the-opentelemetry-operator
terraform-backend-git

Terraform HTTP Backend implementation that uses Git repository as storage


https://github.com/plumber-cd/terraform-backend-git
tfbuddy

TFBuddy allows Terraform Cloud users to get apply-before-merge workflows in their Pull Requests.


https://github.com/zapier/tfbuddy
Ensuring Effective Helm Charts with Linting, Testing, and Diff Checks

https://dev.to/hkhelil/ensuring-effective-helm-charts-with-linting-testing-and-diff-checks-ni0
Metal3

Metal3 (pronounced “metal cubed”) is an open-source project that provides a set of tools for managing bare-metal infrastructure using Kubernetes.


https://metal3.io
autotune

Kruize Autotune is an Autonomous Performance Tuning Tool for Kubernetes. Autotune accepts a user provided "slo" goal to optimize application performance. It uses Prometheus to identify "layers" of an application that it is monitoring and matches tunables from those layers to the user provided slo. It then runs experiments with the help of a hyperparameter optimization framework to arrive at the most optimal values for the identified set of tunables to get a better result for the user provided slo.

Autotune can take an arbitrarily large set of tunables and run experiments to continually optimize the user provided slo in incremental steps. For this reason, it does not necessarily have a "best" value for a set of tunables, only a "better" one than what is currently deployed.


https://github.com/kruize/autotune
kubeip

Kubernetes' nodes don't necessarily need their own public IP addresses to communicate. However, there are certain situations where it's beneficial for nodes in a node pool to have their own unique public IP addresses.

For instance, in gaming applications, a console might need to establish a direct connection with a cloud virtual machine to reduce the number of hops.

Similarly, if you have multiple agents running on Kubernetes that need a direct server connection, and the server needs to whitelist all agent IPs, having dedicated public IPs can be useful. These scenarios, among others, can be handled on a cloud-managed Kubernetes cluster using Node Public IP.

KubeIP is a utility that assigns a static public IP to each node it manages. The IP is allocated to the node's primary network interface, chosen from a pool of reserved static IPs using platform-supported filtering and ordering.

If there are no static public IPs left, KubeIP will hold on until one becomes available. When a node is removed, KubeIP releases the static public IP back into the pool of reserved static IPs.


https://github.com/doitintl/kubeip
The case of the vanishing CPU: A Linux kernel debugging story

A mysterious CPU spike in ClickHouse Cloud on GCP led to months of debugging, revealing a deeper issue within the Linux kernel’s memory management. What started as random performance degradation turned into a deep dive into kernel internals, where engineer Sergei Trifonov uncovered a hidden livelock. His journey through eBPF tracing, perf analysis, and a reproducible test case ultimately led to a surprising fix - only for another kernel bug to surface right after. Curious, read on…


https://clickhouse.com/blog/a-case-of-the-vanishing-cpu-a-linux-kernel-debugging-story
pgrouting

pgRouting extends the PostGIS/PostgreSQL geospatial database to provide geospatial routing and other network analysis functionality.


https://github.com/pgRouting/pgrouting
rsql

rsql is a command line SQL interface for data. rsql is a modern, feature-rich, and user-friendly client, that has been designed to be easy to use, and to provide a consistent experience across all supported data sources.


https://github.com/theseus-rs/rsql
postgresql-embedded

Install and run a PostgreSQL database locally on Linux, MacOS or Windows. PostgreSQL can be bundled with your application, or downloaded on demand.


https://github.com/theseus-rs/postgresql-embedded
wait4x

Wait4X is a powerful, zero-dependency tool that waits for services to be ready before continuing.


https://github.com/atkrad/wait4x