DevOps&SRE Library – Telegram
DevOps&SRE Library
18.4K subscribers
459 photos
3 videos
2 files
5K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Product management is broken. Engineers can fix it

How we've redefined the PM and engineer relationship


https://newsletter.posthog.com/p/product-management-is-broken-engineers
feluda

🔎 Feluda is a Rust-based command-line tool that analyzes the dependencies of a project, notes down their licenses, and flags any permissions that restrict personal or commercial usage.


https://github.com/anistark/feluda
pgwatch

PGWATCH: PostgreSQL metrics monitor/dashboard


https://github.com/cybertec-postgresql/pgwatch
Postgres in the time of monster hardware

I don't know if you followed the release of the last generation of CPUs. AMD's latest Genoa CPU (AMD EPYC 9965) can run 768 threads. It has 192 cores per socket and 2 threads per core, with 2 sockets. Imagine adding 10 TB of RAM to such a beast! Of course, everyone will think of how useful it will be for virtualization. As a database person, I'd rather ask myself what Postgres could do with so many resources. I love simplicity in architecture. But I often meet customers with huge resource needs. With average hosts nowadays, the best answer for them is sometimes multi-parallel processing (MPP).

So, with this new hardware, can we stop using horizontal scalability? To understand the impact of running PostgreSQL on it, we must examine a few technical limits. The analysis will begin with NUMA (Non-Uniform Memory Access) architecture. Next, we will address I/O bandwidth limits. They are a big factor, no matter the CPU or memory. Next, we will look at how PostgreSQL behaves with many connections. This topic has historical limits that bring up key questions. Finally, we will test parallel queries. We will examine their scalability and effectiveness on systems with many CPU threads.


https://www.enterprisedb.com/blog/postgres-time-monster-hardware
Nping

Nping is a Ping tool developed in Rust. It supports concurrent Ping for multiple addresses, visual chart display, real-time data updates, and other features.


https://github.com/hanshuaikang/Nping
kvm

JetKVM is a high-performance, open-source KVM over IP (Keyboard, Video, Mouse) solution designed for efficient remote management of computers, servers, and workstations. Whether you're dealing with boot failures, installing a new operating system, adjusting BIOS settings, or simply taking control of a machine from afar, JetKVM provides the tools to get it done effectively.


https://github.com/jetkvm/kvm
tailpipe

Tailpipe is the lightweight, developer-friendly way to query logs.

Cloud logs, SQL insights. Collects logs from cloud, container and application sources. Query and analyze your data instantly with the power of SQL, right from your terminal.

Fast, local, and efficient. Runs locally, powered by DuckDB's in-memory analytics and Parquet's optimized storage.

An ecosystem of prebuilt intelligence. MITRE ATT&CK-aligned queries, prebuilt detections, benchmarks, and dashboards, all open source and community-driven.

Built to build with. Define detections as code, extend functionality with plugins and write custom SQL queries.


https://github.com/turbot/tailpipe
mdq

like jq but for Markdown: find specific elements in a md doc


https://github.com/yshavit/mdq
yaak

Yaak is a desktop API client for interacting with REST, GraphQL, Server Sent Events (SSE), WebSocket, and gRPC APIs.


https://github.com/mountain-loop/yaak
It's a log eat log world!

Let's discuss logging - unstructured, structured and canonical log lines - what they are and what value they bring to your production systems.


https://obakeng.substack.com/p/its-a-log-eat-log-world
Redis as a Primary Database for Complex Applications

How Redis can be used as a primary database for complex applications that need to store data in multiple formats?


https://faun.pub/redis-as-a-primary-database-for-complex-applications-501ced31f923
Slicing Up—and Iterating on—SLOs

One of the main pieces of advice about Service Level Objectives (SLOs) is that they should focus on the user experience. Invariably, this leads to people further down the stack asking, “But how do I make my work fit the users?”—to which the answer is to redefine what we mean by “user.” In the end, a user is anyone who uses whatever it is you’re measuring.


https://www.honeycomb.io/blog/slicing-up-and-iterating-on-slos
Unlocking Kubernetes Observability with the OpenTelemetry Operator

https://www.dash0.com/blog/unlocking-kubernetes-observability-with-the-opentelemetry-operator
terraform-backend-git

Terraform HTTP Backend implementation that uses Git repository as storage


https://github.com/plumber-cd/terraform-backend-git
tfbuddy

TFBuddy allows Terraform Cloud users to get apply-before-merge workflows in their Pull Requests.


https://github.com/zapier/tfbuddy
Ensuring Effective Helm Charts with Linting, Testing, and Diff Checks

https://dev.to/hkhelil/ensuring-effective-helm-charts-with-linting-testing-and-diff-checks-ni0
Metal3

Metal3 (pronounced “metal cubed”) is an open-source project that provides a set of tools for managing bare-metal infrastructure using Kubernetes.


https://metal3.io
autotune

Kruize Autotune is an Autonomous Performance Tuning Tool for Kubernetes. Autotune accepts a user provided "slo" goal to optimize application performance. It uses Prometheus to identify "layers" of an application that it is monitoring and matches tunables from those layers to the user provided slo. It then runs experiments with the help of a hyperparameter optimization framework to arrive at the most optimal values for the identified set of tunables to get a better result for the user provided slo.

Autotune can take an arbitrarily large set of tunables and run experiments to continually optimize the user provided slo in incremental steps. For this reason, it does not necessarily have a "best" value for a set of tunables, only a "better" one than what is currently deployed.


https://github.com/kruize/autotune