NEW BOT Телеграм, страница

DevOps&SRE Library

Mastering Progressive Delivery: Implementing Canary Releases, A/B Testing, and Custom Metrics with Istio and Flagger in Kubernetes

https://ibrahimhkoyuncu.medium.com/mastering-progressive-delivery-implementing-canary-releases-a-b-testing-and-custom-metrics-with-373a21918c9e

3.67K views07:04

DevOps&SRE Library

Streamlining Multi-Cluster Deployments with FluxCD, GitOps, Helm, and Kustomize

https://blog.dasburo.com/streamlining-multi-cluster-deployments-with-fluxcd-gitops-helm-and-kustomize-38beffdef29b

3.49K views15:02

DevOps&SRE Library

Kubernetes Scheduling - The Complete Guide

In this guide, we'll break down the essentials of scheduling in Kubernetes. We'll explore how the scheduler works behind the scenes, the techniques used to optimize pod placement, and the best practices to ensure your applications run smoothly.

https://blog.kubesimplify.com/kubernetes-scheduling-the-complete-guide

3.76K views07:03

DevOps&SRE Library

Building a Custom Kubernetes Scheduler Plugin: Scheduling Based on Pod-Specific Node Affinity

https://blog.stackademic.com/building-a-custom-kubernetes-scheduler-plugin-scheduling-based-on-pod-specific-node-affinity-7f66b6c607f9

3.53K views15:03

DevOps&SRE Library

How we migrated onto K8s in less than 12 months

Migrating onto Kubernetes can take years. Here’s why we decided it was worth undertaking, and how we moved a majority of our core services in less than 12 months, all while making our compute platform easier to use.

https://www.figma.com/blog/migrating-onto-kubernetes

3.67K views07:02

DevOps&SRE Library

Everything You Ever Wanted to Know About Deletion and Argo CD Finalizers but Were Afraid to Ask

https://codefresh.io/blog/argocd-application-deletion-finalizers

3.47K views15:06

DevOps&SRE Library

How to profile and debug a Java application running on Kubernetes

https://spoud-io.medium.com/how-to-profile-and-debug-a-java-application-running-on-kubernetes-34cdb1f0c05e

3.83K views07:03

DevOps&SRE Library

We Can Resize Pods without Restarts! Or Can't We?

https://www.perfectscale.io/blog/resize-pods-without-restarts

3.82K views15:04

DevOps&SRE Library

Migrating from Istio to Linkerd

In this guide we'll walk you through a task that is increasingly common in the Kubernetes space: migrating an existing Istio deployment to Linkerd. We'll start with a general overview of our recommended strategy for approaching this task, and then dig into some of the gory details.

The good news is that most of the time, this is a pretty straightforward task that primarily consists of "removing lots of unnecessary Istio configuration". But as with all such changes, it can get a little hairy depending on the specifics of what your application does and how tightly you've (possibly accidentally) built dependencies to Istio's behavior. Happily, there is an incremental way to approach your migration which can help reduce overall risk—we'll talk about that below. Either way, be sure to read through carefully and think through your plan and strategy before you dive right in.

https://buoyant.io/blog/migrating-from-istio-to-linkerd

4.21K views07:03

DevOps&SRE Library

gemini

Gemini is a Kubernetes CRD and operator for managing VolumeSnapshots. This allows you to create a snapshot of the data on your PersistentVolumes on a regular schedule, retire old snapshots, and restore snapshots with minimal downtime.

https://github.com/FairwindsOps/gemini

5.3K views15:07

DevOps&SRE Library

television

Television is a fast and versatile fuzzy finder TUI.

It lets you quickly search through any kind of data source (files, git repositories, environment variables, docker images, you name it) using a fuzzy matching algorithm and is designed to be easily extensible.

https://github.com/alexpasmantier/television

3.73K views07:00

DevOps&SRE Library

Pushing the whole company into the past on purpose

Every six months or so, this neat group called the International Earth Rotation Service issues a directive on whether there will be a leap second inserted at the end of that six month period. You usually find out at the beginning of January or the beginning of July, and thus would have a leap second event at the end of June or December, respectively.

https://rachelbythebay.com/w/2025/01/09/lag

4.03K views15:03

DevOps&SRE Library

The SRE Report 2025 Catchpoint.pdf

3.3 MB

The SRE Report 2025

3.98K views07:02

DevOps&SRE Library

documentdb

DocumentDB offers a native implementation of document-oriented NoSQL database, enabling seamless CRUD operations on BSON data types within a PostgreSQL framework. Beyond basic operations, DocumentDB empowers you to execute complex workloads, including full-text searches, geospatial queries, and vector embeddings on your dataset, delivering robust functionality and flexibility for diverse data management needs.

PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads.

https://github.com/microsoft/documentdb

3.58K views15:06

DevOps&SRE Library

Kafka vs NATS: A Comparison for Message Processing

Kafka and NATS are both popular tools for message processing. This article provides a comparison between Kafka and NATS.

https://dzone.com/articles/kafka-vs-nats-message-processing

5.31K views07:05

DevOps&SRE Library

Go All the Way: Why Golang is Your Swiss Army Knife for Modern Development

At Oodle's inception, we faced a common dilemma: choosing the right technology stack to get started. With a small team proficient in Go and a big vision, we needed a language that could handle everything from application development to infrastructure management. After careful consideration, we chose Go, and it has proven to be our Swiss Army knife for modern development. Here's why.

https://blog.oodle.ai/go-all-the-way-why-golang-is-your-swiss-army-knife-for-modern-development

4.03K views15:07

DevOps&SRE Library

tfmv

tfmv is a CLI to rename Terraform resources, data sources, and modules and generate moved blocks.

https://github.com/suzuki-shunsuke/tfmv

3.72K views07:06

DevOps&SRE Library

Forwarded from Performance matters!

Алгоритмы управления потоком (Flow Control) в TCP служат для предотвращения перегрузки сети и потерь данных.

Исследования в этой области не прекращаются и на сегодня нам доступно множество вариантов:

* Reno (1986)
* New Reno (1999)
* CUBIC (2004)
* FAST TCP (2005)
* BBRv1 (2016)
* BBRv2 (2019)
* BBRv3 (2023)
* ...

По умолчанию в Linux используется CUBIC. Однако создатели BBR (Google) выкладывают любопытные исследования, где резюмируют:

BBR enables big throughput improvements on high-speed, long-haul links...
BBR enables significant reductions in latency in last-mile networks that connect users to the internet...

Так может нам просто переехать на новые рельсы?

Хотя кажется правильнее поставить вопрос по другому: в каких случаях какой алгоритм может быть предпочтительнее?

————

Алгоритмы Flow Control можно условно разделить на два типа:
1. Loss-based (ориентированы на потери пакетов): Reno, NewReno, CUBIC
2. Delay-based (ориентированы на изменения RTT): FAST TCP, BBRv*

Основная цель любой реализации Flow Control — максимально эффективно использовать пропускную способность канала, сохраняя баланс между скоростью передачи данных и предотвращением перегрузок.

Скорость регулируется через Congestion Window (окно перегрузки) — сколько данных можно отправить без получения подтверждения.

Разница между подходами к контролю перегрузки заключается в методах её определения.

Loss-based (CUBIC)

Алгоритмы этого типа оценивают перегрузку по потерям пакетов.

Пришел дублирующий ACK или сработал Retransmission Timeout (RTO)? Значит есть потери и следовательно канал перегружен - снижаем скорость.
Затем ориентируясь на поступающие ACK, скорость увеличивается, пока не обнаружатся новые потери.

Такой подход может забивать очереди в канале до предела, что и будет приводить к потерям. Реакция носит реактивный характер: перегрузка фиксируется только после её возникновения.

Delay-based (BBR)

В Delay-based алгоритмах, таких как BBR, перегрузка оценивается на основе изменения задержек:
* минимальный RTT (RTT_min) принимается за эталон;
* если текущий RTT (RTT_now) превышает RTT_min, алгоритм предполагает, что канал перегружен, и снижает скорость передачи данных.

Таким образом, BBR стремится избегать заполнения очередей, что позволяет сократить задержки.
Его подход более превентивный: предотвращение перегрузки до её появления.

————

CUBIC проигрывает BBR в сетях с высоким RTT, например, в интернете. Это происходит из-за медленного роста скорости после обнаружения потерь: ACK приходят с задержкой.

Внутри дата-центров, где RTT низкий, CUBIC должен справляться лучше - быстрые ACK ускоряют рост скорости передачи данных.

BBR же в таких сетях может не дать преимуществ. При всплесках трафика он снижает скорость, чтобы избежать заполнения очередей, из-за чего канал используется не полностью. Кроме того, возможны конфликты между алгоритмами, когда та или иная реализация будет захватывать пропусную способность, вытесняя другие. Настоящие войны)

Вообщем как обычно надо быть осторожее!

Почитать:
- https://blog.apnic.net/2017/05/09/bbr-new-kid-tcp-block/
- https://book.systemsapproach.org/congestion.html
- https://tcpcc.systemsapproach.org/

tags: #network #tcp

Google Cloud Blog

TCP BBR congestion control comes to GCP – your Internet just got faster | Google Cloud Blog

3.79K views10:03

DevOps&SRE Library

GitOps Secrets with Argo CD, Hashicorp Vault and the External Secret Operator

In this post, we showcase the External Secret Operator and Hashicorp Vault and focus on 2 important aspects.

- How to avoid saving ANY secrets in Git, including tokens for fetching the application secrets
- How to refresh secrets automatically without pod restarts and application deployments

https://medium.com/containers-101/gitops-secrets-with-argo-cd-hashicorp-vault-and-the-external-secret-operator-eb1eec1dab0d

4.81K views15:06

DevOps&SRE Library

A hands-on lab: Why running as root in Kubernetes containers is dangerous?

https://medium.com/@marcin.wasiucionek/why-is-running-as-root-in-kubernetes-containers-dangerous-e5f1a116080e

3.7K views07:01

DevOps&SRE Library

Securing Secrets in Confidential Containers: Usage patterns to avoid

https://itnext.io/securing-secrets-in-confidential-containers-usage-patterns-to-avoid-941388cde546

3.94K views15:05

About

Blog

Apps

Platform