NEW BOT Телеграм, страница

88 views23:46

https://reliabilityengineering.substack.com/p/things-that-makes-a-good-site-reliability?utm_campaign=posts-open-in-app&triedRedirect=true

Reliability Engineering

Things That Makes a Good Site Reliability Engineer

The success of an SRE hinges not only on technical expertise but also on soft skills and work habits. Here are some essential practices and traits that can make you an exemplary SRE. Time Management Effective time management is the cornerstone of any successful…

90 views11:33

DevOps drawer

https://2023.nixcon.org/recordings/

all video recordings from the last NixCON 2023, Darmstadt, DE

86 viewsedited 09:52

DevOps drawer

http://abstraction.blog/2023/06/13/cloud-alerting-strategy

Alerting is an essential step of monitoring. Monitoring provides you visibility into the health of your systems. The benefits of alerting are :
• An alert can contain enough contextual information to help us quickly get started on diagnostic activities.
• Alerting can be used to invoke remediation functions such as autoscaling.
• Alerts can also enable cost-awareness by watching budgets and limits.

Abstraction.blog

An Alerting strategy for the cloud

There arent much articles out there on alerting strategies. I found that out when I was developing one myself to implement a robust alerting system. Its been a couple of years since then and not much has changed. Some gems of knowledge on alerting remain…

95 views13:06

DevOps drawer

https://goteleport.com/blog/kubernetes-audit-logging/

In this guide, you’ll learn the basics of Kubernetes audit logging, as well as advice for how to set it up and choose an appropriate backend. You’ll also learn about best practices for getting the most value from the processes.

Goteleport

6 Best Practices for Kubernetes Audit Logging

A list of best practices for Kubernetes auditing, starting with guidelines for how to create a solid auditing policy foundation.

❤1

119 views13:17

DevOps drawer

paperless-ngx

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

https://github.com/paperless-ngx/paperless-ngx

87 views07:47

DevOps drawer

Feature Flags vs. Feature Management: A Technical Deep Dive for SREs

https://www.cloudbees.com/blog/feature-flag-vs-feature-management

80 views07:48

DevOps drawer

kubeseal-convert

A tool for importing secrets from a pre-existing secrets management systems (e.g. Vault, Secrets Manager) into a SealedSecret.

https://github.com/EladLeev/kubeseal-convert

80 views07:48

DevOps drawer

krr

Robusta KRR (Kubernetes Resource Recommender) is a CLI tool for optimizing resource allocation in Kubernetes clusters. It gathers pod usage data from Prometheus and recommends requests and limits for CPU and memory. This reduces costs and improves performance.

https://github.com/robusta-dev/krr

77 views07:48

DevOps drawer

End-to-end Testing of Kubernetes Resources with the e2e-framework

https://medium.com/programming-kubernetes/end-to-end-testing-of-kubernetes-resources-with-the-e2e-framework-ac52e7e58db8

63 views07:49

DevOps drawer

Understand how graceful shutdown can achieve zero downtime during k8s rolling update

https://dev.to/yutaroyamanaka/understand-how-graceful-shutdown-can-achieve-zero-downtime-during-k8s-rolling-update-15eh

63 views07:49

DevOps drawer

In modern cloud-native environments, Kafka consumers are increasingly deployed within Kubernetes. This setup offers benefits in scalability and deployment ease but also introduces the need for sophisticated scaling strategies that can adapt to the volatile nature of Kafka’s data streams.

https://kedify.io/resources/blog/keda-kafka-improve-performance-by-62-15-at-peak-loads/

Kedify

KEDA + Kafka: Improve performance by 62.15% at peak loads | Kedify

Cut cloud costs by 20%+, auto‑scale any workloads including HTTP, gRPC & ML workloads, and gain centralized multi‑cluster control and insights.

67 views07:50

DevOps drawer

How Wise reduced AWS RDS maintenance downtimes from 10 minutes to 100 milliseconds is an interesting story for those who do DB operations.

From time to time, it's necessary to apply changes that require downtime. However, it's unacceptable to have long "maintenance windows" nowadays. So, one has to be creative.

#dba #mariadb

Medium

How Wise reduced AWS RDS maintenance downtimes from 10 minutes to 100 milliseconds

A story of a fruitful collaboration between Site Reliability and Database Engineering teams

84 views12:04

DevOps drawer

dotenvx

a better dotenv–from the creator of `dotenv`

https://github.com/dotenvx/dotenvx

86 views21:59

DevOps drawer

Kafka 101

Originally developed in LinkedIn during 2011, Apache Kafka is one of the most popular open-source Apache projects out there. So far it has had a total of 24 notable releases and most intriguingly, its code base has grown at an average rate of 24% throughout each of those releases.

https://highscalability.com/unnoscriptd-2

99 views16:07

DevOps drawer

Becoming a Senior Site Reliability Engineer: A Guide to Upskilling

Learn how to upskill yourself to become senior site reliability engineer

https://reliabilityengineering.substack.com/p/becoming-a-senior-site-reliability

114 views06:45

DevOps drawer

https://github.com/qdm12/gluetun

GitHub

GitHub - qdm12/gluetun: VPN client in a thin Docker container for multiple VPN providers, written in Go, and using OpenVPN or Wireguard…

VPN client in a thin Docker container for multiple VPN providers, written in Go, and using OpenVPN or Wireguard, DNS over TLS, with a few proxy servers built-in. - qdm12/gluetun

89 views09:18

DevOps drawer

Tetragon is a flexible Kubernetes-aware security observability and runtime enforcement tool that applies policy and filtering directly with eBPF, allowing for reduced observation overhead, tracking of any process, and real-time enforcement of policies

https://tetragon.io/

Tetragon - eBPF-based Security Observability and Runtime Enforcement

Tetragon is a sub-project under Cillium and a proud CNCF project eBPF-based Security Observability and Runtime Enforcement Tetragon is a flexible Kubernetes-aware security observability and runtime enforcement tool that applies policy and filtering directly…

90 views20:49

DevOps drawer

In this article, the Exness SOC (Security Operations Center) team shares approaches to monitoring and detecting threats in the K8s environment

https://scribe.rip/exness-blog/threat-detection-in-the-k8s-environment-d5fdcd88a094

70 views20:54

DevOps drawer

GPU Virtualization in K8s: Challenges and State of the Art

Kubernetes schedules GPU workloads by assigning a whole device to a single job exclusively. This one-to-one relationship leads to massive GPU underutilization, especially for interactive jobs, characterized by significant idle periods and infrequent bursts of heavy GPU usage. Current solutions enable GPU sharing by statically assigning a fixed slice of GPU memory to each co-located job. These solutions are not suitable for interactive scenarios since the number of co-located jobs is limited by the size of physical GPU memory. Consequently, users must know the GPU memory demand of their jobs before submitting them for execution, which is impractical.

https://www.arrikto.com/blog/gpu-virtualization-in-k8s-challenges-and-state-of-the-art

69 views18:58

DevOps drawer

Kubernetes Events — News feed of your cluster

Understand Kubernetes Events and learn to use kubectl events to monitor and troubleshoot your cluster’s issues effectively.

https://decisivedevops.com/kubernetes-events-news-feed-of-your-kubernetes-cluster-826e08892d7a

68 views19:02

About

Blog

Apps

Platform