NEW BOT Телеграм, страница

Looking for a web interface to simplify troubleshooting and managing Kubernetes clusters? Consider Komodor as a powerful option, offered with paid and freemium plans.

310 viewstutunak, 14:46

DevOps & SRE notes

In the second part of the DevOps project, the focus is on deploying monitoring tools like ArgoCD, Prometheus, and Grafana to a Kubernetes cluster. The blog post covers installing ArgoCD, deploying Prometheus using Helm charts, setting up monitoring for ArgoCD, visualizing ArgoCD metrics using Grafana dashboards, and continuous deployment of applications using ArgoCD. A useful tool, K8sgpt, is recommended to analyze the cluster for errors and potential issues. The next blog post will discuss configuring Alert Manager for notifications, setting up Slack alerts, and installing Loki for logs, enhancing the monitoring solution.

https://blog.devgenius.io/optimizing-kubernetes-deployments-with-argocd-and-prometheus-aa86c11e2bba

Medium

Optimizing Kubernetes Deployments with ArgoCD and Prometheus

Welcome back to our DevOps project, where we demonstrate how to automate Kubernetes deployments using Terraform, ArgoCD, Prometheus, and…

345 viewstutunak, 06:45

DevOps & SRE notes

Don't forget about security

https://dzone.com/articles/container-security-top-5-best-practices-for-devops

DZone

Container Security: Top 5 Best Practices for DevOps Engineers

Container security ensures that your cloud-native applications are protected from cybersecurity threats associated with container environments.

290 viewstutunak, 14:46

DevOps & SRE notes

A new terraform version has been released. Import already existed infrastructure to the terraform state become easier.
https://www.hashicorp.com/blog/terraform-1-5-brings-config-driven-import-and-checks

307 viewstutunak, 06:16

DevOps & SRE notes

https://dev.to/danielepolencic/how-etcd-works-in-kubernetes-373l

DEV Community

How etcd works in Kubernetes

If you've ever interacted with a Kubernetes cluster in any way, chances are it was powered by etcd...

296 viewstutunak, 14:57

DevOps & SRE notes

Streaming alert evaluation offers better scalability than traditional polling time-series databases, overcoming high dimensionality/cardinality limitations. This enables engineers to have more reliable and real-time alerting systems. The transition to the streaming path has opened doors for supporting more exciting use-cases and has allowed multiple platform teams at Netflix to generate and maintain alerts programmatically without affecting other users. The streaming paradigm may help tackle correlation problems in observability and offer new opportunities for metrics and events verticals, such as logs and traces.

https://netflixtechblog.com/improved-alerting-with-atlas-streaming-eval-e691c60dc61e

Medium

Improved Alerting with Atlas Streaming Eval

Ruchir Jha, Brian Harrington, Yingwu Zhao

👍1

286 viewstutunak, 06:48

DevOps & SRE notes

https://www.techtarget.com/searchitoperations/tip/Stay-ahead-of-threats-with-DevOps-security-best-practices

IT Operations

Stay ahead of threats with DevOps security best practices

Learn how to safeguard your organization's IT infrastructure against cyberthreats by implementing these DevOps security best practices.

282 viewstutunak, 14:48

DevOps & SRE notes

In this post, the author discusses potential PostgreSQL pitfalls that may not affect small databases, but can cause issues when databases grow.
https://philbooth.me/blog/nine-ways-to-shoot-yourself-in-the-foot-with-postgresql

philbooth.me

Nine ways to shoot yourself in the foot with PostgreSQL

Personal website of Phil Booth

292 viewstutunak, 06:58

DevOps & SRE notes

https://community.ops.io/danielepolencic/learning-how-an-ingress-controller-works-by-building-one-in-bash-3fni

The Ops Community ⚙️

Learning how an ingress controller works by building one in bash

TL;DR: In this article, you will learn how the Ingress controller works in Kubernetes by building one...

291 viewstutunak, 15:01

DevOps & SRE notes

Pipedrive Infra manages numerous Kubernetes clusters across different clouds, including AWS and on-premise OpenStack. They had been experiencing intermittent failing pod health checks, which became more frequent over time. After an extensive investigation, the team discovered that Kubelet was initiating TCP sessions to pods using random source ports within the same range reserved by Kubernetes nodeports. This caused the TCP SYN-ACK to be redirected to other pods, leading to failed health checks. The solution was to disallow the use of the nodeport range as the source port for TCP sessions with a single line of code, effectively resolving the issue.

https://medium.com/pipedrive-engineering/solving-the-mystery-of-pods-health-checks-failures-in-kubernetes-55b375493d03

Medium

Solving the mystery of pods health checks failures in Kubernetes

The story of one troubleshooting case.

294 viewstutunak, 06:01

DevOps & SRE notes

https://heyryanw.medium.com/benchmark-metal-vs-ec2-vs-eks-vs-ecs-for-an-existing-stack-27a65df6db1f

Medium

Benchmark: Metal vs EC2 vs EKS vs ECS (for an existing stack)

A client I work with was recently contemplating a move from an EC2 deployment with some dedicated metal, to a self-hosted or EKS K8s…

304 viewstutunak, 15:01

DevOps & SRE notes

https://towardsdatascience.com/exploring-the-power-of-overlay-file-systems-in-linux-containers-d846724ec06d

Medium

Exploring the Power of Overlay File Systems in Linux Containers

Unlocking the Potential of Containerization with a unique yet simple idea of layering

325 viewstutunak, 06:01

DevOps & SRE notes

https://blog.devops.dev/differences-between-kcl-and-other-kubernetes-configuration-management-tools-kustomize-bc27c072040f

Medium

Differences between KCL and Other Kubernetes Configuration Management Tools — Kustomize

In the previous section, we introduced how to use KCL to write and manage Kubernetes configurations and apply them to the cluster. In this…

315 viewstutunak, 14:02

DevOps & SRE notes

Efficient GPU utilization is crucial for minimizing infrastructure expenses, especially in large Kubernetes clusters running AI and HPC workloads. NVIDIA MIG enables partitioning GPUs into smaller slices, but using MIG in Kubernetes through the NVIDIA GPU Operator alone has limitations due to static configurations. Dynamic MIG Partitioning addresses these limitations by automating the creation and deletion of MIG profiles based on real-time workload requirements, ensuring optimal GPU utilization. The nos module works alongside the NVIDIA GPU Operator to implement dynamic MIG partitioning, simplifying the management of MIG configurations and reducing operational costs.

https://towardsdatascience.com/dynamic-mig-partitioning-in-kubernetes-89db6cdde7a3

Medium

Dynamic MIG Partitioning in Kubernetes

Maximize GPU utilization and reduce infrastructure costs.

296 viewstutunak, 06:02

DevOps & SRE notes

https://release.com/blog/rainbow-deployment-why-and-how-to-do-it

Release

Rainbow Deployment: Why and how to do it — Release

Zero-downtime deployments with rainbow deployments. Learn how they work and what benefits they can bring.

287 viewstutunak, 14:03

DevOps & SRE notes

https://www.microsoft.com/en-us/security/blog/2023/04/06/devops-threat-matrix/

Microsoft News

DevOps threat matrix

Our DevOps threat matrix is a comprehensive knowledgebase defenders can use to track and build defenses against relevant attack techniques.

300 viewstutunak, 06:04

DevOps & SRE notes

https://world.hey.com/dhh/even-amazon-can-t-make-sense-of-serverless-or-microservices-59625580

Hey

Even Amazon can't make sense of serverless or microservices

The Prime Video team at Amazon has published a rather remarkable case study on their decision to dump their serverless, microservices architecture and replace it with a monolith instead. This move saved them a staggering 90%(!!) on operating costs, and simplified…

311 viewstutunak, 14:03

DevOps & SRE notes

Roadmapper - A Roadmap as Code (Rac) python library. Generate professional roadmap diagram using python code.

https://github.com/csgoh/roadmapper

GitHub

GitHub - csgoh/roadmapper: Roadmapper - A Roadmap as Code (Rac) python library. Generate professional roadmap diagram using python…

Roadmapper - A Roadmap as Code (Rac) python library. Generate professional roadmap diagram using python code. - csgoh/roadmapper

322 viewstutunak, 06:02

DevOps & SRE notes

Open-Source Tracing Platform

https://github.com/teletrace/teletrace

GitHub

GitHub - teletrace/teletrace: Open-Source Tracing Platform

Open-Source Tracing Platform. Contribute to teletrace/teletrace development by creating an account on GitHub.

308 viewstutunak, 14:56

DevOps & SRE notes

rustic - fast, encrypted, deduplicated backups powered by Rust

https://github.com/rustic-rs/rustic

GitHub

GitHub - rustic-rs/rustic: rustic - fast, encrypted, and deduplicated backups powered by Rust

rustic - fast, encrypted, and deduplicated backups powered by Rust - rustic-rs/rustic

293 viewstutunak, 06:56

DevOps & SRE notes

Have you ever heard that company migrate from microservice architecture to monolith?
Moving our service to a monolith reduced our infrastructure cost by over 90%. It also increased our scaling capabilities. Today, we’re able to handle thousands of streams and we still have capacity to scale the service even further. Moving the solution to Amazon EC2 and Amazon ECS also allowed us to use the Amazon EC2 compute saving plans that will help drive costs down even further.
https://www.primevideotech.com/video-streaming/scaling-up-the-prime-video-audio-video-monitoring-service-and-reducing-costs-by-90

Amazon News

Entertainment

We create and provide access to world-class entertainment through Amazon Originals, Prime Video, Audible, Amazon Games, Twitch, Amazon Music, Prime Gaming, and more. Amazon’s digital entertainment products enable customers to access the latest apps and games…

288 viewstutunak, 14:37

About

Blog

Apps

Platform