DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
Security training is a fundamental part of maintaining a secure and resilient organization. This article from PagerDuty outlines their approach to security training, detailing how they empower employees to recognize and mitigate security threats. Learn about the key components of their security training program, including best practices, ongoing education, and the importance of fostering a security-conscious culture across the company.
https://www.pagerduty.com/blog/security-training-at-pagerduty/
👍4
🎵 Do you listen to tech podcasts?
Final Results
56%
Yes
44%
No
👍2
DevOps & SRE notes pinned «🎵 Do you listen to tech podcasts?»
Running GPU-accelerated workloads, especially large language models (LLMs), on Amazon EKS can significantly enhance performance for AI and machine learning applications. This article from Prodigy Engineering explains how to configure and manage GPU-accelerated workloads on EKS. Learn about the necessary steps, best practices, and challenges involved in optimizing Kubernetes clusters to run GPU-intensive tasks efficiently.

https://medium.com/prodigy-engineering/running-gpu-accelerated-llm-workloads-on-eks-9928c07d30ea
👍2
Kubernetes can offer tremendous benefits, but it's not without its challenges. This article from Encore shares real-world "horror stories" from Kubernetes environments, highlighting common mistakes and pitfalls teams have faced. Through these cautionary tales, learn how to avoid misconfigurations, optimize cluster performance, and prevent operational disasters in your own Kubernetes deployments.

https://encore.dev/blog/horror-stories-k8s
👍5💩1
DNS issues can be particularly troublesome when using NGINX as a reverse proxy. This article by Hwchiu on Medium addresses common DNS-related problems encountered in NGINX reverse proxy setups, explaining the root causes and offering solutions to resolve them. Learn about configuration tips, troubleshooting steps, and best practices to ensure reliable DNS resolution in your NGINX reverse proxy deployments.

https://hwchiu.medium.com/nginx-reverse-proxy-dns-issue-671d911dc5fa
👍4💯3
Securing Kubernetes clusters requires understanding both offensive and defensive strategies. This article by Ridho Adya explores the various attack vectors and defense mechanisms for Kubernetes environments. Learn how to identify vulnerabilities, execute common attack techniques, and implement best practices for defending your Kubernetes clusters against potential threats.

https://medium.com/@ridhoadya/unveiling-the-battlefield-attacking-and-defending-kubernetes-clusters-9702cdbe941a
👍6
Karpenter 1.0, recently announced by AWS, is a powerful open-source Kubernetes cluster autoscaling tool designed to optimize resource provisioning in real-time. This blog post from AWS highlights the key features of Karpenter, explaining how it improves the scalability and efficiency of Kubernetes clusters by automatically adjusting compute resources based on workload demands. Learn how Karpenter 1.0 can simplify cluster management and enhance operational efficiency.

https://aws.amazon.com/blogs/containers/announcing-karpenter-1-0/
🔥6👍42
Terraform drift detection is essential for ensuring that your infrastructure remains consistent with your code. This article from Let's Do DevOps explores how to implement and manage drift detection in Terraform environments. Learn about the tools, techniques, and best practices for identifying infrastructure drift and keeping your deployments aligned with their intended state.

https://www.letsdodevops.com/p/lets-do-devops-terraform-drift-detection
❤‍🔥3🔥2
Reducing networking costs is crucial for optimizing cloud infrastructure, especially when managing traffic between tools like Flux and GitHub. This article from Tenets explores strategies for minimizing networking expenses by optimizing traffic flow between Flux and GitHub. Learn how to implement cost-saving measures without compromising performance or security in your continuous deployment workflows.

https://medium.com/tenets/saving-networking-costs-for-traffic-flow-between-flux-github-b1cebc76fd41
3👍3
Using TargetGroupBinding with AWS Load Balancer Controller enables more efficient traffic routing to Kubernetes workloads. This AWS blog post explores common patterns for configuring TargetGroupBinding to integrate AWS Load Balancers with Kubernetes services. Learn how to leverage these patterns to optimize network traffic, enhance scalability, and ensure high availability for your Kubernetes applications.

https://aws.amazon.com/blogs/containers/patterns-for-targetgroupbinding-with-aws-load-balancer-controller/
👍4
Automating credential rotation is a key practice for maintaining security in cloud environments. This article from Mixpanel Engineering explains how to automate the rotation of credentials using Terraform. It covers the setup, tools, and processes for securely rotating secrets and API keys, ensuring that your infrastructure remains secure without manual intervention.

https://engineering.mixpanel.com/automate-rotating-credentials-using-terraform-b0e7dab4d793
👍6