DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
The video covers data agility, focusing on the challenges of building large-scale data systems. Martin Kleppmann discusses the complexity of integrating multiple components like databases, caches, search engines, and graph systems. He introduces event streams and systems like Kafka and Samza as solutions to improve scalability and reduce complexity by processing data in a unified, ordered log. Kleppmann emphasizes loose coupling of components, event-driven architectures, and stream processing to achieve a more scalable and maintainable system.

https://www.youtube.com/watch?v=b_H4FFE3wP0
👍32
Policy enforcement is critical in Kubernetes environments to ensure security and compliance. This article by Javier Canizalez explains how to use Gatekeeper to restrict the kubectl exec command, enhancing security by preventing unauthorized access to running containers. Learn about the steps to configure Gatekeeper for policy enforcement and how to restrict potentially dangerous operations within your Kubernetes clusters.

https://medium.com/@javier-canizalez/policy-enforcement-in-kubernetes-restricting-kubectl-exec-with-gatekeeper-7e99823465c9
👍4
Upgrading AWS EKS clusters can be complex, but using a blue-green deployment strategy can make the process more seamless and reduce downtime. This article from OneFootball Locker Room explains how to optimize EKS cluster upgrades using the blue-green tactic. Learn how this approach ensures smooth transitions between cluster versions, minimizes risk, and maintains high availability during the upgrade process.

https://medium.com/onefootball-locker-room/from-blue-to-green-optimizing-aws-eks-clusters-upgrade-with-blue-green-tactic-2ee7c4920755
👍3
Security training is a fundamental part of maintaining a secure and resilient organization. This article from PagerDuty outlines their approach to security training, detailing how they empower employees to recognize and mitigate security threats. Learn about the key components of their security training program, including best practices, ongoing education, and the importance of fostering a security-conscious culture across the company.
https://www.pagerduty.com/blog/security-training-at-pagerduty/
👍4
🎵 Do you listen to tech podcasts?
Final Results
56%
Yes
44%
No
👍2
DevOps & SRE notes pinned «🎵 Do you listen to tech podcasts?»
Running GPU-accelerated workloads, especially large language models (LLMs), on Amazon EKS can significantly enhance performance for AI and machine learning applications. This article from Prodigy Engineering explains how to configure and manage GPU-accelerated workloads on EKS. Learn about the necessary steps, best practices, and challenges involved in optimizing Kubernetes clusters to run GPU-intensive tasks efficiently.

https://medium.com/prodigy-engineering/running-gpu-accelerated-llm-workloads-on-eks-9928c07d30ea
👍2
Kubernetes can offer tremendous benefits, but it's not without its challenges. This article from Encore shares real-world "horror stories" from Kubernetes environments, highlighting common mistakes and pitfalls teams have faced. Through these cautionary tales, learn how to avoid misconfigurations, optimize cluster performance, and prevent operational disasters in your own Kubernetes deployments.

https://encore.dev/blog/horror-stories-k8s
👍5💩1
DNS issues can be particularly troublesome when using NGINX as a reverse proxy. This article by Hwchiu on Medium addresses common DNS-related problems encountered in NGINX reverse proxy setups, explaining the root causes and offering solutions to resolve them. Learn about configuration tips, troubleshooting steps, and best practices to ensure reliable DNS resolution in your NGINX reverse proxy deployments.

https://hwchiu.medium.com/nginx-reverse-proxy-dns-issue-671d911dc5fa
👍4💯3
Securing Kubernetes clusters requires understanding both offensive and defensive strategies. This article by Ridho Adya explores the various attack vectors and defense mechanisms for Kubernetes environments. Learn how to identify vulnerabilities, execute common attack techniques, and implement best practices for defending your Kubernetes clusters against potential threats.

https://medium.com/@ridhoadya/unveiling-the-battlefield-attacking-and-defending-kubernetes-clusters-9702cdbe941a
👍6
Karpenter 1.0, recently announced by AWS, is a powerful open-source Kubernetes cluster autoscaling tool designed to optimize resource provisioning in real-time. This blog post from AWS highlights the key features of Karpenter, explaining how it improves the scalability and efficiency of Kubernetes clusters by automatically adjusting compute resources based on workload demands. Learn how Karpenter 1.0 can simplify cluster management and enhance operational efficiency.

https://aws.amazon.com/blogs/containers/announcing-karpenter-1-0/
🔥6👍42