DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
Importance of integrating action items into incident reviews, arguing that learning and identifying improvements are inseparable aspects of the process. The author emphasizes that discussing potential actions during reviews can enhance understanding and lead to more effective system improvements, challenging the notion that focusing on actions detracts from learning.

https://incident.io/blog/why-i-like-discussing-actions-items-in-incident-reviews
This blogpost discusses strategies for building resilient applications on Kubernetes, emphasizing the importance of proper configuration to harness the platform's dynamic nature. It covers key topics such as configuring health probes, handling pod termination gracefully, and implementing pod distribution strategies to improve application stability and reduce downtime.

https://jaadds.medium.com/building-resilient-applications-on-kubernetes-9e9e4edb4d33
👍3
This blog post argues against discussing action items during incident reviews, emphasizing that these meetings should focus on learning and understanding system behavior. The author contends that dedicating time to action items reduces opportunities for valuable insights, as incident reviews offer a unique chance for diverse team members to explore the intricacies of complex socio-technical systems

https://surfingcomplexity.blog/2024/09/28/why-i-dont-like-discussing-action-items-during-incident-reviews/
👍3
This guide provides instructions on constructing an AI agent for Site Reliability Engineering (SRE). It offers insights into leveraging artificial intelligence to enhance operational efficiency and reliability in software systems.

https://www.aptible.ai/guides/how-to-build-an-ai-agent-for-sre
2👍2
This blog post discusses Gitpod's decision to move away from Kubernetes for hosting cloud development environments. It details the challenges they faced in using Kubernetes for this purpose, including resource management, security, and operational complexities

https://www.gitpod.io/blog/we-are-leaving-kubernetes
👍8
This blog post discusses Temporal Cloud's expansion to become a multi-cloud platform, detailing their journey from AWS to Google Cloud. It explores the challenges of multi-cloud deployments and how Temporal leveraged its own technology to create cloud-agnostic workflows, enabling easier management of infrastructure across different cloud providers

https://temporal.io/blog/multi-cloud-thats-one-small-step-for-temporal-one-giant-leap-for-reliability
2👍1
This article provides a comprehensive guide on selecting appropriate Postgres indexes to optimize database performance. It covers the fundamentals of indexes, their benefits and drawbacks, and offers practical advice on when and how to implement them effectively.


https://incident.io/blog/choosing-the-right-postgres-indexes
👍4
This guide demonstrates how to create an event-driven image processing pipeline using AWS Lambda, S3, and Terraform. When an image is uploaded to a source S3 bucket, it triggers a Lambda function that pixelates the image in real-time and stores the processed versions in a separate S3 bucket

https://deepak-tyagi.medium.com/image-pixelator-event-driven-real-time-s3-to-lambda-image-processing-workflow-using-terraform-6fb88ba5cdf5
This blog post provides a comprehensive guide on selecting appropriate Postgres indexes to optimize database performance. It covers the fundamentals of indexes, their benefits and drawbacks, and offers practical advice on when and how to implement them effectively.

https://pixelrobots.co.uk/2024/06/advanced-network-observability-supercharging-container-network-observability-in-azure-kubernetes-service-aks/
👍1
This blog post introduces a solution for deploying serverless Streamlit applications using Terraform on AWS. It describes a scalable architecture that leverages Amazon ECS with Fargate, CloudFront for content delivery, and a CI/CD pipeline using CodePipeline and CodeBuild. The solution aims to simplify the deployment process, allowing developers to focus on building AI-powered apps without managing infrastructure

https://aws.amazon.com/blogs/devops/accelerate-serverless-streamlit-app-deployment-with-terraform/
🔥5
This article discusses using the Monkale CoreDNS Manager Operator to manage internal DNS in air-gapped Kubernetes (k3s) clusters. It provides a comprehensive guide on exposing k3s CoreDNS to the network, utilizing it as a DNS server, and managing DNS zones using Custom Resource Definitions (CRDs).

https://itnext.io/managing-internal-dns-in-air-gapped-k3s-clusters-with-monkale-coredns-manager-operator-fa1c9136cc2c