DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
In this insightful piece, Lawrence Jones breaks down the nuances of measuring and understanding latency in distributed systems. The author emphasizes the importance of using percentiles and histograms over simple averages to get a true picture of system performance.
https://blog.lawrencejones.dev/latency/
👍2🔥1
🤣12😢1
The Adore Me tech team shares their journey and best practices for mastering GitOps with Flux in this detailed publication. They discuss their implementation strategies, challenges overcome, and the benefits of adopting a fully declarative approach to continuous delivery.
https://adoreme.tech/mastering-gitops-with-flux-adoreme-024b56ac397b
👍2
This article from "login:" magazine explains how Google's SRE teams are adopting the STAMP (System-Theoretic Accident Model and Processes) framework. This shift moves from preventing individual component failures to managing complex system interactions for improved reliability.
https://www.usenix.org/publications/loginonline/evolution-sre-google
This blogpost provides a comprehensive guide on implementing SMART-on-FHIR authentication for AWS HealthLake using Terraform. It walks through the necessary configurations for HealthLake, Cognito, and Lambda to create a secure healthcare application.
https://medium.com/@kczpl/how-to-implement-smart-on-fhir-with-aws-healthlake-using-terraform-130389a1c0b8
👍1
#пятница

Ну почему бы и не поучаствовать в этом нескончаемом челлендже
👍3
Ari Zilka's study for The New Stack discusses the challenges in the data observability market, highlighting how proprietary systems create data silos and limit value. The piece advocates for open standards like OpenTelemetry to foster interoperability and innovation.
https://thenewstack.io/the-looming-crisis-in-the-data-observability-market/
👍1
FluxCD UI - Coming soon

The Flux Status Page is a lightweight, mobile-friendly web interface providing real-time visibility into your GitOps pipelines. Embedded directly within the Flux Operator, it requires no additional installation steps.

Designed for DevOps engineers and platform teams, the Status Page offers direct insight into your Kubernetes clusters. It allows you to track app deployments, monitor controller readiness, and troubleshoot issues instantly, without needing to access the CLI.

Built with security in mind, the interface is strictly read-only, ensuring it never interferes with Flux controllers or compromises cluster security. Together with the Flux MCP Server, it provides a comprehensive solution for on-call monitoring and Agentic AI incident response in production environments.

https://github.com/controlplaneio-fluxcd/flux-operator/pull/488
🔥61
This piece offers a detailed look into the system architecture that powers Netflix's streaming service. It covers the company's cloud-native approach, its use of microservices, and its sophisticated content delivery network (CDN).
https://www.clickittech.com/software-development/netflix-architecture/
👍2
Karsten Schnitter's review on the OpenSearch blog explores how to visualize metrics ingested with OpenTelemetry using OpenSearch Dashboards. The author provides examples of creating insightful visualizations for monitoring Kubernetes container metrics.
https://opensearch.org/blog/opentelemetry-metrics-visualization/
👍3