DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
39 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
The OpenTelemetry Collector is a powerful tool for gathering, processing, and exporting telemetry data from various sources. This article by Frankel provides a deep dive into the OpenTelemetry Collector, explaining its architecture, key features, and how to set it up. Learn how to use the OpenTelemetry Collector to improve observability in your systems by centralizing and standardizing the collection of metrics, traces, and logs.

https://blog.frankel.ch/opentelemetry-collector/
Upgrading Amazon EKS worker nodes is crucial for maintaining security, performance, and access to new features. This AWS blog post explains how to use Karpenter to automate the upgrade of EKS worker nodes, specifically handling node drift. Learn about the process and best practices to ensure smooth upgrades, minimize downtime, and maintain consistency in your Kubernetes environment.

https://aws.amazon.com/blogs/containers/how-to-upgrade-amazon-eks-worker-nodes-with-karpenter-drift/
1👍3
Performance regressions in cloud environments can be challenging to diagnose and resolve. This article from DoltHub discusses a "spooky" performance regression issue encountered with AWS Elastic Block Store (EBS). It explores the investigative steps taken to identify the root cause, the lessons learned, and best practices for monitoring and mitigating similar issues in cloud storage systems.

https://www.dolthub.com/blog/2023-11-22-spooky-performance-regression-aws-ebs/
👍3
Detecting and managing infrastructure drift is crucial for maintaining the desired state of your AWS resources. This article from ShipMonk's Product Development blog explains how to implement drift checks in Terraform for AWS environments. Learn about tools and techniques to identify, monitor, and remediate drift, ensuring your infrastructure remains consistent and compliant with your configurations.
https://pd.shipmonk.com/terraform-aws-drift-checks/
👍2
Managing access to Amazon RDS instances securely is vital for protecting your data and maintaining compliance. This article from SymOps discusses strategies for controlling and auditing access to RDS instances in AWS environments. Learn about best practices, tools, and techniques to enhance security, streamline access management, and ensure that only authorized users can interact with your databases.

https://blog.symops.com/post/rds-access
👍4
Understanding the internals of GNU/Linux, including file denoscriptors, pipes, terminals, user sessions, process groups, and daemons, is essential for Site Reliability Engineers (SREs). This comprehensive guide by Biriukov covers these critical concepts, explaining how they function and interconnect within a Linux environment. Learn how these components work together to manage processes and sessions, providing a foundation for advanced system troubleshooting and performance optimization.

https://biriukov.dev/docs/fd-pipe-session-terminal/0-sre-should-know-about-gnu-linux-shell-related-internals-file-denoscriptors-pipes-terminals-user-sessions-process-groups-and-daemons/
👍7❤‍🔥31
Monitoring Redis metrics is crucial for maintaining optimal performance and ensuring system reliability. This article from Sematext outlines key Redis metrics to monitor, such as memory usage, latency, and command processing. Learn about best practices for tracking and analyzing these metrics to prevent issues, optimize performance, and ensure the smooth operation of your Redis instances.

https://sematext.com/blog/redis-metrics/
Monitoring Redis metrics is essential for ensuring optimal performance and reliability of your Redis instances. This article from Sematext explores the key Redis metrics you should track, including memory usage, latency, and command performance. Learn how to leverage these metrics to identify potential issues, optimize resource usage, and maintain a high-performing Redis environment.

https://semaphoreci.com/blog/security-cloud-environment
Understanding different team types is crucial for structuring effective organizations and fostering collaboration. This article from IT Revolution outlines the four team types in modern software development: Stream-aligned, Enabling, Complicated-Subsystem, and Platform teams. Learn how each team type functions, their responsibilities, and how they can work together to deliver value efficiently and improve overall organizational performance.

https://itrevolution.com/articles/four-team-types/
👍4
Speeding up container image builds is vital for efficient CI/CD pipelines. This article from the CD Foundation discusses how to optimize container image builds using Tekton Pipelines. Discover strategies and best practices for reducing build times, enhancing build efficiency, and leveraging Tekton's capabilities to streamline your development workflow.
https://cd.foundation/blog/2023/10/12/speed-up-container-image-builds-tekton-pipelines/
1🔥1