DevOps & SRE notes – Telegram
DevOps & SRE notes
12K subscribers
38 photos
19 files
2.5K links
Helpfull articles and tools for DevOps&SRE

WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F

For paid consultation (RU/EN), contact: @tutunak


All ways to support https://telegra.ph/How-support-the-channel-02-19
Download Telegram
🚀 Golang Notes 🐹

Looking for a place to level up your Go skills? Join Golang Notes and stay ahead in the world of Golang!

What you'll find:
🔹 Best practices and coding tips
🔹 Latest updates from the Go ecosystem
🔹 Useful tools, snippets, and guides
🔹 Community discussions and expert insights

👨‍💻 Whether you're a beginner or an experienced developer, this channel has something for you!

🔗 Join now
2
The article "Autoscaling with Keda and Prometheus Using Custom Metrics in Go" on *Medium* provides a detailed guide on how to implement autoscaling in Kubernetes using Keda and Prometheus. It demonstrates creating custom Prometheus metrics in a Go application, deploying it on Kubernetes, and configuring Prometheus to scrape these metrics. The article then shows how to integrate Keda with Prometheus to scale pods based on custom metrics, such as the number of HTTP requests or product orders, ensuring dynamic resource allocation during varying traffic conditions.


https://medium.com/vakifbank-teknoloji/autoscaling-with-keda-and-prometheus-using-custom-metrics-in-go-558a64668fc4
👍3
The blogpost highlights potential security risks associated with automating Terraform lifecycle management. It discusses how malicious actors can exploit vulnerabilities in Terraform automation platforms, such as Hashicorp Cloud and Atlantis, by creating custom providers or using data sources to execute malicious code during the terraform plan phase. This can lead to unauthorized access to sensitive cloud credentials, compromising entire cloud environments. The article emphasizes the need for secure defaults and validation mechanisms in these platforms to mitigate such risks

https://snyk.io/blog/gitflops-dangers-of-terraform-automation-platforms/
👍2
In his article "TTR: the out-of-control metric," Lorin Hochstein critiques the application of the Time-to-Resolve (TTR) metric in incident management. He argues that since incidents represent periods when systems are out of control, applying statistical analyses to TTR is ineffective and does not lead to meaningful improvements.

https://surfingcomplexity.blog/2024/11/23/ttr-the-out-of-control-metric/
👍2
Richard Artoul explores the distinctions between "shared nothing" and "shared storage" architectures, particularly within data streaming contexts. He highlights how shared storage systems, by decoupling data from metadata, offer enhanced flexibility and scalability compared to traditional shared-nothing models. citeturn0search0
https://www.warpstream.com/blog/the-case-for-shared-storage
👍3
In the blogpost examined how increasing CPU utilization can lead to higher latency, affecting overall system performance. Through various experiments, they observed that as CPU usage rises, latency increases, highlighting the importance of optimizing system efficiency to maintain performance under varying loads. citeturn0search0

https://github.blog/engineering/architecture-optimization/breaking-down-cpu-speed-how-utilization-impacts-performance/
👍2
🐍 Python Notes 🐍

Stay on top of your Python skills with concise notes, tips, and tricks for every level of developer! Whether you're a beginner or advanced, these notes cover everything from basic syntax to advanced libraries and real-world applications.

📘 Comprehensive Python Guides
⚙️ Practical Coding Tips & Tricks
🚀 Master Python, Step by Step

Subscribe now and boost your Python knowledge! 📲
🔥4
Define sleep & wake up cycles for your Kubernetes resources. Automatically schedule to shutdown Deployments, CronJobs, StatefulSets and HorizontalPodAutoscalers that occupy resources in your cluster and wake them up only when you need them, reducing that way the overall power consumption.

https://github.com/rekuberate-io/sleepcycles
❤‍🔥4
In the article discusses the challenges of maintaining Service Level Objectives (SLOs) in a microservices environment. The team redefined their Critical User Journeys (CUJs) and implemented end-to-end (E2E) testing to automate SLO maintenance, resulting in a 99% reduction in maintenance time and immediate impact assessment during incidents.

https://engineering.mercari.com/en/blog/entry/20241204-keeping-user-journey-slos-up-to-date-with-e2e-testing-in-a-microservices-architecture/
👍6
In December 2024, AWS introduced a visual deployment timeline feature for CloudFormation, enhancing the infrastructure-as-code service with real-time visualization of resource provisioning sequences. This timeline offers a graphical representation of the order and duration of resource deployments, providing insights into dependencies and potential bottlenecks.

https://www.infoq.com/news/2024/12/cloudformation-visual-deployment/
👍4
In the article "The Karpenter Transformation," Nadav Buchman from Fiverr Engineering discusses the company's migration of their Kubernetes compute nodes to Karpenter, an open-source Kubernetes node lifecycle manager developed by AWS.
https://medium.com/fiverr-engineering/the-karpenter-transformation-1c278294bd9b
👍41